Statistics
edu.rit.numeric

## Class Statistics

• public class Statisticsextends Object
Class Statistics provides static methods for doing statistical tests.

For each statistical test, there is a method that returns the "p-value" of the test statistic. This is the probability that the test statistic would have a value greater than or equal to the observed value if the null hypothesis is true.

• ### Method Summary

Methods
Modifier and TypeMethod and Description
static doublebernoulliChiSquarePvalue(double chisqr)
Returns the p-value of a Bernoulli chi-square statistic.
static doublebernoulliChiSquareTest(long total, long measured)
Do a Bernoulli chi-square test on the given data.
static doublebinomialKsTest(int[] data, int n)
Do a Kolmogorov-Smirnov (K-S) test on the given data.
static doublechiSquarePvalue(double N, double chisqr)
Returns the p-value of a chi-square statistic.
static doublechiSquareTest(double[] measured, double[] expected)
Do a chi-square test on the given data.
static doubleksPvalue(double N, double D)
Returns the p-value of a K-S statistic.
static doubleksTest(double[] data)
Do a Kolmogorov-Smirnov (K-S) test on the given data.
static doubleksTest(double[] data, Function cdf)
Do a Kolmogorov-Smirnov (K-S) test on the given data.
static doublenormalPvalue(double x, double mean, double stddev)
Returns the p-value of a statistic drawn from a normal distribution.
static doubleySquarePvalue(double N, double ysqr)
Returns the p-value of a Y-square statistic.
static doubleySquareTest(int N, double[] measured, double[] expected)
Do a Y-square test on the given data.
• ### Method Detail

• #### chiSquareTest

public static double chiSquareTest(double[] measured,                   double[] expected)
Do a chi-square test on the given data. The null hypothesis is that the data was drawn from the distribution given by expected. The measured and expected arrays must be the same length.
Parameters:
measured - Measured count in each bin.
expected - Expected count in each bin.
Returns:
Chi-square statistic.
• #### chiSquarePvalue

public static double chiSquarePvalue(double N,                     double chisqr)
Returns the p-value of a chi-square statistic.
Parameters:
N - Degrees of freedom.
chisqr - Chi-square statistic.
Returns:
P-value.
• #### bernoulliChiSquareTest

public static double bernoulliChiSquareTest(long total,                            long measured)
Do a Bernoulli chi-square test on the given data. The null hypothesis is that the data was drawn from a Bernoulli distribution with both outcomes equally likely (e.g., a fair coin). total is the total number of trials. measured is the number of trials yielding one of the outcomes. (total − measured) is the number of trials yielding the other outcome.
Parameters:
total - Total number of trials.
measured - Number of trials yielding one of the outcomes.
Returns:
Chi-square statistic.
• #### bernoulliChiSquarePvalue

public static double bernoulliChiSquarePvalue(double chisqr)
Returns the p-value of a Bernoulli chi-square statistic.
Parameters:
chisqr - Chi-square statistic.
Returns:
P-value.
• #### ySquareTest

public static double ySquareTest(int N,                 double[] measured,                 double[] expected)
Do a Y-square test on the given data. The null hypothesis is that the data was drawn from the distribution given by expected. The measured and expected arrays must be the same length.

The Y-square test is similar to the chi-square test, except the Y-square statistic is valid even if the expected counts in some of the bins are small, which is not true of the chi-square statistic. For further information, see:

L. Lucy. Hypothesis testing for meagre data sets. Monthly Notices of the Royal Astronomical Society, 318(1):92-100, October 2000.

Parameters:
N - Degrees of freedom.
measured - Measured count in each bin.
expected - Expected count in each bin.
Returns:
Y-square statistic.
• #### ySquarePvalue

public static double ySquarePvalue(double N,                   double ysqr)
Returns the p-value of a Y-square statistic.
Parameters:
N - Degrees of freedom.
ysqr - Y-square statistic.
Returns:
P-value.
• #### ksTest

public static double ksTest(double[] data)
Do a Kolmogorov-Smirnov (K-S) test on the given data. The null hypothesis is that the data was drawn from a uniform distribution between 0.0 and 1.0.

The values in the data array must all be in the range 0.0 through 1.0 and must be in ascending numerical order. The ksTest() method does not sort the data itself because the process that produced the data might already have sorted the data. If necessary, call Arrays.sort(data) before calling ksTest(data).

Parameters:
data - Data array.
Returns:
K-S statistic.
• #### ksTest

public static double ksTest(double[] data,            Function cdf)
Do a Kolmogorov-Smirnov (K-S) test on the given data. The null hypothesis is that the data was drawn from the distribution specified by the given Function. cdf.f(x) must return the value of the cumulative distribution function at x, in the range 0.0 through 1.0.

The values in the data array must all be in the domain of cdf and must be in ascending numerical order. The ksTest() method does not sort the data itself because the process that produced the data might already have sorted the data. If necessary, call Arrays.sort(data) before calling ksTest(data,cdf).

Parameters:
data - Data array.
cdf - Cumulative distribution function.
Returns:
K-S statistic.
• #### binomialKsTest

public static double binomialKsTest(int[] data,                    int n)
Do a Kolmogorov-Smirnov (K-S) test on the given data. The null hypothesis is that the data was drawn from a binomial random variable X that is the sum of n equiprobable Bernoulli random variables. For 0 ≤ kn, the probability that X equals k is

Pr[X = k] = 2n n! / k! / (nk)!

The values in the data array must all be in the range 0 .. n and must be in ascending numerical order. The binomialKsTest() method does not sort the data itself because the process that produced the data might already have sorted the data. If necessary, call Arrays.sort(data) before calling binomialKsTest(data,n).

Note: To prevent roundoff error, the internal calculations are done using exact rational arithmetic. The final K-S statistic is then converted to a double-precision floating-point number and is returned.

Parameters:
data - Data array.
n - Number of Bernoulli random variables.
Returns:
K-S statistic.
Throws:
IllegalArgumentException - (unchecked exception) Thrown if n ≤ 0.
• #### ksPvalue

public static double ksPvalue(double N,              double D)
Returns the p-value of a K-S statistic.
Parameters:
N - Number of data points.
D - K-S statistic.
Returns:
P-value.
• #### normalPvalue

public static double normalPvalue(double x,                  double mean,                  double stddev)
Returns the p-value of a statistic drawn from a normal distribution.
Parameters:
x - Statistic.
mean - Mean of the normal distribution.
stddev - Standard deviation of the normal distribution.
Returns:
P-value.