Chi-square distribution and Cost of test: Difference between pages

From HandWiki
(Difference between pages)
imported>Smart bot editor
(add)
 
imported>Jworkorg
(import)
 
Line 1: Line 1:
{{short description|Probability distribution and special case of gamma distribution}}


{{Probability distribution
  | name      = chi-square
  | type      = density
  | pdf_image  = [[File:Chi-square pdf.svg|321px]]
  | cdf_image  = [[File:Chi-square cdf.svg|321px]]
  | notation  = <math>\chi^2(k)\;</math> or <math>\chi^2_k\!</math>
  | parameters = <math>k \in \mathbb{N}^{*}~~</math>  (known as "degrees of freedom")
  | support    = <math>x \in (0, +\infty)\;</math> if <math>k = 1</math>, otherwise <math>x \in [0, +\infty)\;</math>
  | pdf        = <math>\frac{1}{2^{k/2}\Gamma(k/2)}\; x^{k/2-1} e^{-x/2}\; </math>
  | cdf        = <math>\frac{1}{\Gamma(k/2 )} \; \gamma\left(\frac{k}{2},\,\frac{x}{2}\right)\;</math>
  | mean      = <math>k</math>
  | median    = <math>\approx k\bigg(1-\frac{2}{9k}\bigg)^3\;</math>
  | mode      = <math>\max(k-2,0)\;</math>
  | variance  = <math>2k\;</math>
  | skewness  = <math>\sqrt{8/k}\,</math>
  | kurtosis  = <math>\frac{12}{k}</math>
  | entropy    = <math>\begin{align}\frac{k}{2}&+\ln(2\Gamma(\frac{k}{2})) \\ &\!+(1-\frac{k}{2})\psi(\frac{k}{2})\end{align}</math>
  | mgf        = <math>(1-2t)^{-k/2} \text{ for } t < \frac{1}{2}\;</math>
  | char      = <math>(1-2it)^{-k/2}</math><ref>{{cite web | url=http://www.planetmathematics.com/CentralChiDistr.pdf | title=Characteristic function of the central chi-square distribution | author=M.A. Sanders | access-date=2009-03-06 | archive-url=https://web.archive.org/web/20110715091705/http://www.planetmathematics.com/CentralChiDistr.pdf# | archive-date=2011-07-15 | url-status=dead }}</ref>
|pgf=<math>(1-2\ln t)^{-k/2} \text{ for } 0<t<\sqrt{e}\;</math>}}


In [[Probability theory|probability theory]] and [[Statistics|statistics]], the '''chi-square distribution''' (also '''chi-squared''' or {{nowrap|1='''<span style="font-family:serif">''χ''</span><sup>2</sup>-distribution'''}}) with {{mvar|k}} [[Degrees of freedom (statistics)|degrees of freedom]] is the distribution of a sum of the squares of {{mvar|k}} [[Independence (probability theory)|independent]] standard normal random variables. The chi-square distribution is a special case of the [[Gamma distribution|gamma distribution]] and is one of the most widely used [[Probability distribution|probability distribution]]s in inferential statistics, notably in hypothesis testing and in construction of [[Confidence interval|confidence interval]]s.<ref name=abramowitz>{{Abramowitz_Stegun_ref|26|940}}</ref><ref>NIST (2006).  [http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm Engineering Statistics Handbook – Chi-Squared Distribution]</ref><ref name="Johnson_et_al">{{cite book
The cost of a test is the probability of rejecting good events in hypothesis testing ( [[File:hepa_img2.gif|14x8px]] [[Neyman-pearson diagram#184|Neyman-Pearson Diagram]]).
  | last = Johnson
  | first = N. L.
  | first2 = S. |last2=Kotz |first3=N. |last3=Balakrishnan
  | title = Continuous Univariate Distributions |edition=Second |volume=1 |chapter=Chi-Square Distributions including Chi and Rayleigh |pages=415–493
  | publisher = John Wiley and Sons
  | year = 1994
  | isbn = 978-0-471-58495-7
}}</ref><ref>{{cite book
  | last = Mood
  | first = Alexander
  | first2=Franklin A. |last2=Graybill |first3=Duane C. |last3=Boes
  | title = Introduction to the Theory of Statistics |edition=Third |pages=241–246
  | publisher = McGraw-Hill
  | year = 1974
  | isbn = 978-0-07-042864-5
}}</ref> This distribution is sometimes called the '''central chi-square distribution''', a special case of the more general noncentral chi-square distribution.


The chi-square distribution is used in the common chi-square tests for [[Goodness of fit|goodness of fit]] of an observed distribution to a theoretical one, the independence of two criteria of classification of [[Data analysis|qualitative data]], and in confidence interval estimation for a population [[Standard deviation|standard deviation]] of a normal distribution from a sample standard deviation.  Many other statistical tests also use this distribution, such as [[Friedman test|Friedman's analysis of variance by ranks]].
[[Category:W.Krisher and R.Bock]]
 
[[Category:Data analysis]]
==Definitions==
[[Category:Statistics]]
If ''Z''<sub>1</sub>, ..., ''Z''<sub>''k''</sub> are [[Independence (probability theory)|independent]], standard normal random variables, then the sum of their squares,
: <math>
    Q\ = \sum_{i=1}^k Z_i^2 ,
  </math>
is distributed according to the chi-square distribution with ''k'' degrees of freedom. This is usually denoted as
: <math>
    Q\ \sim\ \chi^2(k)\ \ \text{or}\ \ Q\ \sim\ \chi^2_k .
  </math>
 
The chi-square distribution has one parameter: a positive integer ''k'' that specifies the number of [[Degrees of freedom (statistics)|degrees of freedom]] (the number of random variables being summed, ''Z''<sub>''i''</sub> s).
 
===Introduction===
 
The chi-square distribution is used primarily in hypothesis testing, and to a lesser extent for confidence intervals for population variance when the underlying distribution is normal. Unlike more widely known distributions such as the [[Normal distribution|normal distribution]] and the [[Exponential distribution|exponential distribution]], the chi-square distribution is not as often applied in the direct modeling of natural phenomena. It arises in the following hypothesis tests, among others:
 
*Chi-square test of independence in contingency tables
*Chi-square test of goodness of fit of observed data to hypothetical distributions
*[[Likelihood-ratio test]] for nested models
*Log-rank test in survival analysis
*[[Cochran–Mantel–Haenszel statistics|Cochran–Mantel–Haenszel test]] for stratified contingency tables
 
It is also a component of the definition of the [[Student's t-distribution|t-distribution]] and the [[F-distribution]] used in t-tests, analysis of variance, and regression analysis.
 
The primary reason for which the chi-square distribution is extensively used in hypothesis testing is its relationship to the normal distribution. Many hypothesis tests use a test statistic, such as the [[T-statistic|t-statistic]] in a t-test. For these hypothesis tests, as the sample size, n, increases, the [[Sampling distribution|sampling distribution]] of the test statistic approaches the normal distribution ([[Central limit theorem|central limit theorem]]). Because the test statistic (such as t) is asymptotically normally distributed, provided the sample size is sufficiently large, the distribution used for hypothesis testing may be approximated by a normal distribution. Testing hypotheses using a normal distribution is well understood and relatively easy. The simplest chi-square distribution is the square of a standard normal distribution. So wherever a normal distribution could be used for a hypothesis test, a chi-square distribution could be used.
 
Suppose that <math>Z</math> is a random variable sampled from the standard normal distribution, where the mean is <math>0</math> and the variance is <math>1</math>: <math>Z \sim N(0,1)</math>. Now, consider the random variable <math>Q = Z^2</math>. The distribution of the random variable <math>Q</math> is an example of a chi-square distribution: <math>
    \ Q\ \sim\ \chi^2_1 .
  </math> The subscript 1 indicates that this particular chi-square distribution is constructed from only 1 standard normal distribution. A chi-square distribution constructed by squaring a single standard normal distribution is said to have 1 degree of freedom. Thus, as the sample size for a hypothesis test increases, the distribution of the test statistic approaches a normal distribution. Just as extreme values of the normal distribution have low probability (and give small p-values), extreme values of the chi-square distribution have low probability.
 
An additional reason that the chi-square distribution is widely used is that it turns up as the large sample distribution of generalized [[Likelihood-ratio test|likelihood ratio tests]] (LRT).<ref name=Westfall2013>{{cite book|last1=Westfall|first1=Peter H.|title=Understanding Advanced Statistical Methods|date=2013|publisher=CRC Press|location=Boca Raton, FL|isbn=978-1-4665-1210-8}}</ref> LRT's have several desirable properties; in particular, simple LRT's commonly provide the highest power to reject the null hypothesis ([[Neyman–Pearson lemma]]) and this leads also to optimality properties of generalised LRTs. However, the normal and chi-square approximations are only valid asymptotically. For this reason, it is preferable to use the t distribution rather than the normal approximation or the chi-square approximation for a small sample size. Similarly, in analyses of contingency tables, the chi-square approximation will be poor for a small sample size, and it is preferable to use [[Fisher's exact test]]. Ramsey shows that the exact [[Binomial test|binomial test]] is always more powerful than the normal approximation.<ref name=Ramsey1988>{{cite journal|last1=Ramsey|first1=PH|title=Evaluating the Normal Approximation to the Binomial Test|journal=Journal of Educational Statistics|date=1988|volume=13|issue=2|pages=173–82|doi=10.2307/1164752|jstor=1164752}}</ref>
 
Lancaster shows the connections among the binomial, normal, and chi-square distributions, as follows.<ref name="Lancaster1969">{{Citation
|last=Lancaster
|first=H.O.
|title=The Chi-squared Distribution
|year=1969
|publisher=Wiley
}}</ref> De Moivre and Laplace established that a binomial distribution could be approximated by a normal distribution. Specifically they showed the asymptotic normality of the random variable
 
:<math> \chi = {m - Np \over \sqrt{Npq}} </math>
 
where <math>m</math> is the observed number of successes in <math>N</math> trials, where the probability of success is <math>p</math>, and <math>q = 1 - p</math>.
 
Squaring both sides of the equation gives
 
:<math> \chi^2 = {(m - Np)^2\over Npq} </math>
 
Using <math>N = Np + N(1 - p)</math>, <math>N = m + (N - m)</math>, and <math>q = 1 - p</math>, this equation can be rewritten as
 
:<math> \chi^2 = {(m - Np)^2\over Np} + {(N - m - Nq)^2\over Nq} </math>
 
The expression on the right is of the form that [[Biography:Karl Pearson|Karl Pearson]] would generalize to the form:
 
:<math> \chi^2 = \sum_{i=1}^n \frac{(O_i - E_i)^2}{E_i}  </math>
 
where
 
:<math> \chi^2</math> = Pearson's cumulative test statistic, which asymptotically approaches a <math>\chi^2</math> distribution.
:<math>O_i</math> = the number of observations of type <math>i</math>.
:<math>E_i = N p_i</math> = the expected (theoretical) frequency of type <math>i</math>, asserted by the null hypothesis that the fraction of type <math>i</math> in the population is <math> p_i</math>
:<math>n</math>  = the number of cells in the table.
 
In the case of a binomial outcome (flipping a coin), the binomial distribution may be approximated by a normal distribution (for sufficiently large <math>n</math>). Because the square of a standard normal distribution is the chi-square distribution with one degree of freedom, the probability of a result such as 1 heads in 10 trials can be approximated either by using the normal distribution directly, or the chi-square distribution for the normalised, squared difference between observed and expected value. However, many problems involve more than the two possible outcomes of a binomial, and instead require 3 or more categories, which leads to the multinomial distribution. Just as de Moivre and Laplace sought for and found the normal approximation to the binomial, Pearson sought for and found a degenerate multivariate normal approximation to the multinomial distribution (the numbers in each category add up to the total sample size, which is considered fixed). Pearson showed that the chi-square distribution arose from such a multivariate normal approximation to the multinomial distribution, taking careful account of the statistical dependence (negative correlations) between numbers of observations in different categories. <ref name="Lancaster1969" />
 
===Probability density function===
The [[Physics:Probability density function|probability density function]] (pdf) of the chi-square distribution is
:<math>
f(x;\,k) =
\begin{cases}
  \dfrac{x^{\frac k 2 -1} e^{-\frac x 2}}{2^{\frac k 2} \Gamma\left(\frac k 2 \right)},  & x > 0; \\ 0, & \text{otherwise}.
\end{cases}
</math>
where <math display="inline">\Gamma(k/2)</math> denotes the [[Gamma function|gamma function]], which has [[Particular values of the gamma function|closed-form values for integer <math>k</math>]].
 
For derivations of the pdf in the cases of one, two and <math>k</math> degrees of freedom, see Proofs related to chi-square distribution.
 
===Cumulative distribution function===
 
[[File:Chernoff-bound.svg|thumb|right|400px|Chernoff bound for the [[Cumulative distribution function|CDF]] and tail (1-CDF) of a chi-square random variable with ten degrees of freedom (<math>k</math> = 10) ]]
 
Its [[Cumulative distribution function|cumulative distribution function]] is:
: <math>
    F(x;\,k) = \frac{\gamma(\frac{k}{2},\,\frac{x}{2})}{\Gamma(\frac{k}{2})} = P\left(\frac{k}{2},\,\frac{x}{2}\right),
  </math>
where <math>\gamma(s,t)</math> is the lower incomplete gamma function and <math display="inline">P(s,t)</math> is the regularized gamma function.
 
In a special case of <math>k</math> = 2 this function has the simple form:
: <math>
    F(x;\,2) = 1 - e^{-x/2}
  </math>
which can be easily derived by integrating <math>f(x;\,2)=\frac{1}{2}e^{-\frac{x}{2}}</math> directly. The integer recurrence of the gamma function makes it easy to compute  <math>F(x;\,2)</math> for other small, even <math>k</math>.
 
Tables of the chi-square cumulative distribution function are widely available and the function is included in many [[Engineering:Spreadsheet|spreadsheet]]s and all [[Software:List of statistical packages|statistical packages]].
 
Letting <math>z \equiv x/k</math>, [[Chernoff bound#The first step in the proof of Chernoff bounds|Chernoff bounds]] on the lower and upper tails of the CDF may be obtained.<ref>{{cite journal |last1=Dasgupta |first1=Sanjoy D. A. |last2=Gupta |first2=Anupam K. |date=January 2003 |title=An Elementary Proof of a Theorem of Johnson and Lindenstrauss |journal=Random Structures and Algorithms |volume=22 |issue=1 |pages=60–65 |doi=10.1002/rsa.10073 |url=http://cseweb.ucsd.edu/~dasgupta/papers/jl.pdf |access-date=2012-05-01 }}</ref>  For the cases when <math>0 < z < 1</math> (which include all of the cases when this CDF is less than half):
: <math>
    F(z k;\,k) \leq (z e^{1-z})^{k/2}.
  </math>
 
The tail bound for the cases when <math>z > 1</math>, similarly, is
: <math>
    1-F(z k;\,k) \leq (z e^{1-z})^{k/2}.
  </math>
 
For another [[Approximation|approximation]] for the CDF modeled after the cube of a Gaussian, see under Noncentral chi-square distribution.
 
==Properties==
===Sum of squares of independent identically distributed normal random variables minus their mean===
{{main | Cochran's theorem}}
If ''Z''<sub>1</sub>, ..., ''Z''<sub>''k''</sub> are [[Independence (probability theory)|independent]] identically distributed (i.i.d.), standard normal random variables, then
: <math>
\sum_{i=1}^k(Z_i - \overline{Z})^2 \sim \chi^2_{k-1}
</math>
where
: <math>
\overline Z = \frac{1}{k} \sum_{i=1}^k Z_i.
</math>
 
===Additivity===
It follows from the definition of the chi-square distribution that the sum of independent chi-square variables is also chi-square distributed. Specifically, if <math>X_i,i=\overline{1,n}</math> are independent chi-square variables with  <math>k_i</math>, <math>i=\overline{1,n} </math> degrees of freedom, respectively, then <math>Y = X_1  + ... + X_n</math> is chi-square distributed with  <math>k_1  + ... + k_n</math>  degrees of freedom.
 
===Sample mean===
The sample mean of <math>n</math> [[Independent and identically distributed random variables|i.i.d.]] chi-square variables of degree <math>k</math> is distributed according to a gamma distribution with shape <math>\alpha</math> and scale <math>\theta</math> parameters:
:<math> \overline X = \frac{1}{n} \sum_{i=1}^n X_i \sim \operatorname{Gamma}\left(\alpha=n\, k /2, \theta= 2/n \right)  \qquad \text{where } X_i \sim \chi^2(k)</math>
 
Asymptotically, given that for a scale parameter <math> \alpha </math> going to infinity, a Gamma distribution converges towards a normal distribution with expectation <math> \mu = \alpha\cdot \theta </math> and variance <math> \sigma^2 = \alpha\, \theta^2 </math>, the sample mean converges towards:
 
:<math> \overline X  \xrightarrow{n \to \infty} N(\mu = k, \sigma^2 = 2\, k /n ) </math>
 
Note that we would have obtained the same result invoking instead the [[Central limit theorem|central limit theorem]], noting that for each chi-square variable of degree <math>k</math> the expectation is <math> k </math> , and its variance <math> 2\,k </math> (and hence the variance of the sample mean <math> \overline{X}</math> being  <math> \sigma^2 = \frac{2k}{n}  </math>).
 
===Entropy===
The [[Differential entropy|differential entropy]] is given by
: <math>
    h = \int_{0}^\infty f(x;\,k)\ln f(x;\,k) \, dx
      = \frac k 2 + \ln \left[2\,\Gamma \left(\frac k 2 \right)\right] + \left(1-\frac k 2 \right)\, \psi\!\left[\frac k 2 \right],
  </math>
where ''ψ''(''x'') is the [[Digamma function]].
 
The chi-square distribution is the [[Maximum entropy probability distribution|maximum entropy probability distribution]] for a random variate  <math>X</math>  for which <math>\operatorname{E}(X)=k</math> and <math>\operatorname{E}(\ln(X))=\psi(k/2)+\ln(2)</math> are fixed. Since the chi-square is in the family of gamma distributions, this can be derived by substituting appropriate values in the [[Gamma distribution#Logarithmic expectation and variance|Expectation of the log moment of gamma]]. For derivation from more basic principles, see the derivation in [[Exponential family#Moment-generating function of the sufficient statistic|moment-generating function of the sufficient statistic]].
 
===Noncentral moments===
The moments about zero of a chi-square distribution with <math>k</math> degrees of freedom are given by<ref>[http://mathworld.wolfram.com/Chi-SquaredDistribution.html Chi-squared distribution], from [[MathWorld]], retrieved Feb. 11, 2009</ref><ref>M. K. Simon, ''Probability Distributions Involving Gaussian Random Variables'', New York: Springer, 2002, eq. (2.35), {{ISBN|978-0-387-34657-1}}</ref>
: <math>
    \operatorname{E}(X^m) = k (k+2) (k+4) \cdots (k+2m-2) = 2^m \frac{\Gamma\left(m+\frac{k}{2}\right)}{\Gamma\left(\frac{k}{2}\right)}.
  </math>
 
===Cumulants===
The [[Cumulant|cumulant]]s are readily obtained by a (formal) power series expansion of the logarithm of the characteristic function:
: <math>
    \kappa_n = 2^{n-1}(n-1)!\,k
  </math>
 
===Concentration===
 
The chi-squared distribution exhibits strong concentration around its mean. The standard Laurent-Massart <ref>https://projecteuclid.org/journals/annals-of-statistics/volume-28/issue-5/Adaptive-estimation-of-a-quadratic-functional-by-model--selection/10.1214/aos/1015957395.full, Lemma 1, retrieved May 1, 2021</ref> bounds are:
: <math>
    \operatorname{P}(X - k \ge 2 \sqrt{k x} + 2x) \le \exp(-x)
  </math>
: <math>
    \operatorname{P}(k - X \ge 2 \sqrt{k x}) \le \exp(-x)
  </math>
 
===Asymptotic properties===
[[File:Chi-square median approx.png|thumb|400px|Approximate formula for median (from the Wilson–Hilferty transformation) compared with numerical quantile (top); and difference (blue) and relative difference (red) between numerical quantile and approximate formula (bottom).  For the chi-squared distribution, only the positive integer numbers of degrees of freedom (circles) are meaningful.]]
 
By the [[Central limit theorem|central limit theorem]], because the chi-square distribution is the sum of <math>k</math> independent random variables with finite mean and variance, it converges to a normal distribution for large <math>k</math>. For many practical purposes, for <math>k>50</math> the distribution is sufficiently close to a [[Normal distribution|normal distribution]] for the difference to be ignored.<ref>{{cite book|title=Statistics for experimenters|author=Box, Hunter and Hunter|publisher=Wiley|year=1978|isbn=978-0471093152|page=[https://archive.org/details/statisticsforexp00geor/page/118 118]|url-access=registration|url=https://archive.org/details/statisticsforexp00geor/page/118}}</ref> Specifically, if <math>X \sim \chi^2(k)</math>, then as <math>k</math> tends to infinity, the distribution of <math>(X-k)/\sqrt{2k}</math> [[Convergence of random variables#Convergence in distribution|tends]] to a standard normal distribution. However, convergence is slow as the [[Skewness|skewness]] is <math>\sqrt{8/k}</math> and the excess kurtosis is <math>12/k</math>.
 
The sampling distribution of <math>\ln(\chi^2)</math> converges to normality much faster than the sampling distribution of <math>\chi^2</math>,<ref>{{cite journal |first=M. S. |last=Bartlett |first2=D. G. |last2=Kendall |title=The Statistical Analysis of Variance-Heterogeneity and the Logarithmic Transformation |journal=Supplement to the Journal of the Royal Statistical Society |volume=8 |issue=1 |year=1946 |pages=128–138 |jstor=2983618 |doi=10.2307/2983618 }}</ref> as the logarithm removes much of the asymmetry.<ref name=":0">{{Cite journal|last=Pillai|first=Natesh S.|year=2016|title=An unexpected encounter with Cauchy and Lévy|journal=[[Annals of Statistics]]|volume=44|issue=5|pages=2089–2097|doi=10.1214/15-aos1407|arxiv=1505.01957}}</ref> Other functions of the chi-square distribution converge more rapidly to a normal distribution. Some examples are:
* If <math>X \sim \chi^2(k)</math> then <math>\sqrt{2X}</math> is approximately normally distributed with mean <math>\sqrt{2k-1}</math> and unit variance (1922, by R. A. Fisher, see (18.23), p.&nbsp;426 of Johnson.<ref name="Johnson_et_al" />
* If <math>X \sim \chi^2(k)</math> then  <math>\sqrt[3]{X/k}</math>  is approximately normally distributed with mean <math> 1-\frac{2}{9k}</math> and variance <math>\frac{2}{9k} .</math><ref>{{cite journal |last=Wilson |first=E. B. |last2=Hilferty |first2=M. M. |year=1931 |title=The distribution of chi-squared |journal=Proc. Natl. Acad. Sci. USA |volume=17 |issue=12 |pages=684–688 |bibcode=1931PNAS...17..684W |doi=10.1073/pnas.17.12.684 |pmid=16577411 |pmc=1076144 }}</ref> This is known as the Wilson–Hilferty transformation, see (18.24), p.&nbsp;426 of Johnson.<ref name="Johnson_et_al" />
**This normalizing transformation leads directly to the commonly used median approximation <math>k\bigg(1-\frac{2}{9k}\bigg)^3\;</math> by back-transforming from the mean, which is also the median, of the normal distribution.
 
==Related distributions==
 
* As <math>k\to\infty</math>, <math> (\chi^2_k-k)/\sqrt{2k} ~ \xrightarrow{d}\ N(0,1) \,</math> ([[Normal distribution|normal distribution]])
*<math> \chi_k^2 \sim  {\chi'}^2_k(0)</math> (noncentral chi-square distribution with non-centrality parameter <math> \lambda = 0 </math>)
*If <math>Y \sim \mathrm{F}(\nu_1, \nu_2)</math> then <math>X = \lim_{\nu_2 \to \infty} \nu_1 Y</math> has the chi-square distribution <math>\chi^2_{\nu_{1}}</math>
:*As a special case, if <math>Y \sim \mathrm{F}(1, \nu_2)\,</math> then <math>X = \lim_{\nu_2 \to \infty} Y\,</math> has the chi-square distribution <math>\chi^2_{1}</math>
*<math> \|\boldsymbol{N}_{i=1,\ldots,k} (0,1) \|^2 \sim \chi^2_k </math> (The squared [[Norm (mathematics)|norm]] of ''k'' standard normally distributed variables is a chi-square distribution with ''k'' [[Degrees of freedom (statistics)|degrees of freedom]])
*If <math>X \sim \chi^2(\nu)\,</math> and <math>c>0 \,</math>, then <math>cX \sim \Gamma(k=\nu/2, \theta=2c)\,</math>. ([[Gamma distribution|gamma distribution]])
*If <math>X \sim \chi^2_k</math> then <math>\sqrt{X} \sim \chi_k</math> ([[Chi distribution|chi distribution]])
*If <math>X \sim \chi^2(2)</math>, then <math>X \sim \operatorname{Exp}(1/2)</math> is an [[Exponential distribution|exponential distribution]].  (See [[Gamma distribution|gamma distribution]] for more.)
*If <math>X \sim \chi^2(2k)</math>, then <math>X \sim \operatorname{Erlang}(k, 1/2)</math> is an [[Erlang distribution]]. 
*If <math> X \sim \operatorname{Erlang}(k,\lambda)</math>, then <math> 2\lambda X\sim \chi^2_{2k}</math>
*If <math>X \sim \operatorname{Rayleigh}(1)\,</math> ([[Rayleigh distribution]]) then <math>X^2 \sim \chi^2(2)\,</math>
*If <math>X \sim \operatorname{Maxwell}(1)\,</math> (Maxwell distribution)  then <math>X^2 \sim \chi^2(3)\,</math>
*If <math>X \sim \chi^2(\nu)</math> then <math>\tfrac{1}{X} \sim \operatorname{Inv-}\chi^2(\nu)\, </math> (Inverse-chi-square distribution)
*The chi-square distribution is a special case of type III [[Pearson distribution]]
* If <math>X \sim \chi^2(\nu_1)\,</math> and <math>Y \sim \chi^2(\nu_2)\,</math> are independent then <math>\tfrac{X}{X+Y} \sim \operatorname{Beta}(\tfrac{\nu_1}{2}, \tfrac{\nu_2}{2})\,</math> ([[Beta distribution|beta distribution]])
*If <math> X \sim \operatorname{U}(0,1)\, </math> ([[Uniform distribution (continuous)|uniform distribution]]) then <math> -2\log(X) \sim \chi^2(2)\,</math>
*If <math>X_i \sim \operatorname{Laplace}(\mu,\beta)\,</math> then <math>\sum_{i=1}^n \frac{2 |X_i-\mu|}{\beta} \sim \chi^2(2n)\,</math>
* If <math>X_i</math> follows the [[Generalized normal distribution|generalized normal distribution]] (version 1) with parameters <math>\mu,\alpha,\beta</math> then <math>\sum_{i=1}^n \frac{2 |X_i-\mu|^\beta}{\alpha} \sim \chi^2\left(\frac{2n}{\beta}\right)\,</math> <ref>{{cite journal |last= Bäckström |first= T. |author2=Fischer, J. |date=January 2018|title= Fast Randomization for Distributed Low-Bitrate Coding of Speech and Audio|journal= IEEE/ACM Transactions on Audio, Speech, and Language Processing |volume= 26|issue= 1|pages= 19&ndash;30|doi=  10.1109/TASLP.2017.2757601|url= https://aaltodoc.aalto.fi/handle/123456789/33466 }}</ref>
* chi-square distribution is a transformation of [[Pareto distribution]]
* [[Student's t-distribution]] is a transformation of chi-square distribution
* [[Student's t-distribution]] can be obtained from chi-square distribution and [[Normal distribution|normal distribution]]
* [[Noncentral beta distribution]] can be obtained as a transformation of chi-square distribution and Noncentral chi-square distribution
* [[Noncentral t-distribution]] can be obtained from normal distribution and chi-square distribution
 
A chi-square variable with <math>k</math> degrees of freedom is defined as the sum of the squares of <math>k</math> independent standard normal random variables.
 
If <math>Y</math> is a <math>k</math>-dimensional Gaussian random vector with mean vector <math>\mu</math> and rank <math>k</math> covariance matrix <math>C</math>, then <math>X = (Y-\mu )^{T}C^{-1}(Y-\mu)</math> is chi-square distributed with <math>k</math> degrees of freedom.
 
The sum of squares of statistically independent unit-variance Gaussian variables which do ''not'' have mean zero yields a generalization of the chi-square distribution called the noncentral chi-square distribution.
 
If <math>Y</math> is a vector of <math>k</math> i.i.d. standard normal random variables and <math>A</math> is a <math>k\times k</math> [[Symmetric matrix|symmetric]], [[Idempotent matrix|idempotent matrix]] with [[Rank (linear algebra)|rank]] <math>k-n</math>, then the [[Quadratic form|quadratic form]] <math>Y^TAY</math> is chi-square distributed with <math>k-n</math> degrees of freedom.
 
If <math>\Sigma</math> is a <math>p\times p</math> positive-semidefinite covariance matrix with strictly positive diagonal entries, then for <math>X\sim N(0,\Sigma)</math> and <math>w</math> a random <math>p</math>-vector independent of <math>X</math> such that <math>w_1+\cdots+w_p=1</math> and <math>w_i\geq 0, i=1,\cdots,p,</math> it holds that
 
<math>\frac{1}{\left(\frac{w_1}{X_1},\cdots,\frac{w_p}{X_p}\right)\Sigma\left(\frac{w_1}{X_1},\cdots,\frac{w_p}{X_p}\right)^{\top}}\sim\chi_1^2.</math><ref name=":0" />
 
The chi-square distribution is also naturally related to other distributions arising from the Gaussian. In particular,
 
* <math>Y</math> is [[F-distribution|F-distributed]], <math>Y \sim F(k_1, k_2)</math> if <math>Y = \frac{ {X_1}/{k_1} }{ {X_2}/{k_2} }</math>, where <math>X_1 \sim \chi^2(k_1)</math> and <math>X_2 \sim \chi^2(k_2)</math> are statistically independent.
* If <math>X_1 \sim \chi^2(k_1)</math> and <math>X_2 \sim \chi^2(k_2)</math> are statistically independent, then <math>X_1 + X_2\sim \chi^2(k_1+k_2)</math>. If <math>X_1</math> and <math>X_2</math> are not independent, then <math>X_1+X_2</math> is not chi-square distributed.
 
===Generalizations===
The chi-square distribution is obtained as the sum of the squares of ''k'' independent, zero-mean, unit-variance Gaussian random variables. Generalizations of this distribution can be obtained by summing the squares of other types of Gaussian random variables. Several such distributions are described below.
 
===Linear combination===
If <math>X_1,\ldots,X_n</math> are chi square random variables and <math>a_1,\ldots,a_n\in\mathbb{R}_{>0}</math>, then a closed expression for the distribution of <math>X=\sum_{i=1}^n a_i X_i</math> is not known. It may be, however, approximated efficiently using the [[Characteristic function (probability theory)#Properties|property of characteristic functions]] of chi-square random variables.<ref>{{cite journal
|first=J.
|last=Bausch
|title=On the Efficient Calculation of a Linear Combination of Chi-Square Random Variables with an Application in Counting String Vacua
|journal=J. Phys. A: Math. Theor.
|volume=46
|issue=50
|year=2013
|pages=505202
|doi=10.1088/1751-8113/46/50/505202 |bibcode=2013JPhA...46X5202B
|arxiv=1208.2691
}}</ref>
 
===Chi-square distributions===
 
====Noncentral chi-square distribution====
The noncentral chi-square distribution is obtained from the sum of the squares of independent Gaussian random variables having unit variance and ''nonzero'' means.
 
====Generalized chi-square distribution====
The generalized chi-square distribution is obtained from the quadratic form ''z′Az'' where ''z'' is a zero-mean Gaussian vector having an arbitrary covariance matrix, and ''A'' is an arbitrary matrix.
 
===Gamma, exponential, and related distributions===
The chi-square distribution <math>X \sim \chi_k^2</math> is a special case of the [[Gamma distribution|gamma distribution]], in that <math>X \sim \Gamma \left(\frac{k}2,\frac{1}2\right)</math> using the rate parameterization of the gamma distribution (or
<math>X \sim \Gamma \left(\frac{k}2,2 \right)</math> using the scale parameterization of the gamma distribution)
where ''k'' is an integer.
 
Because the [[Exponential distribution|exponential distribution]] is also a special case of the gamma distribution, we also have that if <math>X \sim \chi_2^2</math>, then <math>X\sim \operatorname{Exp}\left(\frac 1 2\right)</math> is an [[Exponential distribution|exponential distribution]].
 
The [[Erlang distribution]] is also a special case of the gamma distribution and thus we also have that if <math>X \sim\chi_k^2</math> with even <math>\text{k}</math>, then <math>\text{X}</math> is Erlang distributed with shape parameter <math>\text{k}/2</math> and scale parameter <math>1/2</math>.
 
==Occurrence and applications{{anchor|Applications}}==
The chi-square distribution has numerous applications in inferential [[Statistics|statistics]], for instance in chi-square tests and in estimating [[Variance|variance]]s. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a [[Linear regression|regression]] line via its role in [[Student's t-distribution]]. It enters all [[Analysis of variance|analysis of variance]] problems via its role in the [[F-distribution]], which is the distribution of the ratio of two independent chi-squared [[Random variable|random variable]]s, each divided by their respective degrees of freedom.
 
Following are some of the most common situations in which the chi-square distribution arises from a Gaussian-distributed sample.
 
*if <math>X_1, ..., X_n</math> are [[Independent and identically distributed random variables|i.i.d.]] <math>N(\mu, \sigma^2)</math> [[Random variable|random variable]]s, then <math>\sum_{i=1}^n(X_i - \overline{X})^2 \sim \sigma^2 \chi^2_{n-1}</math> where <math>\overline X = \frac{1}{n} \sum_{i=1}^n X_i</math>.
*The box below shows some [[Statistics|statistics]] based on <math>X_i \sim N(\mu_i, \sigma^2_i), i= 1, \ldots, k</math> independent random variables that have probability distributions related to the chi-square distribution:
<center>
{| class="wikitable" align="center"
|-
! Name !! Statistic
|-
| chi-square distribution || <math>\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2</math>
|-
| noncentral chi-square distribution || <math>\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2</math>
|-
| [[Chi distribution|chi distribution]] || <math>\sqrt{\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2}</math>
|-
| [[Noncentral chi distribution|noncentral chi distribution]] || <math>\sqrt{\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2}</math>
|}
</center>
The chi-square distribution is also often encountered in [[Physics:Magnetic resonance imaging|magnetic resonance imaging]].<ref>den Dekker A. J., Sijbers J., (2014) "Data distributions in magnetic resonance images: a review", ''Physica Medica'', [https://dx.doi.org/10.1016/j.ejmp.2014.05.002]</ref>
 
==Computational methods==
===Table of ''χ''<sup>2</sup> values vs ''p''-values===
The [[P-value|''p''-value]] is the probability of observing a test statistic ''at least'' as extreme in a chi-square distribution.  Accordingly, since the [[Cumulative distribution function|cumulative distribution function]] (CDF) for the appropriate degrees of freedom ''(df)'' gives the probability of having obtained a value ''less extreme'' than this point, subtracting the CDF value from 1 gives the ''p''-value.  A low ''p''-value, below the chosen significance level, indicates  [[Statistical significance|statistical significance]], i.e., sufficient evidence to reject the null hypothesis. A significance level of 0.05 is often used as the cutoff between significant and non-significant results.
 
The table below gives a number of ''p''-values matching to <math> \chi^2 </math> for the first 10 degrees of freedom.
{| class="wikitable"
|-
! Degrees of freedom (df)
!colspan=11| <math> \chi^2 </math> value<ref>[http://www2.lv.psu.edu/jxm57/irp/chisquar.html Chi-Squared Test] Table B.2. Dr. Jacqueline S. McLaughlin at The Pennsylvania State University. In turn citing: R. A. Fisher and F. Yates, Statistical Tables for Biological Agricultural and Medical Research, 6th ed., Table IV. Two values have been corrected, 7.82 with 7.81 and 4.60 with 4.61</ref>
|-
| style="text-align:center;" | 1
| 0.004
| 0.02
| 0.06
| 0.15
| 0.46
| 1.07
| 1.64
| 2.71
| 3.84
| 6.63
| 10.83
|-
| style="text-align:center;" | 2
| 0.10
| 0.21
| 0.45
| 0.71
| 1.39
| 2.41
| 3.22
| 4.61
| 5.99
| 9.21
| 13.82
|-
| style="text-align:center;" | 3
| 0.35
| 0.58
| 1.01
| 1.42
| 2.37
| 3.66
| 4.64
| 6.25
| 7.81
| 11.34
| 16.27
|-
| style="text-align:center;" | 4
| 0.71
| 1.06
| 1.65
| 2.20
| 3.36
| 4.88
| 5.99
| 7.78
| 9.49
| 13.28
| 18.47
|-
| style="text-align:center;" | 5
| 1.14
| 1.61
| 2.34
| 3.00
| 4.35
| 6.06
| 7.29
| 9.24
| 11.07
| 15.09
| 20.52
|-
| style="text-align:center;" | 6
| 1.63
| 2.20
| 3.07
| 3.83
| 5.35
| 7.23
| 8.56
| 10.64
| 12.59
| 16.81
| 22.46
|-
| style="text-align:center;" | 7
| 2.17
| 2.83
| 3.82
| 4.67
| 6.35
| 8.38
| 9.80
| 12.02
| 14.07
| 18.48
| 24.32
|-
| style="text-align:center;" | 8
| 2.73
| 3.49
| 4.59
| 5.53
| 7.34
| 9.52
| 11.03
| 13.36
| 15.51
| 20.09
| 26.12
|-
| style="text-align:center;" | 9
| 3.32
| 4.17
| 5.38
| 6.39
| 8.34
| 10.66
| 12.24
| 14.68
| 16.92
| 21.67
| 27.88
|-
| style="text-align:center;" | 10
| 3.94
| 4.87
| 6.18
| 7.27
| 9.34
| 11.78
| 13.44
| 15.99
| 18.31
| 23.21
| 29.59
|-
! scope="row" style="text-align:right;" | P value (Probability)
| style="background: #ffa2aa" | 0.95
| style="background: #efaaaa" | 0.90
| style="background: #e8b2aa" | 0.80
| style="background: #dfbaaa" | 0.70
| style="background: #d8c2aa" | 0.50
| style="background: #cfcaaa" | 0.30
| style="background: #c8d2aa" | 0.20
| style="background: #bfdaaa" | 0.10
| style="background: #b8e2aa" | 0.05
| style="background: #afeaaa" | 0.01
| style="background: #a8faaa" | 0.001
|-
|}
 
These values can be calculated evaluating the [[Quantile function|quantile function]] (also known as “inverse CDF” or “ICDF”) of the chi-square distribution;<ref>[http://www.r-tutor.com/elementary-statistics/probability-distributions/chi-squared-distribution R Tutorial: Chi-squared Distribution]</ref> e. g., the {{math|''χ''<sup>2</sup>}} ICDF for {{math|1=''p'' = 0.05}} and {{math|1=df = 7}} yields {{math|2.1673 ≈ 2.17}} as in the table above, noticing that 1 - p is the [[P-value|''p''-value]] from the table.
 
==History==
This distribution was first described by the German statistician Friedrich Robert Helmert in papers of 1875–6,{{sfn|Hald|1998|pp=633–692|loc=27. Sampling Distributions under Normality}}<ref>F. R. Helmert, "[http://gdz.sub.uni-goettingen.de/dms/load/img/?PPN=PPN599415665_0021&DMDID=DMDLOG_0018 Ueber die Wahrscheinlichkeit der Potenzsummen der Beobachtungsfehler und über einige damit im Zusammenhange stehende Fragen]", ''Zeitschrift für Mathematik und Physik'' [http://gdz.sub.uni-goettingen.de/dms/load/toc/?PPN=PPN599415665_0021 21], 1876, pp. 102–219</ref> where he computed the sampling distribution of the sample variance of a normal population. Thus in German this was traditionally known as the ''Helmert'sche'' ("Helmertian") or "Helmert distribution".
 
The distribution was independently rediscovered by the English mathematician [[Biography:Karl Pearson|Karl Pearson]] in the context of [[Goodness of fit|goodness of fit]], for which he developed his Pearson's chi-square test, published in 1900, with computed table of values published in {{Harv|Elderton|1902}}, collected in {{Harv|Pearson|1914|pp=xxxi–xxxiii, 26–28|loc=Table XII}}.
The name "chi-square" ultimately derives from Pearson's shorthand for the exponent in a [[Multivariate normal distribution|multivariate normal distribution]] with the Greek letter Chi, writing
−½χ<sup>2</sup> for what would appear in modern notation as −½'''x'''<sup>T</sup>Σ<sup>−1</sup>'''x''' (Σ being the [[Covariance matrix|covariance matrix]]).<ref>
R. L. Plackett, ''Karl Pearson and the Chi-Squared Test'', International Statistical Review, 1983,  [https://www.jstor.org/stable/1402731?seq=3 61f.]
See also Jeff Miller,  [http://jeff560.tripod.com/c.html Earliest Known Uses of Some of the Words of Mathematics].
</ref> The idea of a family of "chi-square distributions", however, is not due to Pearson but arose as a further development due to Fisher in the 1920s.{{sfn|Hald|1998|pp=633–692|loc=27. Sampling Distributions under Normality}}
 
==See also==
{{Colbegin}}
* [[Chi distribution]]
* [[Cochran's theorem]]
* [[F-distribution|''F''-distribution]]
* [[Fisher's method]] for combining independent tests of significance
* [[Gamma distribution]]
* Generalized chi-square distribution
* Hotelling's ''T''-square distribution
* Noncentral chi-square distribution
* Pearson's chi-square test
* [[Reduced chi-squared statistic]]
* [[Student's t-distribution|Student's ''t''-distribution]]
* [[Wilks's lambda distribution]]
* [[Wishart distribution]]
{{Colend}}
 
==References==
{{Reflist|30em}}
 
==Further reading==
{{refbegin}}
* {{cite book |title=A history of mathematical statistics from 1750 to 1930 |last=Hald |first=Anders |year=1998 |publisher=Wiley |location=New York |isbn=978-0-471-17912-2 }}
* {{Cite journal | last = Elderton | first = William Palin |  title = Tables for Testing the Goodness of Fit of Theory to Observation | doi = 10.1093/biomet/1.2.155 | journal = Biometrika | volume = 1 | issue = 2 | pages = 155–163 | year = 1902 | url = https://zenodo.org/record/1431595 }}
* {{springer|title=Chi-squared distribution|id=Chi-squared_distribution}}
{{refend}}
 
==External links==
*[http://jeff560.tripod.com/c.html Earliest Uses of Some of the Words of Mathematics: entry on Chi squared has a brief history]
*[http://www.stat.yale.edu/Courses/1997-98/101/chigf.htm Course notes on Chi-Squared Goodness of Fit Testing] from Yale University Stats 101 class.
*[http://demonstrations.wolfram.com/StatisticsAssociatedWithNormalSamples/ ''Mathematica'' demonstration showing the chi-squared sampling distribution of various statistics, e. g. Σ''x''², for a normal population]
*[https://www.jstor.org/stable/2348373 Simple algorithm for approximating cdf and inverse cdf for the chi-squared distribution with a pocket calculator]
* [https://www.medcalc.org/manual/chi-square-table.php Values of the Chi-squared distribution]
 
{{ProbDistributions|continuous-semi-infinite}}
 
{{DEFAULTSORT:Chi-Squared Distribution}}
[[Category:Normal distribution]]
[[Category:Infinitely divisible probability distributions]]
 
{{Sourceattribution|Chi-square distribution|1}}

Latest revision as of 09:05, 6 October 2019


The cost of a test is the probability of rejecting good events in hypothesis testing ( Hepa img2.gif Neyman-Pearson Diagram).