Powered by MathJax
We use MathJax

Chi-Square Distributions

Chi-square distributions arise in the study of sample variances. Mathematically, a squared standard score (squared z-score) from a normal distribution has a chi-square distribution with one degree of freedom. Because chi-square distributions are a type of gamma distribution, and variances are found by squaring deviations from the mean, it follows that a function of the sample variance will have a chi-square distribution.

The Formulas

If $X$ has a chi-square distribution with $r$ degrees of freedom over the interval   $[0, \infty)$,   then the following formulas apply.

\begin{align} f(x) &= \dfrac{1}{\Gamma \left(\dfrac{r}{2}\right) 2^{r/2}} x^{\frac{r}{2} - 1} e^{-\frac{r}{2}} \\ M(t) &= \dfrac{1}{(1 - 2t)^{r/2}} \\ E(X) &= r \\ Var(X) &= 2r \\ \sigma_X &= \sqrt{2r} \end{align}

When using the chi-square distribution in practice, we use either tables of values of the chi-square CDF (or values related to the CDF), or available technology.

Computing with the Chi-Square Distribution

Analogous to the standard score, the notation $\chi^2_{\alpha}$ is used to identify the $\chi^2$-score for which $\alpha$ is the area under the chi-square curve to the right of $\chi^2_{\alpha}$. As with the normal distribution, $\chi^2_{\alpha}$ and the CDF use different areas, one to the right and one to the left. Also, since the chi-square curve is not symmetric,   $\chi^2_{\alpha} \ne -\chi^2_{1-\alpha}$.

Let us consider a few examples. Remember that probabilities are areas under a PDF curve.

Finding the value from an area under a chi-square curve is a bit more problematic, since an inverse chi-square function is not generally available on calculators. If computer software is available, such values may be easy to obtain. However, a table feature on a TI graphing calculator can be used to "zoom in" on the required value. This is done as follows:

We describe this process in the following two examples.

Additivity of Chi-Square Distributions

If two independent random variables $X$ and $Y$ both have chi-square distributions, with degrees of freedom $r_x$ and $r_y$ respectively, then the random variable given by their sum will have   $r_x + r_y$   degrees of freedom. This result is an immediate consequence of the fact that a chi-square distribution is a special case of a gamma distribution. A chi-square distribution with $r$ degrees of freedom is equivalent to a gamma distribution with parameters   $k = \dfrac{r}{2}$   and $\lambda = \dfrac12$, so the additivity of the gamma distribution gives the additivity of the chi-square distribution.

Squared Standard Scores

Suppose the random variable $X$ is normally distributed, so that the the random variable defined by the standard score,   $Z = \dfrac{X - \mu}{\sigma}$,   has a standard normal distribution. We are interested in the random variable   $V = Z^2$. We consider the CDF of $V$ as follows.

\begin{align} F_V(v) &= P(V \le v) \\ &= P(Z^2 \le v) \\ &= P(-\sqrt{v} \le Z \le \sqrt{v}) \\ &= \int_{-\sqrt{v}}^{\sqrt{v}} \dfrac{1}{\sqrt{2 \pi}} e^{-\frac12 z^2} \, \mathrm{d}z \\ &= \int_0^{\sqrt{v}} \sqrt{\dfrac{2}{\pi}} e^{-\frac12 z^2} \, \mathrm{d}z \end{align}

Then making the substitution   $z = \sqrt{y}$,   we obtain

\begin{equation} F_V(v) = \int_0^{v} \dfrac{1}{\sqrt{2 \pi y}} e^{- \frac12 y} \, \mathrm{d}y \end{equation}

We can then take the derivative of the CDF to obtain the PDF. Note that we used the identity   $\Gamma\left( \frac12 \right) = \sqrt{\pi}$.

\begin{align} f_V(v) &= \dfrac{1}{\sqrt{2 \pi v}} e^{- \frac12 v} \\ &= \dfrac{1}{\Gamma\left(\frac12 \right) 2^{1/2}} v^{\frac12 - 1} e^{-\frac{v}{2}} \end{align}

We recognize this as a chi-square distribution with one degree of freedom, or alternatively as a gamma distribution with   $\lambda = \dfrac12$   and   $k = \dfrac12$.

Furthermore, when we examine the random variable that is the sum of $n$ independent squared standard scores, the additivity of the chi-square distribution will find that the random variable of the sum is a chi-square distribution with $n$ degrees of freedom.

Derivation of the Other Formulas

Since the chi-square distribution is a gamma distribution with parameters   $k=\dfrac{r}{2}$   and $\lambda = \dfrac12$,   we substitute these into the gamma distribution formulas and obtain the following results.

\begin{align} M(t) &= \left( \dfrac{\lambda}{\lambda -t} \right)^k = \left( \dfrac{\dfrac12}{\dfrac12 - t} \right)^{r/2} = \dfrac{1}{(1-2t)^{r/2}} \\ E(X) &= \dfrac{k}{\lambda} = \dfrac{\frac{r}2}{\frac12} = r \\ Var(X) &= \dfrac{k}{\lambda^2} = \dfrac{\frac{r}2}{\left(\frac12\right)^2} = 2r \\ \sigma_X &= \sqrt{Var(X)} = \sqrt{2r} \end{align}