Normal Distributions

Normal distributions are the most frequently encountered continuous distributions in basic statistics. Their ubiquity is due to their connection to the sampling process, and the distributions of sample means and sample proportions. However, even absent a sampling context, many quantities in the natural world are approximately normally distributed. Normal distributions are always defined over the interval $(-\infty, \infty)$.

The Formulas

If $X$ is normally distributed with mean $\mu$ and standard deviation $\sigma$, then the PDF and the moment generating functions of the distribution are given by the following formulas.

\begin{align} f(x) &= \dfrac{1}{\sigma \sqrt{2\pi}} e^{-\dfrac{(x-\mu)^2}{2 \sigma^2}} \\ M(t) &= e^{\mu t + \frac12 \sigma^2 t^2} \end{align}

A standard normal distribution is a normal distribution with mean $\mu = 0$ and standard deviation $\sigma = 1$. These values simplify the PDF and the moment generating function.

\begin{align} f(x) &= \dfrac{1}{\sqrt{2\pi}} e^{-\frac12 x^2} \\ M(t) &= e^{\frac12 t^2} \end{align}

Of course, the CDF of either of these distributions would be a definite integral of their PDFs. However, the antiderivative of the function $f(x)=e^{-x^2}$ cannot be written using only elementary functions. In other words, there is no integration technique (substitution, parts, etc.) that will produce the antiderivative. There are many possible resolutions to this conundrum.

Some mathematical statisticians will define $\phi(x)$ to be the PDF of the standard normal distribution. In other words, $\phi(x) = \dfrac{1}{\sqrt{2\pi}} e^{-\frac12 x^2}$. Since CDFs are often represented by using the capital letter for the function, then $\Phi(x)$ is the CDF. That is, $\Phi(x) = \int_{-\infty}^x \dfrac{1}{\sqrt{2\pi}} e^{-\frac12 t^2} \mathrm{d}t$. This approach simplifies the theorems and proofs in which the normal distribution appears.
Some mathematicians will define a new function in terms of the integral when the antiderivative does not have an elementary solution. With this approach, the Gaussian Error Function is defined by $\operatorname{erf}(x) = \dfrac{2}{\sqrt{\pi}} \int_0^x e^{-t^2} \mathrm{d}t$. With this definition, the CDF of the standard normal distribution is $\Phi(x) = \dfrac12 \left[ 1 + \operatorname{erf} \left( \dfrac{x}{\sqrt{2}} \right) \right]$. In this approach, the function $\operatorname{erf}(x)$ has odd symmetry and horizontal asymptotes $y = \pm 1$, which simplifies it mathematically, but makes it less transparent statistically.
Some Texas Instruments calculators offer $\operatorname{normalpdf}(x)$ and $\operatorname{normalcdf}(a,b)$. The first is the true PDF, that is, $\phi(x) = \operatorname{normalpdf}(x)$. The actual CDF is found when the first argument of the calculator function is $-\infty$. In other words, $\Phi(x) = \operatorname{normalcdf}(-\infty,x)$. The function $\operatorname{invNorm}(x)$ is used to find z-scores when the CDF value (the area to the left) is known.
Many beginning textbooks in statistics avoid the notational issue entirely, and simply speak of the area under the curve from one value to another.

When using the normal distribution in practice, we use either tables of values of the standard normal CDF (or values related to the CDF), or available technology. Since $-\infty$ cannot be entered into the TI calculator, we typically substitute a really large number, like $1 \times 10^{99}$. However, since the area in the tails beyond $z=\pm 7$ of the standard normal distribution is less than $10^{-10}$, it is sufficient to use $\pm 7$ as our really large number.

Computing with the Standard Normal Distribution

Recall that a standard scores (typically called a z-score) is defined by $z = \dfrac{x - \mu}{\sigma}$, or its counterpart $z = \dfrac{x - \bar{x}}{s}$. Due to this definition, z-scores will always have a mean of $0$ and a standard deviation of $1$. (To verify this, compute the z-scores of $x = \mu$ and $x = \mu + \sigma$.) Therefore, we always refer to the domain values of the standard normal distribution as z-scores.

The notation $z_{\alpha}$ is used to identify the z-score for which $\alpha$ is the area under the standard normal curve to the right of $z_{\alpha}$. Note that $z_{\alpha}$ and the CDF use different areas, one to the right and one to the left.

Let us consider a few examples. In our solutions, we will use a combination of statistical and calculator notation. Remember that probabilities are areas under a PDF curve.

Find the area under the standard normal curve to the left of $z = 1.37$.

$P(z < 1.37) = \Phi(1.37) = \operatorname{normalcdf}(-\infty,1.37) = 0.9147$

Find the area under the standard normal curve to the right of $z = 1.81$.

$P(z > 1.81) = 1 - \Phi(1.81) = \operatorname{normalcdf}(1.81,\infty) = 0.0351$

Find the area under the standard normal curve between $z = -1.23$ and $z = 0.85$.

$P(-1.23 < z < 0.85) = \Phi(0.85) - \Phi(-1.23) = \operatorname{normalcdf}(-1.23,0.85) = 0.6930$

Find the z-score for which the area to the right under the standard normal curve is 0.6215. (In other words, find $z_{0.6215}$).

$z_{0.6215} = \operatorname{invNorm}(1 - 0.6215) = \operatorname{invNorm}(0.3785) = -0.31$

Find the z-score for which the area to the left under the standard normal curve is 0.8427.

$z_{1 - 0.8427} = z_{0.1573} = \operatorname{invNorm}(0.8427) = 1.01$

The Empirical Rule

If a data set is approximately normally distributed (bell-shaped), then

about 68% of the data will fall within 1 standard deviation of the mean
about 95% of the data will fall within 2 standard deviations of the mean
about 99.7% of the data will fall within 3 standard deviations of the mean

These results are obtained from the standard normal distribution.

$P(-1 < z < 1) = \Phi(1) - \Phi(-1) = \operatorname{normalcdf}(-1,1) = 0.6827$
$P(-2 < z < 2) = \Phi(2) - \Phi(-2) = \operatorname{normalcdf}(-2,2) = 0.9545$
$P(-3 < z < 3) = \Phi(3) - \Phi(-3) = \operatorname{normalcdf}(-3,3) = 0.9973$

Computing with Non-Standard Normal Distributions

In practical situations, data will not have a mean of $0$ and a standard deviation of $1$. When such data is (at least approximately) normally distributed, we can find all relevant z-scores first.

Suppose the mean lifetime of a particular brand of alkaline battery is 6.32 hours, with a standard deviation of 0.47 hours.

What is the probability that a particular battery will last at least 7 hours?

Standardizing, we obtain $z = \dfrac{7-6.32}{0.47} \approx 1.45$. Then $P(x \ge 7) = P(z \ge 1.45) = 1 - \Phi(1.45) = \operatorname{normalcdf}(1.45,\infty) = 0.0735$. There is a 7.35% probability that the battery will last at least 7 hours.

What percentage of batteries will last less than 6 hours?

Standardizing, we find $z = \dfrac{6-6.32}{0.47} \approx -0.68$. Then $P(x < 6) = P(z < -0.68) = \Phi(-0.68) = \operatorname{normalcdf}(-\infty,-0.68) = 0.2483$. Approximately 25% of the batteries will last less than 6 hours.

What lifetime, in hours, will be exceeded by 95% of the batteries? (Equivalently, what is the 5th percentile of the battery lifetimes?)

Since $z_{0.95} = \operatorname{invNorm}(0.05) \approx -1.645$, we can use this z-score in the z-score formula. That is, $-1.645 = \dfrac{x - 6.32}{0.47}$. Solving this equation for $x$ gives about 5.55 hours. Ninety-five percent of all batteries will last longer than 5.55 hours.

Derivation of the Moment Generating Function

The moment generating function is obtained by evaluating $E(e^{tX})$. The last integral below is recognized as the PDF of a translated standard normal distribution, and therefore equal to one.

\begin{align} M(t) &= \int_{-\infty}^{\infty} e^{tx} \dfrac{1}{\sigma \sqrt{2\pi}} e^{-\dfrac{(x-\mu)^2}{2 \sigma^2}} \, \mathrm{d}x \\ &= \int_{-\infty}^{\infty} \dfrac{1}{\sigma \sqrt{2\pi}} e^{-\dfrac{1}{2 \sigma^2} [x^2 - 2(\mu + \sigma^2 t)x + \mu^2]} \, \mathrm{d}x \\ &= \int_{-\infty}^{\infty} \dfrac{1}{\sigma \sqrt{2\pi}} e^{-\dfrac{1}{2 \sigma^2} [(x - (\mu + \sigma^2 t))^2 - 2\mu \sigma^2 t - \sigma^4 t^2 ]} \, \mathrm{d}x \\ &= e^{-\dfrac{1}{2 \sigma^2}[2\mu \sigma^2 t + \sigma^4 t^2 ]} \int_{-\infty}^{\infty} \dfrac{1}{\sigma \sqrt{2\pi}} e^{-\dfrac12 \left[ \dfrac{x - (\mu + \sigma^2 t)}{\sigma} \right]^2} \, \mathrm{d}x \\ &= e^{-\dfrac{1}{2 \sigma^2}[2\mu \sigma^2 t + \sigma^4 t^2 ]} \\ &= e^{\mu t + \frac12 \sigma^2 t^2} \end{align}

From the moment generating function, it is easily verified that the distribution has mean $\mu$ and variance $\sigma^2$.

\begin{align} M'(t) &= e^{\mu t + \frac12 \sigma^2 t^2} (\mu + \sigma^2 t) \\ E(X) &= M'(0) = e^0 (\mu + 0) = \mu \\ M''(t) &= e^{\mu t + \frac12 \sigma^2 t^2} (\mu + \sigma^2 t)^2 + e^{\mu t + \frac12 \sigma^2 t^2} \sigma^2 \\ E(X^2) &= M''(0) = e^0 \mu^2 + e^0 \sigma^2 = \mu^2 + \sigma^2 \\ Var(X) &= E(X^2) - (E(X))^2 = \mu^2 + \sigma^2 - \mu^2 = \sigma^2 \end{align}