When testing a claim about the value of a population mean, the test statistic will depend on whether the population standard deviation is known or unknown. This situation is identical to finding a confidence interval for a mean, and is resolved in exactly the same way.

If $\sigma$ is known, then for a large enough sample, the distribution of sample means will be approximately normal. Or more specifically, we can expect an approximate normality in the following two cases.

- $\sigma$ is known, and the sample size is at least 30 (for any population)
- $\sigma$ is known, and the original population is normal (for any value of $n$)

In this situation, the test statistic will follow the standard normal distribution, and its formula is

$z = \dfrac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$ |

Suppose a high school principal claims that the mean SAT score in math at his school is better than 550. A random sample of 72 students finds a mean score of 574. Assume that the population standard deviation is $\sigma = 100$. Is the principal's claim valid?

- The hypotheses are:

$H_0: \mu = 550$

$H_a: \mu > 550$ - We shall choose $\alpha = 0.05$.
- The test statistic is $z = \dfrac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} = \dfrac{574 - 550}{100/\sqrt{72}} = 2.036$.
- The p-value is $p = \operatorname{normalcdf}(2.036,\infty) = 0.0209$.
- Since $p < \alpha$, we reject $H_0$.
- The evidence supports the claim that the mean SAT math score is more than 550.

There are also two cases for which a hypothesis test of a mean can be done when $\sigma$ is unknown. In these cases, for a large enough sample, the distribution of sample means will follow a t-distribution. Or more specifically, we can expect a t-distribution in the following two cases.

- $\sigma$ is unknown, and the sample size is at least 30 (for any population)
- $\sigma$ is unknown, and the original population is normal (for any value of $n$)

In these two cases, the test statistic will follow a t-distribution with $n-1$ degrees of freedom, and its formula is

$t = \dfrac{\bar{x} - \mu_0}{s/\sqrt{n}}$ |

Suppose twelve gas stations were randomly sampled, and the price of the low grade of gasoline was $\$3.35$ per gallon, with a standard deviation of $\$0.06$ per gallon. Furthermore, a normal probability plot indicates that the data is consistent with having come from a normal population. Have the prices changed from last week's price of $\$3.32$ per gallon?

- The hypotheses are:

$H_0: \mu = 3.32$

$H_a: \mu \ne 3.32$ - We shall choose $\alpha = 0.05$.
- The test statistic is $t = \dfrac{\bar{x} - \mu_0}{s/\sqrt{n}} = \dfrac{3.35 - 3.32}{0.06/\sqrt{12}} = 1.732$.
- With $n-1=11$ degrees of freedom, the p-value is $p = 2 \times \operatorname{tcdf}(1.732,\infty,11) = 0.1112$.
- Since $p > \alpha$, we fail to reject $H_0$.
- There is insufficient evidence to conclude that the mean price has changed from last week.

A paired sample occurs when the data are collected from the same individual at two different points in time, or on two different tasks, or some other fashion in which the values will be connected. The actual test is done on the differences, denoted $d$. These differences may have a known or an unknown population standard deviation, resulting in two cases analogous to those already described. The sample size and normality requirements apply to the differences. Therefore, the test statistic formulas are:

If $\sigma$ is known, $z = \dfrac{\bar{d} - d_0}{\sigma_d/\sqrt{n}}$ |

If $\sigma$ is unknown, $t = \dfrac{\bar{d} - d_0}{s_d/\sqrt{n}}$, with $n-1$ degrees of freedom |

A test preparation course measures student scores on sample tests before and after the course. For the following sample of students, test the claim that the mean score after the course is higher than before the course.

Al | Bob | Carrie | Don | Ellen | Faith | Gina | |

Before: | 550 | 450 | 580 | 500 | 480 | 520 | 510 |

After: | 560 | 480 | 660 | 540 | 460 | 510 | 570 |

We begin by computing the differences.

Al | Bob | Carrie | Don | Ellen | Faith | Gina | |

Differences: | 10 | 30 | 80 | 40 | -20 | -10 | 60 |

The population standard deviation is unknown, and the sample size is small. A normal probability plot provides evidence that the original population of differences could have come from a normal population. Therefore, we can use the t-distribution. The sample statistics are $\bar{d} = 27.14$, $s_d = 36.38$, and $n = 7$.

- The hypotheses are:

$H_0: d = 0$

$H_a: d \ge 0$ - We shall choose $\alpha = 0.05$.
- The test statistic is $t = \dfrac{\bar{d} - d_0}{s_d/\sqrt{n}} = \dfrac{27.14 - 0}{36.38/\sqrt{7}} = 1.974$.
- With $n-1=6$ degrees of freedom, the p-value is $p = \operatorname{tcdf}(1.974,\infty,6) = 0.0479$.
- Since $p < \alpha$, we reject $H_0$.
- The evidence indicates that test scores after the course are higher than before the course.

Independent samples occur when the data are collected from two different groups who may have come from the same population, but otherwise the groups do not consist of the same individuals. The claimed difference is $d_0$. As with other situations, the samples may have known or unknown population standard deviations, resulting in two cases as before. The same assumptions are required, namely a large sample size or a normal population. The test statistic formulas are:

If the $\sigma$ values are both known, then $z = \dfrac{(\bar{x}_1 - \bar{x}_2) - d_0}{\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}}$ |

If the $\sigma$ values are unknown, but $\sigma_1 = \sigma_2$ is assumed, then $t = \dfrac{(\bar{x}_1 - \bar{x}_2) - d_0}{s_p \sqrt{\dfrac{1}{n_1} + \dfrac{1}{n_2}}}$, with $n_1 + n_2 - 2$ degrees of freedom. $s_p^2 = \dfrac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}$ is the pooled variance. |

If the $\sigma$ values are unknown, but $\sigma_1 \neq \sigma_2$ is assumed, then $t = \dfrac{(\bar{x}_1 - \bar{x}_2) - d_0}{\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}}$, with $\dfrac{ \left( s_1^2 / n_1 + s_2^2 / n_2 \right)^2} { \dfrac{\left(s_1^2 / n_1 \right)^2}{n_1 - 1} + \dfrac{\left(s_2^2 / n_2 \right)^2}{n_2 - 1}}$ degrees of freedom. |

The following three examples are on the surface quite similar, but in fact illustrate the three different cases above. The first assumes a known population standard deviation, while the other two examples do not. The second example assumes the underlying populations are identical, while the third example does not.

Two machines are filling packages. Fifty samples from the first machine find a sample mean of 4.53 kilograms, and ninety samples from the second machine have a sample mean of 4.01 kilograms. The population standard deviation of the first machine is 0.80 kilograms, and the population standard deviation of the second machine is 0.60 kilograms. Test the claim that the machines are filling packages equally.

- The hypotheses are:

$H_0: \mu_1 = \mu_2$, or $H_0: d = 0$

$H_a: \mu_1 \ne \mu_2$, or $H_a: d \ne 0$ - We shall choose $\alpha = 0.05$.
- The test statistic is $z = \dfrac{(\bar{x}_1 - \bar{x}_2) - d_0}{\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}} = \dfrac{(4.53 - 4.01) - 0}{\sqrt{\dfrac{0.80^2}{50} + \dfrac{0.60^2}{90}}} = 4.012$.
- The p-value is $p = 2 \times \operatorname{normalcdf}(4.012,\infty) = 6.02 \times 10^{-5}$.
- Since $p < \alpha$, we reject $H_0$.
- The evidence indicates that the machines are not filling packages equally.

Two machines are filling packages with materials from the same population. Fifty samples from the first machine find a sample mean of 4.53 kilograms, and a sample standard deviation of 0.80 kg. Ninety samples from the second machine have a sample mean of 4.01 kilograms, and a sample standard deviation of 0.60 kilograms. Test the claim that the machines are filling packages equally.

- The hypotheses are:

$H_0: \mu_1 = \mu_2$, or $H_0: d = 0$

$H_a: \mu_1 \ne \mu_2$, or $H_a: d \ne 0$ - We shall choose $\alpha = 0.05$.
- Since the populations are identical, the population variances will be equal. Therefore,
we pool the sample variances:

$s_p^2 = \dfrac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2} = \dfrac{(50 - 1)0.80^2 + (90 - 1)0.60^2}{50 + 90 - 2} = 0.4594$, so $s_p = \sqrt{0.4594} = 0.6778$. - The test statistic is $t = \dfrac{(\bar{x}_1 - \bar{x}_2) - d_0}{s_p \sqrt{\dfrac{1}{n_1} + \dfrac{1}{n_2}}} = \dfrac{(4.53 - 4.01) - 0}{0.6778 \sqrt{\dfrac{1}{50} + \dfrac{1}{90}}} = 4.350$.
- The p-value is $p = 2 \times \operatorname{tcdf}(4.350,\infty,138) = 2.63 \times 10^{-5}$.
- Since $p < \alpha$, we reject $H_0$.
- The evidence indicates that the machines are not filling packages equally.

Two machines are filling packages with materials from different populations. Fifty samples from the first machine find a sample mean of 4.53 kilograms, and a sample standard deviation of 0.80 kg. Ninety samples from the second machine have a sample mean of 4.01 kilograms, and a sample standard deviation of 0.60 kilograms. Test the claim that the machines are filling packages equally.

- The hypotheses are:

$H_0: \mu_1 = \mu_2$, or $H_0: d = 0$

$H_a: \mu_1 \ne \mu_2$, or $H_a: d \ne 0$ - We shall choose $\alpha = 0.05$.
- Since the populations are different, we cannot expect the population variances to be equal. Therefore, the test statistic is

$t = \dfrac{(\bar{x}_1 - \bar{x}_2) - d_0}{\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}} = \dfrac{(4.53 - 4.01) - 0}{\sqrt{\dfrac{0.80^2}{50} + \dfrac{0.60^2}{90}}} = 4.012$. - There are $\dfrac{ \left( s_1^2 / n_1 + s_2^2 / n_2 \right)^2} { \dfrac{\left(s_1^2 / n_1 \right)^2}{n_1 - 1} + \dfrac{\left(s_2^2 / n_2 \right)^2}{n_2 - 1}} = \dfrac{ \left( 0.80^2 /50 + 0.60^2 /90 \right)^2} { \dfrac{\left(0.80^2 /50 \right)^2}{50 - 1} + \dfrac{\left(0.60^2 /90 \right)^2}{90 - 1}} = 80.1033$ degrees of freedom.
- The p-value is $p = 2 \times \operatorname{tcdf}(4.012,\infty,80.1033) = 0.000135$.
- Since $p < \alpha$, we reject $H_0$.
- The evidence indicates that the machines are not filling packages equally.