Binomial Distributions

A statistical experiment often consists of a repeated number of trials, and the presence or absence of some characteristic is observed at each trial. The presence of the characteristic is generically called a "success". Let the discrete random variable $X$ represent the number of "successes". Then $X$ has a binomial distribution when the following four properties are satisfied.

There are two possible outcomes for each trial, "success" and "failure".
The probability of a success is constant.
The trials are independent.
The number of trials is fixed.

A very great many situations meet the requirements for using the binomial distribution.

The Formulas

In a binomial distribution, if $n$ is the number of trials, $p$ is the probability of a success, and $x$ is the number of successes, then the following formulas apply.

\begin{align} P(x) &=\left( {}_n C_x \right) p^x (1-p)^{n-x} \\ M(t) &= \left[pe^t + 1 - p \right]^n \\ E(X) &= np \\ Var(X) &= np(1-p) \end{align}

Rolling Several Dice

In the game of dice basketball, a player rolls eight dice. What is the probability that five of the dice are fours? How many fours should we expect, and what is the standard deviation for the number of fours?

Since we are interested in "fours", then a success is a four. There are two outcomes on each die, namely "fours" and "not fours". The probability of a success is $p=\dfrac16$, and is constant. There are eight trials in this experiment, so $n=8$, and the trials are independent. Therefore all of the conditions for using the binomial distribution have been met.

To determine the probability that five of the dice are fours, we use $x=5$. This gives $P(X=5) = \left( {}_8 C_5 \right) \left(\dfrac16\right)^5 \left(\dfrac56\right)^3 = \dfrac{7000}{1679616} \approx 0.0042$. The expected value is $E(X) = 8\left(\dfrac16\right) = \dfrac43 \approx 1.3333$ fours, and the standard deviation is $\sigma = \sqrt{ 8\left(\dfrac16\right)\left(\dfrac56\right)} = \dfrac{\sqrt{10}}{3} \approx 1.0541$ fours.

Selecting With Replacement

A bag of 20 jellybeans includes 9 red, 5 green, 4 yellow, and 2 orange jellybeans. Seven jellybeans are selected with replacement. Find the probability that three of the seven are red. What is the expected number of red jellybeans, and the standard deviation?

A success is a red jellybean, and the other outcome is "not red". The probability of a success is $p=\dfrac{9}{20}=0.45$, and is constant. There are seven trials, so $n=7$. The trials are independent, since the selections are being done with replacement. Therefore, all of the binomial conditions have been met.

To obtain three red jellybeans, we let $x=3$. Then $P(X=3) = \left( {}_7 C_3 \right) \left(0.45\right)^3 \left(0.55\right)^4 \approx 0.2918$. The expected value is $E(X) = 7(0.45) = 3.15$ jellybeans, and the standard deviation is $\sigma = \sqrt{ 7 (0.45)(0.55)} \approx 1.3162$ jellybeans.

Sampling from a Very Large Population

Approximately 44% of all Americans have blood type O. Suppose 24 people are randomly selected. What is the probability that exactly 13 have blood type O? How many should we expect to have that blood type, and what is the standard deviation?

In this problem, a success is an individual with blood type O, and the other outcome are those who do not have blood type O. The probability of a success is $p=0.44$. We will actually be sampling without replacement, so this problem is really a hypergeometric distribution. But since the population of the USA is millions of times greater than the size of the sample, we can assume the probabilities are essentially constant, and therefore use the binomial distribution as an approximation for the hypergeometric distribution. (The details of the approximation are given on the Hypergeometric Distributions web page.) The number of trials is fixed at $n=24$. As long as we randomly sample from the entire population, and not a small group that makes it likely that we would choose relatives, we can assume the trials would be independent. Therefore, the conditions for using the binomial distribution have been basically met (due to the large size of the population compared to the sample).

To obtain 13 people with blood type O, we want $x=13$. We then have $P(X=13) = \left( {}_{24} C_{13} \right) \left(0.44\right)^{13} \left(0.56\right)^{11} \approx 0.0982$. The expected value is $E(X) = 24(0.44) = 10.56$ people with blood type O, and the standard deviation is $\sigma = \sqrt{ 24 (0.44)(0.56)} \approx 2.4318$ people.

Derivation of the Formulas

In the following derivation, we will make use of the binomial formula from college algebra. It says $(x+y)^n = \sum\limits_{k=0}^n \left({}_n C_k\right) x^k y^{n-k}$.

Now imagine a scenario with $n$ trials and $x$ successes. We shall identify the successes by S, and the failures by F. One possible arrangement of the successes and failures is that all of the successes came first, and then the failures, which would be SS...SFF...F. In this arrangement, there are $x$ successes and $(n-x)$ failures. The number of distinguishable permutations of the successes and failures is given by ${}_n C_x$, which is equivalent to choosing the $x$ positions in the list for the $x$ successes, which also determines the positions for the failures. Each success has probability $p$, and since there are $x$ successes, the probability of these successes is the product, $p^x$. Each failure has probability $(1-p)$, and there are $(n-x)$ failures, so the probability of the failures is $(1-p)^{n-x}$. The probability of $x$ successes in $n$ trials is therefore the product of all of these factors. Therefore, we have $P(x)=\left( {}_n C_x \right) p^x (1-p)^{n-x}$.

To obtain the moment generating function $M(t)$, we evaluate $E(e^tX)$.

\begin{align} M(t) &= E(e^tX) = \sum\limits_{x=0}^n e^{tx} \left( {}_n C_x \right) p^x (1-p)^{n-x} \\ &= \sum\limits_{x=0}^n \left( {}_n C_x \right) (pe^t)^x (1-p)^{n-x} \\ &= \left[pe^t + 1 - p \right]^n \end{align}

We can obtain the expected value of a binomial distribution from the first derivative of the moment generating function.

\begin{align} M'(t) &= n \left[pe^t + 1 - p \right]^{n-1} pe^t \\ E(X) &= M'(0) = n (1^{n-1}) p = np \end{align}

Similarly, the value of $E(X^2)$ is found from the second derivative of $M(t)$, from which we can obtain the variance.

\begin{align} M''(t) &= n(n-1)\left[pe^t + 1 - p \right]^{n-2} (pe^t)^2 + n \left[pe^t + 1 - p \right]^{n-1} pe^t \\ E(X^2) &= M''(0) = n(n-1) (1^{n-2}) p^2 + n (1^{n-1}) p = n(n-1)p^2 + np \\ Var(x) &= E(X^2) - (E(X))^2 = n(n-1)p^2 + np - (np)^2 = np(1-p) \end{align}

And this result implies that the standard deviation of a binomial distribution is given by $\sigma = \sqrt{ np(1-p)}$.