Poisson Distributions

When trials can occur in a fixed continuum of time (or distance), each instant of time (or distance) is essentially a distinct trial. Because a continuum contains an infinity of points, this means a statistical experiment may have an infinite number of trials, and the probability of each trial would approach zero. But the rate of successes per unit of time (or distance) can still be a finite, nonzero quantity. Let us assume that there are only two possible outcomes for each instant of time (or distance), that two successes cannot occur at the same instant, that trials are independent, and that the rate of successes is constant. Let $X$ represent the number of successes. Then $X$ has a Poisson distribution.

The Formulas

In a Poisson distribution, if $\lambda$ is the rate of successes per unit of time (or distance), and $x$ is the number of successes, then the following formulas apply.

\begin{align} P(x) &= \dfrac{ \lambda^x e^{-\lambda}}{x!} \\ M(t) &= e^{\lambda (e^t - 1)} \\ E(X) &= \lambda \\ Var(X) &= \lambda \end{align}

Occurrences in Time

Customers arrive at a drive-up window at the rate of 40 per hour. What is the probability that 30 customers arrive in the next hour? How many customers should we expect per hour, and what is the standard deviation for the number of customers per hour?

Since we are interested in the number of customers, then a success is a customer. There are two outcomes for each instant of time, either a customer arrives or they do not. The rate is $\lambda = 40$ customers per hour, and we assume that this is constant. We would also assume that the successes are independent, which would imply that a group of individuals did not take two cars to the drive-up window. Cars may not arrive simultaneously at the window, since we do force them to line up and take turns. With our stated assumptions, all of the conditions for using the Poisson distribution will have been met.

To determine the probability that 30 customers arrive in the next hour, we use $x=30$. This gives $P(X=30) = \dfrac{ 40^{30} e^{-40}}{30!} \approx 0.0185$. The expected value is $E(X) = 40$ customers per hour, and the standard deviation is $\sigma = \sqrt{40} \approx 6.32$ customers per hour.

Occurrences over a Distance

A particular stretch of road has an average of 2 potholes per mile. What is the probability that there are no potholes over the next 3 miles? What is the expected number of potholes in the next three miles, and the standard deviation for the number of potholes?

Since we are interested in the number of potholes, a success is a pothole (and good roads are failures!). The rate of potholes is a constant 2 per mile. We assume that potholes are independent (so the presence of one does not make it more or less likely that another will immediately occur), and that they do not occur simultaneously (an extra deep pothole does not count twice). Therefore, all of the Poisson conditions have been met.

Now we may be tempted to think $\lambda = 2$, since that was the given rate, but that would be incorrect. The rate was 2 potholes per mile, but we are interested in a three-mile stretch of road. Therefore, we use $\lambda = 3(2) = 6$ potholes per three miles. To determine that no potholes occur, we use $x=0$. Then $P(X=0) = \dfrac{ 6^0 e^{-6}}{0!} = e^{-6} \approx 0.0025$. (That makes it quite likely that at least one pothole will occur in the next three miles, in fact 99.75% likely.) The expected number of potholes for that stretch of road is $E(X) = 6$ potholes, and the standard deviation is $\sigma = \sqrt{ 6} \approx 2.45$ potholes.

Derivation of the Formulas

In the following derivation, we will make use of the limiting result $\lim\limits_{x \rightarrow \infty} \left( 1+ \dfrac{n}{x}\right)^x = e^n$, and the Taylor series of the exponential function about $x=0$, which is $e^x = \sum\limits_{k=0}^{\infty} \dfrac{1}{k!} x^k$.

Suppose $X$ has a binomial distribution, with $n$ trials and probability $p$ of a success. We recall that the expected value (or mean) of a binomial distribution is given by $E(X) = np$. The rate of successes is just the expected number of successes, so we define $\lambda = np$. At this point, we can substitute $\lambda$ and simplify.

\begin{align} P(X=x) &= \left( {}_n C_x \right) p^x (1-p)^{n-x} \\ &= \dfrac{n!}{x! (n-x)!} \left(\dfrac{\lambda}{n} \right)^x \left(1-\dfrac{\lambda}{n} \right)^{n-x} \\ &= \dfrac{\lambda^x}{x!} \dfrac{n(n-1) \cdots (n-x+1)}{n^x} \dfrac{ \left(1-\dfrac{\lambda}{n} \right)^n} { \left(1-\dfrac{\lambda}{n} \right)^x} \\ &= \dfrac{\lambda^x}{x!} (1) \left(1-\dfrac{1}{n}\right) \cdots \left(1-\dfrac{x-1}{n}\right) \dfrac{ \left(1-\dfrac{\lambda}{n} \right)^n}{ \left(1-\dfrac{\lambda}{n} \right)^x} \end{align}

Now we consider what happens when $n$ approaches infinity.

\begin{align} \lim_{n \rightarrow \infty} P(X=x) &= \lim_{n \rightarrow \infty} \dfrac{\lambda^x}{x!} (1) \left(1-\dfrac{1}{n}\right) \cdots \left(1-\dfrac{x-1}{n}\right) \dfrac{ \left(1-\dfrac{\lambda}{n} \right)^n}{ \left(1-\dfrac{\lambda}{n} \right)^x} \\ &= \dfrac{\lambda^x}{x!} \left( 1^x \right) \dfrac{ e^{-\lambda}}{ 1^x} \\ &= \dfrac{\lambda^x e^{-\lambda}}{x!} \end{align}

The moment generating function can be obtained by evaluating $E(e^{tX})$, and recognizing the Taylor series of the exponential function that results.

\begin{align} M(t) &= E(e^{tx}) = \sum\limits_{x=0}^{\infty} e^{tx} \dfrac{\lambda^x e^{-\lambda}}{x!} \\ &= e^{-\lambda} \sum\limits_{x=0}^{\infty} \dfrac{(\lambda e^t)^x}{x!} \\ &= e^{-\lambda} \left[ e^{\lambda e^t} \right] \\ &= e^{\lambda (e^t - 1)} \end{align}

Since we defined $\lambda$ by using the expected value of the binomial distribution, we could certainly expect it to also be the expected value of the Poisson distribution. We can confirm this by examining the first derivative of the moment generating function.

\begin{align} M'(t) &= e^{\lambda (e^t - 1)} \lambda e^t \\ E(X) &= M'(0) = e^0 \lambda e^0 = \lambda \end{align}

The quantities $E(X^2)$ and $Var(X)$ are obtained from the second derivative.

\begin{align} M''(t) &= e^{\lambda (e^t - 1)} (\lambda e^t)^2 + e^{\lambda (e^t - 1)} \lambda e^t \\ E(X^2) &= M''(0) = e^0 \lambda^2 + e^0 \lambda = \lambda^2 + \lambda \\ Var(X) &= E(X^2) - (E(X))^2 = \lambda^2 + \lambda - \lambda^2 = \lambda \end{align}

And this result implies that the standard deviation of a Poisson distribution is given by $\sigma = \sqrt{ \lambda}$.