## Moments and Moment Generating Functions

Measures of central tendency and dispersion are the two most common ways to summarize the features of a probability distribution. Expected value and variance are two typically used measures. Other features that could be summarized include skewness and kurtosis. All four of these measures are examples of a mathematical quantity called a moment.

### Moments Defined

The nth moment of a distribution (or set of data) about a number is the expected value of the nth power of the deviations about that number. In statistics, moments are needed about the mean, and about the origin.

• The nth moment of a distribution about zero is given by $E(X^n)$.
• The nth moment of a distribution about the mean is given by $E((X-\mu)^n)$.

Then each type of measure includes a moment definition.

• The expected value, $E(X)$, is the first moment about zero.
• The variance, $Var(X)$, is the second moment about the mean, $E((X-\mu)^2)$.
• A common definition of skewness is the third moment about the mean, $E((X-\mu)^3)$.
• A common definition of kurtosis is the fourth moment about the mean, $E((X-\mu)^4)$.

Since moments about zero are typically much easier to compute than moments about the mean, alternative formulas are often provided.

• $Var(X) = E((X-\mu)^2) = E(X^2) - E(X)^2$. This formula was derived earlier.
• $Skew(X) = E((X-\mu)^3) = E(X^3) - 3E(X)E(X^2) + 2E(X)^3$
• $Kurt(X) = E((X-\mu)^4) = E(X^4) - 4E(X)E(X^3) + 6E(X)^2 E(X^2) - 3E(X)^4$

### Moment Generating Functions

Since each moment is an expected value, and the definition of expected value involves either a sum (in the discrete case) or an integral (in the continuous case), it would seem that the computation of moments could be tedious. However, there is a single expected value function whose derivatives can produce each of the required moments. This function is called a moment generating function. In particular, if $X$ is a random variable, and either $P(x)$ or $f(x)$ is the PDF of the distribution (the first is discrete, the second continuous), then the moment generating function is defined by the following formulas.

 $M_X(t) = E(e^{tX}) = \sum\limits_{\text{all }x} e^{tx} P(x)$ $M_X(t) = E(e^{tX}) = \int_x e^{tx} f(x) \, \mathrm{d}x$

When the nth derivative (with respect to $t$) of the moment generating function is evaluated at $t=0$, the nth moment of the random variable $X$ about zero will be obtained.

 $E(X^n) = \left. \dfrac{d^n}{dt^n} M_X(t) \right|_{t=0}$

This result can be easily obtained by writing the Taylor series for the exponential function about zero. In the discrete case, we get the following results. (The continuous case will be very similar.)

\begin{align} M_X(t) &= \sum\limits_{\text{all }x} e^{tx} P(x) \\ &= \sum\limits_{\text{all }x} \left( 1 + tx + \dfrac{(tx)^2}{2!} + \dfrac{(tx)^3}{3!} + \dfrac{(tx)^4}{4!} + \cdots \right) P(x) \\ M_X(t=0) &= \sum\limits_{\text{all }x} P(x) = 1 \\ \dfrac{d}{dt} M_X(t) &= \sum\limits_{\text{all }x} \left( x + tx^2 + \dfrac12 t^2 x^3 + \dfrac16 t^3 x^4 + \cdots \right) P(x) \\ M'_X(t=0) &= \sum\limits_{\text{all }x} x P(x) = E(X) \\ \dfrac{d^2}{dt^2} M_X(t) &= \sum\limits_{\text{all }x} \left( x^2 + t x^3 + \dfrac12 t^2 x^4 + \cdots \right) P(x) \\ M''_X(t=0) &= \sum\limits_{\text{all }x} x^2 P(x) = E(X^2) \\ \dfrac{d^3}{dt^3} M_X(t) &= \sum\limits_{\text{all }x} \left( x^3 + t x^4 + \cdots \right) P(x) \\ M^{(3)}_X(t=0) &= \sum\limits_{\text{all }x} x^3 P(x) = E(X^3) \\ \dfrac{d^4}{dt^4} M_X(t) &= \sum\limits_{\text{all }x} \left( x^4 + \cdots \right) P(x) \\ M^{(4)}_X(t=0) &= \sum\limits_{\text{all }x} x^4 P(x) = E(X^4) \end{align}

### A Discrete Example

Suppose a discrete PDF is given by the following table.

 $X=x$ $P(X=x)$ $X=0$ $0.4$ $X=1$ $0.35$ $X=2$ $0.25$

We obtain the moment generating function $M_X (t)$ from the expected value of the exponential function.

\begin{align} M_X (t) &= E(e^{tx}) \\ &= \sum\limits_{\text{all }x} e^{tx} P(x) \\ &= 0.4 e^{0t} + 0.35 e^{1t} + 0.25 e^{2t} \\ &= 0.4 + 0.35e^t + 0.25e^{2t} \end{align}

We can then compute derivatives and obtain the moments about zero.

 \begin{align} M'_X (t) &= 0.35e^t + 0.5e^{2t} \\ M''_X(t) &= 0.35e^t + e^{2t} \\ M^{(3)}_X (t) &= 0.35e^t + 2e^{2t} \\ M^{(4)}_X (t) &= 0.35e^t + 4e^{2t} \phantom{...............} \end{align} \begin{align} M'_X (0) &= 0.35 + 0.5 = 0.85 \\ M''_X(t) &= 0.35 + 1 = 1.35 \\ M^{(3)}_X (t) &= 0.35 + 2 = 2.35 \\ M^{(4)}_X (t) &= 0.35 + 4 = 4.35 \end{align}

Then, with the formulas above, we can produce the various measures.

\begin{align} E(X) &= M'(0) = 0.85 \\ Var(X) &= E(X^2) - E(X)^2 \\ &= 1.35 - 0.85^2 = 0.6275 \\ Skew(X) &= E(X^3) - 3E(X)E(X^2) + 2E(X)^3 \\ &= 2.35 - 3(0.85)(1.35) + 2(0.85^3) = 0.13575 \\ Kurt(X) &= E(X^4) - 4E(X)E(X^3) + 6E(X)^2 E(X^2) - 3E(X)^4 \\ &= 4.35 - 4(0.85)(2.35) + 6(0.85^2)(1.35) - 3(0.85^4) \approx 0.6462 \end{align}

### A Continuous Example

Let us now consider the PDF given by   $f(x) = 4e^{-4x}$,   defined on the interval   $[0, \infty)$. We find the moment generating function as the integral of the expected value of an exponential quantity.

\begin{align} M_X (t) &= E(e^{tx}) \\ &= \int_0^{\infty} e^{tx} f(x) \, \mathrm{d}x \\ &= \int_0^{\infty} e^{tx} 4e^{-4x} \, \mathrm{d}x \\ &= \int_0^{\infty} 4 e^{(t-4)x} \, \mathrm{d}x \\ &= \left. \dfrac{4}{t-4} e^{(t-4)x} \right|_0^{\infty} \\ &= \dfrac{-4}{t-4} = -4(t-4)^{-1} \end{align}

It should be noted that the computations above assumed that   $t<4$, which was necessary in order to evaluate the antiderivative at $\infty$. With this result, we can compute the derivatives and the moments.

 \begin{align} M'_X (t) &= 4(t-4)^{-2} \\ M''_X (t) &= -8(t-4)^{-3} \\ M^{(3)}_X (t) &= 24(t-4)^{-4} \\ M^{(4)}_X (t) &= -96(t-4)^{-5} \phantom{...............} \end{align} \begin{align} M'_X (0) &= 4(-4)^{-2} = \dfrac14 \\ M''_X(t) &= -8(-4)^{-3} = \dfrac18 \\ M^{(3)}_X (t) &= 24(-4)^{-4} = \dfrac{3}{32} \\ M^{(4)}_X (t) &= -96(-4)^{-5} = \dfrac{3}{32} \end{align}

Then we can compute the various measures.

\begin{align} E(X) &= M'(0) = \dfrac14 \\ Var(X) &= E(X^2) - E(X)^2 \\ &= \dfrac18 - \left( \dfrac14 \right)^2 = \dfrac{1}{16} \\ Skew(X) &= E(X^3) - 3E(X)E(X^2) + 2E(X)^3 \\ &= \dfrac{3}{32} - 3\left( \dfrac14 \right) \left( \dfrac18 \right) + 2 \left( \dfrac14 \right)^3 = \dfrac{1}{32} \\ Kurt(X) &= E(X^4) - 4E(X)E(X^3) + 6E(X)^2 E(X^2) - 3E(X)^4 \\ &= \dfrac{3}{32} - 4\left( \dfrac14 \right) \left( \dfrac{3}{32} \right) + 6 \left( \dfrac14 \right)^2 \left( \dfrac18 \right) - 3 \left( \dfrac14 \right)^4 = \dfrac{9}{256} \end{align}

### Uniqueness of Moment Generating Functions

Different probability distribution functions will always have different moment generating functions. That means if a moment generating function is found and recognized, then there is no other possible PDF for that function.