Just as graphs in college algebra could be translated or stretched by changing the parameters in the function, so too can probability distributions, since they are also functions and have graphs. To describe the transformation, we typically define a new random variable, $Y$, in terms of the previous random variable, $X$. We shall assume that the relation between $X$ and $Y$ is linear, so it has the form $Y=aX+b$.

Now the discrete random variable $X$ takes on values $x$ that are in the domain of $X$, and the PDF defines the probability $P(X=x)$ associated with each value $x$. The transformation equation $Y=aX+b$, together with the values of $x$ from the domain of $X$, define the domain of $Y$ with values $y$, but the probabilities will not change as a result of the translation. Therefore, $P(x) = P(y)$ whenever $y=ax+b$. More clearly, we have the equivalences

$P(Y=y) = P(aX+b = ax+b) = P(X=x)$ |

which we usually abbreviate as $P(y) = P(ax+b) = P(x)$.

In the case of the continuous random variable, a change in the probability density function can occur. Suppose first that the continuous PDF $f_X(x)$ is defined on an interval $(p,q)$ (where $p$ and $q$ are not necessarily finite). The transformation $Y=aX+b$ will produce a continuous PDF $f_Y(y)$ on the interval $[ap+b,aq+b]$. Solving the transformation for $x$, we get $x = \dfrac{y-b}{a}$, which we can substitute into the integral

\begin{equation} \int_p^q f_X(x) \, \mathrm{d}x = 1 = \int_{ap+b}^{aq+b} f_X \left( \dfrac{y-b}{a} \right) \dfrac{1}{a} \, \mathrm{d}y \end{equation}

Equating $f_Y(y)$ with the integrand of the transformed integral, we find

$f_Y(y) = \dfrac1a f_X(x)$ |

Note that since the transformation increased the length of the interval (or any finite subinterval) on which the random variable was defined by a factor of $a$, we find that the height was decreased by the same factor.

When a random variable is transformed, as in the equation $Y = aX + b$, the moment generating function has the following property.

$M_Y(t) = M_{aX+b}(t) = e^{tb} M_X (at)$ |

Although the result is not obvious, the proof is quite easy.

\begin{align} M_Y(t) &= M_{aX+b}(t) \\ &= E(e^{t(ax+b)}) \\ &= \sum\limits_x e^{t(ax+b)} P(x) \\ &= e^{tb} \sum\limits_x e^{atx} P(x) \\ &= e^{tb} M_X (at) \end{align} | \begin{align} M_Y(t) &= M_{aX+b}(t) \\ &= E(e^{t(ax+b)}) \\ &= \int_x e^{t(ax+b)} f(x) \\ &= e^{tb} \int_x e^{atx} f(x) \\ &= e^{tb} M_X (at) \end{align} |

We would anticipate that the expected value will change when the data is shifted. After all, the expected value is related to the mean, a measure of central tendency, and the center of the graph will move if the data is all moved. In fact, the relationship between the expected values of the old and new random variables is:

$E(Y) = E(aX+b) = a E(X)+b$ |

We can easily derive this formula from the moment generating function.

\begin{align} M'_Y(t) &= a e^{tb} M'_X(at) + b e^{tb} M_X (at) \\ E(Y) &= M'_Y (0) = a e^0 M'_X(0) + b e^0 M_X(0) = a E(X) + b \end{align}Similarly, a formula for the variance of the random variable $Y$ in terms of the variance of $X$ also exists.

$Var(Y) = Var(aX+b) = a^2 Var(X)$ |

Once again, we turn to the moment generating function.

\begin{align} M''_Y(t) &= a^2 e^{tb} M''_X(at) + 2ab e^{tb} M'_X(at) + b^2 e^{tb} M_X(at) \\ E(Y^2) &= M''_Y(0) = a^2 e^0 M_X''(0) + 2ab e^0 M'_X(0) + b^2 e^0 M_X(0) \\ &= a^2 E(X^2) + 2ab E(X) + b^2 \\ Var(Y) &= E(Y^2) - (E(Y))^2 = a^2 E(X^2) + 2ab E(X) + b^2 - (a E(X) + b)^2 \\ &= a^2 E(X^2) + 2ab E(X) + b^2 - a^2 (E(X))^2 - 2ab E(X) - b^2 \\ &= a^2 E(X^2) - a^2 (E(X))^2 \\ &= a^2 (E(X^2) - a^2 (E(X))^2) = a^2 Var(X) \end{align}An immediate corollary of the variance result is the relationship between the standard deviations of the two random variables.

$\sigma_Y = \sigma_{aX+b} = |a| \sigma_X$ |