Probability is best studied by simultaneously considering all possible outcomes in the sample space, as this provides a check on the accuracy of the computations. Furthermore, we can apply our descriptive statistics concepts to the probability distributions that we obtain.
You should remember that in classical probability, we called the process that generated outcomes a statistical experiment. The collection of all possible outcomes was called the sample space. When we considered probabilities for the outcomes in a sample space, we saw that the outcomes were generally equally likely. Now, we want to group the outcomes by some characteristic, and consider the probabilities of the different possible events that result.
A random variable is a numerical quantity whose values are the result of a random process (i.e., statistical experiment). The random variable will most often provide the grouping of the outcomes into different events. Although it is called a "variable", a random variable is actually a function which assigns to each outcome in the sample space a numerical quantity. A random variable can be either discrete or continuous, depending on whether the numerical quantities being assigned are discrete or continuous. The domain of the random variable is the sample space. Convention uses a capital letter as a name for the random variable, and the corresponding small letter for its value.
A probability distribution is a description of all possible values of a random variable, along with their associated probabilities. A probability distribution is also a function, and is often abbreviated pdf for probability distribution function. Notationally, the pdf gives the values of $P(X=x)$, or more briefly, $P(x)$. If the random variable is discrete, then we have a discrete probability distribution function, or discrete pdf. Discrete pdfs can be described through a table, a graph, or a formula.
Every discrete pdf must satisfy the two basic rules of probability. That is, for every $x$, we have:
$0 \leq P(X=x) \leq 1$ 
$\sum\limits_{\text{all }x} P(x) = 1$ 
A cumulative distribution function, or cdf, is a description of the probabilities associated with values of a random variable up to and including some value. In other words, if $X$ is a random variable, then the cdf gives values of $P(X \leq x)$.
The expected value, $E(X)$, of a discrete random variable is defined by
$E(X)= \sum\limits_{\text{all }x} x P(X=x) = \sum\limits_{\text{all }x} x P(x)$ 
The expected value is an average, and in fact is equivalent to the weighted mean formula. The expected value will describe the longrun behavior that the statistical experiment can be expected to produce.
The variance, $Var(X)$, of a discrete random variable is derived from the weighted population variance formula, and is defined by $Var(X)=\sum\limits_{\text{all }x} (x\mu)^2 P(X=x)$. After some algebra, it can be shown that the variance is given by
$Var(X)= E(X^2)(E(X))^2$ 
To use this formula, we need to know how to compute $E(X^2)$. By the definition of the expected value, the result is $E(X^2)=\sum\limits_{\text{all }x} x^2 P(x)$.
And as before, the standard deviation, $\sigma$, of a random variable is the square root of the variance. Since Chebyshev's Theorem is valid for any distribution, it also is valid for every pdf.
Suppose two coins are flipped. Let $X$ represent the number of heads obtained. Construct both the pdf and the cdf for this statistical experiment. Then determine the expected value and the standard deviation.
When two coins are flipped, there will be two possible outcomes for the first coin, and two possible outcomes for the second coin. The Multiplication Rule of Counting applies, so there will be four outcomes in the sample space of this experiment: ${HH,HT,TH,TT}$. Each of the outcomes in the sample space is equally likely, and therefore each has probability $\frac14$. But the random variable $X$ has only three possible values: 0, 1, and 2. We note that the value $X=1$ corresponds to two outcomes in the sample space, so the probability $P(X=1)=\frac24$. Here are three possible representations of the probability distribution. (We have not yet discussed how to obtain the algebraic formula. That will come later. For now, be aware that it exists.)

$\phantom{.....}$ $\phantom{.....}$  $P(X=x) = \dfrac{{}_2 C_x}{4}$ where $x \in \{0,1,2\}$ 
The cumulative distribution function, or cdf, can be obtained simply by computing subtotals in the pdf.
$X \leq x$  $P(X=x)$ 
$X \leq 0$  $\dfrac14$ 
$X \leq 1$  $\dfrac34$ 
$X \leq 2$  $1$ 
The expected value, the variance, and the standard deviation can be computed through the formulas provided. We shall organize our work by extending the pdf table.
$X = x$  $P(x)$  $x P(x)$  $x^2 P(x)$ 
$X = 0$  $\dfrac14$  $0$  $0$ 
$X = 1$  $\dfrac24$  $\dfrac24$  $\dfrac24$ 
$X = 2$  $\dfrac14$  $\dfrac24$  $\dfrac44$ 
Totals  $\dfrac44$  $\dfrac44$  $\dfrac64$ 
Therefore, we have $E(X)=\sum x P(x) = \dfrac44 = 1$, so our longrun expectation is to see an average of one head when repeatedly performing two flips of a coin. We also have $E(X^2) = \sum x^2 P(x) = \dfrac64 = 1.5$, and therefore $Var(X)= E(X^2)(E(X))^2 = 1.5  1^2 = 0.5$. This implies a standard deviation of $\sqrt{0.5} \approx 0.707$ heads.
Suppose a drawer of 20 socks contains 8 green, 6 white, 4 black, and 2 yellow socks. Three socks are randomly selected without replacement. Let $Y$ represent the number of green socks. Obtain the pdf, the mean, and the standard deviation of this statistical experiment.
We begin by setting up a table, and this time, we shall include the computations for the probabilities directly in the table.
$Y = y$  $P(y)$  $y P(y)$  $y^2 P(y)$ 
$Y = 0$  ${}_3 C_0 \cdot\dfrac{12}{20}\cdot\dfrac{11}{19}\cdot\dfrac{10}{18} = \dfrac{1320}{6840} \approx 0.1930$  $0$  $0$ 
$Y = 1$  ${}_3 C_1 \cdot\dfrac{8}{20}\cdot\dfrac{12}{19}\cdot\dfrac{11}{18} = \dfrac{3168}{6840} \approx 0.4632$  $\dfrac{3168}{6840}$  $\dfrac{3168}{6840}$ 
$Y = 2$  ${}_3 C_2 \cdot\dfrac{8}{20}\cdot\dfrac{7}{19}\cdot\dfrac{12}{18} = \dfrac{2016}{6840} \approx 0.2947$  $\dfrac{4032}{6840}$  $\dfrac{8064}{6840}$ 
$Y = 3$  ${}_3 C_3 \cdot\dfrac{8}{20}\cdot\dfrac{7}{19}\cdot\dfrac{6}{18} = \dfrac{336}{6840} \approx 0.0491$  $\dfrac{1008}{6840}$  $\dfrac{3024}{6840}$ 
Totals  $\dfrac{6840}{6840}=1$  $\dfrac{8208}{6840}$  $\dfrac{14256}{6840}$ 
The pdf is displayed in the first two columns. The expected value is $E(Y)=\sum y P(y) = \dfrac{8208}{6840} = \dfrac65 = 1.2$ green socks. We also have $E(Y^2) = \dfrac{14256}{6840}=\dfrac{198}{95} \approx 2.0842$, so the variance is $Var(Y) = \dfrac{198}{95}\left(\dfrac65\right)^2 = \dfrac{306}{475} \approx 0.6442$. Thus, the standard deviation is $\sigma = \sqrt{\dfrac{306}{475}} \approx 0.8026$ green socks.
A oneyear, $\$1,000$ term life insurance policy is sold by a company for $\$50$. If the probability of survival is 0.963, what is the company's expected profit?
There are only two possible events. Either the policyholder survives the year, and the company makes $\$50$. Or the policyholder dies, and the beneficiary collects $\$1,000$, although the company still collected $\$50$ for the premium. Let $X$ represent the profit of the company. We have the following pdf.
$X = x$  $P(x)$  $x P(x)$ 
$X = 50$  $0.963$  $48.15$ 
$X = 950$  $0.037$  $35.15$ 
Totals  $1$  $13.00$ 
In the long run, the company earns $\$13$ per policy sold.