Expectation of a Random Variable

Sampling from a discrete random variable: rolling a fair 6-sided die

One roll of a six-sided die can be modeled as a Random Variable \(Y_D6\) which has one of six possible values in, { 1,…,6 }, each of which has probability \(1/6\).

The R function sample takes arguments:

  • x - either a vector of elements from which to sample or an integer \(n\) in which case \(x\) is the vector of integers \(1:n\).
  • size - a positive integer which specifies the number of items to choose
  • replace - either TRUE or FALSE specifies whether sampling with/without replacement, default is FALSE.

Simulate one roll of a fair six-sided die (“D6”):

sample(c(1:6), 1);
 [1] 3

To simulate 10 independent rolls of a D6, replace should be set to TRUE:

sample(c(1:6), 10, replace=TRUE);
 [1] 6 6 5 2 4 3 4 3 6 5

Simulate 100 values:

> sample(c(1:6), 100, replace=TRUE)
[1] 4 3 3 6 1 2 1 3 2 2 6 1 1 5 2 6 2 4 1 6 2 5 6 3 4
    5 4 1 4 3 4 3 2 2 4 1 4 4 5 2 6 5 4 1 3 5 3 2 4 1
    3 1 2 2 6 6 4 6 5 1 1 2 5 4 5 1 5 1 5 2 5 1 5 4 6
    5 2 6 2 6 3 1 2 6 1 4 2 4 4 4 6 5 1 4 3 4 4 6 1 2

Exercise 1.1: Compute the expectation of the discrete RV \(Y_{D6}\), one roll of a fair six-sided die using a sample of 100, 1000, and 10,000 draws, respectively.

Sampling from a continuous random variable: the Normal distribution.

The normal (Gaussian) distribution is a continuous probability distribution with parameters:

  • mean - the location parameter, usually written as \(\mu\)
  • variance - the square of the scale parameter, usually written as \(\sigma^2\). (Note: the parameter \(\sigma\) is the standard deviation; direct comparison of mean and variance is more intuitive because they are on the same scale).

Plots of the probability density function show a symmetric “bell curve” centered on the mean whose height/width is determined by the standard deviation.

The R function rnorm generates a sequence of random draws from a normal distribution with a specified mean and standard deviation:

Take one draw from a normal distribution with mean 5, standard deviation 10:

rnorm(1, mean=5, sd=10)
[1] 12.07875

Take 100 draws:

rnorm(100, mean=5, sd=10)
  [1]  14.8462947   8.0735792   2.4734157   7.6774951  -0.2073813   4.9606262   3.6510407  12.8835495
  [9]   5.6449032  22.8583381   2.7765726   5.0189006 -10.6783155  -0.3498541   1.8920707  23.4656587
 [17]  13.0086797   1.6940446   7.6245823   0.1356268   3.6677676 -12.4893873  15.6051132   0.8198043
 [25]  29.8099277   1.5086770  17.8460684  -7.0311548 -11.9051549  17.6974229   4.9505833  -6.8014745
 [33]  -4.2759833  -1.5289839  -1.2261375   5.1040920  27.4643456  -4.3950494  -5.1456122  -3.7409858
 [41]  21.0609330  -0.3272468   0.5560233  10.1796741  12.7950104  22.9025000   1.2499491  -4.7077668
 [49]   8.9663284 -14.3782372 -16.8933651   6.3258832  18.7200482   7.1013480  -7.2355715  13.1030800
 [57]  -2.0957482   4.3703866   1.8779873  21.4730634   6.4950266 -13.1762963  13.1514351   9.0616678
 [65] -12.8618908   3.8085431   1.0981333   6.9481642  16.9777193  14.3057281  16.1671721   1.4471080
 [73]   2.5653629  17.6422668   5.1011498  17.5863671   9.5020089  17.9404970  18.2373002   5.0611401
 [81]   3.2828250  35.6020802   3.0981154   1.7111767  -5.2415036  19.8766368   9.5519978   1.6051318
 [89]   1.6079017  -6.4268732  15.4876133  19.2577546  15.7550100  15.0751943   4.1478186  10.5754150
 [97]   8.7662413  -6.8411017  16.2789227  -4.3084518

You can use R’s hist function to plot the sample distribution: hist(rnorm(100, mean=5, sd=10), xlim=c(-35, 45)) - does this look like a bell curve?

Exercise 1.2: Compute the expectation of the continuous RV \(Y\) which has a normal distribution of mean=5 and standard deviation=10, using a samples of 100, 1000, and 10,000 draws, respectively.

Variance of a Random Variable

The Variance of a Random Variable is the expectation of the squared difference of the value of the random variable from its mean:

\[ \mathrm{var}[Y] = \mathbb{E}[ { \left( Y - \mathbb{E}[Y] \right) }^2 ] \]

Exercise 1.3: Compute the variance of the discrete RV \(Y_{D6}\), one roll of a fair six-sided die using a sample of 100, 1000, and 10,000 draws, respectively.

Exercise 1.4: Compute the variance of the continuous RV \(Y\) which has a normal distribution of mean=5 and standard deviation=10, using a samples of 100, 1000, and 10,000 draws, respectively.