#### Expectation of a Random Variable

• The expectation a discrete Random Variable $$Y$$ is: $\mathbb{E}[Y] = \sum_{y} y \, p_Y (y) \approx \frac{1}{M} \sum_{i=1}^M y^{(m)} = \bar{y}$ where each $$y^{(m)}$$ from $$1$$ through $$M$$ is distributed $$p_Y (y)$$.

• The expectation of a continuous Random Variable $$Y$$ is: $\mathbb{E}[Y] = \int_{y} y \, {p}_{Y}(y) \, dy \approx \frac{1}{M} \sum_{i=1}^M y^{(m)} = \bar{y}$ where each $$y^{(m)}$$ from $$1$$ through $$M$$ is distributed $$p_Y (y)$$.

##### Sampling from a discrete random variable: rolling a fair 6-sided die

One roll of a six-sided die can be modeled as a Random Variable $$Y_D6$$ which has one of six possible values in, { 1,…,6 }, each of which has probability $$1/6$$.

The R function sample takes arguments:

• x - either a vector of elements from which to sample or an integer $$n$$ in which case $$x$$ is the vector of integers $$1:n$$.
• size - a positive integer which specifies the number of items to choose
• replace - either TRUE or FALSE specifies whether sampling with/without replacement, default is FALSE.

Simulate one roll of a fair six-sided die (“D6”):

sample(c(1:6), 1);
[1] 3

To simulate 10 independent rolls of a D6, replace should be set to TRUE:

sample(c(1:6), 10, replace=TRUE);
[1] 6 6 5 2 4 3 4 3 6 5

Simulate 100 values:

> sample(c(1:6), 100, replace=TRUE)
[1] 4 3 3 6 1 2 1 3 2 2 6 1 1 5 2 6 2 4 1 6 2 5 6 3 4
5 4 1 4 3 4 3 2 2 4 1 4 4 5 2 6 5 4 1 3 5 3 2 4 1
3 1 2 2 6 6 4 6 5 1 1 2 5 4 5 1 5 1 5 2 5 1 5 4 6
5 2 6 2 6 3 1 2 6 1 4 2 4 4 4 6 5 1 4 3 4 4 6 1 2

Exercise 1.1: Compute the expectation of the discrete RV $$Y_{D6}$$, one roll of a fair six-sided die using a sample of 100, 1000, and 10,000 draws, respectively.

##### Sampling from a continuous random variable: the Normal distribution.

The normal (Gaussian) distribution is a continuous probability distribution with parameters:

• mean - the location parameter, usually written as $$\mu$$
• variance - the square of the scale parameter, usually written as $$\sigma^2$$. (Note: the parameter $$\sigma$$ is the standard deviation; direct comparison of mean and variance is more intuitive because they are on the same scale).

Plots of the probability density function show a symmetric “bell curve” centered on the mean whose height/width is determined by the standard deviation.

The R function rnorm generates a sequence of random draws from a normal distribution with a specified mean and standard deviation:

Take one draw from a normal distribution with mean 5, standard deviation 10:

rnorm(1, mean=5, sd=10)
[1] 12.07875

Take 100 draws:

rnorm(100, mean=5, sd=10)
[1]  14.8462947   8.0735792   2.4734157   7.6774951  -0.2073813   4.9606262   3.6510407  12.8835495
[9]   5.6449032  22.8583381   2.7765726   5.0189006 -10.6783155  -0.3498541   1.8920707  23.4656587
[17]  13.0086797   1.6940446   7.6245823   0.1356268   3.6677676 -12.4893873  15.6051132   0.8198043
[25]  29.8099277   1.5086770  17.8460684  -7.0311548 -11.9051549  17.6974229   4.9505833  -6.8014745
[33]  -4.2759833  -1.5289839  -1.2261375   5.1040920  27.4643456  -4.3950494  -5.1456122  -3.7409858
[41]  21.0609330  -0.3272468   0.5560233  10.1796741  12.7950104  22.9025000   1.2499491  -4.7077668
[49]   8.9663284 -14.3782372 -16.8933651   6.3258832  18.7200482   7.1013480  -7.2355715  13.1030800
[57]  -2.0957482   4.3703866   1.8779873  21.4730634   6.4950266 -13.1762963  13.1514351   9.0616678
[65] -12.8618908   3.8085431   1.0981333   6.9481642  16.9777193  14.3057281  16.1671721   1.4471080
[73]   2.5653629  17.6422668   5.1011498  17.5863671   9.5020089  17.9404970  18.2373002   5.0611401
[81]   3.2828250  35.6020802   3.0981154   1.7111767  -5.2415036  19.8766368   9.5519978   1.6051318
[89]   1.6079017  -6.4268732  15.4876133  19.2577546  15.7550100  15.0751943   4.1478186  10.5754150
[97]   8.7662413  -6.8411017  16.2789227  -4.3084518

You can use R’s hist function to plot the sample distribution: hist(rnorm(100, mean=5, sd=10), xlim=c(-35, 45)) - does this look like a bell curve?

Exercise 1.2: Compute the expectation of the continuous RV $$Y$$ which has a normal distribution of mean=5 and standard deviation=10, using a samples of 100, 1000, and 10,000 draws, respectively.

#### Variance of a Random Variable

The Variance of a Random Variable is the expectation of the squared difference of the value of the random variable from its mean:

$\mathrm{var}[Y] = \mathbb{E}[ { \left( Y - \mathbb{E}[Y] \right) }^2 ]$

• For a discrete Random Variable $$Y$$, the variance $$\mathrm{var}[Y]$$ is: $\mathrm{var}[Y] = \sum_y {\left( y - \bar{y}\right)}^2 p_Y (y) \approx \frac{1}{M} \sum_{i=1}^M {\left( y^{(m)} -\bar{y} \right) }^2$ where each $$y^{(m)}$$ from $$1$$ through $$M$$ is distributed $$p_Y (y)$$.

• For a continuous Random Variable $$Y$$, the variance $$\mathrm{var}[Y]$$ is: $\mathrm{var}[Y] = \int_y {\left( y - \bar{y}\right)}^2 p_Y (y) \, dy \approx \frac{1}{M} \sum_{i=1}^M {\left( y^{(m)} -\bar{y} \right) }^2$ where each $$y^{(m)}$$ from $$1$$ through $$M$$ is distributed $$p_Y (y)$$.

Exercise 1.3: Compute the variance of the discrete RV $$Y_{D6}$$, one roll of a fair six-sided die using a sample of 100, 1000, and 10,000 draws, respectively.

Exercise 1.4: Compute the variance of the continuous RV $$Y$$ which has a normal distribution of mean=5 and standard deviation=10, using a samples of 100, 1000, and 10,000 draws, respectively.