The expectation a discrete Random Variable \(Y\) is: \[\mathbb{E}[Y] = \sum_{y} y \, p_Y (y) \approx \frac{1}{M} \sum_{i=1}^M y^{(m)} = \bar{y}\] where each \(y^{(m)}\) from \(1\) through \(M\) is distributed \(p_Y (y)\).
The expectation of a continuous Random Variable \(Y\) is: \[\mathbb{E}[Y] = \int_{y} y \, {p}_{Y}(y) \, dy \approx \frac{1}{M} \sum_{i=1}^M y^{(m)} = \bar{y}\] where each \(y^{(m)}\) from \(1\) through \(M\) is distributed \(p_Y (y)\).
One roll of a six-sided die can be modeled as a Random Variable \(Y_D6\) which has one of six possible values in, { 1,…,6 }, each of which has probability \(1/6\).
The R function sample
takes arguments:
x
- either a vector of elements from which to sample or an integer \(n\) in which case \(x\) is the vector of integers \(1:n\).size
- a positive integer which specifies the number of items to choosereplace
- either TRUE
or FALSE
specifies whether sampling with/without replacement, default is FALSE
.Simulate one roll of a fair six-sided die (“D6”):
sample(c(1:6), 1);
[1] 3
To simulate 10 independent rolls of a D6, replace
should be set to TRUE
:
sample(c(1:6), 10, replace=TRUE);
[1] 6 6 5 2 4 3 4 3 6 5
Simulate 100 values:
> sample(c(1:6), 100, replace=TRUE)
[1] 4 3 3 6 1 2 1 3 2 2 6 1 1 5 2 6 2 4 1 6 2 5 6 3 4
5 4 1 4 3 4 3 2 2 4 1 4 4 5 2 6 5 4 1 3 5 3 2 4 1
3 1 2 2 6 6 4 6 5 1 1 2 5 4 5 1 5 1 5 2 5 1 5 4 6
5 2 6 2 6 3 1 2 6 1 4 2 4 4 4 6 5 1 4 3 4 4 6 1 2
Exercise 1.1: Compute the expectation of the discrete RV \(Y_{D6}\), one roll of a fair six-sided die using a sample of 100, 1000, and 10,000 draws, respectively.
The normal (Gaussian) distribution is a continuous probability distribution with parameters:
Plots of the probability density function show a symmetric “bell curve” centered on the mean whose height/width is determined by the standard deviation.
The R function rnorm
generates a sequence of random draws from a normal distribution with a specified mean and standard deviation:
Take one draw from a normal distribution with mean 5, standard deviation 10:
rnorm(1, mean=5, sd=10)
[1] 12.07875
Take 100 draws:
rnorm(100, mean=5, sd=10)
[1] 14.8462947 8.0735792 2.4734157 7.6774951 -0.2073813 4.9606262 3.6510407 12.8835495
[9] 5.6449032 22.8583381 2.7765726 5.0189006 -10.6783155 -0.3498541 1.8920707 23.4656587
[17] 13.0086797 1.6940446 7.6245823 0.1356268 3.6677676 -12.4893873 15.6051132 0.8198043
[25] 29.8099277 1.5086770 17.8460684 -7.0311548 -11.9051549 17.6974229 4.9505833 -6.8014745
[33] -4.2759833 -1.5289839 -1.2261375 5.1040920 27.4643456 -4.3950494 -5.1456122 -3.7409858
[41] 21.0609330 -0.3272468 0.5560233 10.1796741 12.7950104 22.9025000 1.2499491 -4.7077668
[49] 8.9663284 -14.3782372 -16.8933651 6.3258832 18.7200482 7.1013480 -7.2355715 13.1030800
[57] -2.0957482 4.3703866 1.8779873 21.4730634 6.4950266 -13.1762963 13.1514351 9.0616678
[65] -12.8618908 3.8085431 1.0981333 6.9481642 16.9777193 14.3057281 16.1671721 1.4471080
[73] 2.5653629 17.6422668 5.1011498 17.5863671 9.5020089 17.9404970 18.2373002 5.0611401
[81] 3.2828250 35.6020802 3.0981154 1.7111767 -5.2415036 19.8766368 9.5519978 1.6051318
[89] 1.6079017 -6.4268732 15.4876133 19.2577546 15.7550100 15.0751943 4.1478186 10.5754150
[97] 8.7662413 -6.8411017 16.2789227 -4.3084518
You can use R’s hist
function to plot the sample distribution: hist(rnorm(100, mean=5, sd=10), xlim=c(-35, 45))
- does this look like a bell curve?
Exercise 1.2: Compute the expectation of the continuous RV \(Y\) which has a normal distribution of mean=5 and standard deviation=10, using a samples of 100, 1000, and 10,000 draws, respectively.
The Variance of a Random Variable is the expectation of the squared difference of the value of the random variable from its mean:
\[ \mathrm{var}[Y] = \mathbb{E}[ { \left( Y - \mathbb{E}[Y] \right) }^2 ] \]
For a discrete Random Variable \(Y\), the variance \(\mathrm{var}[Y]\) is: \[ \mathrm{var}[Y] = \sum_y {\left( y - \bar{y}\right)}^2 p_Y (y) \approx \frac{1}{M} \sum_{i=1}^M {\left( y^{(m)} -\bar{y} \right) }^2 \] where each \(y^{(m)}\) from \(1\) through \(M\) is distributed \(p_Y (y)\).
For a continuous Random Variable \(Y\), the variance \(\mathrm{var}[Y]\) is: \[ \mathrm{var}[Y] = \int_y {\left( y - \bar{y}\right)}^2 p_Y (y) \, dy \approx \frac{1}{M} \sum_{i=1}^M {\left( y^{(m)} -\bar{y} \right) }^2 \] where each \(y^{(m)}\) from \(1\) through \(M\) is distributed \(p_Y (y)\).
Exercise 1.3: Compute the variance of the discrete RV \(Y_{D6}\), one roll of a fair six-sided die using a sample of 100, 1000, and 10,000 draws, respectively.
Exercise 1.4: Compute the variance of the continuous RV \(Y\) which has a normal distribution of mean=5 and standard deviation=10, using a samples of 100, 1000, and 10,000 draws, respectively.