Getting Started

Stan program hello.stan

In exercise 1.2, we created a data generating process which produces draws from a normally-distributed RV with mean 5.0 and standard deviation 10.0. The Stan model which fits this data is:

data {
  int<lower=0> N;
  vector[N] y;
}
parameters {
  real mu;
  real<lower=0> sigma;
}
model {
  y ~ normal(mu, sigma);
}

Two things to note:

RStan function stan

The RStan function stan fits a Stan model and returns the fitted result as a stanfit object.

Exercise 2.1 Hello World in Stan

  1. Create the file “hello.stan” in your current R working directory and copy the above program into it.

  2. In your R environment, create variables “N” and “y”, where N is the number of draws, and y is a vector of draw from a normally-distributed RV with mean 5.0 and standard deviation 10.0:

    N = 100
    y = rnorm(N, mean=5, sd=10)
  3. Run the stan command to fit the model to the data (variables “N” and “y”) and save the resulting stanfit object:

    stanfit = stan("hello.stan",data=c("N","y"))
  4. Print a summary of the stanfit object:

    print(stanfit)

    First, the summary reports the number of draws from the posterior saved into the stanfit object:

    Inference for Stan model: hello.
    4 chains, each with iter=2000; warmup=1000; thin=1; 
    post-warmup draws per chain=1000, total post-warmup draws=4000.

    Next, it reports on all the parameters and any top-level variables declared in the transformed parameters and generated quantities blocks:

         mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
    mu       5.48    0.02 1.09    3.24    4.78    5.50    6.21    7.58  2953    1
    sigma   10.76    0.01 0.77    9.43   10.23   10.71   11.22   12.42  3656    1
    lp__  -284.55    0.02 1.02 -287.21 -284.94 -284.24 -283.82 -283.57  1905    1

    Lastly, it reminds you what n_eff and Rhat mean:

    Samples were drawn using NUTS(diag_e) at Mon Oct 16 23:20:08 2017.
    For each parameter, n_eff is a crude measure of effective sample size,
    and Rhat is the potential scale reduction factor on split chains (at
    convergence, Rhat=1).

Exercise 2.2

In an R session:

Does Stan recover the mean and standard deviation?

How well does Stan recover the mean and standard deviation? How do the reported values for n_eff and Rhat change?