In this tutorial we do the following:
- Generate some fake data to play with
- Write code for a simple Stan model
- Fit the model using RStan
- Use ShinyStan for graphical posterior predictive checks
In this tutorial we do the following:
Model is a vanilla linear regression model, classical formulation: \(Y = \alpha + \beta X + \epsilon\)
Generate fake data (in R):
N <- 100 # Number of observations alpha = 2; ## intercept beta = 0.5; ## covariate sigma = 1; x = rnorm(N, 0, 1); y = rnorm(N, mean=(alpha + x * beta), sd = sigma);
Plot fake data:
# visual check plot(x,y); abline(lsfit(x,y), col="blue");
lm.stan
The usual linear regression model:
data { int N; //number of observations vector[N] x; // single co-variate vector[N] y; // vector of N observations } parameters { real<lower=0> sigma; real alpha; real beta; } model { y ~ normal(alpha + x * beta, sigma); // vectorized likelihood alpha ~ normal(0, 5); beta ~ normal(0, 5); sigma ~ cauchy(0, 2.5); }
y
,
alpha
, beta
and sigma
for that drawgenerated quantities { vector[N] y_rep ; // vector of same length as the data y for (n in 1:N) y_rep[n] = normal_rng(alpha + x[n] * beta, sigma); }
lm.stan
with RStanUsing RStan's stan
function to fit the model. * Default sampler is NUTS * Default number of chains is 4 * Specify number of iterations to be 500. By default, equal number of warmup and post-warmup draws - total of 1000 post-warmup draws.
library(rstan) library(shinystan) lm_fit <- stan(file = "lm.stan", data = list(N, x, y), iter=500); launch_shinystan(lm_fit)