We should know this by now, right? \(y \sim \mathcal{N}(\beta X + \alpha, \sigma)\)
data { int N; int K; matrix[N, K] X; real y[N]; }
We should know this by now, right? \(y \sim \mathcal{N}(\beta X + \alpha, \sigma)\)
data { int N; int K; matrix[N, K] X; real y[N]; }
data { int N; int K; matrix[N, K] X; real y[N]; } parameters { real alpha; vector[K] beta; real<lower=0> sigma; } model { y ~ normal(X * beta + alpha, sigma); }
Warning: There were 88 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Warning: There were 1318 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: There were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See http://mc-stan.org/misc/warnings.html#bfmi-low
Warning: Examine the pairs() plot to diagnose sampling problems
Inference for Stan model: model. 4 chains, each with iter=2000; warmup=1000; thin=1; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd n_eff Rhat alpha -431333212.7 9.838594e+08 6.353719e+09 42 1.12 beta[1] -12507614.2 1.919952e+07 2.422585e+08 159 1.03 beta[2] -42691574.9 9.176587e+07 6.523436e+08 51 1.09 beta[3] -12206733.3 1.800335e+07 2.702798e+08 225 1.01 sigma 3006709155.6 1.278141e+09 8.207271e+09 41 1.10 lp__ -76.8 4.740000e+00 1.219000e+01 7 1.41 Samples were drawn using NUTS(diag_e) at Tue Dec 5 22:40:25 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
data { int N; int K; matrix[N, K] X; real y[N]; } parameters { real alpha; vector[K] beta; real<lower=0> sigma; } model { y ~ normal(X * beta + alpha, sigma); sigma ~ normal(0, 20); alpha ~ normal(0, 20); beta ~ normal(0, 20); }
Warning: There were 69 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Warning: Examine the pairs() plot to diagnose sampling problems
Inference for Stan model: model2. 4 chains, each with iter=2000; warmup=1000; thin=1; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat alpha 4.36 0.26 8.31 -14.61 0.49 4.97 8.49 20.59 1006 1.01 beta[1] -0.52 0.01 0.52 -1.65 -0.74 -0.51 -0.30 0.57 1367 1.00 beta[2] 0.91 0.04 1.30 -2.16 0.36 1.00 1.54 3.45 1033 1.00 beta[3] 0.25 0.03 0.92 -1.81 -0.14 0.29 0.66 2.08 1293 1.00 sigma 10.91 0.43 7.75 1.87 5.30 8.90 14.23 31.70 319 1.00 lp__ -11.27 0.17 2.92 -17.79 -13.09 -10.99 -9.16 -6.47 288 1.01 Samples were drawn using NUTS(diag_e) at Tue Dec 5 22:40:31 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
pairs(fit2, pars = c("alpha", "beta[1]", "sigma"))
fit2.1 = sampling(model2, list(y = y, N = N, X = X), control = list(adapt_delta = 0.99), seed = 1234)
Inference for Stan model: model2. 4 chains, each with iter=2000; warmup=1000; thin=1; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat alpha 4.57 0.26 8.33 -13.66 0.82 4.85 8.48 21.86 1052 1.01 beta[1] -0.51 0.01 0.51 -1.57 -0.73 -0.51 -0.30 0.50 1224 1.01 beta[2] 0.92 0.04 1.33 -2.18 0.34 0.98 1.52 3.56 1007 1.00 beta[3] 0.25 0.03 0.87 -1.59 -0.11 0.25 0.63 2.15 1188 1.01 sigma 10.48 0.35 7.50 2.44 5.14 8.18 13.69 30.37 466 1.00 lp__ -11.02 0.15 2.92 -17.52 -12.90 -10.70 -8.81 -6.44 379 1.00 Samples were drawn using NUTS(diag_e) at Tue Dec 5 22:40:36 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
pairs(fit2.1, pars = c("alpha", "beta[1]", "sigma"))
data { int N; int K; matrix[N, K] X; real y[N]; } parameters { real alpha; vector[K] beta; real<lower=0> sigma; } model { y ~ normal(X * beta + alpha, sigma); sigma ~ normal(0, 2); alpha ~ normal(0, 5); beta ~ normal(0, 5); }
fit3 = sampling(model3, list(y = y, N = N, X = X), seed = 1234)
Inference for Stan model: model3. 4 chains, each with iter=2000; warmup=1000; thin=1; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat alpha 4.45 0.06 2.32 -0.38 2.98 4.56 5.96 8.72 1404 1.00 beta[1] -0.52 0.00 0.13 -0.80 -0.60 -0.52 -0.44 -0.27 1717 1.00 beta[2] 0.94 0.01 0.35 0.21 0.74 0.96 1.16 1.62 1432 1.00 beta[3] 0.26 0.01 0.24 -0.25 0.11 0.27 0.41 0.73 1721 1.00 sigma 3.14 0.03 0.97 1.64 2.41 3.00 3.72 5.40 1304 1.00 lp__ -9.67 0.06 1.90 -14.33 -10.69 -9.29 -8.28 -7.01 949 1.01 Samples were drawn using NUTS(diag_e) at Tue Dec 5 22:40:41 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
pairs(fit3, pars = c("alpha", "beta[1]", "sigma"))
model { y ~ normal(X * beta + alpha, sigma); sigma ~ normal(0, 2); alpha ~ normal(0, 5); beta ~ normal(0, 5); } generated quantities { vector[N] y_ppc; for (n in 1:N) y_ppc[n] = normal_rng(X[n,] * beta + alpha, sigma); }
library(shinystan) launch_shinystan(fit3ppc)
alpha
[1] -0.1102855
beta
[1] -0.5110095 -0.9111954 -0.8371717
sigma
[1] 4
Inference for Stan model: model3. 4 chains, each with iter=2000; warmup=1000; thin=1; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat alpha 4.45 0.06 2.32 -0.38 2.98 4.56 5.96 8.72 1404 1.00 beta[1] -0.52 0.00 0.13 -0.80 -0.60 -0.52 -0.44 -0.27 1717 1.00 beta[2] 0.94 0.01 0.35 0.21 0.74 0.96 1.16 1.62 1432 1.00 beta[3] 0.26 0.01 0.24 -0.25 0.11 0.27 0.41 0.73 1721 1.00 sigma 3.14 0.03 0.97 1.64 2.41 3.00 3.72 5.40 1304 1.00 lp__ -9.67 0.06 1.90 -14.33 -10.69 -9.29 -8.28 -7.01 949 1.01 Samples were drawn using NUTS(diag_e) at Tue Dec 5 22:40:41 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
readLines("dist/y.txt")
[1] "15.72128 -2.101855 -11.15166 10.11426 -2.182092"