Stan Development Team
CmdStanR: the R interface to CmdStan.
CmdStanR (cmdstanr package) is an interface to Stan (mc-stan.org) for R users. It provides the necessary objects and functions to compile a Stan program and run Stan's algorithms from R via CmdStan, the shell interface to Stan (mc-stan.org/users/interfaces/cmdstan).
The RStan interface (rstan package) is an in-memory interface to Stan and relies on R packages like Rcpp and inline to call C++ code from R. On the other hand, the CmdStanR interface does not directly call any C++ code from R, instead relying on the CmdStan interface behind the scenes for compilation, running algorithms, and writing results to output files.
Allows other developers to distribute R packages with pre-compiled Stan programs (like rstanarm) on CRAN. (Note: As of 2023, this can mostly be achieved with CmdStanR as well. See Developing using CmdStanR.)
Avoids use of R6 classes, which may result in more familiar syntax for many R users.
CRAN binaries available for Mac and Windows.
Compatible with latest versions of Stan. Keeping up with Stan releases
is complicated for RStan, often requiring non-trivial changes to the
rstan package and new CRAN releases of both rstan and
StanHeaders. With CmdStanR the latest improvements in Stan will be
available from R immediately after updating CmdStan using
cmdstanr::install_cmdstan()
.
Running Stan via external processes results in fewer unexpected crashes, especially in RStudio.
Less memory overhead.
More permissive license. RStan uses the GPL-3 license while the license for CmdStanR is BSD-3, which is a bit more permissive and is the same license used for CmdStan and the Stan C++ source code.
CmdStanR requires a working version of CmdStan. If
you already have CmdStan installed see cmdstan_model()
to get started,
otherwise see install_cmdstan()
to install CmdStan. The vignette
Getting started with CmdStanR
demonstrates the basic functionality of the package.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
# \dontrun{
library(cmdstanr)
library(posterior)
library(bayesplot)
color_scheme_set("brightblue")
# Set path to CmdStan
# (Note: if you installed CmdStan via install_cmdstan() with default settings
# then setting the path is unnecessary but the default below should still work.
# Otherwise use the `path` argument to specify the location of your
# CmdStan installation.)
set_cmdstan_path(path = NULL)
#> CmdStan path set to: /Users/jgabry/.cmdstan/cmdstan-2.33.1
# Create a CmdStanModel object from a Stan program,
# here using the example model that comes with CmdStan
file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
mod <- cmdstan_model(file)
mod$print()
#> data {
#> int<lower=0> N;
#> array[N] int<lower=0,upper=1> y;
#> }
#> parameters {
#> real<lower=0,upper=1> theta;
#> }
#> model {
#> theta ~ beta(1,1); // uniform prior on interval 0,1
#> y ~ bernoulli(theta);
#> }
# Data as a named list (like RStan)
stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
# Run MCMC using the 'sample' method
fit_mcmc <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
parallel_chains = 2
)
#> Running MCMC with 2 parallel chains...
#>
#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#>
#> Both chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.2 seconds.
#>
# Use 'posterior' package for summaries
fit_mcmc$summary()
#> # A tibble: 2 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ -7.30 -7.03 0.721 0.380 -8.82 -6.75 1.00 902. 1006.
#> 2 theta 0.247 0.233 0.122 0.129 0.0786 0.470 1.00 762. 712.
# Check sampling diagnostics
fit_mcmc$diagnostic_summary()
#> $num_divergent
#> [1] 0 0
#>
#> $num_max_treedepth
#> [1] 0 0
#>
#> $ebfmi
#> [1] 1.017555 1.250490
#>
# Get posterior draws
draws <- fit_mcmc$draws()
print(draws)
#> # A draws_array: 1000 iterations, 2 chains, and 2 variables
#> , , variable = lp__
#>
#> chain
#> iteration 1 2
#> 1 -6.8 -6.8
#> 2 -6.9 -6.8
#> 3 -7.0 -7.0
#> 4 -6.9 -7.1
#> 5 -6.7 -7.0
#>
#> , , variable = theta
#>
#> chain
#> iteration 1 2
#> 1 0.28 0.21
#> 2 0.19 0.20
#> 3 0.16 0.17
#> 4 0.20 0.36
#> 5 0.25 0.34
#>
#> # ... with 995 more iterations
# Convert to data frame using posterior::as_draws_df
as_draws_df(draws)
#> # A draws_df: 1000 iterations, 2 chains, and 2 variables
#> lp__ theta
#> 1 -6.8 0.28
#> 2 -6.9 0.19
#> 3 -7.0 0.16
#> 4 -6.9 0.20
#> 5 -6.7 0.25
#> 6 -7.1 0.36
#> 7 -9.0 0.55
#> 8 -7.2 0.15
#> 9 -6.8 0.23
#> 10 -7.5 0.42
#> # ... with 1990 more draws
#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}
# Plot posterior using bayesplot (ggplot2)
mcmc_hist(fit_mcmc$draws("theta"))
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# For models fit using MCMC, if you like working with RStan's stanfit objects
# then you can create one with rstan::read_stan_csv()
# stanfit <- rstan::read_stan_csv(fit_mcmc$output_files())
# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
# and also demonstrate specifying data as a path to a file instead of a list
my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
fit_optim <- mod$optimize(data = my_data_file, seed = 123)
#> Initial log joint probability = -9.51104
#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
#> 6 -5.00402 0.000103557 2.55661e-07 1 1 9
#> Optimization terminated normally:
#> Convergence detected: relative gradient magnitude is below tolerance
#> Finished in 0.1 seconds.
fit_optim$summary()
#> # A tibble: 2 × 2
#> variable estimate
#> <chr> <dbl>
#> 1 lp__ -5.00
#> 2 theta 0.2
# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
# to the posterior
fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
#> Initial log joint probability = -18.9282
#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
#> 5 -6.74802 0.00032939 4.62604e-06 1 1 8
#> Optimization terminated normally:
#> Convergence detected: relative gradient magnitude is below tolerance
#> Finished in 0.1 seconds.
fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
#> Calculating Hessian
#> Calculating inverse of Cholesky factor
#> Generating draws
#> iteration: 0
#> iteration: 100
#> iteration: 200
#> iteration: 300
#> iteration: 400
#> iteration: 500
#> iteration: 600
#> iteration: 700
#> iteration: 800
#> iteration: 900
#> iteration: 1000
#> iteration: 1100
#> iteration: 1200
#> iteration: 1300
#> iteration: 1400
#> iteration: 1500
#> iteration: 1600
#> iteration: 1700
#> iteration: 1800
#> iteration: 1900
#> Finished in 0.1 seconds.
fit_laplace$summary()
#> # A tibble: 3 × 7
#> variable mean median sd mad q5 q95
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ -7.23 -6.97 0.683 0.304 -8.57 -6.75
#> 2 lp_approx__ -0.496 -0.225 0.743 0.305 -1.91 -0.00217
#> 3 theta 0.264 0.247 0.120 0.119 0.0967 0.492
# Run 'variational' method to use ADVI to approximate posterior
fit_vb <- mod$variational(data = stan_data, seed = 123)
#> ------------------------------------------------------------
#> EXPERIMENTAL ALGORITHM:
#> This procedure has not been thoroughly tested and may be unstable
#> or buggy. The interface is subject to change.
#> ------------------------------------------------------------
#> Gradient evaluation took 6e-06 seconds
#> 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
#> Adjust your expectations accordingly!
#> Begin eta adaptation.
#> Iteration: 1 / 250 [ 0%] (Adaptation)
#> Iteration: 50 / 250 [ 20%] (Adaptation)
#> Iteration: 100 / 250 [ 40%] (Adaptation)
#> Iteration: 150 / 250 [ 60%] (Adaptation)
#> Iteration: 200 / 250 [ 80%] (Adaptation)
#> Success! Found best value [eta = 1] earlier than expected.
#> Begin stochastic gradient ascent.
#> iter ELBO delta_ELBO_mean delta_ELBO_med notes
#> 100 -6.262 1.000 1.000
#> 200 -6.263 0.500 1.000
#> 300 -6.307 0.336 0.007 MEDIAN ELBO CONVERGED
#> Drawing a sample of size 1000 from the approximate posterior...
#> COMPLETED.
#> Finished in 0.1 seconds.
fit_vb$summary()
#> # A tibble: 3 × 7
#> variable mean median sd mad q5 q95
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ -7.18 -6.94 0.588 0.259 -8.36 -6.75
#> 2 lp_approx__ -0.515 -0.221 0.692 0.303 -2.06 -0.00257
#> 3 theta 0.263 0.246 0.115 0.113 0.106 0.481
mcmc_hist(fit_vb$draws("theta"))
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Run 'pathfinder' method, a new alternative to the variational method
fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
#> Path [1] :Initial log joint density = -11.008832
#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 9.383e-04 1.391e-05 1.000e+00 1.000e+00 126 -6.264e+00 -6.264e+00
#> Path [1] :Best Iter: [3] ELBO (-6.195408) evaluations: (126)
#> Path [2] :Initial log joint density = -7.318450
#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 4 -6.748e+00 5.414e-03 1.618e-04 1.000e+00 1.000e+00 101 -6.251e+00 -6.251e+00
#> Path [2] :Best Iter: [3] ELBO (-6.229174) evaluations: (101)
#> Path [3] :Initial log joint density = -12.374612
#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 1.419e-03 2.837e-05 1.000e+00 1.000e+00 126 -6.199e+00 -6.199e+00
#> Path [3] :Best Iter: [5] ELBO (-6.199185) evaluations: (126)
#> Path [4] :Initial log joint density = -13.009824
#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 1.677e-03 3.885e-05 1.000e+00 1.000e+00 126 -6.173e+00 -6.173e+00
#> Path [4] :Best Iter: [5] ELBO (-6.172860) evaluations: (126)
#> Total log probability function evaluations:4379
#> Finished in 0.1 seconds.
fit_pf$summary()
#> # A tibble: 3 × 7
#> variable mean median sd mad q5 q95
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp_approx__ -1.08 -0.728 0.886 0.304 -2.71 -0.511
#> 2 lp__ -7.26 -6.96 0.738 0.297 -8.72 -6.75
#> 3 theta 0.249 0.230 0.120 0.121 0.0854 0.471
mcmc_hist(fit_pf$draws("theta"))
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Run 'pathfinder' again with more paths, fewer draws per path,
# better covariance approximation, and fewer LBFGSs iterations
fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
history_size=50, max_lbfgs_iters=100)
#> Path [1] :Initial log joint density = -12.637372
#> Path [1] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 1.531e-03 3.268e-05 1.000e+00 1.000e+00 126 -6.214e+00 -6.214e+00
#> Path [1] :Best Iter: [5] ELBO (-6.213940) evaluations: (126)
#> Path [2] :Initial log joint density = -11.196746
#> Path [2] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 9.877e-04 1.507e-05 1.000e+00 1.000e+00 126 -6.283e+00 -6.283e+00
#> Path [2] :Best Iter: [4] ELBO (-6.223987) evaluations: (126)
#> Path [3] :Initial log joint density = -10.337120
#> Path [3] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 7.410e-04 9.634e-06 1.000e+00 1.000e+00 126 -6.236e+00 -6.236e+00
#> Path [3] :Best Iter: [2] ELBO (-6.179583) evaluations: (126)
#> Path [4] :Initial log joint density = -12.005851
#> Path [4] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 1.253e-03 2.257e-05 1.000e+00 1.000e+00 126 -6.159e+00 -6.159e+00
#> Path [4] :Best Iter: [5] ELBO (-6.159209) evaluations: (126)
#> Path [5] :Initial log joint density = -11.377707
#> Path [5] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 1.032e-03 1.614e-05 1.000e+00 1.000e+00 126 -6.206e+00 -6.206e+00
#> Path [5] :Best Iter: [2] ELBO (-6.202552) evaluations: (126)
#> Path [6] :Initial log joint density = -7.049866
#> Path [6] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 4 -6.748e+00 2.694e-03 5.075e-05 1.000e+00 1.000e+00 101 -6.203e+00 -6.203e+00
#> Path [6] :Best Iter: [4] ELBO (-6.203218) evaluations: (101)
#> Path [7] :Initial log joint density = -9.219130
#> Path [7] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 3.841e-04 3.455e-06 1.000e+00 1.000e+00 126 -6.185e+00 -6.185e+00
#> Path [7] :Best Iter: [5] ELBO (-6.185156) evaluations: (126)
#> Path [8] :Initial log joint density = -17.192426
#> Path [8] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 1.354e-03 3.651e-05 1.000e+00 1.000e+00 126 -6.204e+00 -6.204e+00
#> Path [8] :Best Iter: [4] ELBO (-6.168586) evaluations: (126)
#> Path [9] :Initial log joint density = -9.491732
#> Path [9] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 5 -6.748e+00 4.697e-04 4.731e-06 1.000e+00 1.000e+00 126 -6.247e+00 -6.247e+00
#> Path [9] :Best Iter: [3] ELBO (-6.220590) evaluations: (126)
#> Path [10] :Initial log joint density = -6.750036
#> Path [10] : Iter log prob ||dx|| ||grad|| alpha alpha0 # evals ELBO Best ELBO Notes
#> 3 -6.748e+00 3.322e-04 6.069e-06 1.000e+00 1.000e+00 76 -6.214e+00 -6.214e+00
#> Path [10] :Best Iter: [2] ELBO (-6.188430) evaluations: (76)
#> Total log probability function evaluations:1335
#> Finished in 0.1 seconds.
# Specifying initial values as a function
fit_mcmc_w_init_fun <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
refresh = 0,
init = function() list(theta = runif(1))
)
#> Running MCMC with 2 sequential chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#>
#> Both chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.3 seconds.
#>
fit_mcmc_w_init_fun_2 <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
refresh = 0,
init = function(chain_id) {
# silly but demonstrates optional use of chain_id
list(theta = 1 / (chain_id + 1))
}
)
#> Running MCMC with 2 sequential chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#>
#> Both chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.3 seconds.
#>
fit_mcmc_w_init_fun_2$init()
#> [[1]]
#> [[1]]$theta
#> [1] 0.5
#>
#>
#> [[2]]
#> [[2]]$theta
#> [1] 0.3333333
#>
#>
# Specifying initial values as a list of lists
fit_mcmc_w_init_list <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
refresh = 0,
init = list(
list(theta = 0.75), # chain 1
list(theta = 0.25) # chain 2
)
)
#> Running MCMC with 2 sequential chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#>
#> Both chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.3 seconds.
#>
fit_optim_w_init_list <- mod$optimize(
data = stan_data,
seed = 123,
init = list(
list(theta = 0.75)
)
)
#> Initial log joint probability = -11.6657
#> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
#> 6 -5.00402 0.000237915 9.55309e-07 1 1 9
#> Optimization terminated normally:
#> Convergence detected: relative gradient magnitude is below tolerance
#> Finished in 0.1 seconds.
fit_optim_w_init_list$init()
#> [[1]]
#> [[1]]$theta
#> [1] 0.75
#>
#>
# }