A CmdStanModel object is an R6 object created by the cmdstan_model() function. The object stores the path to a Stan program and compiled executable (once created), and provides methods for fitting the model using Stan's algorithms. See the Details section for available methods.

Details

CmdStanModel objects have the following associated methods:

MethodDescription
$code()Return Stan program as a string.
$print()Print readable version of Stan program.
$stan_file()Return the file path to the Stan program.
$exe_file()Return the file path to the compiled executable.
$compile()Compile Stan program.
$sample()Run CmdStan's "sample" method, return CmdStanMCMC object.
$optimize()Run CmdStan's "optimize" method, return CmdStanMLE object.
$variational()Run CmdStan's "variational" method, return CmdStanVB object.

See also

The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.

The Stan and CmdStan documentation:

Examples

# \dontrun{ # Set path to cmdstan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL)
#> CmdStan path set to: /Users/jgabry/.cmdstanr/cmdstan-2.23.0
# Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(stan_program)
#> Model executable is up to date!
mod$print()
#> data { #> int<lower=0> N; #> int<lower=0,upper=1> y[N]; #> } #> parameters { #> real<lower=0,upper=1> theta; #> } #> model { #> theta ~ beta(1,1); // uniform prior on interval 0,1 #> y ~ bernoulli(theta); #> }
# data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, cores = 2 )
#> Running MCMC with 2 chain(s) on 2 core(s)... #> #> Running ./bernoulli 'id=1' random 'seed=123' data \ #> 'file=/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/standata-29297f346967.json' \ #> output \ #> 'file=/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-1-f42a6f.csv' \ #> 'method=sample' 'save_warmup=0' 'algorithm=hmc' 'engine=nuts' adapt \ #> 'engaged=1' #> Running ./bernoulli 'id=2' random 'seed=124' data \ #> 'file=/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/standata-29297f346967.json' \ #> output \ #> 'file=/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-2-f42a6f.csv' \ #> 'method=sample' 'save_warmup=0' 'algorithm=hmc' 'engine=nuts' adapt \ #> 'engaged=1' #> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup) #> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup) #> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup) #> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup) #> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup) #> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling) #> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling) #> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling) #> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling) #> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling) #> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup) #> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup) #> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup) #> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup) #> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup) #> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling) #> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling) #> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling) #> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling) #> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling) #> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1 finished in 0.2 seconds. #> Chain 2 finished in 0.2 seconds. #> #> Both chains finished successfully. #> Mean chain execution time: 0.2 seconds. #> Total execution time: 0.2 seconds.
# Use 'posterior' package for summaries fit_mcmc$summary()
#> # A tibble: 2 x 10 #> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 lp__ -7.28 -7.00 0.739 0.342 -8.80 -6.75 1.00 815. 621. #> 2 theta 0.254 0.235 0.123 0.123 0.0809 0.485 1.00 752. 589.
# Call CmdStan's diagnose and stansummary utilities fit_mcmc$cmdstan_diagnose()
#> Running bin/diagnose \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-1-f42a6f.csv \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-2-f42a6f.csv #> Processing csv files: /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-1-f42a6f.csv, /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-2-f42a6f.csv #> #> Checking sampler transitions treedepth. #> Treedepth satisfactory for all transitions. #> #> Checking sampler transitions for divergences. #> No divergent transitions found. #> #> Checking E-BFMI - sampler transitions HMC potential energy. #> E-BFMI satisfactory for all transitions. #> #> Effective sample size satisfactory. #> #> Split R-hat values satisfactory all parameters. #> #> Processing complete, no problems detected.
fit_mcmc$cmdstan_summary()
#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-1-f42a6f.csv \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-2-f42a6f.csv #> Inference for Stan model: bernoulli_model #> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. #> #> Warmup took (0.0067, 0.0067) seconds, 0.013 seconds total #> Sampling took (0.015, 0.013) seconds, 0.028 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ -7.3 3.1e-02 0.74 -8.8 -7.0 -6.8 5.9e+02 2.1e+04 1.0e+00 #> accept_stat__ 0.92 5.0e-03 0.14 0.61 0.97 1.0 7.3e+02 2.6e+04 1.0e+00 #> stepsize__ 1.0 9.0e-02 0.090 0.93 1.1 1.1 1.0e+00 3.6e+01 2.6e+13 #> treedepth__ 1.4 1.2e-02 0.52 1.0 1.0 2.0 1.9e+03 6.6e+04 1.0e+00 #> n_leapfrog__ 2.6 4.0e-01 1.5 1.0 3.0 7.0 1.4e+01 5.0e+02 1.0e+00 #> divergent__ 0.00 nan 0.00 0.00 0.00 0.00 nan nan nan #> energy__ 7.8 4.0e-02 1.0 6.8 7.4 10.0 6.9e+02 2.5e+04 1.0e+00 #> theta 0.25 4.5e-03 0.12 0.081 0.23 0.49 7.6e+02 2.7e+04 1.0e+00 #> #> Samples were drawn using hmc with nuts. #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #>
# For models fit using MCMC, if you like working with RStan's stanfit objects # then you can create one with rstan::read_stan_csv() if (require(rstan, quietly = TRUE)) { stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) print(stanfit) }
#> rstan (Version 2.19.3, GitRev: 2e1f913d3ca3)
#> For execution on a local, multicore CPU with excess RAM we recommend calling #> options(mc.cores = parallel::detectCores()). #> To avoid recompilation of unchanged Stan programs, we recommend calling #> rstan_options(auto_write = TRUE)
#> Inference for Stan model: bernoulli-202005121301-1-f42a6f. #> 2 chains, each with iter=2000; warmup=1000; thin=1; #> post-warmup draws per chain=1000, total post-warmup draws=2000. #> #> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat #> theta 0.25 0.00 0.12 0.06 0.16 0.23 0.33 0.53 754 1 #> lp__ -7.28 0.03 0.74 -9.50 -7.45 -7.00 -6.80 -6.75 586 1 #> #> Samples were drawn using NUTS(diag_e) at Tue May 12 13:01:41 2020. #> For each parameter, n_eff is a crude measure of effective sample size, #> and Rhat is the potential scale reduction factor on split chains (at #> convergence, Rhat=1).
# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123)
#> method = optimize #> optimize #> algorithm = lbfgs (Default) #> lbfgs #> init_alpha = 0.001 (Default) #> tol_obj = 9.9999999999999998e-13 (Default) #> tol_rel_obj = 10000 (Default) #> tol_grad = 1e-08 (Default) #> tol_rel_grad = 10000000 (Default) #> tol_param = 1e-08 (Default) #> history_size = 5 (Default) #> iter = 2000 (Default) #> save_iterations = 0 (Default) #> id = 1 #> data #> file = /Users/jgabry/.cmdstanr/cmdstan-2.23.0/examples/bernoulli/bernoulli.data.json #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-1-fd6411.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> Initial log joint probability = -9.51104 #> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes #> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 #> Optimization terminated normally: #> Convergence detected: relative gradient magnitude is below tolerance
#> Optimization method is experimental and the structure of returned object may change.
fit_optim$summary()
#> # A tibble: 2 x 2 #> variable estimate #> <chr> <dbl> #> 1 lp__ -5.00 #> 2 theta 0.2
# Run 'variational' to approximate the posterior (default is meanfield ADVI) fit_vb <- mod$variational(data = stan_data, seed = 123)
#> method = variational #> variational #> algorithm = meanfield (Default) #> meanfield #> iter = 10000 (Default) #> grad_samples = 1 (Default) #> elbo_samples = 100 (Default) #> eta = 1 (Default) #> adapt #> engaged = 1 (Default) #> iter = 50 (Default) #> tol_rel_obj = 0.01 (Default) #> eval_elbo = 100 (Default) #> output_samples = 1000 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/standata-292969b4cadc.json #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpMwcfcG/bernoulli-202005121301-1-7267bb.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> ------------------------------------------------------------ #> EXPERIMENTAL ALGORITHM: #> This procedure has not been thoroughly tested and may be unstable #> or buggy. The interface is subject to change. #> ------------------------------------------------------------ #> #> #> #> Gradient evaluation took 1.9e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. #> Adjust your expectations accordingly! #> #> #> Begin eta adaptation. #> Iteration: 1 / 250 [ 0%] (Adaptation) #> Iteration: 50 / 250 [ 20%] (Adaptation) #> Iteration: 100 / 250 [ 40%] (Adaptation) #> Iteration: 150 / 250 [ 60%] (Adaptation) #> Iteration: 200 / 250 [ 80%] (Adaptation) #> Success! Found best value [eta = 1] earlier than expected. #> #> Begin stochastic gradient ascent. #> iter ELBO delta_ELBO_mean delta_ELBO_med notes #> 100 -6.262 1.000 1.000 #> 200 -6.263 0.500 1.000 #> 300 -6.307 0.336 0.007 MEDIAN ELBO CONVERGED #> #> Drawing a sample of size 1000 from the approximate posterior... #> COMPLETED.
#> Variational method is experimental and the structure of returned object may change.
fit_vb$summary()
#> # A tibble: 3 x 7 #> variable mean median sd mad q5 q95 #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 lp__ -7.18 -6.94 0.588 0.259 -8.36 -6.75 #> 2 lp_approx__ -0.515 -0.221 0.692 0.303 -2.06 -0.00257 #> 3 theta 0.263 0.246 0.115 0.113 0.106 0.481
# }