Controlling Outputs

CSV File Outputs

Underlyingly, the CmdStan outputs are a set of per-chain Stan CSV files. The filenames follow the template ‘<model_name>-<YYYYMMDDHHMMSS>-<chain_id>’ plus the file suffix ‘.csv’. CmdStanPy also captures the per-chain console and error messages.

In [1]: import os

In [2]: from cmdstanpy import CmdStanModel

In [3]: stan_file = os.path.join('users-guide', 'examples', 'bernoulli.stan')

In [4]: model = CmdStanModel(stan_file=stan_file)

In [5]: data_file = os.path.join('users-guide', 'examples', 'bernoulli.data.json')

In [6]: fit = model.sample(data=data_file)
INFO:cmdstanpy:CmdStan start processing
                                                                                                                                                                                                                                                                                                                                
INFO:cmdstanpy:CmdStan done processing.

# printing the object reports sampler commands, output files
In [7]: print(fit)
CmdStanMCMC: model=bernoulli chains=4['method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
 csv_files:
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_1.csv
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_2.csv
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_3.csv
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_4.csv
 output_files:
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_0-stdout.txt
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_1-stdout.txt
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_2-stdout.txt
	/tmp/tmpgvv426mx/bernoulliigzvjuln/bernoulli-20240326163037_3-stdout.txt

The output_dir argument is an optional argument which specifies the path to the output directory used by CmdStan. If this argument is omitted, the output files are written to a temporary directory which is deleted when the current Python session is terminated.

In [8]: fit = model.sample(data=data_file, output_dir="./outputs/")
INFO:cmdstanpy:created output directory: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/outputs
INFO:cmdstanpy:CmdStan start processing
                                                                                                                                                                                                                                                                                                                                
INFO:cmdstanpy:CmdStan done processing.

In [9]: !ls outputs/
bernoulli-20240326163037_0-stdout.txt  bernoulli-20240326163037_2.csv
bernoulli-20240326163037_1-stdout.txt  bernoulli-20240326163037_3-stdout.txt
bernoulli-20240326163037_1.csv	       bernoulli-20240326163037_3.csv
bernoulli-20240326163037_2-stdout.txt  bernoulli-20240326163037_4.csv

Alternatively, the save_csvfiles() function moves the CSV files to a specified directory.

In [10]: fit = model.sample(data=data_file)
INFO:cmdstanpy:CmdStan start processing
                                                                                                                                                                                                                                                                                                                                
INFO:cmdstanpy:CmdStan done processing.

In [11]: fit.save_csvfiles(dir='some/path')

In [12]: !ls some/path
bernoulli-20240326163037_1.csv	bernoulli-20240326163037_3.csv
bernoulli-20240326163037_2.csv	bernoulli-20240326163037_4.csv

Logging

You may notice CmdStanPy can produce a lot of output when it is running:

In [13]: fit = model.sample(data=data_file, show_progress=False)
INFO:cmdstanpy:CmdStan start processing
INFO:cmdstanpy:Chain [1] start processing
INFO:cmdstanpy:Chain [2] start processing
INFO:cmdstanpy:Chain [3] start processing
INFO:cmdstanpy:Chain [4] start processing
INFO:cmdstanpy:Chain [1] done processing
INFO:cmdstanpy:Chain [4] done processing
INFO:cmdstanpy:Chain [3] done processing
INFO:cmdstanpy:Chain [2] done processing

This output is managed through the built-in logging module. For example, it can be disabled entirely:

In [14]: import logging

In [15]: cmdstanpy_logger = logging.getLogger("cmdstanpy")

In [16]: cmdstanpy_logger.disabled = True

# look, no output!
In [17]: fit = model.sample(data=data_file, show_progress=False)

Or one can remove the logging handler that CmdStanPy installs by default and install their own for more fine-grained control. For example, the following code sends all logs (including the DEBUG logs, which are hidden by default), to a file.

DEBUG logging is useful primarily to developers or when trying to hunt down an issue.

In [18]: cmdstanpy_logger.disabled = False

# remove all existing handlers
In [19]: cmdstanpy_logger.handlers = []

In [20]: cmdstanpy_logger.setLevel(logging.DEBUG)

In [21]: handler = logging.FileHandler('all.log')

In [22]: handler.setLevel(logging.DEBUG)

In [23]: handler.setFormatter(
   ....:     logging.Formatter(
   ....:         '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
   ....:         "%H:%M:%S",
   ....:     )
   ....: )
   ....: 

In [24]: cmdstanpy_logger.addHandler(handler)

Now, if we run the model and check the contents of the file, we will see all the possible logging.

In [25]: fit = model.sample(data=data_file, show_progress=False)

In [26]: with open('all.log','r') as logs:
   ....:     for line in logs.readlines():
   ....:         print(line.strip())
   ....: 
16:30:38 - cmdstanpy - DEBUG - cmd: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli info
cwd: None
16:30:38 - cmdstanpy - INFO - CmdStan start processing
16:30:38 - cmdstanpy - DEBUG - idx 0
16:30:38 - cmdstanpy - DEBUG - idx 1
16:30:38 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
16:30:38 - cmdstanpy - DEBUG - idx 2
16:30:38 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
16:30:38 - cmdstanpy - DEBUG - idx 3
16:30:38 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=1', 'random', 'seed=23084', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
16:30:38 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
16:30:38 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=2', 'random', 'seed=23084', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_2.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
16:30:38 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
16:30:38 - cmdstanpy - INFO - Chain [1] start processing
16:30:38 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=3', 'random', 'seed=23084', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_3.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
16:30:38 - cmdstanpy - INFO - Chain [2] start processing
16:30:38 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=4', 'random', 'seed=23084', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_4.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
16:30:38 - cmdstanpy - INFO - Chain [3] start processing
16:30:38 - cmdstanpy - INFO - Chain [4] start processing
16:30:38 - cmdstanpy - INFO - Chain [1] done processing
16:30:38 - cmdstanpy - INFO - Chain [2] done processing
16:30:38 - cmdstanpy - INFO - Chain [3] done processing
16:30:38 - cmdstanpy - INFO - Chain [4] done processing
16:30:38 - cmdstanpy - DEBUG - runset
RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4
cmd (chain 1):
['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=1', 'random', 'seed=23084', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
retcodes=[0, 0, 0, 0]
per-chain output files (showing chain 1 only):
csv_file:
/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_1.csv
console_msgs (if any):
/tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_0-stdout.txt
16:30:38 - cmdstanpy - DEBUG - Chain 1 console:
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.05 (Default)
delta = 0.8 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
save_metric = 0 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file =  (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
num_chains = 1 (Default)
id = 1 (Default)
data
file = users-guide/examples/bernoulli.data.json
init = 2 (Default)
random
seed = 23084
output
file = /tmp/tmpgvv426mx/bernoulli1wek20la/bernoulli-20240326163038_1.csv
diagnostic_file =  (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
profile_file = profile.csv (Default)
save_cmdstan_config = 0 (Default)
num_threads = 1 (Default)


Gradient evaluation took 3e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  100 / 2000 [  5%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  300 / 2000 [ 15%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  500 / 2000 [ 25%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  700 / 2000 [ 35%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration:  900 / 2000 [ 45%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

Elapsed Time: 0.004 seconds (Warm-up)
0.013 seconds (Sampling)
0.017 seconds (Total)