Controlling Outputs¶
CSV File Outputs¶
Underlyingly, the CmdStan outputs are a set of per-chain Stan CSV files. The filenames follow the template ‘<model_name>-<YYYYMMDDHHMMSS>-<chain_id>’ plus the file suffix ‘.csv’. CmdStanPy also captures the per-chain console and error messages.
In [1]: import os
In [2]: from cmdstanpy import CmdStanModel
In [3]: stan_file = os.path.join('users-guide', 'examples', 'bernoulli.stan')
In [4]: model = CmdStanModel(stan_file=stan_file)
In [5]: data_file = os.path.join('users-guide', 'examples', 'bernoulli.data.json')
In [6]: fit = model.sample(data=data_file)
INFO:cmdstanpy:CmdStan start processing
INFO:cmdstanpy:CmdStan done processing.
# printing the object reports sampler commands, output files
In [7]: print(fit)
CmdStanMCMC: model=bernoulli chains=4['method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
csv_files:
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_1.csv
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_2.csv
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_3.csv
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_4.csv
output_files:
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_0-stdout.txt
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_1-stdout.txt
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_2-stdout.txt
/tmp/tmprj9uktub/bernoulli_c76olso/bernoulli-20241211201410_3-stdout.txt
The output_dir
argument is an optional argument which specifies
the path to the output directory used by CmdStan.
If this argument is omitted, the output files are written
to a temporary directory which is deleted when the current Python session is terminated.
In [8]: fit = model.sample(data=data_file, output_dir="./outputs/")
INFO:cmdstanpy:created output directory: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/outputs
INFO:cmdstanpy:CmdStan start processing
INFO:cmdstanpy:CmdStan done processing.
In [9]: !ls outputs/
bernoulli-20241211201410_0-stdout.txt bernoulli-20241211201410_2.csv
bernoulli-20241211201410_1-stdout.txt bernoulli-20241211201410_3-stdout.txt
bernoulli-20241211201410_1.csv bernoulli-20241211201410_3.csv
bernoulli-20241211201410_2-stdout.txt bernoulli-20241211201410_4.csv
Alternatively, the save_csvfiles()
function moves the CSV files
to a specified directory.
In [10]: fit = model.sample(data=data_file)
INFO:cmdstanpy:CmdStan start processing
INFO:cmdstanpy:CmdStan done processing.
In [11]: fit.save_csvfiles(dir='some/path')
In [12]: !ls some/path
bernoulli-20241211201411_1.csv bernoulli-20241211201411_3.csv
bernoulli-20241211201411_2.csv bernoulli-20241211201411_4.csv
Logging¶
You may notice CmdStanPy can produce a lot of output when it is running:
In [13]: fit = model.sample(data=data_file, show_progress=False)
INFO:cmdstanpy:CmdStan start processing
INFO:cmdstanpy:Chain [1] start processing
INFO:cmdstanpy:Chain [2] start processing
INFO:cmdstanpy:Chain [3] start processing
INFO:cmdstanpy:Chain [4] start processing
INFO:cmdstanpy:Chain [3] done processing
INFO:cmdstanpy:Chain [1] done processing
INFO:cmdstanpy:Chain [4] done processing
INFO:cmdstanpy:Chain [2] done processing
This output is managed through the built-in logging
module. For example, it can be disabled entirely:
In [14]: import logging
In [15]: cmdstanpy_logger = logging.getLogger("cmdstanpy")
In [16]: cmdstanpy_logger.disabled = True
# look, no output!
In [17]: fit = model.sample(data=data_file, show_progress=False)
Or one can remove the logging handler that CmdStanPy installs by default and install their own for more
fine-grained control. For example, the following code sends all logs (including the DEBUG
logs, which are hidden by default),
to a file.
DEBUG logging is useful primarily to developers or when trying to hunt down an issue.
In [18]: cmdstanpy_logger.disabled = False
# remove all existing handlers
In [19]: cmdstanpy_logger.handlers = []
In [20]: cmdstanpy_logger.setLevel(logging.DEBUG)
In [21]: handler = logging.FileHandler('all.log')
In [22]: handler.setLevel(logging.DEBUG)
In [23]: handler.setFormatter(
....: logging.Formatter(
....: '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
....: "%H:%M:%S",
....: )
....: )
....:
In [24]: cmdstanpy_logger.addHandler(handler)
Now, if we run the model and check the contents of the file, we will see all the possible logging.
In [25]: fit = model.sample(data=data_file, show_progress=False)
In [26]: with open('all.log','r') as logs:
....: for line in logs.readlines():
....: print(line.strip())
....:
20:14:11 - cmdstanpy - DEBUG - cmd: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli info
cwd: None
20:14:11 - cmdstanpy - INFO - CmdStan start processing
20:14:11 - cmdstanpy - DEBUG - idx 0
20:14:11 - cmdstanpy - DEBUG - idx 1
20:14:11 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
20:14:11 - cmdstanpy - DEBUG - idx 2
20:14:11 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
20:14:11 - cmdstanpy - DEBUG - idx 3
20:14:11 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=1', 'random', 'seed=6352', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
20:14:11 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
20:14:11 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=2', 'random', 'seed=6352', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_2.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
20:14:11 - cmdstanpy - DEBUG - running CmdStan, num_threads: 1
20:14:11 - cmdstanpy - INFO - Chain [1] start processing
20:14:11 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=3', 'random', 'seed=6352', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_3.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
20:14:11 - cmdstanpy - INFO - Chain [2] start processing
20:14:11 - cmdstanpy - DEBUG - CmdStan args: ['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=4', 'random', 'seed=6352', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_4.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
20:14:11 - cmdstanpy - INFO - Chain [3] start processing
20:14:11 - cmdstanpy - INFO - Chain [4] start processing
20:14:11 - cmdstanpy - INFO - Chain [1] done processing
20:14:11 - cmdstanpy - INFO - Chain [3] done processing
20:14:11 - cmdstanpy - INFO - Chain [2] done processing
20:14:11 - cmdstanpy - INFO - Chain [4] done processing
20:14:11 - cmdstanpy - DEBUG - runset
RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4
cmd (chain 1):
['/home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli', 'id=1', 'random', 'seed=6352', 'data', 'file=users-guide/examples/bernoulli.data.json', 'output', 'file=/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
retcodes=[0, 0, 0, 0]
per-chain output files (showing chain 1 only):
csv_file:
/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_1.csv
console_msgs (if any):
/tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_0-stdout.txt
20:14:11 - cmdstanpy - DEBUG - Chain 1 console:
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = false (Default)
thin = 1 (Default)
adapt
engaged = true (Default)
gamma = 0.05 (Default)
delta = 0.8 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
save_metric = false (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
num_chains = 1 (Default)
id = 1 (Default)
data
file = users-guide/examples/bernoulli.data.json
init = 2 (Default)
random
seed = 6352
output
file = /tmp/tmprj9uktub/bernoullil140l7_8/bernoulli-20241211201411_1.csv
diagnostic_file = (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
profile_file = profile.csv (Default)
save_cmdstan_config = false (Default)
num_threads = 1 (Default)
Gradient evaluation took 3e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Adjust your expectations accordingly!
Iteration: 1 / 2000 [ 0%] (Warmup)
Iteration: 100 / 2000 [ 5%] (Warmup)
Iteration: 200 / 2000 [ 10%] (Warmup)
Iteration: 300 / 2000 [ 15%] (Warmup)
Iteration: 400 / 2000 [ 20%] (Warmup)
Iteration: 500 / 2000 [ 25%] (Warmup)
Iteration: 600 / 2000 [ 30%] (Warmup)
Iteration: 700 / 2000 [ 35%] (Warmup)
Iteration: 800 / 2000 [ 40%] (Warmup)
Iteration: 900 / 2000 [ 45%] (Warmup)
Iteration: 1000 / 2000 [ 50%] (Warmup)
Iteration: 1001 / 2000 [ 50%] (Sampling)
Iteration: 1100 / 2000 [ 55%] (Sampling)
Iteration: 1200 / 2000 [ 60%] (Sampling)
Iteration: 1300 / 2000 [ 65%] (Sampling)
Iteration: 1400 / 2000 [ 70%] (Sampling)
Iteration: 1500 / 2000 [ 75%] (Sampling)
Iteration: 1600 / 2000 [ 80%] (Sampling)
Iteration: 1700 / 2000 [ 85%] (Sampling)
Iteration: 1800 / 2000 [ 90%] (Sampling)
Iteration: 1900 / 2000 [ 95%] (Sampling)
Iteration: 2000 / 2000 [100%] (Sampling)
Elapsed Time: 0.005 seconds (Warm-up)
0.014 seconds (Sampling)
0.019 seconds (Total)