12.1 Reading and transforming data
The reading and transforming data steps are the same for sampling, optimization and diagnostics.
Read data
The first step of execution is to read data into memory. Data may be read in through file (in CmdStan) or through memory (RStan and PyStan); see their respective manuals for details.18
All of the variables declared in the data
block will be read.
If a variable cannot be read, the program will halt with a message
indicating which data variable is missing.
After each variable is read, if it has a declared constraint, the
constraint is validated. For example, if a variable N
is
declared as int<lower=0>
, after N
is read, it will be tested
to make sure it is greater than or equal to zero. If a variable
violates its declared constraint, the program will halt with a warning
message indicating which variable contains an illegal value, the value
that was read, and the constraint that was declared.
Define transformed data
After data is read into the model, the transformed data variable statements are executed in order to define the transformed data variables. As the statements execute, declared constraints on variables are not enforced.
Transformed data variables are initialized with real values set to
NaN
and integer values set to the smallest integer (large
absolute value negative number).
After the statements are executed, all declared constraints on transformed data variables are validated. If the validation fails, execution halts and the variable’s name, value and constraints are displayed.
The C++ code underlying Stan is flexible enough to allow data to be read from memory or file. Calls from R, for instance, can be configured to read data from file or directly from R’s memory.↩