6.2 Variables
A variable by itself is a well-formed expression of the same type as
the variable. Variables in Stan consist of ASCII strings containing
only the basic lower-case and upper-case Roman letters, digits, and
the underscore (_
) character. Variables must start with a
letter (a--z
and A--Z
) and may not end with two underscores
(__
).
Examples of legal variable identifiers are as follows.
a, a3, a_3, Sigma, my_cpp_style_variable, myCamelCaseVariable
Unlike in R and BUGS, variable identifiers in Stan may not contain a period character.
Reserved Names
Stan reserves many strings for internal use and these may not be used
as the name of a variable. An attempt to name a variable after an
internal string results in the stanc
translator halting with an
error message indicating which reserved name was used and its location
in the model code.
Model Name
The name of the model cannot be used as a variable within the model.
This is usually not a problem because the default in bin/stanc
is to append _model
to the name of the file containing the
model specification. For example, if the model is in file
foo.stan
, it would not be legal to have a variable named
foo_model
when using the default model name through
bin/stanc
. With user-specified model names, variables cannot
match the model.
User-Defined Function Names
User-defined function names cannot be used as a variable within the model.
Reserved Words from Stan Language
The following list contains reserved words for Stan’s programming language. Not all of these features are implemented in Stan yet, but the tokens are reserved for future use.
for, in, while, repeat, until, if, then, else,
true, false, target
Variables should not be named after types, either, and thus may not be any of the following.
int, real, vector, simplex, unit_vector, ordered,
positive_ordered, row_vector, matrix,
cholesky_factor_corr, cholesky_factor_cov,
corr_matrix, cov_matrix.
The following block identifiers are reserved and cannot be used as variable names:
functions, model, data, parameters, quantities,
transformed, generated
Reserved Names from Stan Implementation
Some variable names are reserved because they are used within Stan’s C++ implementation. These are
var, fvar, STAN_MAJOR, STAN_MINOR, STAN_PATCH,
STAN_MATH_MAJOR, STAN_MATH_MINOR, STAN_MATH_PATCH
Reserved Function and Distribution Names
Variable names will conflict with the names of predefined functions
other than constants. Thus a variable may not be named logit
or add
, but it may be named pi
or e
.
Variable names will also conflict with the names of distributions
suffixed with _lpdf
, _lpmf
, _lcdf
, and _lccdf
, _cdf
, and
_ccdf
, such as normal_lcdf_log
; this also holds for the deprecated
forms _log
, _cdf_log
, and _ccdf_log
,
Using any of these variable names causes the stanc
translator
to halt and report the name and location of the variable causing the
conflict.
Reserved Names from C++
Finally, variable names, including the names of models, should not conflict with any of the C++ keywords.
alignas, alignof, and, and_eq, asm, auto, bitand, bitor, bool,
break, case, catch, char, char16_t, char32_t, class, compl,
const, constexpr, const_cast, continue, decltype, default,
delete, do, double, dynamic_cast, else, enum, explicit,
export, extern, false, float, for, friend, goto, if,
inline, int, long, mutable, namespace, new, noexcept,
not, not_eq, nullptr, operator, or, or_eq, private,
protected, public, register, reinterpret_cast, return,
short, signed, sizeof, static, static_assert, static_cast,
struct, switch, template, this, thread_local, throw, true,
try, typedef, typeid, typename, union, unsigned, using,
virtual, void, volatile, wchar_t, while, xor, xor_eq
Legal Characters
The legal characters for variable identifiers are given in the identifier characters table.
Identifier Characters Table. id:identifier-characters-table The alphanumeric characters and underscore in base ASCII are the only legal characters in Stan identifiers.
characters | ASCII code points |
---|---|
a -- z |
97 – 122 |
A -- Z |
65 – 90 |
0 -- 9 |
48 – 57 |
_ |
95 |
Although not the most expressive character set, ASCII is the most portable and least prone to corruption through improper character encodings or decodings. Sticking to this range of ASCII makes Stan compatible with Latin-1 or UTF-8 encodings of these characters, which are byte-for-byte identical to ASCII.
Comments Allow ASCII-Compatible Encoding
Within comments, Stan can work with any ASCII-compatible character encoding, such as ASCII itself, UTF-8, or Latin1. It is up to user shells and editors to display them properly.