6.9 Function Application
Stan provides a range of built in mathematical and statistical functions, which are documented in the built-in function documentation.
Expressions in Stan may consist of the name of function followed by a
sequence of zero or more argument expressions. For instance,
log(2.0)
is the expression of type real
denoting the
result of applying the natural logarithm to the value of the real
literal 2.0
.
Syntactically, function application has higher precedence than any of
the other operators, so that y + log(x)
is interpreted as
y + (log(x))
.
Type Signatures and Result Type Inference
Each function has a type signature which determines the allowable type of its arguments and its return type. For instance, the function signature for the logarithm function can be expressed as
real log(real);
and the signature for the lmultiply
function is
real lmultiply(real,real);
A function is uniquely determined by its name and its sequence of argument types. For instance, the following two functions are different functions.
real mean(real[]);
real mean(vector);
The first applies to a one-dimensional array of real values and the second to a vector.
The identity conditions for functions explicitly forbids having two functions with the same name and argument types but different return types. This restriction also makes it possible to infer the type of a function expression compositionally by only examining the type of its subexpressions.
Constants
Constants in Stan are nothing more than nullary (no-argument)
functions. For instance, the mathematical constants \(\pi\) and \(e\) are
represented as nullary functions named pi()
and e()
.
See the built-in constants section for a list of built-in constants.
Type Promotion and Function Resolution
Because of integer to real type promotion, rules must be established for which function is called given a sequence of argument types. The scheme employed by Stan is the same as that used by C++, which resolves a function call to the function requiring the minimum number of type promotions.
For example, consider a situation in which the following two function
signatures have been registered for foo
.
real foo(real,real);
int foo(int,int);
The use of foo
in the expression foo(1.0,1.0)
resolves
to foo(real,real)
, and thus the expression foo(1.0,1.0)
itself is assigned a type of real
.
Because integers may be promoted to real values, the expression
foo(1,1)
could potentially match either foo(real,real)
or foo(int,int)
. The former requires two type promotions and
the latter requires none, so foo(1,1)
is resolved to function
foo(int,int)
and is thus assigned the type int
.
The expression foo(1,1.0)
has argument types (int,real)
and thus does not explicitly match either function signature. By
promoting the integer expression 1
to type real
, it is
able to match foo(real,real)
, and hence the type of the
function expression foo(1,1.0)
is real
.
In some cases (though not for any built-in Stan functions), a
situation may arise in which the function referred to by an
expression remains ambiguous. For example, consider a situation in
which there are exactly two functions named bar
with the
following signatures.
real bar(real,int);
real bar(int,real);
With these signatures, the expression bar(1.0,1)
and
bar(1,1.0)
resolve to the first and second of the above
functions, respectively. The expression bar(1.0,1.0)
is
illegal because real values may not be demoted to integers. The
expression bar(1,1)
is illegal for a different reason. If the
first argument is promoted to a real value, it matches the first
signature, whereas if the second argument is promoted to a real value,
it matches the second signature. The problem is that these both
require one promotion, so the function name bar
is ambiguous.
If there is not a unique function requiring fewer promotions than all
others, as with bar(1,1)
given the two declarations above,
the Stan compiler will flag the expression as illegal.
Random-Number Generating Functions
For most of the distributions supported by Stan, there is a
corresponding random-number generating function. These random number
generators are named by the distribution with the suffix _rng
.
For example, a univariate normal random number can be generated by
normal_rng(0,1)
; only the parameters of the distribution,
here a location (0) and scale (1) are specified because the variate is
generated.
Random-Number Generators Locations
The use of random-number generating functions is restricted to the
transformed data and generated quantities blocks; attempts to use them
elsewhere will result in a parsing error with a diagnostic message.
They may also be used in the bodies of user-defined functions whose
names end in _rng
.
This allows the random number generating functions to be used for simulation in general, and for Bayesian posterior predictive checking in particular.
Posterior Predictive Checking
Posterior predictive checks typically use the parameters of the model to generate simulated data (at the individual and optionally at the group level for hierarchical models), which can then be compared informally using plots and formally by means of test statistics, to the actual data in order to assess the suitability of the model; see Chapter 6 of (Gelman et al. 2013) for more information on posterior predictive checks.
References
Gelman, Andrew, J. B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. Third. London: Chapman &Hall/CRC Press.