6.9 Function application

This is an old version, view current version.

Stan provides a range of built in mathematical and statistical functions, which are documented in the built-in function documentation.

Expressions in Stan may consist of the name of function followed by a sequence of zero or more argument expressions. For instance, log(2.0) is the expression of type real denoting the result of applying the natural logarithm to the value of the real literal 2.0.

Syntactically, function application has higher precedence than any of the other operators, so that y + log(x) is interpreted as y + (log(x)).

Type signatures and result type inference

Each function has a type signature which determines the allowable type of its arguments and its return type. For instance, the function signature for the logarithm function can be expressed as

real log(real);

and the signature for the lmultiply function is

real lmultiply(real,real);

A function is uniquely determined by its name and its sequence of argument types. For instance, the following two functions are different functions.

real mean(real[]);

real mean(vector);

The first applies to a one-dimensional array of real values and the second to a vector.

The identity conditions for functions explicitly forbids having two functions with the same name and argument types but different return types. This restriction also makes it possible to infer the type of a function expression compositionally by only examining the type of its subexpressions.

Constants

Constants in Stan are nothing more than nullary (no-argument) functions. For instance, the mathematical constants \(\pi\) and \(e\) are represented as nullary functions named pi() and e(). See the built-in constants section for a list of built-in constants.

Type promotion and function resolution

Because of integer to real type promotion, rules must be established for which function is called given a sequence of argument types. The scheme employed by Stan is the same as that used by C++, which resolves a function call to the function requiring the minimum number of type promotions.

For example, consider a situation in which the following two function signatures have been registered for foo.

real foo(real,real);
int foo(int,int);

The use of foo in the expression foo(1.0,1.0) resolves to foo(real,real), and thus the expression foo(1.0,1.0) itself is assigned a type of real.

Because integers may be promoted to real values, the expression foo(1,1) could potentially match either foo(real,real) or foo(int,int). The former requires two type promotions and the latter requires none, so foo(1,1) is resolved to function foo(int,int) and is thus assigned the type int.

The expression foo(1,1.0) has argument types (int,real) and thus does not explicitly match either function signature. By promoting the integer expression 1 to type real, it is able to match foo(real,real), and hence the type of the function expression foo(1,1.0) is real.

In some cases (though not for any built-in Stan functions), a situation may arise in which the function referred to by an expression remains ambiguous. For example, consider a situation in which there are exactly two functions named bar with the following signatures.

real bar(real,int);
real bar(int,real);

With these signatures, the expression bar(1.0,1) and bar(1,1.0) resolve to the first and second of the above functions, respectively. The expression bar(1.0,1.0) is illegal because real values may not be demoted to integers. The expression bar(1,1) is illegal for a different reason. If the first argument is promoted to a real value, it matches the first signature, whereas if the second argument is promoted to a real value, it matches the second signature. The problem is that these both require one promotion, so the function name bar is ambiguous. If there is not a unique function requiring fewer promotions than all others, as with bar(1,1) given the two declarations above, the Stan compiler will flag the expression as illegal.

Random-number generating functions

For most of the distributions supported by Stan, there is a corresponding random-number generating function. These random number generators are named by the distribution with the suffix _rng. For example, a univariate normal random number can be generated by normal_rng(0,1); only the parameters of the distribution, here a location (0) and scale (1) are specified because the variate is generated.

Random-number generators locations

The use of random-number generating functions is restricted to the transformed data and generated quantities blocks; attempts to use them elsewhere will result in a parsing error with a diagnostic message. They may also be used in the bodies of user-defined functions whose names end in _rng.

This allows the random number generating functions to be used for simulation in general, and for Bayesian posterior predictive checking in particular.

Posterior predictive checking

Posterior predictive checks typically use the parameters of the model to generate simulated data (at the individual and optionally at the group level for hierarchical models), which can then be compared informally using plots and formally by means of test statistics, to the actual data in order to assess the suitability of the model; see Chapter 6 of (Gelman et al. 2013) for more information on posterior predictive checks.

References

Gelman, Andrew, J. B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. Third Edition. London: Chapman & Hall / CRC Press.