7.9 Statement blocks and local variable declarations
Just as parentheses may be used to group expressions, curly brackets may be used to group a sequence of zero or more statements into a statement block. At the beginning of each block, local variables may be declared that are scoped over the rest of the statements in the block.
Blocks in for loops
Blocks are often used to group a sequence of statements together to be used in the body of a for loop. Because the body of a for loop can be any statement, for loops with bodies consisting of a single statement can be written as follows.
for (n in 1:N) {
y[n] ~ normal(mu, sigma);
}
To put multiple statements inside the body of a for loop, a block is used, as in the following example.
for (n in 1:N) {
lambda[n] ~ gamma(alpha, beta);
y[n] ~ poisson(lambda[n]);
}
The open curly bracket ({
) is the first character of the block
and the close curly bracket (}
) is the last character.
Because whitespace is ignored in Stan, the following program will not compile.
for (n in 1:N)
y[n] ~ normal(mu, sigma);
z[n] ~ normal(mu, sigma); // ERROR!
The problem is that the body of the for loop is taken to be the
statement directly following it, which is
y[n] ~ normal(mu, sigma)
. This leaves the probability statement for
z[n]
hanging, as is clear from the following equivalent
program.
for (n in 1:N) {
y[n] ~ normal(mu, sigma);
}
z[n] ~ normal(mu, sigma); // ERROR!
Neither of these programs will compile. If the loop variable n
was defined before the for loop, the for-loop declaration will raise
an error. If the loop variable n
was not defined before the
for loop, then the use of the expression z[n]
will raise an
error.
Local variable declarations
A for loop has a statement as a body. It is often convenient in writing programs to be able to define a local variable that will be used temporarily and then forgotten. For instance, the for loop example of repeated assignment should use a local variable for maximum clarity and efficiency, as in the following example.
for (n in 1:N) {
real theta;
theta = inv_logit(alpha + x[n] * beta);
y[n] ~ bernoulli(theta);
}
The local variable theta
is declared here inside the for loop.
The scope of a local variable is just the block in which it is
defined. Thus theta
is available for use inside the for loop,
but not outside of it. As in other situations, Stan does not allow
variable hiding. So it is illegal to declare a local variable
theta
if the variable theta is already defined in the scope of
the for loop. For instance, the following is not legal.
for (m in 1:M) {
real theta;
for (n in 1:N) {
real theta; // ERROR!
theta = inv_logit(alpha + x[m, n] * beta);
y[m, n] ~ bernoulli(theta);
// ...
The compiler will flag the second declaration of theta
with a
message that it is already defined.
No constraints on local variables
Local variables may not have constraints on their declaration. The only types that may be used are
int, real, vector[K], row_vector[K], matrix[M, N].
Blocks within blocks
A block is itself a statement, so anywhere a sequence of statements is allowed, one or more of the statements may be a block. For instance, in a for loop, it is legal to have the following
for (m in 1:M) {
{
int n = 2 * m;
sum += n;
}
for (n in 1:N) {
sum += x[m, n];
}
}
The variable declaration int n;
is the first element of an
embedded block and so has scope within that block. The for loop
defines its own local block implicitly over the statement following it
in which the loop variable is defined. As far as Stan is concerned,
these two uses of n
are unrelated.