7.9 Statement blocks and local variable declarations

This is an old version, view current version.

7.9 Statement blocks and local variable declarations

Just as parentheses may be used to group expressions, curly brackets may be used to group a sequence of zero or more statements into a statement block. At the beginning of each block, local variables may be declared that are scoped over the rest of the statements in the block.

Blocks in for loops

Blocks are often used to group a sequence of statements together to be used in the body of a for loop. Because the body of a for loop can be any statement, for loops with bodies consisting of a single statement can be written as follows.

for (n in 1:N) {
  y[n] ~ normal(mu, sigma);
}

To put multiple statements inside the body of a for loop, a block is used, as in the following example.

for (n in 1:N) {
  lambda[n] ~ gamma(alpha, beta);
  y[n] ~ poisson(lambda[n]);
}

The open curly bracket ({) is the first character of the block and the close curly bracket (}) is the last character.

Because whitespace is ignored in Stan, the following program will not compile.

for (n in 1:N)
  y[n] ~ normal(mu, sigma);
  z[n] ~ normal(mu, sigma); // ERROR!

The problem is that the body of the for loop is taken to be the statement directly following it, which is y[n] ~ normal(mu, sigma). This leaves the probability statement for z[n] hanging, as is clear from the following equivalent program.

for (n in 1:N) {
  y[n] ~ normal(mu, sigma);
}
z[n] ~ normal(mu, sigma); // ERROR!

Neither of these programs will compile. If the loop variable n was defined before the for loop, the for-loop declaration will raise an error. If the loop variable n was not defined before the for loop, then the use of the expression z[n] will raise an error.

Local variable declarations

A for loop has a statement as a body. It is often convenient in writing programs to be able to define a local variable that will be used temporarily and then forgotten. For instance, the for loop example of repeated assignment should use a local variable for maximum clarity and efficiency, as in the following example.

for (n in 1:N) {
  real theta;
  theta = inv_logit(alpha + x[n] * beta);
  y[n] ~ bernoulli(theta);
}

The local variable theta is declared here inside the for loop. The scope of a local variable is just the block in which it is defined. Thus theta is available for use inside the for loop, but not outside of it. As in other situations, Stan does not allow variable hiding. So it is illegal to declare a local variable theta if the variable theta is already defined in the scope of the for loop. For instance, the following is not legal.

for (m in 1:M) {
  real theta;
  for (n in 1:N) {
    real theta; // ERROR!
    theta = inv_logit(alpha + x[m, n] * beta);
    y[m, n] ~ bernoulli(theta);
// ...

The compiler will flag the second declaration of theta with a message that it is already defined.

No constraints on local variables

Local variables may not have constraints on their declaration. The only types that may be used are

int, real, vector[K], row_vector[K], matrix[M, N].

Blocks within blocks

A block is itself a statement, so anywhere a sequence of statements is allowed, one or more of the statements may be a block. For instance, in a for loop, it is legal to have the following

for (m in 1:M) {
  {
     int n = 2 * m;
     sum += n;
  }
  for (n in 1:N) {
    sum += x[m, n];
  }
}

The variable declaration int n; is the first element of an embedded block and so has scope within that block. The for loop defines its own local block implicitly over the statement following it in which the loop variable is defined. As far as Stan is concerned, these two uses of n are unrelated.