23.9 White Space
Stan allows spaces between elements of a program. The white space
characters allowed in Stan programs include the space (ASCII
0x20
), line feed (ASCII 0x0A
), carriage return
(0x0D
), and tab (0x09
). Stan treats all whitespace
characters interchangeably, with any sequence of whitespace characters
being syntactically equivalent to a single space character.
Nevertheless, effective use of whitespace is the key to good program
layout.
Line Breaks Between Statements and Declarations
It is dispreferred to have multiple statements or declarations on the same line, as in the following example.
transformed parameters {
real mu_centered; real sigma;
mu = (mu_raw - mean_mu_raw); sigma = pow(tau,-2);
}
These should be broken into four separate lines.
No Tabs
Stan programs should not contain tab characters. They are legal and may be used anywhere other whitespace occurs. Using tabs to layout a program is highly unportable because the number of spaces represented by a single tab character varies depending on which program is doing the rendering and how it is configured.
Two-Character Indents
Stan has standardized on two space characters of indentation, which is the standard convention for C/C++ code. Another sensible choice is four spaces, which is the convention for Java and Python. Just be consistent.
23.9.1 Space Between if
, {
and Condition
Use a space after if
s. For instance, use if (x < y) ...
, not
if(x < y) ...
.
No Space For Function Calls
There is no space between a function name and the function it applies
to. For instance, use normal(0,1)
, not normal (0,1)
.
Spaces Around Operators
There should be spaces around binary operators. For instance, use
y[1]~=~x
, not y[1]=x
, use (x~+~y)~*~z
not
(x+y)*z
.
Breaking Expressions across Lines
Sometimes expressions are too long to fit on a single line. In that case, the recommended form is to break before an operator,46 aligning the operator to indicate scoping. For example, use the following form (though not the content; inverting matrices is almost always a bad idea).
target += (y - mu)' * inv(Sigma) * (y - mu);
Here, the multiplication operator (*
) is aligned to clearly
signal the multiplicands in the product.
For function arguments, break after a comma and line the next argument up underneath as follows.
y[n] ~ normal(alpha + beta * x + gamma * y,
pow(tau,-0.5));
Optional Spaces after Commas
Optionally use spaces after commas in function arguments for clarity.
For example, normal(alpha * x[n] + beta,sigma)
can also be
written as normal(alpha~*~x[n]~+~beta,~sigma)
.
Unix Newlines
Wherever possible, Stan programs should use a single line feed character to separate lines. All of the Stan developers (so far, at least) work on Unix-like operating systems and using a standard newline makes the programs easier for us to read and share.
Platform Specificity of Newlines
Newlines are signaled in Unix-like operating systems such as Linux and
Mac OS X with a single line-feed (LF) character (ASCII code point
0x0A
). Newlines are signaled in Windows using two characters,
a carriage return (CR) character (ASCII code point 0x0D
)
followed by a line-feed (LF) character.
This is the usual convention in both typesetting and other programming languages. Neither R nor BUGS allows breaks before an operator because they allow newlines to signal the end of an expression or statement.↩