7.3 Increment log density
The basis of Stan’s execution is the evaluation of a log probability function (specifically, a probability density function) for a given set of (real-valued) parameters; this function returns the log density of the posterior up to an additive constant. Data and transformed data are fixed before the log density is evaluated. The total log probability is initialized to zero. Next, any log Jacobian adjustments accrued by the variable constraints are added to the log density (the Jacobian adjustment may be skipped for optimization). Sampling and log probability increment statements may add to the log density in the model block. A log probability increment statement directly increments the log density with the value of an expression as follows.5
target += -0.5 * y * y;
The keyword target
here is actually not a variable, and may not
be accessed as such (though see below on how to access the value of
target through a special function).
In this example, the unnormalized log probability of a unit normal variable \(y\) is added to the total log probability. In the general case, the argument can be any expression.6
An entire Stan model can be implemented this way. For instance, the following model will draw a single variable according to a unit normal probability.
parameters {
real y;
}model {
target += -0.5 * y * y;
}
This model defines a log probability function
\[ \log p(y) = - \, \frac{y^2}{2} - \log Z \]
where \(Z\) is a normalizing constant that does not depend on \(y\). The constant \(Z\) is conventionally written this way because on the linear scale, \[ p(y) = \frac{1}{Z} \exp\left(-\frac{y^2}{2}\right). \] which is typically written without reference to \(Z\) as \[ p(y) \propto \exp\left(-\frac{y^2}{2}\right). \]
Stan only requires models to be defined up to a constant that does not depend on the parameters. This is convenient because often the normalizing constant \(Z\) is either time-consuming to compute or intractable to evaluate.
Built in distributions
The built in distribution functions in Stan are all available in normalized and unnormalized form. The normalized forms include all of the terms in the log density, and the unnormalized forms drop terms which are not directly or indirectly a function of the model parameters.
For instance, the normal_lpdf
function returns the log density of a normal
distribution:
\[ \textsf{normal\_lpdf}(x | \mu, \sigma) = -\log \left( \sigma \sqrt{2 \pi} \right) -\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2 \]
The normal_lupdf
function returns the log density of an unnormalized distribution.
With the unnormalized version of the function, Stan does not define what the
normalization constant will be, though usually as many terms as possible are dropped
to make the calculation fast. Dropping a constant sigma
term, normal_lupdf
would
be equivalent to:
\[ \textsf{normal\_lupdf}(x | \mu, \sigma) = -\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2 \]
All functions ending in _lpdf
have a corresponding _lupdf
version which evaluates
and returns the unnormalized density. The same is true for _lpmf
and _lupmf
.
Relation to compound addition and assignment
The increment log density statement looks syntactically like compound
addition and assignment (see the compound arithmetic/assignment
section, it is treated as a
primitive statement because target
is not itself a variable. So,
even though
target += lp;
is a legal statement, the corresponding long form is not legal.
// BAD, target is not a variable target = target + lp;
Vectorization
The target += ...
statement accepts an argument in place of
...
for any expression type, including integers, reals,
vectors, row vectors, matrices, and arrays of any dimensionality,
including arrays of vectors and matrices. For container arguments,
their sum will be added to the total log density.
Accessing the log density
To access accumulated log density up to the current execution point,
the function target()
may be used.
The current notation replaces two previous versions. Originally, a variable
lp__
was directly exposed and manipulated; this is no longer allowed. The original statement syntax fortarget += u
wasincrement_log_prob(u)
, but this form has been deprecated and will be removed in Stan 3.↩︎Writing this model with the expression
-0.5 * y * y
is more efficient than with the equivalent expressiony * y / -2
because multiplication is more efficient than division; in both cases, the negation is rolled into the numeric literal (-0.5
and-2
). Writingsquare(y)
instead ofy * y
would be even more efficient because the derivatives can be precomputed, reducing the memory and number of operations required for automatic differentiation.↩︎