7.12 Reject Statements
The Stan reject
statement provides a mechanism to report errors or problematic values encountered during program execution and either halt processing or reject iterations.
Like the print
statement, the reject statement accepts any number of quoted string literals or Stan expressions as arguments.
Reject statements are typically embedded in a conditional statement in order to detect variables in illegal states. For example, the following code handles the case where a variable x
’s value is negative.
if (x < 0)
reject("x must not be negative; found x=", x);
Behavior of Reject Statements
Reject statements have the same behavior as exceptions thrown by built-in Stan functions. For example, the normal_lpdf
function raises an exception if the input scale is not positive and finite. The effect of a reject statement depends on the program block in which the rejection occurs.
In all cases of rejection, the interface accessing the Stan program should print the arguments to the reject statement.
Rejections in Functions
Rejections in user-defined functions are just passed to the calling function or program block. Reject statements can be used in functions to validate the function arguments, allowing user-defined functions to fully emulate built-in function behavior. It is better to find out earlier rather than later when there is a problem.
Fatal Exception Contexts
In both the transformed data block and generated quantities block, rejections are fatal. This is because if initialization fails or if generating output fails, there is no way to recover values.
Reject statements placed in the transformed data block can be used to validate both the data and transformed data (if any). This allows more complicated constraints to be enforced that can be specified with Stan’s constrained variable declarations.
Recoverable Rejection Contexts
Rejections in the transformed parameters and model blocks are not in and of themselves instantly fatal. The result has the same effect as assigning a \(-\infty\) log probability, which causes rejection of the current proposal in MCMC samplers and adjustment of search parameters in optimization.
If the log probability function results in a rejection every time it is called, the containing application (MCMC sampler or optimization) should diagnose this problem and terminate with an appropriate error message. To aid in diagnosing problems, the message for each reject statement will be printed as a result of executing it.
Rejection is not for Constraints
Rejection should be used for error handling, not defining arbitrary constraints. Consider the following errorful Stan program.
parameters {
real a;
real<lower=a> b;
real<lower=a, upper=b> theta;
...
model {
// **wrong** needs explicit truncation
theta ~ normal(0, 1);
...
This program is wrong because its truncation bounds on theta
depend on parameters, and thus need to be accounted for using an explicit truncation on the distribution. This is the right way to do it.
theta ~ normal(0, 1) T[a, b];
The conceptual issue is that the prior does not integrate to one over the admissible parameter space; it integrates to one over all real numbers and integrates to something less than one over \([a ,b]\); in these simple univariate cases, we can overcome that with the T[ , ]
notation, which essentially divides by whatever the prior integrates to over \([a, b]\).
This problem is exactly the same problem as you would get using reject statements to enforce complicated inequalities on multivariate functions. In this case, it is wrong to try to deal with truncation through constraints.
if (theta < a || theta > b)
reject("theta not in (a, b)");
// still **wrong**, needs T[a,b]
theta ~ normal(0, 1);
In this case, the prior integrates to something less than one over the region of the parameter space where the complicated inequalities are satisfied. But we don’t generally know what value the prior integrates to, so we can’t increment the log probability function to compensate.
Even if this adjustment to a proper probability model may seem minor in particular models where the amount of truncated posterior density is negligible or constant, we can’t sample from that truncated posterior efficiently. Programs need to use one-to-one mappings that guarantee the constraints are satisfied and only use reject statements to raise errors or help with debugging.