# Proportionality Constants

When evaluating a likelihood or prior as part of the log density computation in MCMC, variational inference, or optimization, it is usually only necessary to compute the functions up to a proportionality constant (or similarly compute log densities up to an additive constant). In MCMC this comes from the fact that the distribution being sampled does not need to be normalized (and so it is the normalization constant that is ignored). Similarly the distribution does not need normalized to perform variational inference or do optimizations. The advantage of working with unnormalized distributions is they can make computation quite a bit cheaper.

There are three different syntaxes to build the model in Stan. The way to select between them is by determining if the proportionality constants are necessary. If performance is not a problem, it is always safe to use the normalized densities.

The distribution statement (`~`

) and log density increment statement (`target +=`

) with `_lupdf()`

use unnormalized densities for \(x\) (dropping proportionality constants):

```
0, 1);
x ~ normal(target += normal_lupdf(x | 0, 1); // the 'u' is for unnormalized
```

The log density increment statement (`target +=`

) with `_lpdf()`

uses the full normalized density for \(x\) (dropping no constants):

`target += normal_lpdf(x | 0, 1);`

For discrete distributions, the `target +=`

syntax is using `_lupmf`

and `_lpmf`

instead:

```
0.5);
y ~ bernoulli(target += bernoulli_lupmf(y | 0.5);
target += bernoulli_lpmf(y | 0.5);
```

## Dropping Proportionality Constants

If a density \(p(\theta)\) can be factored into \(K g(\theta)\) where \(K\) are all the factors that are a not a function of \(\theta\) and \(g(\theta)\) are all the terms that are a function of \(\theta\), then it is said that \(g(\theta)\) is proportional to \(p(\theta)\) up to a constant.

The advantage of all this is that sometimes \(K\) is expensive to compute and if it is not a function of the distribution that is to be sampled (or optimized or approximated with variational inference), there is no need to compute it because it will not affect the results.

Stan takes advantage of the proportionality constant fact with the `~`

syntax. Take for instance the normal data model:

```
data {
real mu;
real<lower=0.0> sigma;
}parameters {
real x;
}model {
x ~ normal(mu, sigma); }
```

Syntactically, this is just shorthand for the equivalent model that replaces the `~`

syntax with a `target +=`

statement and a `normal_lupdf`

function call:

```
data {
real mu;
real<lower=0.0> sigma;
}parameters {
real x;
}model {
target += normal_lupdf(x | mu, sigma)
}
```

The function `normal_lupdf`

is only guaranteed to return the log density of the normal distribution up to a proportionality constant density to be sampled. The proportionality constant itself is not defined. The full log density of the statement here is:

\[ \textsf{normal\_lpdf}(x | \mu, \sigma) = -\log \left( \sigma \sqrt{2 \pi} \right) -\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2. \]

Now because the density here is only a function of \(x\), the additive terms in the log density that are not a function of \(x\) can be dropped. In this case it is enough to know only the quadratic term:

\[ \textsf{normal\_lupdf}(x | \mu, \sigma) = -\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2. \]

## Keeping Proportionality Constants

In the case that the proportionality constants were needed for a normal log density the function `normal_lpdf`

can be used. For clarity, if there is ever a situation where it is unclear if the normalization is necessary, it should always be safe to include it. Only use the `~`

or `target += normal_lupdf`

syntaxes if it is absolutely clear that the proportionality constants are not necessary.

## User-defined Distributions

When a custom `_lpdf`

or `_lpmf`

function is defined, the compiler will automatically make available a `_lupdf`

or `_lupmf`

version of the function. It is only possible to define custom distributions in the normalized form in Stan. Any attempt to define an unnormalized distribution directly will result in an error.

The difference in the normalized and unnormalized versions of custom probability functions is how probability functions are treated inside these functions. Any internal unnormalized probability function call will be replaced with its normalized equivalent if the normalized version of the parent custom distribution is called.

The following code demonstrates the different behaviors:

```
functions {
real custom1_lpdf(x) {
return normal_lupdf(x | 0.0, 1.0)
}real custom2_lpdf(x) {
return normal_lpdf(x | 0.0, 1.0)
}
}parameters {
real mu;
}model {
// Normalization constants dropped
mu ~ custom1(); target += custom1_lupdf(mu); // Normalization constants dropped
target += custom1_lpdf(mu); // Normalization constants kept
// Normalization constants kept
mu ~ custom2(); target += custom2_lupdf(mu); // Normalization constants kept
target += custom2_lpdf(mu); // Normalization constants kept
}
```

## Limitations on Using `_lupdf`

and `_lupmf`

Functions

To avoid ambiguities in how the normalization constants work, functions ending in `_lupdf`

and `_lupmf`

can only be used in the model block or user-defined probability functions (functions ending in `_lpdf`

or `_lpmf`

).