1.5 Logistic and probit regression
For binary outcomes, either of the closely related logistic or probit regression models may be used. These generalized linear models vary only in the link function they use to map linear predictions in \((-\infty,\infty)\) to probability values in \((0,1)\). Their respective link functions, the logistic function and the standard normal cumulative distribution function, are both sigmoid functions (i.e., they are both S-shaped).
A logistic regression model with one predictor and an intercept is coded as follows.
data {
int<lower=0> N;
vector[N] x;
array[N] int<lower=0, upper=1> y;
}
parameters {
real alpha;
real beta;
}
model {
y ~ bernoulli_logit(alpha + beta * x);
}
The noise parameter is built into the Bernoulli formulation here rather than specified directly.
Logistic regression is a kind of generalized linear model with binary outcomes and the log odds (logit) link function, defined by \[ \operatorname{logit}(v) = \log \left( \frac{v}{1-v} \right). \]
The inverse of the link function appears in the model: \[ \operatorname{logit}^{-1}(u) = \texttt{inv}\mathtt{\_}\texttt{logit}(u) = \frac{1}{1 + \exp(-u)}. \]
The model formulation above uses the logit-parameterized version of the Bernoulli distribution, which is defined by \[ \texttt{bernoulli}\mathtt{\_}\texttt{logit}\left(y \mid \alpha \right) = \texttt{bernoulli}\left(y \mid \operatorname{logit}^{-1}(\alpha)\right). \]
The formulation is also vectorized in the sense that alpha
and
beta
are scalars and x
is a vector, so that alpha + beta * x
is a vector. The vectorized formulation is equivalent
to the less efficient version
for (n in 1:N) {
y[n] ~ bernoulli_logit(alpha + beta * x[n]);
}
Expanding out the Bernoulli logit, the model is equivalent to the more explicit, but less efficient and less arithmetically stable
for (n in 1:N) {
y[n] ~ bernoulli(inv_logit(alpha + beta * x[n]));
}
Other link functions may be used in the same way. For example, probit regression uses the cumulative normal distribution function, which is typically written as
\[ \Phi(x) = \int_{-\infty}^x \textsf{normal}\left(y \mid 0,1 \right) \,\textrm{d}y. \]
The cumulative standard normal distribution function \(\Phi\) is implemented
in Stan as the function Phi
. The probit regression model
may be coded in Stan by replacing the logistic model’s sampling
statement with the following.
y[n] ~ bernoulli(Phi(alpha + beta * x[n]));
A fast approximation to the cumulative standard normal distribution
function \(\Phi\) is implemented in Stan as the function
Phi_approx
.2
The approximate probit regression model may
be coded with the following.
y[n] ~ bernoulli(Phi_approx(alpha + beta * x[n]));
The
Phi_approx
function is a rescaled version of the inverse logit function, so while the scale is roughly the same \(\Phi\), the tails do not match.↩︎