2.1 Autoregressive Models

A first-order autoregressive model (AR(1)) with normal noise takes each point \(y_n\) in a sequence \(y\) to be generated according to

\[ y_n \sim \mathsf{normal}(\alpha + \beta y_{n-1}, \sigma). \]

That is, the expected value of \(y_n\) is \(\alpha + \beta y_{n-1}\), with noise scaled as \(\sigma\).

AR(1) Models

With improper flat priors on the regression coefficients \(\alpha\) and \(\beta\) and on the positively-constrained noise scale (\(\sigma\)), the Stan program for the AR(1) model is as follows.7

data {
  int<lower=0> N;
  vector[N] y;
}
parameters {
  real alpha;
  real beta;
  real<lower=0> sigma;
}
model {
  for (n in 2:N)
    y[n] ~ normal(alpha + beta * y[n-1], sigma);
}

The first observed data point, y[1], is not modeled here because there is nothing to condition on; instead, it acts to condition y[2]. This model also uses an improper prior for sigma, but there is no obstacle to adding an informative prior if information is available on the scale of the changes in y over time, or a weakly informative prior to help guide inference if rough knowledge of the scale of y is available.

Slicing for Efficiency

Although perhaps a bit more difficult to read, a much more efficient way to write the above model is by slicing the vectors, with the model above being replaced with the one-liner

model {
  y[2:N] ~ normal(alpha + beta * y[1:(N - 1)], sigma);
}

The left-hand side slicing operation pulls out the last \(N-1\) elements and the right-hand side version pulls out the first \(N-1\).

Extensions to the AR(1) Model

Proper priors of a range of different families may be added for the regression coefficients and noise scale. The normal noise model can be changed to a Student-\(t\) distribution or any other distribution with unbounded support. The model could also be made hierarchical if multiple series of observations are available.

To enforce the estimation of a stationary AR(1) process, the slope coefficient beta may be constrained with bounds as follows.

real<lower=-1,upper=1> beta;

In practice, such a constraint is not recommended. If the data are not well fit by a stationary model it is best to know this. Stationary parameter estimates can be encouraged with a prior favoring values of beta near zero.

AR(2) Models

Extending the order of the model is also straightforward. For example, an AR(2) model could be coded with the second-order coefficient gamma and the following model statement.

for (n in 3:N)
  y[n] ~ normal(alpha + beta*y[n-1] + gamma*y[n-2], sigma);

AR(\(K\)) Models

A general model where the order is itself given as data can be coded by putting the coefficients in an array and computing the linear predictor in a loop.

data {
  int<lower=0> K;
  int<lower=0> N;
  real y[N];
}
parameters {
  real alpha;
  real beta[K];
  real sigma;
}
model {
  for (n in (K+1):N) {
    real mu = alpha;
    for (k in 1:K)
      mu += beta[k] * y[n-k];
    y[n] ~ normal(mu, sigma);
  }
}

ARCH(1) Models

Econometric and financial time-series models usually assume heteroscedasticity: hey allow the scale of the noise terms defining the series to vary over time. The simplest such model is the autoregressive conditional heteroscedasticity (ARCH) model Engle (1982). Unlike the autoregressive model AR(1), which modeled the mean of the series as varying over time but left the noise term fixed, the ARCH(1) model takes the scale of the noise terms to vary over time but leaves the mean term fixed. Models could be defined where both the mean and scale vary over time; the econometrics literature presents a wide range of time-series modeling choices.

The ARCH(1) model is typically presented as the following sequence of equations, where \(r_t\) is the observed return at time point \(t\) and \(\mu\), \(\alpha_0\), and \(\alpha_1\) are unknown regression coefficient parameters.

\[ \begin{array}{rcl} r_t & = & \mu + a_t \\[2pt] a_t & = & \sigma_t \epsilon_t \\[2pt] \epsilon_t & \sim & \mathsf{normal}(0,1) \\[2pt] \sigma^2_t & = & \alpha_0 + \alpha_1 a_{t-1}^2 \end{array} \]

In order to ensure the noise terms \(\sigma^2_t\) are positive, the scale coefficients are constrained to be positive, \(\alpha_0, \alpha_1 > 0\). To ensure stationarity of the time series, the slope is constrained to to be less than one, i.e., \(\alpha_1 < 1\).8

The ARCH(1) model may be coded directly in Stan as follows.

data {
  int<lower=0> T;   // number of time points
  real r[T];        // return at time t
}
parameters {
  real mu;                       // average return
  real<lower=0> alpha0;          // noise intercept
  real<lower=0,upper=1> alpha1;  // noise slope
}
model {
  for (t in 2:T)
    r[t] ~ normal(mu, sqrt(alpha0 + alpha1 * pow(r[t-1] - mu,2)));
}

The loop in the model is defined so that the return at time \(t=1\) is not modeled; the model in the next section shows how to model the return at \(t=1\). The model can be vectorized to be more efficient; the model in the next section provides an example.

References

Engle, Robert F. 1982. “Autoregressive Conditional Heteroscedasticity with Estimates of Variance of United Kingdom Inflation.” Econometrica 50: 987–1008.


  1. The intercept in this model is \(\alpha / (1 - \beta)\). An alternative parameterization in terms of an intercept \(\gamma\) suggested Mark Scheuerell on GitHub is \(y_n \sim \mathsf{normal}(\alpha + \gamma \cdot (y_{n-1} - \alpha), \sigma)\).

  2. In practice, it can be useful to remove the constraint to test whether a non-stationary set of coefficients provides a better fit to the data. It can also be useful to add a trend term to the model, because an unfitted trend will manifest as non-stationarity.