28 Hidden Markov Models
An elementary first-order Hidden Markov model is a probabilistic model over \(N\) observations, \(y_n\), and \(N\) hidden states, \(x_n\), which can be fully defined by the conditional distributions \(p(y_n \mid x_n, \phi)\) and \(p(x_n \mid x_{n - 1}, \phi)\). Here we make the dependency on additional model parameters, \(\phi\), explicit. When \(x\) is continuous, the user can explicitly encode these distributions in Stan and use Markov chain Monte Carlo to integrate \(x\) out.
When each state \(x\) takes a value over a discrete and finite set, say \(\{1, 2, ..., K\}\), we can take advantage of the dependency structure to marginalize \(x\) and compute \(p(y \mid \phi)\). We start by defining the conditional observational distribution, stored in a \(K \times N\) matrix \(\omega\) with \[ \omega_{kn} = p(y_n \mid x_n = k, \phi). \] Next, we introduce the \(K \times K\) transition matrix, \(\Gamma\), with \[ \Gamma_{ij} = p(x_n = j \mid x_{n - 1} = i, \phi). \] Each row defines a probability distribution and must therefore be a simplex (i.e. its components must add to 1). Currently, Stan only supports stationary transitions where a single transition matrix is used for all transitions. Finally we define the initial state \(K\)-vector \(\rho\), with \[ \rho_k = p(x_0 = k \mid \phi). \]
The Stan functions that support this type of model are special in that the user does not explicitly pass \(y\) and \(\phi\) as arguments. Instead, the user passes \(\log \omega\), \(\Gamma\), and \(\rho\), which in turn depend on \(y\) and \(\phi\).