13.4 Measurement Error Models
Statistical models or differential equations may be used to estimate the parameters and/or initial state of a dynamic system given noisy measurements of the system state at a finite number of time points.
For instance, suppose the simple harmonic oscillator has a parameter value of \(\theta = 0.15\) and initial state \(y(t=0) = (1,0)\). Now suppose the system is observed at 10 time points, say \(t=1, 2, ..., 10\), where each measurement of \(y(t)\) has independent \(\textsf{normal}(0, 0.1)\) error in both dimensions (\(y_1(t)\) and \(y_2(t)\)). A plot of such measurements is shown in the simple harmonic oscillator trajectory plots.
Trajectory of the simple harmonic oscillator given parameter \(\theta=0.15\) and initial condition \(y(t=0) = (1,0)\) with additional independent \(\textsf{normal}(0,0.1)\) measurement error in both dimensions.
Simulating Noisy Measurements
The data used to make this plot is derived from the Stan model to simulate noisy observations given below.
functions {
real[] sho(real t,
real[] y,
real[] theta,
real[] x_r,
int[] x_i) {
real dydt[2];
dydt[1] = y[2];
dydt[2] = -y[1] - theta[1] * y[2];
return dydt;
}
}
data {
int<lower=1> T;
real y0[2];
real t0;
real ts[T];
real theta[1];
}
transformed data {
real x_r[0];
int x_i[0];
}
model {
}
generated quantities {
real y_hat[T,2] = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
// add measurement error
for (t in 1:T) {
y_hat[t, 1] += normal_rng(0, 0.1);
y_hat[t, 2] += normal_rng(0, 0.1);
}
}
The system of differential equations is coded as a function. The system parameters theta
and initial state y0
are read in as data along with the initial time t0
and observation times ts
. The generated quantities block is used to solve the ODE for the specified times and then add random measurement error, producing observations y_hat
. Because the system is not stiff, the rk45
solver is used. Note that when the system is linear, it is a good idea to try using the matrix exponential function, as the method can, depending on the system at hand, yield more efficient results.
This program illustrates the way in which the ODE solver is called in a Stan program,
y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
This assigns the solutions to the system defined by function sho
, given initial state y0
, initial time t0
, requested solution times ts
, parameters theta
, real data x
, and integer data x_int
. The call explicitly specifies the Runge-Kutta solver (for non-stiff systems).
Here, the ODE solver is called in the generated quantities block to provide a \(10 \times 2\) array of solutions y_hat
to which measurement error is added using the normal pseudo-random number generating function normal_rng
. The number of rows in the solution array is the same as the size of ts
, the requested solution times.
Data versus Parameters
Unlike other functions, the integration functions for ODEs are limited as to the origins of variables in their arguments. In particular, real data x
, and integer data x_int
must be expressions that only involve data or transformed data variables. The initial state y
, the initial time t0
, the requested solution times ts
, or the parameters theta
may involve parameters.
Estimating System Parameters and Initial State
Stan provides statistical inference for unknown initial states and/or parameters. The ODE solver will be used deterministically to produce predictions, much like the linear predictor does in a generalized linear model. These states will then be observed with measurement error.
functions {
real[] sho(real t,
real[] y,
real[] theta,
real[] x_r,
int[] x_i) {
real dydt[2];
dydt[1] = y[2];
dydt[2] = -y[1] - theta[1] * y[2];
return dydt;
}
}
data {
int<lower=1> T;
real y[T,2];
real t0;
real ts[T];
}
transformed data {
real x_r[0];
int x_i[0];
}
parameters {
real y0[2];
vector<lower=0>[2] sigma;
real theta[1];
}
model {
real y_hat[T,2];
sigma ~ cauchy(0, 2.5);
theta ~ std_normal();
y0 ~ std_normal();
y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
for (t in 1:T)
y[t] ~ normal(y_hat[t], sigma);
}
This Stan program allows estimates of unknown initial conditions y0
and system parameter theta
for the simple harmonic oscillator with independent normal measurement error.
A Stan program that can be used to estimate both the initial state and parameter value for the simple harmonic oscillator given noisy observations is given above. Compared to the program for simulation, the program to estimate parameters uses the integrate_ode
function in the model block rather than the generated quantities block. There are Cauchy priors on the measurement error scales sigma
and standard normal priors on the components of parameter array theta
and initial state parameter array y0
. The solutions to the ODE are then assigned to an array y_hat
, which is then used as the location in the observation noise model as follows.
y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
for (t in 1:T)
y[t] ~ normal(y_hat[t], sigma);
As with other regression-like models, it’s easy to change the noise model to be robust (e.g., Student-t distributed), to be correlated in the state variables (e.g., with a multivariate normal distribution), or both (e.g., with a multivariate Student-t distribution).
In this simple model with independent noise scales of 0.10, 10 observed data points for times \(t = 1,\dotsc, 10\) is sufficient to reliably estimate the ODE parameter, initial state, and noise scales.