```{r, include = FALSE}
knitr::opts_chunk$set(dev.args = list(pointsize = 18), warning = FALSE)
print_blocks <- function(stan_file, blocks = c()) {
lines <- readLines(stan_file)
if(is.null(blocks)) {
cat(lines, sep = "\n")
} else {
starts <- grep("^[[:alpha:]].*\\{", lines)
ends <- sapply(starts, function(x) grep("^}", lines[-(1:x)])[1] + x)
abbrev <- ! sapply(lines[starts], function(x) x %in% paste(blocks, "{"))
to_print <- list()
for(i in 1:length(starts)) {
if(abbrev[i]) {
to_print[[i]] <- paste(lines[starts[i]], "... }")
} else {
to_print[[i]] <- lines[starts[i]:ends[i]]
}
}
cat(unlist(to_print), sep = "\n")
}
}
```
## Outline
1. IRT with **edstan**
2. Constructing basic IRT models with **Stan**
3. Extending IRT models with **Stan**
- It can contain markdown
- like this list
## Part 1
1. **IRT with edstan**
+ Overview of **edstan**
+ Example analysis with the Rasch model
+ Fitting other **edstan** models
2. Constructing basic IRT models with **Stan**
3. Extending IRT models with **Stan**
## The purpose of **edstan**
The purpose is lower the start up costs of doing IRT with **Stan**.
**edstan** assists by providing
- *.stan* files for several common models
- **R** functions to help fit and interpret the models
## Installing and loading **edstan**
**rstan** should be installed first. Then **edstan** can be installed from Github.
```{r, eval=FALSE}
install.packages("devtools")
devtools::install_github("danielcfurr/edstan")
```
Then both should be loaded.
```{r, message=FALSE, warning=FALSE}
# Load rstan and edstan
library(rstan)
library(edstan)
rstan_options(auto_write = TRUE) # Optional
options(mc.cores = parallel::detectCores()) # Optional
```
## **edstan** models
| Model | **Stan** file |
|--------------------------------------------------------------------------------------------------|-------------------------|
| [Rasch](http://mc-stan.org/documentation/case-studies/rasch_latent_reg.html) | *rasch_latent_reg.stan* |
| [Partial credit](http://mc-stan.org/documentation/case-studies/pcm_latent_reg.html) | *pcm_latent_reg.stan* |
| [Rating scale](http://mc-stan.org/documentation/case-studies/2pl_latent_reg.html) | *rsm_latent_reg.stan* |
| [Two-parameter logistic](http://mc-stan.org/documentation/case-studies/rsm_latent_reg.html) | *2pl_latent_reg.stan* |
| [Generalized partial credit](http://mc-stan.org/documentation/case-studies/gpcm_latent_reg.html) | *gpcm_latent_reg.stan* |
| [Generalized rating scale](http://mc-stan.org/documentation/case-studies/grsm_latent_reg.html) | *grsm_latent_reg.stan* |
- These *.stan* files are packaged with edstan.
- Each incorporates an optional latent regression.
- Each is documented as a case study on the Stan website (follow links).
## **edstan** functions (1/5)
| Function | Purpose |
|----------------------|----------------------------------------|
| `irt_data()` | Prepares data for fitting |
| `labelled_integer()` | Create vector of consecutive integers |
| `irt_stan()` | Wrapper for running MCMC |
| `print_irt_stan()` | Show table of output |
## **edstan** functions (2/5)
`irt_data()` returns a data list in the correct format for the **edstan** models.
For wide-form data, it is used as follows:
`irt_data(response_matrix, W)`
- `response_matrix` is a matrix of scored responses.
- `W` (optional) is a matrix of person covariates.
For long-form data, it is used as follows:
`irt_data(y, ii, jj, W)`
- `y` is the scored response vector.
- `ii` is an index for items.
- `jj` is an index for persons.
- `response_matrix` and `W` both contain one row per person.
- `W` should include an intercept term.
- `NA` is permitted in `response_matrix`
- `y`, `ii`, and `jj` are each $N$ elements long, minus missing responses
- `NA` is permitted in `y`, `ii`, and `jj`, but elements in vectors may be omitted
## **edstan** functions (3/5)
`labelled_integer()` returns a vector of consecutive integers. It helps in creating indices like `ii` and `jj`.
It is used as follows:
`labelled_integer(x)`
- `x` is a vector of identifiers.
`x` can be a numeric, character, or factor variable.
The elements of the returned vector are labelled with their original values.
- Useful because ID variables in **Stan**/**edstan** must be consecutive integers.
- Because the original values are stored as labels, we can easily see the link between the original and modified values.
## **edstan** functions (4/5)
`irt_stan()` is a wrapper for `stan()` and returns a `stanfit` object.
It is used as follows:
`irt_stan(data_list, model, ...)`
- `data_list` is the result of `irt_data()`.
- `model` (optional) is the file name for one of the edstan models.
- `...` allows arguments to be passed to `stan()`, such as:
- `chains`: the number of MCMC chains.
- `iter`: the number of iterations per chain.
- Function will choose a model if not provided.
- Mostly helps by finding the location of the installed **edstan** files.
## **edstan** functions (5/5)
`print_irt_stan()` provides summaries of parameter posteriors. It is an alternative to `print()` for `stanfit` objects.
It is used as follows:
`print_irt_stan(fit, data_list, probs)`
- `fit` is the fitted model from `irt_stan()`.
- `data_list` is the formatted data list from `irt_stan()`.
- `probs` (optional) are the quantiles to include for the posteriors.
- A replacement for the **rstan** `print()` method suitable for the **edstan** models.
- Organizes parameter summaries by item.
## Example dataset: Verbal agression (1/3)
24 items, for example:
- "A bus fails to stop for me. I would want to curse."
- "I miss a train because a clerk gave me faulty information. I would scold."
- "The grocery store closes just as I am about to enter. I would shout."
Polytomous responses:
- 2 = "Yes"
- 1 = "Perhaps"
- 0 = "No"
## Example dataset: Verbal agression (2/3)
24 items, for example:
- "A bus fails to stop for me. I would want to curse."
- "I miss a train because a clerk gave me faulty information. I would scold."
- "The grocery store closes just as I am about to enter. I would shout."
Uses a $3 \times 2 \times 2$ item design:
- Behavior: *curse* versus *scold* versus *shout*.
- Mode: *do* versus *want*.
- Blame: *other* versus *self*.
## Example dataset: Verbal agression (3/3)
- $I = 24$ items and $J = 316$ persons.
- Original and dichotomized responses (`poly`, `dich`).
- Two person covariates (`anger`, `male`).
- Four item covariates (`do`, `other`, `scold`, `shout`).
```{r}
# View first few rows
head(aggression)
```
## Latent regression Rasch model (1/5)
$$
\Pr(y_{ij} = 1 | \theta_j, \beta_i) =
\mathrm{logit}^{-1} [ \theta_j - \beta_i ]
$$
$$
\theta_j | w_j \sim \mathrm{N}(w_j' \lambda, \sigma^2)
$$
- $y_{ij} = 1$ if person $j$ responded to item $i$ correctly.
- $w_{j}$ is the vector of covariates for person $j$.
- $\beta_i$ is the difficulty for item $i$.
- $\theta_j$ is the ability for person $j$.
- $\lambda$ is the vector of latent regression coefficients.
- $\sigma^2$ is the variance for the ability distribution.
## Latent regression Rasch model (2/5)
A latent regression may be included (or not) with any of the **edstan** models. To do this, create a matrix of covariates and pass it to `irt_data()`.
```{r}
# Assemble matrix of person covariates
covars <- aggression[, c("person", "anger", "male")]
covars <- unique(covars)
covars <- covars[order(covars$person), ]
W <- cbind(intercept = 1, covars[, -1])
head(W)
```
Summary of R code:
1. Select ID variable and person-related covariates.
2. Reduce to one row per person (because dataset is long-form).
3. Sort on person ID (ID = row number).
4. Remove ID variable and add intercept.
## Latent regression Rasch model (3/5)
The data list is assembled and then the model is fit.
```{r fit_rasch, cache=TRUE, result="hide", message = FALSE}
# Make the data list
data_dich <- irt_data(y = aggression$dich,
ii = labelled_integer(aggression$description),
jj = aggression$person,
W = W)
# Fit the latent regression Rasch model
fit_rasch <- irt_stan(data_dich, model = "rasch_latent_reg.stan",
iter = 200, chains = 4)
```
## Latent regression Rasch model (4/5)
```{r}
# View summary of parameter posteriors
print_irt_stan(fit_rasch, data_dich, probs = c(.025, .975))
```
----
...
```{r, echo=FALSE}
out <- capture.output(
print_irt_stan(fit_rasch, data_dich, probs = c(.025, .975))
)
cat(out[84:103], sep = "\n")
```
## Latent regression Rasch model (5/5)
Check all parameters for convergence.
```{r, message=FALSE, warning=FALSE, fig.height=4}
# Check convergence (rhat)
stan_rhat(fit_rasch)
```
## Two-parameter logistic model
The 2PL may be fit by requesting a different **Stan** model.
```{r, eval=FALSE}
# Fit the latent regression 2PL model
fit_2pl <- irt_stan(data_dich, model = "2pl_latent_reg.stan",
iter = 200, chains = 4)
```
The latent regression may be omitted by excluding `W` when using `irt_data()`.
```{r, eval=FALSE}
# Make the data list without poviding W
data_noreg <- irt_data(y = aggression$dich,
ii = labelled_integer(aggression$description),
jj = aggression$person)
# Fit the 2PL without latent regression
fit_noreg <- irt_stan(data_noreg, model = "2pl_latent_reg.stan",
iter = 200, chains = 4)
```
## Polytomous models (1/2)
To fit the polytomous models, assemble the data list using the original responses.
```{r, eval=FALSE}
data_poly <- irt_data(y = aggression$poly,
ii = labelled_integer(aggression$description),
jj = aggression$person,
W = W)
```
## Polytomous models (2/2)
Then pass the new data list to `irt_stan()` and specify the desired model.
```{r, eval=FALSE}
fit_rsm <- irt_stan(data_poly, model = "rsm_latent_reg.stan",
iter = 300, chains = 4)
fit_pcm <- irt_stan(data_poly, model = "pcm_latent_reg.stan",
iter = 300, chains = 4)
fit_gpcm <- irt_stan(data_poly, model = "gpcm_latent_reg.stan",
iter = 300, chains = 4)
fit_grsm <- irt_stan(data_poly, model = "grsm_latent_reg.stan",
iter = 300, chains = 4)
```
## Part 2
1. IRT with **edstan**
2. **Constructing basic IRT models with Stan**
- Simple Rasch model
- Latent regression Rasch model
- Latent regression 2PL model
3. Extending IRT models with **Stan**
## Simple Rasch model (1/3)
$$
\Pr(y_{ij} = 1 | \theta_j, \beta_i) =
\mathrm{logit}^{-1} [ \theta_j - \beta_i ]
$$
$$
\theta_j \sim \mathrm{N}(0, \sigma^2)
$$
Priors:
$$\beta_i \sim \mathrm{N}(0, 25)$$
$$\sigma \sim \mathrm{Exp}(.1)$$
## Simple Rasch model (2/3)
```{r, echo=FALSE, comment=""}
print_blocks("rasch_priors.stan", c("data"))
```
## Simple Rasch model (3/3)
```{r, echo=FALSE, comment=""}
print_blocks("rasch_priors.stan", c("parameters", "model"))
```
## Adding latent regression (1/3)
$$
\Pr(y_{ij} = 1 | \theta_j, \beta_i) =
\mathrm{logit}^{-1} [ \theta_j - \beta_i ]
$$
$$
\theta_j | w_j \sim \mathrm{N}(w_j' \lambda, \sigma^2)
$$
Priors:
$$\beta_1 \cdots \beta_{(I-1)} \sim \mathrm{N}(0, 25)$$
$$\sigma \sim \mathrm{Exp}(.1)$$
$$\lambda \sim \mathrm{unif}(-\infty, \infty)$$
Constraint:
$$\beta_I = -\sum_{i=1}^{(I-1)} \beta_i$$
- This is the model used in edstan.
- $W$ can include only an intercept, so is a generalization of the Rasch model.
- Item difficulties are constrained to have a mean of zero.
- Priors on $\lambda$ are uniform because the range for regression coefficients will vary greatly between applications.
## Adding latent regression (2/3)
**edstan** file: *rasch_latent_reg.stan*
```{r, echo=FALSE, comment=""}
rasch_file <- file.path(system.file("extdata", package = "edstan"),
"rasch_latent_reg.stan")
print_blocks(rasch_file, c("data"))
```
- Data block now includes `W` and `K`, previously ignored.
## Adding latent regression (3/3)
```{r, echo=FALSE, comment=""}
print_blocks(rasch_file, c("parameters", "transformed parameters", "model"))
```
- `beta_free` is made up of the unconstrained betas.
- `beta` is `beta_free` with the last, constrained beta tacked on the end.
- Priors are placed on `beta_free`, while the likelihood is calculated on `beta`.
## 2PL with latent regression (1/3)
$$
\Pr(y_{ij} = 1 | \theta_j, \alpha_i, \beta_i) =
\mathrm{logit}^{-1} [ \alpha_i (\theta_j + w_j'\lambda - \beta_i) ]
$$
$$
\theta_j \sim \mathrm{N}(0, 1)
$$
Priors:
$$\beta_1 \cdots \beta_{(I-1)} \sim \mathrm{N}(0, 25)$$
$$\alpha_i \sim \mathrm{log~N}(1, 1)$$
$$\lambda \sim \mathrm{unif}(-\infty, \infty)$$
Constraint:
$$\beta_I = -\sum_{i=1}^{(I-1)} \beta_i$$
- Uses the non-centered characterization for ability, which I have found to work better with 2PL.
- Otherwise, we are just adding in $\alpha_i$ and its prior.
## 2PL with latent regression (2/3)
**edstan** file: *2pl_latent_reg.stan*
```{r, echo=FALSE, comment=""}
twopl_file <- file.path(system.file("extdata", package = "edstan"),
"2PL_latent_reg.stan")
print_blocks(twopl_file, c("data"))
```
- Data block is the same as for the previous model.
## 2PL with latent regression (3/3)
```{r, echo=FALSE, comment=""}
print_blocks(twopl_file, c("parameters", "transformed parameters", "model"))
```
- Lower bound for `alpha` is set to zero.
- `beta` and `beta_free` are set up the same way as before.
- `mu` is created in the model block to allow for indexing `mu[jj]`.
- `.*` in the likelihood performs element-wise rather than vector multiplication.
## Fit the models directly with **rstan**
The Rasch and 2PL models may be fit with `stan()` rather than `irt_stan()`, yielding equivalent results.
```{r, eval = FALSE}
# Assuming the .stan files are in the working directory:
# Fit Rasch latent reg model
rasch_fit <- stan("rasch_latent_reg.stan", data = data_dich,
iter = 200, chains = 4)
# Fit 2PL latent reg model
twopl_fit <- stan("2pl_latent_reg.stan", data = data_dich,
iter = 200, chains = 4)
```
## 2PL tutorial
See also an in-depth
[tutorial](http://mc-stan.org/documentation/case-studies/tutorial_twopl.html)
for the 2PL model.
## Part 3
1. IRT with **edstan**
2. Constructing basic IRT models with **Stan**
3. **Extending IRT models with Stan**
- Multilevel Rasch model
- Random item Rasch model
- Linear logistic test model with error
## Multilevel Rasch model (1/3)
$$
\Pr(y_{ij} = 1 | \xi_s, \zeta_{sj}, \beta_i) =
\mathrm{logit}^{-1} [ \xi_s + \zeta_{sj} - \beta_i ]
$$
$$
\zeta_{sj} \sim \mathrm{N}(0, \sigma_1^2)
$$
$$
\xi_s \sim \mathrm{N}(0, \sigma_2^2)
$$
- $\xi_s$ is the school-level ability for school $s$.
- $\zeta_{sj}$ is the person-level ability for person $j$ in school $s$.
## Multilevel Rasch model (2/3)
```{r, echo=FALSE, comment=""}
print_blocks("multilevel_rasch_simple.stan", c("data"))
```
## Multilevel Rasch model (3/3)
```{r, echo=FALSE, comment=""}
print_blocks("multilevel_rasch_simple.stan", c("parameters", "model"))
```
## Random item Rasch model (1/4)
$$
\Pr(y_{ij} = 1 | \theta_j, \beta_i) =
\mathrm{logit}^{-1} [ \theta_{j} - \beta_i ]
$$
$$
\theta_j \sim \mathrm{N}(\mu, \sigma^2)
$$
$$
\beta_i \sim \mathrm{N}(0, \tau^2)
$$
## Random item Rasch model (2/4)
```{r, echo=FALSE, comment=""}
print_blocks("random_item.stan", c("data"))
```
## Random item Rasch model (3/4)
```{r, echo=FALSE, comment=""}
print_blocks("random_item.stan", c("parameters", "model"))
```
## Random item Rasch model (4/4)
See also a
[case study](http://mc-stan.org/documentation/case-studies/hierarchical_2pl.html)
for a 2PL random item model that includes a correlation between the discrimination and difficulty parameters.
## LLTM-E (1/9)
Linear logistic test model with error:
$$
\Pr(y_{ij} = 1 | \theta_j, \delta_i) =
\mathrm{logit}^{-1} [ \theta_{j} - \delta_i ]
$$
$$
\delta_i \equiv x_i'\beta + \epsilon_i
$$
$$
\theta_j \sim \mathrm{N}(0, \sigma^2)
$$
$$
\epsilon_i \sim \mathrm{N}(0, \tau^2)
$$
- $x_i$ is a row from item covariate matrix $X$.
- $\beta$ is a vector of regression coefficients.
- $\epsilon_i$ is the residual item difficulty.
- $\delta_i$ is the composite item difficulty.
## LLTM-E (2/9)
**Stan** file: *lltme.stan*
```{r, echo=FALSE, comment=""}
print_blocks("lltme.stan", c("data"))
```
## LLTM-E (3/9)
```{r, echo=FALSE, comment=""}
print_blocks("lltme.stan",
c("parameters", "transformed parameters", "model"))
```
## LLTM-E (4/9)
```{r, echo=FALSE, comment=""}
print_blocks("lltme.stan", c("generated quantities"))
```
## LLTM-E (5/9)
Let's revisit the verbal aggression data.
```{r}
# View first few rows
head(aggression)
```
## LLTM-E (6/9)
Make $X$, the matrix of item covariates.
```{r}
# Assemble a matrix of item covariates
item_covars <- aggression[, c("item", "do", "other", "scold", "shout")]
item_covars <- unique(item_covars)
item_covars <- item_covars[order(item_covars$item), ]
X <- cbind(intercept = 1, item_covars[, -1])
head(X)
```
## LLTM-E (7/9)
Make the full data list and fit the model.
```{r lltme_fit, cache=TRUE, result="hide", message = FALSE}
# Make the data list
data_lltme <- list(I = max(aggression$item),
J = max(aggression$person),
N = nrow(aggression),
ii = aggression$item,
jj = aggression$person,
y = aggression$dich,
K = ncol(X),
X = X)
# Fit the LLTM-E with Stan
fit_lltme <- stan("lltme.stan", data = data_lltme,
iter = 200, chains = 4)
```
## LLTM-E (8/9)
```{r lltm_converge, cache=TRUE, message=FALSE, warning=FALSE}
# Check convergence (rhat)
stan_rhat(fit_lltme)
```
## LLTM-E (9/9)
```{r lltm_print, cache=TRUE, message=FALSE, warning=FALSE}
# View a summary of parameter posteriors
print(fit_lltme, pars = c("beta", "tau", "rsq", "sigma"))
```