`Stan`

Case StudyThis case study uses **Stan** to fit the Dyadic Item Response Theory (dIRT) model proposed by (Gin et al. 2019) to measure interactions between pairs of individuals when the responses to items represent the actions/behaviors/perceptions of an individual (called the ‘actor’) made within the context of a dyad formed with another individual (called the ‘partner’). The dIRT model is fit using **Stan** (version 2.18.1) in **R** via the **rstan** package. Additionally, the necessary packages and options that will be used in subsequent **R** code are presented below.

```
# Rstan Package and Options
library(rstan)
rstan_options(auto_write = TRUE)
# Other Required Packages
library(tidyverse)
```

`## Warning: package 'tidyverse' was built under R version 3.5.3`

`## Warning: package 'dplyr' was built under R version 3.5.3`

`## Warning: package 'forcats' was built under R version 3.5.3`

In a social setting where groups of individuals interact, it is likely that the behavior of individual \(a \in \{1, 2, \ldots, n\}\) (called the actor) in group \(g\) is affected not only by his/her own latent traits, but also those of the individuals he/she interacts with. There could also be a ``unique’’ component attributable to the specific composition of the group that could affect the actor’s behavior above and beyond the effects at the individual level. Gin et al. (2019) proposed extending any IRT model to deal with such a setting by replacing the latent trait \(\theta_a\) of individual \(a\), with a composite latent trait \(\theta_{a, g}\) of individual \(a\) in the context of group \(g\) of size \(n\): \[\begin{equation} \label{generallinear} \theta_{a,g} \equiv \alpha_a + \sum_{\substack{j = 1\\j \ne a}}^n \beta_j + \sum_{k \in K} \gamma_{a, g(k)}. \end{equation}\] For the last term, the index set is defined as \(K := \{A\subseteq \{1, 2, \ldots, n\}\setminus\{a\} \mid |A| \ge 1\}\) (i.e., the set of all subsets of \(\{1, 2, \ldots, n\}\setminus\{a\}\) except the empty set).

In particular, if (i) the individuals interact only in pairs, and (ii) the Partial Credit Model is appropriate to describe the underlying item response patterns, the behavior \(y_{a,p,i}\) of the actor \(a\) when paired with the partner \(p\) in response to item \(i\) with \(m_i\) categories, can be modeled as \[\begin{equation} \label{dirt} \log \left( \frac{\Pr(y_{a, p, i}=j \mid \theta_{a, p}, \delta_{i,j})}{\Pr(y_{a, p, i}=j-1 \mid \theta_{a, p}, \delta_{i,j})} \right) = \theta_{a, p} - \delta_{i, j} \equiv (\alpha_a + \beta_p + \gamma_{a,p}) - \delta_{i, j}, \end{equation}\] subject to the constraint that \(\sum_{j=0}^{m_i-1} \Pr(y_{a, p, i}=j \mid \theta_{a, p}, \delta_{i, j}) = 1\), where \(j \in \{1, 2 \ldots, m_i-1\}\). Here, \(\delta_{i, j}\) are item step difficulties with \(\delta_{i,0} = 0\).

Alternatively, the Partial Credit Model can also be formulated as \[\begin{align} \label{dirt2} \Pr(y_{a, p, i}=j \mid \theta_{a, p}, \delta_{i,j}) &= \frac{\exp\left( \sum_{k=0}^j(\theta_{a, p} - \delta_{i, k})\right)}{\sum_{l=0}^m \exp\left( \sum_{k=0}^l(\theta_{a, p} - \delta_{i, k})\right)}\\ &= \frac{\exp\left( \sum_{k=0}^j((\alpha_a + \beta_p + \gamma_{a,p}) - \delta_{i, k})\right)}{\sum_{l=0}^m \exp\left( \sum_{k=0}^l((\alpha_a + \beta_p + \gamma_{a,p}) - \delta_{i, k})\right)}. \end{align}\] where \(\sum_{k=0}^0(\theta_{a, p} - \delta_{i, k}) = 0\).

In this setting, it is reasonable to assume that the latent trait of individual \(a\) when playing the role of an actor, \(\alpha_a\), is correlated with the latent trait of the same individual playing the role of is a partner, \(\beta_a\). Furthermore, it may also be resonable to assume that the latent trait of the dyad, \(\gamma_{a, p}\) when individual \(a\) is the actor and individual \(p\) is the partner may be correlated with the latent trait of the dyad when their roles are reversed, \(\gamma_{p, a}\). And so, we assume that the latent variables \(\alpha_a\), \(\beta_a\), \(\gamma_{a, p}\), and \(\gamma_{p, a}\) have bivariate normal distributions as follows: \[\begin{align} \label{hyperparameters} \left[ \begin{array}{c} \alpha_a \\ \beta_a \\ \end{array} \right] &\sim \text{N}\left( \left[\begin{array}{c} \mu_\alpha \\ \mu_\beta \\ \end{array} \right] , \left[ \begin{array}{cc} \sigma_\alpha^2 & \rho_{\alpha\beta}\sigma_\alpha\sigma_\beta \\ \rho_{\alpha\beta}\sigma_\alpha\sigma_\beta & \sigma_\beta^2 \\ \end{array} \right] \right), \nonumber \\ \left[ \begin{array}{c} \gamma_{a, p} \\ \gamma_{p, a} \\ \end{array} \right] &\sim \text{N}\left( \left[ \begin{array}{c} \mu_\gamma \\ \mu_\gamma \\ \end{array} \right] , \left[ \begin{array}{cc} \sigma_\gamma^2 & \rho_{\gamma}\sigma_\gamma^2 \\ \rho_{\gamma}\sigma_\gamma^2 & \sigma_\gamma^2 \\ \end{array} \right] \right), \end{align}\] where \(\sigma_\alpha^2\), \(\sigma_\beta^2\), and \(\sigma_\gamma^2\) are the variances of the latent traits \(\alpha\), \(\beta\) and \(\gamma\), \(\rho_{\alpha\beta}\) and \(\rho_\gamma\) are the correlations, and \(\mu\)’s are the corresponding expectations of the latent traits. For identification, we typically set the expectations of the latent variables to zero (i.e., \((\mu_\alpha = \mu_\beta = \mu_\gamma = 0)\)), and allow the item step difficulties \(\delta_{i, j}\) to be unconstrained (i.e., we anchor on latent trait scores instead of item difficulties) with the exception that \(\delta_{i,0} = 0\).

As the Dyadic Partial Credit Model is an extension of the Partial Credit Model, we adapt the *Stan* code from Furr (2017). For ease of exposition, we assume that all items have the same number of response categories (\(m_i = m\) for all \(i\)), although it is straightforward to extend the code below to deal with the case where \(m_i\) is not constant as per Furr (2017).

```
functions {
real pcminteract(int x, real alpha, real beta, real gamma, vector delta) {
vector[rows(delta) + 1] unsummed;
vector[rows(delta) + 1] probs;
unsummed = append_row(rep_vector(0.0, 1), alpha + beta + gamma - delta);
probs = softmax(cumulative_sum(unsummed));
return categorical_lpmf(x+1 | probs);
}
}
data {
int<lower = 1> I; // # items
int<lower = 1> A; // # actors (or partners)
int<lower = 1> U; // # undirected pairs
int<lower = 1> N; // # responses
int<lower = 1, upper = A> aa[N]; // size N array to index actors for each response
int<lower = 1, upper = A> pp[N]; // size N array to index partners for each response
int<lower = 1, upper = I> ii[N]; // size N array to index items for each response
int<lower = 0> x[N]; // size N array for responses; x = 0, 1 ... m_i
int<lower = 1, upper = U> dd[N]; // size N array to index undirected pairs for each response
int<lower = 1, upper = 2> mm[N]; // size N array taking values 1 or 2 for two directed pairs
// involving the same two individuals
}
transformed data {
int M; // # parameters per item (same for all items)
M = max(x);
}
parameters {
vector[M] delta[I]; // length m vector for each item i
vector[2] AB[A]; // size 2 vector of alpha and beta for each person;
vector[2] GG[U]; // size 2 vector of gammas for each undirected pair;
real<lower = 0> sigmaA; // real sd of alpha
real<lower = 0> sigmaB; // real sd of beta
real<lower = 0> sigmaG; // real sd of gamma
real<lower = -1, upper = 1> rhoAB; // real cor between alpha and beta (within person)
real<lower = -1, upper = 1> rhoG; // real cor between gammas (within pair)
}
transformed parameters {
cov_matrix[2] SigmaAB; // 2x2 covariance matrix of alpha and beta
cov_matrix[2] SigmaG; // 2x2 covariance matrix of gammas
SigmaAB[1, 1] = sigmaA^2;
SigmaAB[2, 2] = sigmaB^2;
SigmaAB[1, 2] = rhoAB * sigmaA * sigmaB;
SigmaAB[2, 1] = rhoAB * sigmaA * sigmaB;
SigmaG[1, 1] = sigmaG^2;
SigmaG[2, 2] = sigmaG^2;
SigmaG[1, 2] = rhoG * sigmaG^2;
SigmaG[2, 1] = rhoG * sigmaG^2;
}
model {
AB ~ multi_normal(rep_vector(0.0, 2), SigmaAB);
GG ~ multi_normal(rep_vector(0.0, 2), SigmaG);
for (n in 1:N){
target += pcminteract(x[n], AB[aa[n],1], AB[pp[n],2], GG[dd[n], mm[n]], delta[ii[n]]);
}
}
```

The function block in the code above declares a user-specified function `pcminteract`

which takes in (i) an integer \(x\) for a response to an item, (ii) a real value \(\alpha\) for the latent trait of the actor, (iii) a real value \(\beta\) for the latent trait of the partner, (iv) a real value \(\gamma\) for the unique latent interaction of the pair, and (v) a vector \(\delta\) for the \(m-1\) non-zero item step parameters for the items. The `unsummed`

vector in the code has elements \((\alpha_a + \beta_p + \gamma_{a,p}) - \delta_{i, k}\) for \(k = 0, \ldots, m-1\), and when acted on by the `cumulative_sum`

function, produces a vector whose with \(l^{\text{th}}\) element is \(\left( \sum_{k=0}^l((\alpha_a + \beta_p + \gamma_{a,p}) - \delta_{i, k})\right)\). With this vector as input, the `softmax`

function outputs a vector of the same length, whose entries are given by the equation above. Hence, for each item response \(y_{a, p, i}\), the function `pcminteract`

outputs its associated likelihood contribution.

In the data block, we first declare integers for the number of items, actors, directed and undirected pairs, and the total number of responses. Next, we define unidimensional arrays `aa`

, `pp`

, and `ii`

, each of size \(N\), to index the associated actor, partner, and item for each of the \(N\) responses. That is, when we input the data in **R**, we would have a vector of length \(N\) for all responses, a vector of length \(N\) enumerating which of the \(A\) actors was responding, etc. Next, we also have a unidimensional array `x`

of size \(N\) collecting all the responses to be analyzed. Additionally, because the latent trait \(\gamma_{a, p}\) of the pair is correlated with the latent trait \(\gamma_{p,a}\) of the same pair when the roles of actor and partner are reversed, we need an indexing system to link these two directed pairs. As such, we define unidimensional arrays, `dd`

and `mm`

each of length \(N\). `dd`

takes the values \(1\) to \(U\) where \(U\) is the number of pairs ignoring their roles as either actor or partner (also called undirected pair), and `mm`

takes values 1 and 2 to (arbitarily) distinugish the two directed pairs corresponding to each directed pair. The transformed data block generates a variable \(M\) which represents the number of step difficulty parameters per item.

In the parameters block, for each of \(I\) items, we declare a vector `delta`

of \(M\) parameters to be estimated. For each of \(A\) individuals, we then declare a two-dimensional vector `AB`

that contains the latent traits of the individuals both as an actor (element 1) and as a partner (element 2). This is needed because each individual’s latent trait as an actor is correlated with his/her own latent trait as a partner. For each of \(U\) undirected pairs, we also declare a two-dimensional vector `GG`

that contains the latent traits of each of the two associated directed pairs. Finally, we declare real values for the standard deviations and correlations of the latent traits.

In order to properly specify bivariate priors for each individual and for each undirected pair, we construct two \(2\times 2\) matrices comprising the variances and covariances in the transformed parameters block.

Lastly, in the model block, we specify the Gaussian priors on the latent variables both at the individual and pair levels, and the sum of the log likelihood contributions.

We make use of a speed-dating dataset (Fisman et al. 2006, 2008) to examine the mutual attractiveness ratings of two individuals in a dyad to look for evidence of interactions that cannot be explained solely by the individuals’ attractiveness or rating preferences. The data, consisting of 551 individuals, interacting in 4,119 undirected pairs, were collected at 21 separate researcher-organized speed-dating sessions, over a period of 2 years, with 10-44 heterosexual students from graduate and professional schools at Columbia University in each session. During these sessions, attended by nearly an equal number of male and female participants, all members of one gender would meet and interact with every member of the opposite gender for 5 minutes each. At the end of the 5 minute session, participants would rate their partner based on five attractiveness criteria on a scale of 1 to 10 on a form attached to a clipboard that they were provided with.

After cleaning and formatting the publicly available dataset, we collect information into one file `df.complete`

that captures individual partner ratings from each actor according to the five criteria. From the `df.complete`

dataset, the columns labeled `actor`

, `partner`

, and `item`

identify the actor, partner, and the item to which the actor has responded. The item responses are in column `x`

and were recoded to take values from 0 to 4. Additionally, the columns `unique.pair`

and `selector`

index the undirected pair as well as an arbitary selection variable that takes the value 1 or 2 to distinguish the pair when the actor and partner switch roles. For example, as seen below, row 171 of the `df.complete`

file shows that actor 4 rated partner 15 on item 1 with a score of 2. In contrast, actor 15 rated partner 4 on item 1 with a score of 3. Both individuals 4 and 15 are part of the undirected pair labeled 35, and the selector variable differentiates the situation when individual 4 is the actor from when he/she is the partner.

```
# explore datasets
load("df.complete.Rdata")
df.complete[df.complete$unique.pair==35, ]
```

```
## actor partner item x unique.pair selector male decision pair
## 171 4 15 1 2 35 1 0 0 35
## 172 4 15 2 3 35 1 0 0 35
## 173 4 15 3 4 35 1 0 0 35
## 174 4 15 4 3 35 1 0 0 35
## 175 4 15 5 3 35 1 0 0 35
## 715 15 4 1 3 35 2 1 1 144
## 716 15 4 2 3 35 2 1 1 144
## 717 15 4 3 3 35 2 1 1 144
## 718 15 4 4 4 35 2 1 1 144
## 719 15 4 5 4 35 2 1 1 144
## match.pair dpair
## 171 144 5557
## 172 144 5557
## 173 144 5557
## 174 144 5557
## 175 144 5557
## 715 35 5558
## 716 35 5558
## 717 35 5558
## 718 35 5558
## 719 35 5558
```

As described in the previous section, we need to ensure that all entires in the data block of the *Stan* model are provided. We begin by entering the number of items \(I\), individuals \(A\), undirected pairs \(U\), and the total number of responses \(N\). Next, we enter indices for the actors, partners and items for each response, as vectors of length \(N\) (one element for each response). We then enter the actual responses. Lastly, we enter an index to identify the undirected pair, and an index taking either 1 or 2 to index the directionality of that pair for each response. This is fed into *Stan* as a list, although it is sufficient to include them as objects in the *Stan* enviroment.

For reproducibility, we set the seed to 349 both inside and outside the `stan`

command and use 4 chains with 2000 iterations (including warmup) each.

```
# stan model
I <- max(df.complete$item)
A <- max(df.complete$a)
U <- max(df.complete$unique.pair)
N <- nrow(df.complete)
data <- list(I = I,
A = A,
U = U,
N = N,
aa = as.numeric(df.complete$actor),
pp = as.numeric(df.complete$partner),
ii = as.numeric(df.complete$item),
x = as.numeric(df.complete$x),
dd = as.numeric(df.complete$unique.pair),
mm = as.numeric(df.complete$selector))
set.seed(349)
samples <- stan(model_code=model,
data=data,
iter=2000,
chains=4,
seed = 349)
```

The estimates of the standard deviation and correlation paramaters can be obtained using the `summary`

method. Here, we specify the parameters of interest using the `pars`

option, and also specify suitable quantiles to obtain credible intervals. Although *Stan* provides chain-specific summaries of the estimates, it is usually sufficient to look at the overall estimates given under the header `$summary`

. Here, we are provided with the point estimate, the Monte-Carlo standard error, the estimated standard error, as well as the \(95\%\) credible interval for each of the parameters of interest. The convergence of the chains is evaluated for each parameter using the effective sample size (`n_eff`

) and the \(\hat{R}\) statistic (`Rhat`

).

```
pcm_estimated_values <- summary(samples,
pars = c("sigmaA",
"sigmaB",
"sigmaG",
"rhoAB",
"rhoG"),
probs = c(.025, .975))
pcm_estimated_values
```

```
## $summary
## mean se_mean sd 2.5% 97.5% n_eff
## sigmaA 1.05174762 0.0008655761 0.03768598 0.98255285 1.1275805 1895.6107
## sigmaB 0.70580694 0.0006877849 0.02662534 0.65565742 0.7596038 1498.5973
## sigmaG 0.88533763 0.0006510104 0.01526100 0.85575823 0.9151220 549.5283
## rhoAB 0.03461021 0.0012101041 0.04974198 -0.06309679 0.1323778 1689.6648
## rhoG 0.34738810 0.0008608882 0.02624560 0.29740100 0.3995651 929.4362
## Rhat
## sigmaA 1.002897
## sigmaB 1.000298
## sigmaG 1.005546
## rhoAB 1.002974
## rhoG 1.006515
##
## $c_summary
## , , chains = chain:1
##
## stats
## parameter mean sd 2.5% 97.5%
## sigmaA 1.04943368 0.03661615 0.98288581 1.1232771
## sigmaB 0.70534949 0.02548315 0.65849985 0.7564466
## sigmaG 0.88538737 0.01448910 0.85811324 0.9145936
## rhoAB 0.03733387 0.04785601 -0.05720353 0.1297648
## rhoG 0.34735910 0.02745884 0.29900039 0.4031490
##
## , , chains = chain:2
##
## stats
## parameter mean sd 2.5% 97.5%
## sigmaA 1.04897503 0.03745938 0.97947683 1.1257274
## sigmaB 0.70622441 0.02610795 0.65597712 0.7597755
## sigmaG 0.88318652 0.01513011 0.85550113 0.9135456
## rhoAB 0.02948156 0.05149576 -0.06785402 0.1299124
## rhoG 0.34922475 0.02549863 0.30005796 0.3995651
##
## , , chains = chain:3
##
## stats
## parameter mean sd 2.5% 97.5%
## sigmaA 1.05210451 0.03772080 0.98099539 1.1267277
## sigmaB 0.70520166 0.02657518 0.65360367 0.7585040
## sigmaG 0.88656288 0.01596351 0.85587483 0.9182331
## rhoAB 0.03298757 0.04803654 -0.05798885 0.1290833
## rhoG 0.34418605 0.02637612 0.29385790 0.3961792
##
## , , chains = chain:4
##
## stats
## parameter mean sd 2.5% 97.5%
## sigmaA 1.05647727 0.03850736 0.98526966 1.1314866
## sigmaB 0.70645218 0.02827258 0.65335602 0.7617349
## sigmaG 0.88621373 0.01522165 0.85504865 0.9147353
## rhoAB 0.03863785 0.05101311 -0.06058071 0.1359205
## rhoG 0.34878249 0.02533654 0.29935542 0.3961572
```

Gin et al. (2019) extend the Dyadic Partial Credit Model to (i) account for individual or dyadic covariates that may affect the latent traits (e.g., the gender of the actor), and (ii) to embed the model in a distal outcome regression model. Using the example in the previous section, we show how to incorporate the gender of the actor into the model, as well as how to embed the basic dIRT model into the distal outcome regression of the indicator of an actor electing to see the partner again on the latent traits from the dIRT model using `Stan`

.

```
functions {
real pcminteract(int x, real alpha, real beta, real gamma, vector delta) {
vector[rows(delta) + 1] unsummed;
vector[rows(delta) + 1] probs;
unsummed = append_row(rep_vector(0.0, 1), alpha + beta + gamma - delta);
probs = softmax(cumulative_sum(unsummed));
return categorical_lpmf(x+1 | probs);
}
}
data {
int<lower = 1> I; // # items
int<lower = 1> A; // # actors (or partners)
int<lower = 1> U; // # undirected pairs
int<lower = 1> N; // # responses
int<lower = 1> D; // # decisions
int<lower = 1> B; // # distal regression parameters
int<lower = 1, upper = A> aa[N]; // size N array to index actors for each response
int<lower = 1, upper = A> pp[N]; // size N array to index partners for each response
int<lower = 1, upper = I> ii[N]; // size N array to index items for each response
int<lower = 0> x[N]; // size N array for responses; x = 0, 1 ... m_i
int<lower = 1, upper = U> dd[N]; // size N array to index undirected pairs for each response
int<lower = 1, upper = 2> mm[N]; // size N array taking values 1 or 2 for two directed pairs
// involving the same two individuals
int<lower = 0, upper = 1> gg[N]; // size N array to index gender for each response
int<lower = 1, upper = A> aaa[D]; // size D array to index actors for each decision
int<lower = 1, upper = A> ppp[D]; // size D array to index partners for each decision
int<lower = 1, upper = U> ddd[D]; // size D array to index undirected pairs for each decision
int<lower = 1, upper = 2> mmm[D]; // size D array taking values 1 or 2 for two directed pairs
// involving the same two individuals
int<lower = 0, upper = 1> zzz[D]; // size D array for decisions
}
transformed data {
int M; // # parameters per item (same for all items)
M = max(x);
}
parameters {
vector[M] delta[I]; // length m vector for each item i
vector[2] AB[A]; // size 2 vector of alpha and beta for each person;
vector[2] GG[U]; // size 2 vector of gammas for each undirected pair;
real<lower = 0> sigmaA; // real sd of alpha
real<lower = 0> sigmaB; // real sd of beta
real<lower = 0> sigmaG; // real sd of gamma
real<lower = -1, upper = 1> rhoAB; // real cor between alpha and beta (within person)
real<lower = -1, upper = 1> rhoG; // real cor between gammas (within pair)
real mu; // real value of mean of theta for males
real beta[B]; // B-dimensional array of real valued of beta (distal regression parameters)
}
transformed parameters {
cov_matrix[2] SigmaAB; // 2x2 covariance matrix of alpha and beta
cov_matrix[2] SigmaG; // 2x2 covariance matrix of gammas
SigmaAB[1, 1] = sigmaA^2;
SigmaAB[2, 2] = sigmaB^2;
SigmaAB[1, 2] = rhoAB * sigmaA * sigmaB;
SigmaAB[2, 1] = rhoAB * sigmaA * sigmaB;
SigmaG[1, 1] = sigmaG^2;
SigmaG[2, 2] = sigmaG^2;
SigmaG[1, 2] = rhoG * sigmaG^2;
SigmaG[2, 1] = rhoG * sigmaG^2;
}
model {
AB ~ multi_normal(rep_vector(0.0, 2), SigmaAB);
GG ~ multi_normal(rep_vector(0.0, 2), SigmaG);
for (n in 1:N){
target += pcminteract(x[n], AB[aa[n],1] - mu*gg[n], AB[pp[n],2], GG[dd[n], mm[n]], delta[ii[n]]);
}
for (d in 1:D){
//distal logistic regression
target += bernoulli_logit_lpmf(zzz[d] | (beta[1]
+ beta[2]*AB[aaa[d],1]
+ beta[3]*AB[ppp[d],1]
+ beta[4]*AB[aaa[d],2]
+ beta[5]*AB[ppp[d],2]
+ beta[6]*GG[ddd[d], mmm[d]]
+ beta[7]*GG[ddd[d], (3-mmm[d])]));
}
}
```

Firstly, suppose there is a difference in the mean of the latent traits of males versus females. If we assume now that the latent traits \(\alpha\) and \(\beta\) are drawn from a bivarate normal distribution given by \[\begin{align}
\left[ \begin{array}{c}
\alpha_a \\
\beta_a \\
\end{array} \right]
&\sim
\text{N}\left(
\left[\begin{array}{c}
\mu 1_{\text{male}} \\
0 \\
\end{array} \right]
,
\left[ \begin{array}{cc}
\sigma_\alpha^2 & \rho_{\alpha\beta}\sigma_\alpha\sigma_\beta \\
\rho_{\alpha\beta}\sigma_\alpha\sigma_\beta & \sigma_\beta^2 \\
\end{array} \right]
\right),
\end{align}\] where \(1_{\text{male}}\) is an indicator for individual \(a\) being male, and \(\mu\) is difference between the mean latent trait between males and females, then we can modify the `Stan`

code in the previous section by first defining a unidimensional array `gg`

of size \(N\) to index the gender of the actor in the data block. Then, in the parameters block, we declare an additional real parameter `mu`

. Since drawing the latent traits \(\alpha\) and \(\beta\) from a bivariate normal with mean \((\mu 1_{\text{male}}, 0)^\intercal\) is equivalent to drawing the transformed latent trait vector \((\alpha - \mu 1_{\text{male}}, \beta)^\intercal\) from a bivariate normal distribution with the same covariance matrix but with mean \((0, 0)^\intercal\), we can incorporate the `mu`

parameter in the second argument of the `pcminteract`

function in the model block.

Next, suppose we have access to a distal outcome for each directed pair, and we are interested in the effect of the latent traits of each dyad and the latent traits of the individuals comprising the dyad on the distal outcome. That is, for each dyad \((a, p)\), we are interested in estimated the model

\[\begin{equation} \log \left( \frac{\pi_{a, p}}{1-\pi_{a, p}}\right) = b_0 + b_1\alpha_a + b_2\alpha_p + b_3\beta_a + b_4\beta_p + b_5\gamma_{a, p} + b_6\gamma_{p, a}, \end{equation}\]

where \(\pi_{a, p}\) is the probability that the distal outcome takes the value of 1, and \(\alpha_a\), \(\alpha_p\), \(\beta_a\), \(\beta_p\), \(\gamma_{a,p}\), and \(\gamma_{p,a}\) are the latent traits in the dIRT model. The additional data we need to provide could take the following form where each row represents a directed pair and the last column labeled `decision’ contains the distal outcome.

```
# explore datasets
load("dpair.specific.Rdata")
dpair.specific[dpair.specific$unique.pair==35, ]
```

```
## # A tibble: 2 x 6
## actor partner unique.pair selector male decision
## <dbl> <dbl> <dbl> <dbl> <int> <int>
## 1 4 15 35 1 0 0
## 2 15 4 35 2 1 1
```

We may be interested in joinly estimating the dIRT model toegther with this distal outcome regression model. To do so, we would need to declare the number of distal outcomes `D`

(which is equal to the number of directed pairs) as well as the number of regression coefficients `B`

in the data block. Next, we declare unidimensional arrays `aaa`

, `ppp`

, `ddd`

and `mmm`

, each of size \(D\), to index the associated actor, partner, and directed pair for each of the \(D\) responses. Note that these are entirely analagous to the arrays `aa`

, `pp`

, `dd`

, and `mm`

but are indices for the distal outcome instead of the responses. We also declare an array of size \(D\) to capture the distal outcomes.

In the parameters block, we declare an array of length `B`

for each regression coefficient in the distal outcome regression model.

Lastly, in the model block, we specifiy the distal outcome regression model contribution to the loglikelihood. Note that this contribution is in addition to that from the dIRT model. The total contribution is captured by the term `target`

which loops over all responses followed by all distal outcomes. In this example, since we are running a distal outcome logistic regression, we make use of the built-in `bernoulli_logit_lpmf`

function.

Fisman, Raymond, Sheena S. Iyengar, Emir Kamenica, and Itamar Simonson. 2006. “Gender Differences in Mate Selection: Evidence from a Speed Dating Experiment.” *The Quarterly Journal of Economics* 121: 673–97.

———. 2008. “Racial Preferences in Dating.” *The Review of Economic Studies* 75: 117–32.

Furr, Daniel C. 2017. “Partial Credit and Generalized Partial Credit Models with Latent Regression.” *Stan Case Studies*. Retrieved from https://mc-stan.org/users/documentation/case-studies/pcm_and_gpcm.html.

Gin, Brian, Nicholas Sim, Anders Skrondal, and Sophia Rabe-Hesketh. 2019. “A Dyadic IRT Model.” *arXiv E-Prints*, June, arXiv:1906.01100. http://arxiv.org/abs/1906.01100.