- What is
**hBayesDM**? - User Support (New in December, 2016)
- Latest Version
- Motivation
- Why hierarchical Bayesian analysis (HBA)?
- Prerequisites
- List of tasks and computational models implemented in hBayesDM
- How to install hBayesDM
- How to use hBayesDM
- To-do list
- Citation
- Contact information
- Suggested reading
- Other Useful Links
- References

hBayesDM (** h**ierarchical

Users can ask questions and make suggestions through our mailing list or GitHub.

0.3.0 (Jan 14, 2016)

**What’s new **

Version 0.3.0

- Made several changes following the guidelines for R packages providing interfaces to Stan.

- Stan models are precompiled and models will run immediately when called.

- The default number of chains is set to 4.

- Set the default value of `adapt_delta`

to 0.95 to reduce the potential for divergences.

- `rhat`

function uses LOOIC by default. Users can select WAIC or both (LOOIC & WAIC) if needed.

Version 0.2.3.3

- Add help files.

- Add a function (`rhat`

) for checking Rhat values.

- Change a link to its tutorial website.

Version 0.2.3.2

- Use wide normal distributions for unbounded parameters.

- Automatic removal of rows (trials) containing NAs.

Version 0.2.3.1

- Add a new function (`plotInd`

) for plotting individual parameters. See its help file (`?plotInd`

) for more details.

Computational modeling provides a quantitative framework for investigating latent neurocognitive processes (e.g., learning rate, reward sensitivity) and interactions among multiple decision-making systems. Parameters of a computational model reflect psychologically meaningful individual differences: thus, getting accurate parameter estimates of a computational model is critical to improving the interpretation of its findings. Hierarchical Bayesian analysis (HBA) is regarded as the gold standard for parameter estimation, especially when the amount of information from each participant is small (see below “Why hierarchical Bayesian analysis?”). However, many researchers interested in HBA often find the approach too technical and challenging to be implemented.

We introduce a free R package **hBayesDM**, which offers HBA of various computational models on an array of decision-making tasks (see below for a list of tasks and models currently available). * Users can perform HBA of various computational models with a single line of coding*. Example datasets are also available. With hBayesDM, we hope anyone with minimal knowledge of programming can take advantage of advanced computational modeling and HBA. It is our expectation that hBayesDM will contribute to the dissemination of these computational tools and enable researchers in related fields to easily characterize latent neurocognitive processes within their study populations.

Most computational models do not have closed form solutions and we need to estimate parameter values. Traditionally parameters are estimated at the individual level with maximum likelihood estimation (MLE): getting point estimates for each individual separately. However, individual MLE estimates are often noisy especially when there is insufficient amount of data. A group-level analysis (e.g., group-level MLE), which estimate a single set of parameters for the whole group of individuals, may generate more reliable estimates but inevitably ignores individual differences.

HBA and other hierarchical approaches (e.g., Huys et al., 2011) allow for individual differences while pooling information across individuals. Both individual and group parameter estimates (i.e., posterior distributions) are estimated simultaneously in a mutually constraining fashion. Consequently, individual parameter estimates tend to be more stable and reliable because commonalities among individuals are captured and informed by the group tendencies (e.g., Ahn et al., 2011). HBA also finds full posterior distributions instead of point estimates (thus providing rich information about the parameters). HBA also makes it easy to do group comparisons in a Bayesian fashion (e.g., comparing clinical and non-clinical groups, see an example below).

HBA is a branch of Bayesian statistics and the conceptual framework of Bayesian data analysis is clearly written in Chapter 2 of John Kruschke’s book (J. Kruschke, 2014). In Bayesian statistics, we assume prior beliefs (i.e., prior distributions) for model parameters and update the priors into posterior distributions given the data (e.g., trial-by-trial choices and outcomes) using Bayes’ rule. Note that the prior distributions we use for model parameters are vague (e.g., flat) or weakly informative priors, so they play a minimal role in the posterior distribution.

For Bayesian updating, we use the Stan software package (http://mc-stan.org/), which implements a very efficient Markov Chain Monte Carlo (MCMC) algorithm called Hamiltonian Monte Carlo (HMC). HMC is known to be effective and works well even for large complex models. See Stan reference manual (http://mc-stan.org/documentation/) and Chapter 14 of J. Kruschke (2014) for a comprehensive description of HMC and Stan. What is MCMC and why shoud we use it? Remember, we need to update our priors into posterior distributions in order to make inference about model parameters. Simply put, MCMC is a way of approximating a posterior distribution by drawing a large number of samples from it. MCMC algorithms are used when posterior distributions cannot be analytically achieved or using MCMC is more efficient than searching for the whole grid of parameter space (i.e., grid search). To learn more about the basic foundations of MCMC, we recommend Chapter 7 of J. Kruschke (2014).

Detailed specification of Bayesian models is not available in text yet (stay tuned for our tutorial paper whose citation is listed below). At the same time, users can go over our Stan codes to check how we implement each computational model (e.g., `PathTo_gng_m1 = system.file("stan/gng_m1.stan", package="hBayesDM")`

). We made strong efforts to optimize Stan codes through reparameterization (e.g., Matt trick) and vectorization.

- R version 3.2.0 or later is required. R is freely available from http://www.r-project.org/.
**Latest Stan (RStan 2.14.1 or later)**. Detailed instructions for installing RStan is available in this link: https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started/.- RStudio (https://www.rstudio.com/products/RStudio/) is not required but strongly recommended.

**Note**: Additional R packages (ggplot2, loo, mail, modeest) will be installed (if not installed yet) during the installation of hBayesDM.

Task (alphabetical order) | Model name | hBayesDM function | References (see below for full citations) |
---|---|---|---|

Delay Discounting Task |
Constant-Sensitivity (CS) model |
dd_cs |
Ebert & Prelec (2007) |

Iowa Gambling Task (IGT) |
Prospect Valence Learning-DecayRI Prospect Valence Learning-Delta |
igt_pvl_decay |
Ahn et al. (2011; 2014) |

Orthogonalized Go/Nogo Task |
RW+noise |
gng_m1 |
Guitart-Masip et al. (2012) |

Probabilistic Reversal Learning (PRL) Task |
Experience-Weighted Attraction |
prl_ewa |
Ouden et al. (2013) |

Risk-Aversion Task |
Prospect Theory (PT) |
ra_prospect |
Sokol-Hessner et al. (2009) |

Two-Armed Bandit (Experience-based) Task |
Rescorla-Wagner (delta) model |
bandit2arm_delta |
Erev et al. (2010) |

Ultimatum Game |
Ideal Bayesian observer model |
ug_bayes |
Xiang et al. (2013) |

The hBayesDM package is now available on CRAN! There are two ways to install hBayesDM, both of which are described below. *Make sure to install RStan prior to install hBayesDM.* Typically RStan can be installed just by typing `install.packages("rstan", dependencies = TRUE)`

. **For Windows, you need to install Rtools first to install RStan**. How can you tell if RStan is correctly installed? Check if you can fit the ‘Eight Schools’ model without a problem. Check here or here if you experience difficulty installing RStan.

Use the following call:

`install.packages("hBayesDM", dependencies=TRUE)`

Install the package from GitHub:

```
if (packageVersion("devtools") < 1.6) { # install the devtools package
install.packages("devtools")
}
devtools::install_github("CCS-Lab/hBayesDM")
```

- Download a copy from
**here**to a directory (e.g., “~/Downloads”). - Open R(Studio) and set working directory to the downloaded folder. (e.g.,
`setwd("~/Downloads")`

) - Install the package from the downloaded file.

`install.packages(pkgs="hBayesDM_0.3.0.tar.gz", dependencies=TRUE, repos=NULL)`

First, open RStudio (or just R) and load the package:

`library(hBayesDM)`

Four steps of doing HBA with hBayesDM are illustrated below. As an example, four models of the orthogonalized Go/Nogo task (Guitart-Masip et al., 2012; Cavanagh et al., 2013) are fit and compared with the hBayesDM package.

**1. Prepare the data**

- For fitting a model with hBayesDM, all subjects’ data should be combined into a single text (*.txt) file. Look at the sample dataset and a help file (e.g.,
`?gng_m1`

) for each task and carefully follow the instructions. - Subjects’ data must contain variables that are consistent with the column names specified in the help file, though extra variables are in practice allowed.
- It is okay if the number of trials is different across subjects. But there should exist no N/A data. If some trials contain N/A data (e.g.,
`choice=NA`

in trial#10), remove the trials first. - Sample data are available
**here**, although users can fit a model with sample data without separately downloading them with one of the function arguments. Once the hBayesDM package is installed, sample data can be also retrieved from the package folder. Note that the file name of sample (example) data for a given task is**taskName_exampleData.txt**(e.g., dd_exampleData.txt, igt_exampleData.txt, gng_exampleData.txt, etc.). See each model’s help file (e.g.,`?gng_m1`

) to check required data columns and their labels.

`dataPath = system.file("extdata/gng_exampleData.txt", package="hBayesDM")`

If you download the sample data to “~/Downloads”, you may specify the path to the data file like this:

`dataPath = "~/Downloads/gng_exampleData.txt"`

**2. Fit candidate models**

Below the `gng_m1`

model is fit with its sample data. The command indicates that three MCMC chains are run and three cores are used for parallel computing. If you enter “example” as an argument for `data`

, hBayesDM will use the sample data for the task. Note that you can save the output to a file (see the `saveDir`

argument) or send an email when fitting is complete (see the `email`

argument). You can also assign your own initial values (see the `inits`

argument; e.g., `inits=c(0.1, 0.2, 1.0)`

):

`output1 = gng_m1(data="example", niter=2000, nwarmup=1000, nchain=4, ncore=4)`

, which is the same as the command below because the default numbers of total (including warmup) iterations (MCMC samples), warmup iterations, and chains are 2,000, 1,000, and 4 for `gng`

models.

`output1 = gng_m1("example", ncore=4)`

Executing the command will generate messages like below in the R console. It will take approximately 2~3 minutes (with the `gng_m1`

model & “example” data) for the model fitting to complete. Note that you may get warning messages about “numerical problems” or that there are a certain number of “divergent transitions after warmup”. When we check our models with example datasets, warning messages appear mostly at the beginning of the warmup period and there are very few divergent transitions after warmup. In such cases, you can ignore the warnings. Also see Appendix D of the Stan Reference Manual.

```
Details:
# of chains = 4
# of cores used = 4
# of MCMC samples (per chain) = 2000
# of burn-in samples = 1000
# of subjects = 10
# of (max) trials per subject = 240
************************************
** Building a model. Please wait. **
************************************
starting worker pid=75130 on localhost:11950 at 08:25:48.905
starting worker pid=75138 on localhost:11950 at 08:25:49.101
SAMPLING FOR MODEL 'gng_m1' NOW (CHAIN 1).
Chain 1, Iteration: 1 / 2000 [ 0%] (Warmup)
SAMPLING FOR MODEL 'gng_m1' NOW (CHAIN 2).
...
```

When model fitting is complete, you see this message and data are stored into `output1`

.

```
************************************
**** Model fitting is complete! ****
************************************
```

`output1`

, a hBayesDM object, is a list with 4 elements (class: “hBayesDM”):

`model`

: Name of the fitted model (i.e.,`output1$model`

is ‘gng_m1’).`allIndPars`

: Summary of individual subjects’ parameters (default:*mean*). Users can also choose to use*median*or*mode*(e.g.,`output1 = gng_m1("example", indPars="mode")`

).`parVals`

: Posterior samples of all parameters. Extracted by`rstan::extract(rstan_object, permuted=T)`

.**Note that hyper (group) mean parameters are indicated by**`mu_PARAMETER`

(e.g.,`mu_xi`

,`mu_ep`

,`mu_rho`

).`fit`

: RStan object (i.e.,`fit = stan(file='gng_m1.stan', ...)`

).`rawdata`

: Raw trial-by-trial data used for modeling. Raw data are provided in the output to allow users to easily access data and compare trial-by-trial model-based regressors (e.g., prediction errors) with choice data.`modelRegressor`

(optional): Trial-by-trial model-based regressors such as prediction errors, the values of the chosen option, etc. For each model, we pre-select appropriate model-based regressors. Currently (version 0.2.3.3), this feature is available only for the orthogonalized Go/NoGo task.

```
> output1$allIndPars
xi ep rho subjID
1 0.03688558 0.1397615 5.902901 1
2 0.02934812 0.1653435 6.066120 2
3 0.04467025 0.1268796 5.898099 3
4 0.02103926 0.1499842 6.185020 4
5 0.02620808 0.1498962 6.081908 5
...
```

```
> output1$fit
Inference for Stan model: gng_m1.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=4000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
mu_xi 0.03 0.00 0.02 0.00 0.02 0.03 0.05 0.08 2316 1.00
mu_ep 0.15 0.00 0.02 0.11 0.13 0.15 0.16 0.19 4402 1.00
mu_rho 5.97 0.01 0.72 4.76 5.45 5.89 6.40 7.61 3821 1.00
sigma[1] 0.54 0.06 1.02 0.02 0.18 0.35 0.61 1.99 318 1.01
sigma[2] 0.12 0.00 0.08 0.01 0.05 0.10 0.16 0.31 2620 1.00
sigma[3] 0.12 0.00 0.09 0.01 0.05 0.10 0.16 0.33 2402 1.00
...
```

\(\hat{R}\) (`Rhat`

) is an index of the convergence of the chains. \(\hat{R}\) values close to 1.00 would indicate that MCMC chains are converged to stationary target distributions. When we check MCMC performance of our models on sample data, \(\hat{R}\) values are 1.00 for most parameters or at most 1.04.

**3. Plot model parameters**

Make sure to visually diagnose MCMC performance (i.e., visually check whether MCMC samples are well mixed and converged to stationary distributions). For the diagnosis or visualization of hyper (group) parameters, you can use `plot.hBayesDM`

or just `plot`

, which searches for an extension function that contains the class name. The class of any hBayesDM output is `hBayesDM`

:

Let’s first visually diagnose MCMC performance of hyper parameters with trace plots:

`plot(output1, type="trace", fontSize=11) # traceplot of hyper parameters. Set font size 11.`