Multiple Indexing and Range Indexing

Stan allows multiple indexes to be provided for containers (i.e., arrays, vectors, and matrices) in a single position, using either an array of integer indexes or range bounds. In many cases, there are functions that provide similar behavior.

Allowing multiple indexes supports inline vectorization of models. For instance, consider the data model for a varying-slope, varying-intercept hierarchical linear regression, which could be coded as

for (n in 1:N) {
  y[n] ~ normal(alpha[ii[n]] + beta[ii[n]] * x[n], sigma);
}

With multiple indexing, this can be coded in one line, leading to more efficient vectorized code.

y ~ normal(alpha[ii] + rows_dot_product(beta[ii], x), sigma);

This latter version is faster than the loop version; it is equivalent in speed to the clunky assignment to a local variable.

{
  vector[N] mu;
  for (n in 1:N) {
    mu[n] = alpha[ii[n]] + beta[ii[n]] * x[n];
  }
  y ~ normal(mu, sigma);
}

The boost in speed compared to the original version is because the single call to the normal log density in the distribution statement will be much more memory efficient than the original version.

Multiple indexing

The following is the simplest concrete example of multiple indexing with an array of integers; the ellipses stand for code defining the variables as indicated in the comments.

array[3] int c;
// ... define: c == (5, 9, 7)
array[4] int idxs;
// ... define: idxs == (3, 3, 1, 2)
array[4] int d;
d = c[idxs];    // result: d == (7, 7, 5, 9)

In general, the multiple indexed expression c[idxs] is defined as follows, assuming idxs is of size K.

c[idxs] = ( c[idxs[1]], c[idxs[2]], ..., c[idxs[K]] )

Thus c[idxs] is of the same size as idxs, which is K in this example.

Multiple indexing can also be used with multi-dimensional arrays. For example, consider the following.

array[2, 3] int c;
// ... define: c = ((1, 3, 5), ((7, 11, 13))
array[4] int idxs;
// ... define: idxs = (2, 2, 1, 2)
array[4, 3] int d
d = c[idxs];    // result: d = ((7, 11, 13), (7, 11, 13),
                //              (1, 3, 5), (7, 11, 13))

That is, putting an index in the first position acts exactly the same way as defined above. The fact that the values are themselves arrays makes no difference—the result is still defined by c[idxs][j] == c[idxs[j]].

Multiple indexing may also be used in the second position of a multi-dimensional array. Continuing the above example, consider a single index in the first position and a multiple index in the second.

array[4] int e;
e = c[2, idxs]; // result:  c[2] = (7, 11, 13)
                // result:  e = (11, 11, 7, 11)

The single index is applied, the one-dimensional result is determined, then the multiple index is applied to the result. That is, c[2,idxs] evaluates to the same value as c[2][idxs].

Multiple indexing can apply to more than one position of a multi-dimensional array. For instance, consider the following

array[2, 3] int c;
// ... define: c = ((1, 3, 5), (7, 11, 13))
array[3] int idxs1;
// ... define: idxs1 = (2, 2, 1)
array[2] int idxs2;
// ... define: idxs2 = (1, 3)
array[3, 2] int d;
d = c[idxs1, idxs2];  // result: d = ((7, 13), (7, 13), (1, 5))

With multiple indexes, we no longer have c[idxs1, idxs2] being the same as c[idxs1][idxs2]. Rather, the entry d[i, j] after executing the above is given by

d[i, j] == c[idxs1, idxs2][i, j] = c[idxs1[i], idxs2[j]]

This example illustrates the operation of multiple indexing in the general case: a multiple index like idxs1 converts an index i used on the result (here, c[idxs1, idxs2]) to index idxs1[i] in the variable being indexed (here, c). In contrast, a single index just returns the value at that index, thus reducing dimensionality by one in the result.

Slicing with range indexes

Slicing returns a contiguous slice of a one-dimensional array, a contiguous sub-block of a two-dimensional array, and so on. Semantically, it is just a special form of multiple indexing.

Lower and upper bound indexes

For instance, consider supplying an upper and lower bound for an index.

array[7] int c;
// ...
array[4] int d;
d = c[3:6];  // result: d == (c[3], c[4], c[5], c[6])

The range index 3:6 behaves semantically just like the multiple index (3, 4, 5, 6). In terms of implementation, the sliced upper and/or lower bounded indices are faster and use less memory because they do not explicitly create a multiple index, but rather use a direct loop. They are also easier to read, so should be preferred over multiple indexes where applicable.

Lower or upper bound indexes

It is also possible to supply just a lower bound, or just an upper bound. Writing c[3:] is just shorthand for c[3:size(c)]. Writing c[:5] is just shorthand for c[1:5].

Full range indexes

Finally, it is possible to write a range index that covers the entire range of an array, either by including just the range symbol (:) as the index or leaving the index position empty. In both cases, c[] and c[:] are equal to c[1:size(c)], which in turn is just equal to c.

Slicing functions

Stan provides head and tail functions that pull out prefixes or suffixes of vectors, row vectors, and one-dimensional arrays. In each case, the return type is the same as the argument type. For example,

vector[M] a = ...;
vector[N] b = head(a, N);

assigns b to be a vector equivalent to the first N elements of the vector a. The function tail works the same way for suffixes, with

array[M] a = ...;
array[N] b = tail(a, N);

Finally, there is a segment function, which specifies a first element and number of elements. For example,

array[15] a = ...;
array[3] b = segment(a, 5, 3);

will set b to be equal to { a[5], a[6], a[7] }, so that it starts at element 5 of a and includes a total of 3 elements.

Multiple indexing on the left of assignments

Multiple expressions may be used on the left-hand side of an assignment statement, where they work exactly the same way as on the right-hand side in terms of picking out entries of a container. For example, consider the following.

array[3] int a;
array[2] int c;
array[2] int idxs;
// ... define: a == (1, 2, 3);  c == (5, 9)
               //         idxs = (3,2)
a[idxs] = c;   // result: a == (1, 9, 5)

The result above can be worked out by noting that the assignment sets a[idxs[1]] (a[3]) to c[1] (5) and a[idxs[2]] (a[2]) to c[2] (9).

The same principle applies when there are many multiple indexes, as in the following example.

array[5, 7] int a;
array[2, 2] int c;
// ...
a[2:3, 5:6] = c;  // result: a[2, 5] == c[1, 1];  a[2, 6] == c[1, 2]
                  //         a[3, 5] == c[2, 1];  a[3, 6] == c[2, 2]

As in the one-dimensional case, the right-hand side is written into the slice, block, or general chunk picked out by the left-hand side.

Usage on the left-hand side allows the full generality of multiple indexing, with single indexes reducing dimensionality and multiple indexes maintaining dimensionality while rearranging, slicing, or blocking. For example, it is valid to assign to a segment of a row of an array as follows.

array[10, 13] int a;
array[2] int c;
// ...
a[4, 2:3] = c;  // result:  a[4, 2] == c[1];  a[4, 3] == c[2]

Assign-by-value and aliasing

Aliasing issues arise when there are references to the same data structure on the right-hand and left-hand side of an assignment. For example, consider the array a in the following code fragment.

array[3] int a;
// ... define: a == (5, 6, 7)
a[2:3] = a[1:2];
// ... result: a == (5, 5, 6)

The reason the value of a after the assignment is $(5, 5, 6)$ rather than $(5, 5, 5)$ is that Stan behaves as if the right-hand side expression is evaluated to a fresh copy. As another example, consider the following.

array[3] int a;
array[3] int idxs;
// ... define idxs = (2, 1, 3)
a[idxs] = a;

In this case, it is evident why the right-hand side needs to be copied before the assignment.

It is tempting (but wrong) to think of the assignment a[2:3] = a[1:2] as executing the following assignments.

// ... define: a = (5, 6, 7)
a[2] = a[1];      // result: a = (5, 5, 7)
a[3] = a[2];      // result: a = (5, 5, 5)!

This produces a different result than executing the assignment because a[2]’s value changes before it is used.

Multiple indexes with vectors and matrices

Multiple indexes can be supplied to vectors and matrices as well as arrays of vectors and matrices.

Vectors

Vectors and row vectors behave exactly the same way as arrays with multiple indexes. If v is a vector, then v[3] is a scalar real value, whereas v[2:4] is a vector of size 3 containing the elements v[2], v[3], and v[4].

The only subtlety with vectors is in inferring the return type when there are multiple indexes. For example, consider the following minimal example.

array[3] vector[5] v;
array[7] int idxs;
// ...
vector[7] u;
u = v[2, idxs];

array[7] real w;
w = v[idxs, 2];

The key is understanding that a single index always reduces dimensionality, whereas a multiple index never does. The dimensions with multiple indexes (and unindexed dimensions) determine the indexed expression’s type. In the example above, because v is an array of vectors, v[2, idxs] reduces the array dimension but doesn’t reduce the vector dimension, so the result is a vector. In contrast, v[idxs, 2] does not reduce the array dimension, but does reduce the vector dimension (to a scalar), so the result type for w is an array of reals. In both cases, the size of the multiple index (here, 7) determines the size of the result.

Matrices

Matrices are a bit trickier because they have two dimensions, but the underlying principle of type inference is the same—multiple indexes leave dimensions in place, whereas single indexes reduce them. The following code shows how this works for multiple indexing of matrices.

matrix[5, 7] m;
// ...
row_vector[3] rv;
rv = m[4, 3:5];    // result is 1 x 3
// ...
vector[4] v;
v = m[2:5, 3];     // result is 3 x 1
// ...
matrix[3, 4] m2;
m2 = m[1:3, 2:5];  // result is 3 x 4

The key is realizing that any position with a multiple index or bounded index remains in play in the result, whereas any dimension with a single index is replaced with 1 in the resulting dimensions. Then the type of the result can be read off of the resulting dimensionality as indicated in the comments above.

Matrices with one multiple index

If matrices receive a single multiple index, the result is a matrix. So if m is a matrix, so is m[2:4]. In contrast, supplying a single index, m[3], produces a row vector result. That is, m[3] produces the same result as m[3, ] or m[3, 1:cols(m)].

Arrays of vectors or matrices

With arrays of matrices, vectors, and row vectors, the basic access rules remain exactly the same: single indexes reduce dimensionality and multiple indexes redirect indexes. For example, consider the following example.

array[5, 7] matrix[3, 4] m;
// ...
array[2] matrix[3, 4] a;
a = m[1, 2:3];  // knock off first array dimension
a = m[3:4, 5];  // knock off second array dimension

In both assignments, the multiple index knocks off an array dimension, but it’s different in both cases. In the first case, a[i] == m[1, i + 1], whereas in the second case, a[i] == m[i + 2, 5].

Continuing the previous example, consider the following.

// ...
vector[2] b;
b = a[1, 3, 2:3, 2];

Here, the two array dimensions are reduced as is the column dimension of the matrix, leaving only a row dimension index, hence the result is a vector. In this case, b[j] == a[1, 3, 1 + j, 2].

This last example illustrates an important point: if there is a lower-bounded index, such as 2:3, with lower bound 2, then the lower bound minus one is added to the index, as seen in the 1 + j expression above.

Continuing further, consider continuing with the following.

// ...
array[2] row_vector[3] c;
c = a[4:5, 3, 1, 2: ];

Here, the first array dimension is reduced, leaving a single array dimension, and the row index of the matrix is reduced, leaving a row vector. For indexing, the values are given by c[i, j] == a[i + 3, 3, 1, j + 1]

Block, row, and column extraction for matrices

Matrix slicing can also be performed using the block function. For example,

matrix[20, 20] a = ...;
matrix[3, 2] b = block(a, 5, 9, 3, 2);

will set b equal to the submatrix of a starting at index [5, 9] and extending 3 rows and 2 columns. Thus block(a, 5, 9, 3, 2) is equivalent to b[5:7, 9:10].

The sub_col function extracts a slice of a column of a matrix as a vector. For example,

matrix[10, 10] a = ...;
vector b = sub_col(a, 2, 3, 5);

will set b equal to the vector a[2:6, 3], taking the element starting at [2, 3], then extending for a total of 5 rows. The function sub_row works the same way for extracting a slice of a row as a row vector. For example, sub_row(a, 2, 3, 5) is equal to the row vector a[2, 3:7], which also starts at position [2, 3] then extends for a total of 5 columns.

Matrices with parameters and constants

Suppose you have a $3 x 3$ matrix and know that two entries are zero but the others are parameters. Such a situation arises in missing data situations and in problems with fixed structural parameters.

Suppose a $3 \times 3$ matrix is known to be zero at indexes $[1, 2]$ and $[1, 3]$ . The indexes for parameters are included in a “melted” data-frame or database format.

transformed data {
  array[7, 2] int<lower=1, upper=3> idxs
    = { {1, 1},
        {2, 1}, {2, 2}, {2, 3},
        {3, 1}, {3, 2}, {3, 3} };
  // ...

The seven remaining parameters are declared as a vector.

parameters {
  vector[7] A_raw;
  // ...
}

Then the full matrix A is constructed in the model block as a local variable.

model {
  matrix[3, 3] A;
  for (i in 1:7) {
    A[idxs[i, 1], idxs[i, 2]] = A_raw[i];
  }
  A[1, 2] = 0;
  A[1, 3] = 0;
  // ...
}

This may seem like overkill in this setting, but in more general settings, the matrix size, vector size, and the idxs array will be too large to code directly. Similar techniques can be used to build up matrices with ad-hoc constraints, such as a handful of entries known to be positive.