This is an old version, view current version.

## 5.12 Special matrix functions

### 5.12.1 Softmax

The softmax function maps3 $$y \in \mathbb{R}^K$$ to the $$K$$-simplex by $\text{softmax}(y) = \frac{\exp(y)} {\sum_{k=1}^K \exp(y_k)},$ where $$\exp(y)$$ is the componentwise exponentiation of $$y$$. Softmax is usually calculated on the log scale, $\begin{eqnarray*} \log \text{softmax}(y) & = & \ y - \log \sum_{k=1}^K \exp(y_k) \\[4pt] & = & y - \mathrm{log\_sum\_exp}(y). \end{eqnarray*}$ where the vector $$y$$ minus the scalar $$\mathrm{log\_sum\_exp}(y)$$ subtracts the scalar from each component of $$y$$.

Stan provides the following functions for softmax and its log.

vector softmax(vector x)
The softmax of x

vector log_softmax(vector x)
The natural logarithm of the softmax of x

### 5.12.2 Cumulative sums

The cumulative sum of a sequence $$x_1,\ldots,x_N$$ is the sequence $$y_1,\ldots,y_N$$, where $y_n = \sum_{m = 1}^{n} x_m.$

real[] cumulative_sum(real[] x)
The cumulative sum of x

vector cumulative_sum(vector v)
The cumulative sum of v

row_vector cumulative_sum(row_vector rv)
The cumulative sum of rv

1. The softmax function is so called because in the limit as $$y_n \rightarrow \infty$$ with $$y_m$$ for $$m \neq n$$ held constant, the result tends toward the “one-hot” vector $$\theta$$ with $$\theta_n = 1$$ and $$\theta_m = 0$$ for $$m \neq n$$, thus providing a “soft” version of the maximum function.