17.4 Log sum of exponentials

This is an old version, view current version.

Working on the log scale, multiplication is converted to addition, $\log (a \cdot b) = \log a + \log b.$ Thus sequences of multiplication operations can remain on the log scale. But what about addition? Given $\log a$ and $\log b$ , how do we get $\log (a + b)$ ? Working out the algebra, $\log (a + b) = \log (\exp(\log a) + \exp(\log b)).$

17.4.1 Log-sum-exp function

The nested log of sum of exponentials is so common, it has its own name, “log-sum-exp”, $\textrm{log-sum-exp}(u, v) = \log (\exp(u) + \exp(v)).$ so that $\log (a + b) = \textrm{log-sum-exp}(\log a, \log b).$

Although it appears this might overflow as soon as exponentiation is introduced, evaluation does not proceed by evaluating the terms as written. Instead, with a little algebra, the terms are rearranged into a stable form, $\textrm{log-sum-exp}(u, v) = \max(u, v) + \log\big( \exp(u - \max(u, v)) + \exp(v - \max(u, v)) \big).$

Because the terms inside the exponentiations are $u - \max(u, v)$ and $v - \max(u, v)$ , one will be zero and the other will be negative. Because the operation is symmetric, it may be assumed without loss of generality that $u \geq v$ , so that $\textrm{log-sum-exp}(u, v) = u + \log\big(1 + \exp(v - u)\big).$

Although the inner term may itself be evaluated using the built-in function log1p, there is only limited gain because $\exp(v - u)$ is only near zero when $u$ is much larger than $v$ , meaning the final result is likely to round to $u$ anyway.

To conclude, when evaluating $\log (a + b)$ given $\log a$ and $\log b$ , and assuming $\log a > \log b$ , return

$\log (a + b) = \log a + \textrm{log1p}\big(\exp(\log b - \log a)\big).$

17.4.2 Applying log-sum-exp to a sequence

The log sum of exponentials function may be generalized to sequences in the obvious way, so that if $v = v_1, \ldots, v_N$ , then $\begin{eqnarray*} \textrm{log-sum-exp}(v) & = & \log \sum_{n = 1}^N \exp(v_n) \\[4pt] & = & \max(v) + \log \sum_{n = 1}^N \exp(v_n - \max(v)). \end{eqnarray*}$ The exponent cannot overflow because its argument is either zero or negative. This form makes it easy to calculate $\log (u_1 + \cdots + u_N)$ given only $\log u_n$ .

17.4.3 Calculating means with log-sum-exp

An immediate application is to computing the mean of a vector $u$ entirely on the log scale. That is, given $\log u$ and returning $\log \textrm{mean}(u)$ . $\begin{eqnarray*} \log \left( \frac{1}{N} \sum_{n = 1}^N u_n \right) & = & \log \frac{1}{N} + \log \sum_{n = 1}^N \exp(\log u_n) \\[4pt] & = & -\log N + \textrm{log-sum-exp}(\log u). \end{eqnarray*}$ where $\log u = (\log u_1, \ldots, \log u_N)$ is understood elementwise.