# Calculating the Log Sum of Exponentials

November 21, 2008

The so-called “log sum of exponentials” is a functional form commonly encountered in dynamic discrete choice models in economics. It arises when the choice-specific error terms have a type 1 extreme value distribution–a very common assumption. In these problems there is typically a vector $v$ of length $J$ which represents the non-stochastic payoff associated with each of $J$ choices. In addition to ${v}_{j}$, the payoff associated with choice $j$ has a random component ${\epsilon }_{j}$. If these components are independent and identically distributed across $j$ and follow the type 1 extreme value distribution, then the expected payoff from choosing optimally is $𝔼\left[\mathrm{max}\left\{{v}_{1}+{\epsilon }_{1},\dots ,{v}_{J}+{\epsilon }_{J}\right\}\right]=\mathrm{ln}\left[\mathrm{exp}\left({v}_{1}\right)+\dots +\mathrm{exp}\left({v}_{J}\right)\right].$ We need to numerically evaluate the expression on the right hand side for many different vectors $v$.

Special care is required to calculate this expression in compiled languages such as Fortran or C to avoid numerical problems. The function needs to work for $v$ with very large and very small components. A large ${v}_{j}$ can cause overflow due to the exponentiation. Similarly, when the ${v}_{j}$ are large (in absolute value) and negative, the exponential terms vanish. Taking the logarithm of a very small number can result in underflow.

A simple transformation can avoid both of these problems. Consider the case where $v=\left(a,b\right)$. Note that we can write $\mathrm{exp}\left(a\right)+\mathrm{exp}\left(b\right)=\left[\mathrm{exp}\left(a-c\right)+\mathrm{exp}\left(b-c\right)\right]\mathrm{exp}\left(c\right)$ for any $c$. Furthermore, we have $\mathrm{ln}\left[\left(\mathrm{exp}\left(a-c\right)+\mathrm{exp}\left(b-c\right)\right)\mathrm{exp}\left(c\right)\right]=\mathrm{ln}\left[\mathrm{exp}\left(a-c\right)+\mathrm{exp}\left(b-c\right)\right]+c.$ We can choose $c$ in a way that reduces the possibility of overflow. Underflow is also possible when taking the logarithm of a number close to zero, since $\mathrm{ln}\left(x\right)\to -\infty$ as $x\to 0$. Thus, we also need to account for large negative elements of $v$.

The following code is a Fortran implementation of a function which makes the adjustments described above. In order to prevent numerical overflow, the function shifts the vector v by a constant c, applies exp and log, and then adjusts the result to account for the shift.

function log_sum_exp(v) result(e)
real, dimension(:), intent(in) :: v   ! Input vector
real                           :: e   ! Result is log(sum(exp(v)))
real                           :: c   ! Shift constant

! Choose c to be the element of v that is largest in absolute value.
if ( maxval(abs(v)) > maxval(v) ) then
c = minval(v)
else
c = maxval(v)
end if

e = log(sum(exp(v-c))) + c
end function log_sum_exp


Note that this function is still not completely robust to very poorly scaled vectors v and so over- or underflow is still possible (just much less likely).