Calculating the Log Sum of Exponentials

November 21, 2008

The so-called “log sum of exponentials” is a functional form commonly encountered in dynamic discrete choice models in economics. It arises when the choice-specific error terms have a type 1 extreme value distribution–a very common assumption. In these problems there is typically a vector $v$ of length $J$ which represents the non-stochastic payoff associated with each of $J$ choices. In addition to $v_{j}$ , the payoff associated with choice $j$ has a random component $ε_{j}$ . If these components are independent and identically distributed across $j$ and follow the type 1 extreme value distribution, then the expected payoff from choosing optimally is $𝔼 [\max {v_{1} + ε_{1}, \dots, v_{J} + ε_{J}}] = \ln [\exp (v_{1}) + \dots + \exp (v_{J})] .$ We need to numerically evaluate the expression on the right hand side for many different vectors $v$ .

Special care is required to calculate this expression in compiled languages such as Fortran or C to avoid numerical problems. The function needs to work for $v$ with very large and very small components. A large $v_{j}$ can cause overflow due to the exponentiation. Similarly, when the $v_{j}$ are large (in absolute value) and negative, the exponential terms vanish. Taking the logarithm of a very small number can result in underflow.

A simple transformation can avoid both of these problems. Consider the case where $v = (a, b)$ . Note that we can write $\exp (a) + \exp (b) = [\exp (a - c) + \exp (b - c)] \exp (c)$ for any $c$ . Furthermore, we have $\ln [(\exp (a - c) + \exp (b - c)) \exp (c)] = \ln [\exp (a - c) + \exp (b - c)] + c .$ We can choose $c$ in a way that reduces the possibility of overflow. Underflow is also possible when taking the logarithm of a number close to zero, since $\ln (x) \to - ∞$ as $x \to 0$ . Thus, we also need to account for large negative elements of $v$ .

The following code is a Fortran implementation of a function which makes the adjustments described above. In order to prevent numerical overflow, the function shifts the vector v by a constant c, applies exp and log, and then adjusts the result to account for the shift.

function log_sum_exp(v) result(e)
  real, dimension(:), intent(in) :: v   ! Input vector
  real                           :: e   ! Result is log(sum(exp(v)))
  real                           :: c   ! Shift constant

  ! Choose c to be the element of v that is largest in absolute value.
  if ( maxval(abs(v)) > maxval(v) ) then
     c = minval(v)
  else
     c = maxval(v)
  end if

  e = log(sum(exp(v-c))) + c
end function log_sum_exp

Note that this function is still not completely robust to very poorly scaled vectors v and so over- or underflow is still possible (just much less likely).