Calculating the Log Sum of Exponentials

November 21, 2008

The so-called “log sum of exponentials” is a functional form commonly encountered in dynamic discrete choice models in economics. It arises when the choice-specific error terms have a type 1 extreme value distribution–a very common assumption. In these problems there is typically a vector v of length J which represents the non-stochastic payoff associated with each of J choices. In addition to v j, the payoff associated with choice j has a random component ε j. If these components are independent and identically distributed across j and follow the type 1 extreme value distribution, then the expected payoff from choosing optimally is 𝔼[max{v 1+ε 1,,v J+ε J}]=ln[exp(v 1)++exp(v J)]. We need to numerically evaluate the expression on the right hand side for many different vectors v.

Special care is required to calculate this expression in compiled languages such as Fortran or C to avoid numerical problems. The function needs to work for v with very large and very small components. A large v j can cause overflow due to the exponentiation. Similarly, when the v j are large (in absolute value) and negative, the exponential terms vanish. Taking the logarithm of a very small number can result in underflow.

A simple transformation can avoid both of these problems. Consider the case where v=(a,b). Note that we can write exp(a)+exp(b)=[exp(ac)+exp(bc)]exp(c) for any c. Furthermore, we have ln[(exp(ac)+exp(bc))exp(c)]=ln[exp(ac)+exp(bc)]+c. We can choose c in a way that reduces the possibility of overflow. Underflow is also possible when taking the logarithm of a number close to zero, since ln(x) as x0. Thus, we also need to account for large negative elements of v.

The following code is a Fortran implementation of a function which makes the adjustments described above. In order to prevent numerical overflow, the function shifts the vector v by a constant c, applies exp and log, and then adjusts the result to account for the shift.

function log_sum_exp(v) result(e)
  real, dimension(:), intent(in) :: v   ! Input vector
  real                           :: e   ! Result is log(sum(exp(v)))
  real                           :: c   ! Shift constant

  ! Choose c to be the element of v that is largest in absolute value.
  if ( maxval(abs(v)) > maxval(v) ) then
     c = minval(v)
     c = maxval(v)
  end if

  e = log(sum(exp(v-c))) + c
end function log_sum_exp

Note that this function is still not completely robust to very poorly scaled vectors v and so over- or underflow is still possible (just much less likely).