Maximum Simulated Likelihood

Given a sample of observations {y i:i=1,,N}, the log-likelihood function for an unknown parameter θ is l N(θ) i=1 Nlnf(θ|y i). Let f˜(θ|y i,ω) be an unbiased simulator such that E ω[f˜(θ|y,ω)|y]=f(θ;y) where ω is a vector of R simulated random variates. Then, the maximum simulated likelihood (MSL) estimator for θ is θ˜ MSLargmax θl˜ N(θ) where l˜ N(θ) i=1 Nlnf˜(θ|y i,ω) for some sequence of simulations {ω i}.

There are two points which deserve special attention. First, the estimator is conditional upon the particular sequence of simulations {ω i} used. That is to say one will obtain a different estimate for each such sequence used. Second, even though the simulator of f is unbiased, the resulting MSL estimate will be biased. That is, even though we have E[l(θ)]=l(θ), this does not imply E[argmax θl˜(θ)]=argmax θl(θ). Unbiased simulation of the log-likelihood function is generally infeasible due to the nonlinearity introduced by the natural log transformation of the likelihood function, which can usually be simulated without bias.

Consistency

All is not lost because, even though our estimate is biased, we can still obtain an estimator whose probability limit is the same as the MLE. This requires that the sample average of the simulated log-likelihood converges to the sample average log-likelihood. This can be accomplished by increasing the number of simulations, and thus decreasing the simulation error, at a sufficiently fast rate relative to the sample size. We have the following lemma (see Newey and McFadden, 1994):

Lemma. Suppose the following:

  1. θΘ K and Θ is compact,
  2. Q 0(θ) and Q N(θ) are continuous in θ,
  3. θ 0argmax θΘQ 0(θ) is unique,
  4. θ^ Nargmax θΘQ N(θ), and
  5. Q N(θ)Q 0(θ) in probability uniformly in θ as N.

Then, θ^ Nθ 0 in probability.

Now, suppose that f satisfies the conditions of this lemma. In particular, suppose that the obersvations y i are iid, that θ is identified, and that f(θ,y) is continuous in θ over some compact set Θ. Finally, assume that E[sup θΘ|lnf(θ,y)|] is finite.

Now, given a sequence of simulators ω ir, iid across r, the the MSL estimator defined as θ˜ MSLargmax θ1N i=1 Nlnf˜(θ|y i,ω i) is consistent if R as N. For a proof refer to Hajivassiliou and Ruud (1994, p. 2417).

Asymptotic Normality

Suppose that f˜ is differentiable in θ. Then we can form a Taylor expansion approximation of Δ θl˜(θ) around θ 0: Δ θl˜(θ^ MSL)=Δ θl˜(θ 0)+Δ θ 2l˜(θ¯)(θ^ MSLθ 0) for some θ¯ lying on the line segment between θ^ MSL and θ 0. By definition, the left hand side equals zero and after multiplying by N and rearranging we find N(θ^ MSLθ 0)=[1NΔ θ 2l˜(θ¯)] 11NΔ θl˜(θ 0).

Now, the consistency of θ^ MSL implies consistency of θ¯ and so 1NΔ θ 2l˜(θ¯)pE[Δ θ 2lnf(θ 0|y)].

As for the gradient term, we have 1NΔ θl˜(θ 0)=1N i=1 NΔ θf˜(θ 0|y i,ω i)f˜(θ 0|y i,ω i). Ideally, to prove asymptotic normallity we would like this to converge to some mean zero normal distribution. However, the expectation of the individual terms in this summation are nonzero, so we cannot apply a central limit theorem directly. We can rewrite this term as follows: 1NΔ θl˜(θ 0)=1NΔ θl(θ 0)+A N+B N with

A N=1N i=1 N{Δ θlnf˜E ω[Δ θlnf˜]} and B N=1N i=1 N{E ω[Δ θlnf˜]Δ θlnf}.

The term A N represents the pure simulation noise and has expectation zero. The B N term represents the simulation bias. Proposition 4 of Hajivassiliou and Ruud (1994, p. 2418) shows that if R grows fast enough relative to N, specifically if R/N, then the simulation bias is harmless. Finally, Proposition 5 (p. 2419) shows that θ^ MSL is in fact asymptotically efficient.


References