Pesendorfer and Schmidt-Dengler (2008)

Asymptotic Least Squares Estimators for Dynamic Games

These notes are based on the following article:

Pesendorfer, Martin and Philipp Schmidt-Dengler (2008). Asymptotic Least Squares Estimators for Dynamic Games. Review of Economic Studies 75, 901–928.

Presentation by Jason Blevins, Duke University Applied Microeconomics Reading Group, June 11, 2008.

Outline

Considers the class of asymptotic least squares estimators for dynamic games.
Estimation is based on equilibrium conditions.
Discuss identification and provide sufficient conditions for exact identification.
Characterize the efficient asymptotic least squares estimator.
Several well-known estimators are members of this class.
Monte Carlo experiments.

Framework

Dynamic games in discrete time with $t = 1, \dots, ∞$ .
$N$ players, $K + 1$ actions, $L$ states per player, common discount factor $β$ .
States:
- $s_{i, t} \in S_{i} = {1, \dots, L}$
- $ε_{i, t} \sim F (ε ∣ s_{i, t}, s_{- i, t})$ on $ℝ^{K}$ .
- Let $S = S_{1} \times \dots \times S_{N}$ .
The payoff shocks $ε_{i, t}$ are private information, independent across players and time, and independent of the actions of other players.
Actions $a_{i, t} \in A_{i} = {0, 1, \dots, K}$ are made simultaneously. Let $A = A_{1} \times \dots \times A_{N}$ .
State transitions follow some density $g (a_{t}, s_{t}, s_{t + 1})$ . Let $G$ denote the $m_{a} m_{s} \times m_{s}$ matrix of these probabilities where $m_{s} = # S = L^{N}$ and $m_{a} = # A = (K + 1)^{N}$ .
Period payoffs are given by $π_{i} (a_{t}, s_{t}) + \sum_{k = 1}^{K} ε_{i, t, k} 1 {a_{i, t} = k}$

Equilibrium Characterization

The continuation value net of payoff shocks under $a_{i}$ with beliefs $σ_{i}$ is $u_{i} (a_{i}; σ_{i}, θ) = \sum_{a_{- i}} σ_{i} (a_{- i} ∣ s) [π_{i} (a_{- i}, a_{i}, s) + β \sum_{s'} g (a_{- i}, a_{i}, s, s') V_{i} (s'; σ_{i})] .$ It is optimal to choose $a_{i}$ under the beliefs $σ_{i}$ if $u_{i} (a_{i}; σ_{i}, θ) + ε_{i, a_{i}} \geq u_{i} (a_{i}'; σ_{i}, θ) + ε_{i, a_{i}'} \forall a_{i}' \in A_{i} .$

Ex ante, in expectation we have $p (a_{i} ∣ s, σ_{i}) = Ψ_{i} (a_{i}, s, σ_{i}; θ) = \int 1 {u_{i} (a_{i}; σ_{i}, θ) - u_{i} (k; σ_{i}, θ) \geq ε_{i, k} - ε_{i, a_{i}}, k \neq a_{i}} dF .$ In matrix notation we have a $(N \cdot K \cdot m_{s}) \times 1$ system $p = Ψ (σ; θ) .$

Equilibrium Properties

In equilibrium, beliefs are consistent and we have the fixed point problem

(1)

p = Ψ (p; θ) .

Thus, finding an equilibrium is a fixed point problem on $[0, 1]^{N \cdot K \cdot m_{s}}$ .

Proposition: In any Markov perfect equilibrium, the probability vector $p$ satisfies (1). Conversely, any $p$ that satisfies (1) can be extended to a Markov perfect equilibrium.

Theorem: A Markov perfect equilibrium exists.

We have the same results under symmetric equilibria: existence and necessary and sufficient conditions. Symmetry reduces the number of equations in (1) and thus the computational complexity.

Identification

The model is identified if there exists a unique set of model primitives $(Π_{i}, \dots, Π_{N}, F, β, g)$ that generate any particular set of choice and state transition probabilities.

Time series data ${a_{t}, s_{t}}_{t = 1}^{T}$ .
Suppose the data allow us to characterize $p (a ∣ s)$ and $g (a, s, s')$ .
Fix $β$ and $F$ .
There are $m_{a} \cdot m_{s} \cdot N$ remaining unknowns in $(Π_{1}, \dots, Π_{N})$ .

Proposition: Suppose $F$ and $β$ are given. Then at most $K \cdot m_{s} \cdot N$ parameters can be identified.

There are only $K \cdot m_{s} \cdot N$ equations in the equilibrium conditions but $m_{a} \cdot m_{s} \cdot N$ parameters. We need at least $(m_{a} \cdot m_{s} - K \cdot m_{s}) \cdot N$ restrictions in order to identify all parameters.

Identification: A Linear Representation

There is some ${\bar{ε}}_{i}^{a_{i}} (s)$ that makes player $i$ indifferent between actions $a_{i}$ and $0$ : $\begin{aligned} \sum_{a_{- i} \in A_{- i}} p (a_{- i} ∣ s) [π_{i} (a_{- i}, a_{i}, s) + β \sum_{s' \in S} g (a_{- i}, a_{i}, s, s') V_{i} (s'; p)] + {\bar{ε}}_{i}^{a_{i}} (s) \\ = \sum_{a_{- i} \in A_{- i}} p (a_{- i} ∣ s) [π_{i} (a_{- i}, 0, s) + β \sum_{s' \in S} g (a_{- i}, 0, s, s') V_{i} (s'; p)] \end{aligned}$

From before, $V_{i} (σ_{i}) = [I - β σ_{i} G]^{- 1} [σ_{i} Π_{i} + D_{i} (σ_{i})]$ . Thus, we have a linear system of equations for player $i$ : $X_{i} (p, g, β) Π_{i} + Y_{i} (p, g, β) = 0$ where $X_{i}$ is a $(K \cdot m_{s}) \times (m_{a} \cdot m_{s})$ matrix and $Y_{i}$ is a $(K \cdot m_{s}) \times 1$ vector, both of which depend on the choice probabilities, transition probabilities, and $β$ .

Identification: Linear Restrictions

Consider player $i$ . Let $R_{i}$ be a $(m_{a} \cdot m_{s} - K \cdot m_{s}) \times (m_{a} \cdot m_{s})$ matrix of restrictions and let $r_{i}$ be a $(m_{a} \cdot m_{s} - K \cdot m_{s}) \times 1$ -dimensional vector such that $R_{i} Π_{i} = r_{i}$ .

We can now form an augmented linear system of $m_{a} \cdot m_{s}$ equations in $m_{a} \cdot m_{s}$ unknowns (hence the order condition is satisfied): $[\begin{matrix} X_{i} \\ R_{i} \end{matrix}] Π_{i} + [\begin{matrix} Y_{i} \\ r_{i} \end{matrix}] = {\bar{X}}_{i} Π_{i} + {\bar{Y}}_{i} = 0 .$

Proposition: Consider any player $i$ and suppose that $F$ and $β$ are given. If $rank ({\bar{X}}_{i}) = m_{a} \cdot m_{s}$ , then $Π_{i}$ is exactly identified.

Example: Consider the following restrictions: $\begin{aligned} π_{i} (a_{i}, a_{- i}, s_{i}, s_{- i}) & = π_{i} (a_{i}, a_{- i}, s_{i}, s_{- i}') & \forall a \in A, (s_{i}, s_{- i}) \in S, (s_{i}, s_{- i}') \in S \\ π_{i} (0, a_{- i}, s_{i}) & = r_{i} (a_{- i}, s_{i}) & \forall a_{- i} \in A_{- i}, s_{i} \in S_{i} \end{aligned}$ The first is an exclusion restriction while the second is an exogeneity restriction (e.g., payoffs for inactive firms are known to be zero). If $L \geq K + 1$ , then these restrictions ensure identification (provided that the rank condition holds).

Asymptotic Least Squares Estimators

Let $θ = (θ_{π}, θ_{F}, β, θ_{g}) \in Θ \subset ℝ^{q}$ be the parameters of interest.

There are also $H \leq (N \cdot K \cdot m_{s}) + (m_{a} \cdot m_{s} \cdot m_{s})$ auxiliary parameters $p (θ)$ and $g (θ)$ , related to $θ$ through the $N \cdot K \cdot m_{s}$ equations

(2)

h (p, g, θ) = p - Ψ (p, g, θ) = 0 .

Asymptotic least squares estimators (Gourieroux and Monfort, 1995, Section 9.1) proceed in two steps:

Estimate the auxiliary parameters $p$ and $g$ .
Estimate the parameters of interest using weighted least squares using (2) as estimating equations.

Asymptotic Least Squares Estimators

Assume that consistent and asymptotically normal estimators of $p$ and $g$ are available such that as $T \to ∞$ , $\begin{matrix} ({\hat{p}}_{T}, {\hat{g}}_{T}) ⟶ (p (θ_{0}), g (θ_{0})) a . s ., \\ \sqrt{T} [({\hat{p}}_{T}, {\hat{g}}_{T}) - (p (θ_{0}), g (θ_{0}))] \overset{d}{⟶} Normal (0, Σ (θ_{0})) . \end{matrix}$

The estimation principle involves choosing $θ$ in order to satisfy the constraints $h ({\hat{p}}_{T}, {\hat{g}}_{T}, θ) = {\hat{p}}_{T} - Ψ ({\hat{p}}_{T}, {\hat{g}}_{T}, θ) = 0 .$

Let $W_{T}$ be a symmetric positive-definite weight matrix of dimension $(N \cdot K \cdot m_{s}) \times (N \times K \times m_{s})$ . The asymptotic least squares estimator corresponding to $W_{T}$ is defined as ${\tilde{θ}}_{T} (W_{T}) = \arg \min_{θ} [{\hat{p}}_{T} - Ψ ({\hat{p}}_{T}, {\hat{g}}_{T}, θ)]^{⊤} W_{T} [{\hat{p}}_{T} - Ψ ({\hat{p}}_{T}, {\hat{g}}_{T}, θ)] .$

Asymptotic Least Squares Estimators: Assumptions

$Θ$ is a compact set.
$θ_{0}$ lies in the interior of $Θ$ .
As $T \to ∞$ , $W_{T} \to W_{0}$ a.s. where $W_{0}$ is a non-stochastic positive definite matrix.
$θ$ satisfies ${[p (θ_{0}) - Ψ (p (θ_{0}), g (θ_{0}), θ)]}^{⊤} W_{o} [p (θ_{0}) - Ψ (p (θ_{0}), g (θ_{0}), θ)] = 0$ implies that $θ = θ_{0}$ .
The functions $π$ , $g$ , and $F$ are twice continuously differentiable in $θ$ .
The matrix ${[\nabla_{θ} Ψ (p (θ_{0}), g (θ_{0}), θ_{0})]}^{⊤} W_{o} [\nabla_{θ} Ψ (p (θ_{0}), g (θ_{0}), θ_{0})]$ is nonsingular.

Asymptotic Least Squares Estimators: Properties

Proposition: Given the assumptions above the asymptotic least squares estimator ${\tilde{θ}}_{T} (W_{T})$ exists, ${\tilde{θ}}_{T} (W_{T}) \overset{a . s .}{\to} θ_{0}$ , and as $T \to 0$ , $\sqrt{T} ({\tilde{θ}}_{T} (W_{T}) - θ_{0}) \overset{d}{\to} Normal (0, Ω (θ_{0}))$ where $Ω (θ_{0}) = {(\nabla_{θ} Ψ^{⊤} W_{0} \nabla_{θ^{⊤}})}^{- 1} \nabla_{θ} Ψ^{⊤} W_{0} [(\begin{matrix} I & 0 \end{matrix}) - \nabla_{(p, g)^{⊤}} Ψ] Σ \cdot {[(\begin{matrix} I & 0 \end{matrix}) - \nabla_{(p, g)^{⊤}} Ψ]}^{⊤} W_{0} \nabla_{θ^{⊤}} Ψ {(\nabla_{θ} Ψ^{⊤} W_{0} \nabla_{θ^{⊤}})}^{- 1}$ where $0$ is the $(N \cdot K \cdot m_{s}) \times (m_{a} \cdot m_{s} \cdot m_{s})$ zero matrix and the various matrices are evaluated at $θ_{0}$ , $p (θ_{0})$ , and $g (θ_{0})$ .

Efficient Asymptotic Least Squares

Proposition: Under the maintained assumptions, the best asymptotic least squares estimators exist. They correspond to sequences of matrices $W_{T}^{*}$ converging to $W_{0}^{*} = {([(\begin{matrix} I & 0 \end{matrix}) - \nabla_{(p, g)'} Ψ] Σ [(\begin{matrix} I & 0 \end{matrix}) - \nabla_{(p, g)'} Ψ]^{⊤})}^{- 1} .$ Their asymptotic covariance matrices are ${(\nabla_{θ} Ψ^{⊤} {([(\begin{matrix} I & 0 \end{matrix}) - \nabla_{(p, g)'} Ψ] Σ [(\begin{matrix} I & 0 \end{matrix}) - \nabla_{(p, g)'} Ψ]^{⊤})}^{- 1} \nabla_{θ^{⊤}} Ψ)}^{- 1}$

Here, $0$ denotes a $(N \cdot K \cdot m_{s}) \times (m_{a} \cdot m_{s} \cdot m_{s})$ matrix of zeros.

Asymptotic Least Squares: Moment Estimator

The moment estimator proposed by Hotz and Miller (1993) is an asymptotic least squares estimator with a particular weight matrix.

Let $T_{is}$ denote the set of observations for individual $i$ in state $s$ and let $α_{is} = (α_{1}, \dots, α_{K})$ be a vector of indicators for each choice (with zero omitted).

The moment condition is $E [Z \otimes (α_{is} - Ψ_{is} ({\hat{p}}_{T}, {\hat{g}}_{T}, θ))] = 0$ where $Z$ is a $J \times 1$ -dimensional vector of instruments.

Suppose $Z_{t} = Z_{is}$ . Then the corresponding sample analog becomes $\frac{1}{NT} \sum_{\overset{1 \leq i \leq N}{s \in S}} \sum_{t \in T_{is}} Z_{t} \otimes (α_{t} - Ψ_{is} ({\hat{p}}_{T}, {\hat{g}}_{T}, θ)) = \frac{1}{NT} \sum_{\overset{1 \leq i \leq N}{s \in S}} n_{is} [Z_{is} \otimes ({\hat{p}}_{is} - Ψ_{is} ({\hat{p}}_{T}, {\hat{g}}_{T}, θ))] .$

Thus, the moment estimator in this case is an asymptotic least squares estimator with estimating equation $\hat{p} - Ψ ({\hat{p}}_{T}, {\hat{g}}_{T}, θ) = 0$ .

Asymptotic Least Squares: Pseudo Maximum Likelihood

The pseudo maximum likelihood estimator of Aguirregabiria and Mira (2002, 2007) is also an asymptotic least squares estimator.

The partial pseudo log-likelihood, conditional on estimates ${\hat{g}}_{T}$ is $ℓ = \sum_{s \in S} \sum_{i = 1}^{N} \sum_{k \in A_{i}} n_{kis} \ln Ψ_{kis} ({\hat{p}}_{T}, {\hat{g}}_{T}, θ) .$

The first order condition is $\frac{\partial ℓ}{\partial θ} = (\nabla_{θ} Ψ^{⊤}) Σ_{p}^{- 1} (Ψ) [\hat{p} - Ψ ({\hat{p}}_{T}, {\hat{g}}_{T}, θ)]$ where $Σ_{p}^{- 1} (Ψ)$ is the inverse covariance matrix of the choice probabilities.

This is equivalent to the first order condition of the asymptotic least squares estimator with weight matrix $W_{T}^{ml} \overset{p}{\to} Σ_{p}^{- 1}$ .

Monte Carlo Study

Compare LS-E, PML, LS-I, and k-PML.
A simple two player, two action, two state, game with five equilibria.
Three equilibria are used for experiments with various sample sizes.
LS-E estimator performs best overall (in eight of 12 experiments).
LS-E performs poorly with the smallest sample size ( $T = 100$ ).
PML ranks second (by MSE) in seven of 12 specifications.
PML performs better than LS-E for $T = 100$ and worse for larger sample sizes. This may be because the covariance matrix of $({\hat{p}}_{T}, {\hat{g}}_{T})$ is estimated better than the efficient weight matrix for small $T$ .
PML may be less computationally burdensome for large state spaces.