Auxiliary Particle Filter
These notes are based on the following articles:
Pitt, Michael K., and Neil Shephard (1999): “Filtering via Simulation: Auxiliary Particle Filters,” Journal of the American Statistical Association, 94, 590–599.
Pitt, Michael K., and Neil Shephard (2001): “Auxiliary Variable Based Particle Filters,” in Sequential Monte Carlo Methods in Practice, ed. by A. Doucet, N. de Freitas, and N. Gordon. New York: Springer-Verlag.
Introduction
Consider a time series for that is independent conditional on an unobserved state which is assumed to be Markov process. We wish to perform on-line filtering to learn about the unobserved state given the currently available information by estimating the density for The measurement density and transition density implicitly depend on a finite vector of parameters. The initial distribution of the state is .
Suppose we know the filtering distribution at time and we receive a new observation for period . We can obtain the updated filtering density in two steps. First, we use the transition density to obtain from as Then, we obtain the new filtering density by using Bayes’ Theorem:
Hence, filtering essentially involves applying the recursive relationship If the support of is known and finite, then the above integral is simply the weighted sum over the points in the support. In other cases, numerical methods might need to be used.
Particle Filters
Particle filters are a class of simulation-based filters that recursively approximate the distribution of using a collection of particles with probability masses . The particles are thought of as a sample from . In this article, the weights are taken to be equal: for all . As , we want the approximation to become better. Thus, we can approximate the true filtering density (eq:filtering) by an empirical one: Then, a new sample of particles can be generated from this empirical density and the procedure can continue recursively. A particle filter is said to be fully adapted if it generates independent and identically distributed samples from (eq:empirical_filtering). It is useful to think of (eq:empirical_filtering) as a posterior density which is the product of a prior, , and a likelihood .
Assuming that we can evaluate up to a constant of proportionality, we can sample from (eq:empirical_filtering) by first obtaining a draw with probability and then drawing from . The authors describe three of the possible methods for doing this. The most commonly used is the Sampling/importance resampling (SIR) method of Rubin (1987). The first particle filter, independently proposed by several authors, was based on SIR. In particular, Gordon, Salmond, and Smith (1993) suggested it for non-Gaussian, nonlinear state space models and Kitagawa (1996) for time series models. The other two methods, acceptance sampling and MCMC methods, are discussed in the article but not in these notes.
Sampling/importance resampling (SIR)
Given a set of draws , the SIR method first takes draws from and assigns a weight to each draw, where and . This weighted sample converges to a nonrandom sample from the empirical filtering distribution as . To generate a random sample of size , a resampling step is introduced where the draws are resampled with weights to produce a uniformly weighted sample.
Adaption
Basically, the SIR particle filter above produces proposal draws of without taking into account the new information, the value of . A particle filter is said to be adapted if it makes proposal draws taking into account this new information. An adapted version of the algorithm would look something like
Draw for .
Evaluate the weights
Resample with weights proportional to to obtain a sample of size .
This algorithm allows for proposals to come from a general density which depends on as opposed to the standard SIR particle filter where the proposal density does not depend on . To understand how the importance weights above were derived, consider the importance sampler of with the importance sampling density . We would first take draws from and then weight by . But from Bayes’ Theorem, Hence, after dividing by , we have the importance weights shown above.
This illustrates the difficulty of adapting the standard particle filter. To obtain a single new particle we must evaluate densities: as well as for each .
Auxiliary Particle Filters
The authors extend standard particle filtering methods by including an auxiliary variable which allows the particle filter to be adapted in a more efficient way. They introduce a variable, , which is an index to the mixture (eq:empirical_filtering) and filter in a higher dimension. This auxiliary variable is introduced only to aid in simulation. With this additional variable, the filtering density we wish to approximate becomes for . Now, if we can sample from , then we can discard the sampled values of and be left with a sample from the original filtering density (eq:empirical_filtering).
To sample from (eq:filter_density_aux) using SIR, we make proposal draws from some proposal density and calculate the weights for .
The choice of is left completely to the researcher. The authors propose a generic choice of which can be applied in many situations and go on to provide more examples in specific models where the structure of the model informs the choice of . Here, I present only the generic in terms of the SIR algorithm. The density (eq:filter_density_aux) can be approximated by where is some value with a high probability of occurance, for example, the mean or mode of the distribution of . This choice is made for convenience since Hence, we can draw from by first drawing values of with probabilities and then drawing from the transition probabilities . The weights are called first stage weights. Then, after sampling times from we form the weights for . We could also resample times from this distribution.
Auxiliary Particle Filter Algorithm
The following algorithm is based on the generic choice of from the discussion above. Other choices are possible, and may be more efficient for some model specifications.
Initialize the algorithm with a uniformly weighted sample from the distribution .
Given draws from , determine and the first stage weights for each .
For , draw from the indices with weights and then draw from the transition density
Form the weights according to (eq:aux_weights).
Resample times from these draws with weights if desired.