Iterated filtering

From HandWiki

Iterated filtering algorithms are a tool for maximum likelihood inference on partially observed dynamical systems. Stochastic perturbations to the unknown parameters are used to explore the parameter space. Applying sequential Monte Carlo (the particle filter) to this extended model results in the selection of the parameter values that are more consistent with the data. Appropriately constructed procedures, iterating with successively diminished perturbations, converge to the maximum likelihood estimate.[1][2][3] Iterated filtering methods have so far been used most extensively to study infectious disease transmission dynamics. Case studies include cholera,[4][5] Ebola virus,[6] influenza,[7][8][9][10] malaria,[11][12][13] HIV,[14] pertussis,[15][16] poliovirus[17] and measles.[5][18] Other areas which have been proposed to be suitable for these methods include ecological dynamics[19][20] and finance.[21][22] The perturbations to the parameter space play several different roles. Firstly, they smooth out the likelihood surface, enabling the algorithm to overcome small-scale features of the likelihood during early stages of the global search. Secondly, Monte Carlo variation allows the search to escape from local minima. Thirdly, the iterated filtering update uses the perturbed parameter values to construct an approximation to the derivative of the log likelihood even though this quantity is not typically available in closed form. Fourthly, the parameter perturbations help to overcome numerical difficulties that can arise during sequential Monte Carlo.

Overview

The data are a time series [math]\displaystyle{ y_1,\dots,y_N }[/math] collected at times [math]\displaystyle{ t_1 \lt t_2 \lt \dots \lt t_N }[/math]. The dynamic system is modeled by a Markov process [math]\displaystyle{ X(t) }[/math] which is generated by a function [math]\displaystyle{ f(x,s,t,\theta,W) }[/math] in the sense that

[math]\displaystyle{ X(t^{}_n)=f(X(t^{}_{n-1}),t^{}_{n-1},t^{}_n,\theta,W) }[/math]

where [math]\displaystyle{ \theta }[/math] is a vector of unknown parameters and [math]\displaystyle{ W }[/math] is some random quantity that is drawn independently each time [math]\displaystyle{ f(.) }[/math] is evaluated. An initial condition [math]\displaystyle{ X(t_0) }[/math] at some time [math]\displaystyle{ t_0\lt t_1 }[/math] is specified by an initialization function, [math]\displaystyle{ X(t_0)=h(\theta) }[/math]. A measurement density [math]\displaystyle{ g(y_n|X_n,t_n,\theta) }[/math] completes the specification of a partially observed Markov process. We present a basic iterated filtering algorithm (IF1)[1][2] followed by an iterated filtering algorithm implementing an iterated, perturbed Bayes map (IF2).[3][23]

Procedure: Iterated filtering (IF1)

Input: A partially observed Markov model specified as above; Monte Carlo sample size [math]\displaystyle{ J }[/math]; number of iterations [math]\displaystyle{ M }[/math]; cooling parameters [math]\displaystyle{ 0\lt a\lt 1 }[/math] and [math]\displaystyle{ b }[/math]; covariance matrix [math]\displaystyle{ \Phi }[/math]; initial parameter vector [math]\displaystyle{ \theta^{(1)} }[/math]
for [math]\displaystyle{ m^{}_{}=1 }[/math] to [math]\displaystyle{ M^{}_{} }[/math]
draw [math]\displaystyle{ \Theta_F(t^{}_0,j)\sim \mathrm{Normal}(\theta^{(m)},b a^{m-1} \Phi) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ X_F(t^{}_0,j)=h\big(\Theta_F(t^{}_0,j)\big) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ \bar\theta(t^{}_0)=\theta^{(m)} }[/math]
for [math]\displaystyle{ n^{}_{}=1 }[/math] to [math]\displaystyle{ N^{}_{} }[/math]
draw [math]\displaystyle{ \Theta_P(t^{}_n,j)\sim \mathrm{Normal}(\Theta_F(t^{}_{n-1},j), a^{m-1} \Phi) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ X_P(t^{}_n,j)=f(X_F(t^{}_{n-1},j),t^{}_{n-1},t_n,\Theta_P(t_{n},j),W) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ w(n,j) = g(y_n|X_P(t^{}_n,j),t^{}_n,\Theta_P(t_{n},j)) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
draw [math]\displaystyle{ k^{}_1,\dots,k^{}_J }[/math] such that [math]\displaystyle{ P(k^{}_j=i)=w(n,i)\big/{\sum}_\ell w(n,\ell) }[/math]
set [math]\displaystyle{ X_F(t^{}_n,j)=X_P(t^{}_n,k^{}_j) }[/math] and [math]\displaystyle{ \Theta_F(t^{}_n,j)=\Theta_P(t^{}_n,k^{}_j) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ \bar\theta_i^{}(t_n^{}) }[/math] to the sample mean of [math]\displaystyle{ \{\Theta_{F,i}^{}(t^{}_{n},j),j=1,\dots,J\} }[/math], where the vector [math]\displaystyle{ \Theta^{}_F }[/math] has components [math]\displaystyle{ \{\Theta^{}_{F,i}\} }[/math]
set [math]\displaystyle{ V_i^{}(t_n^{}) }[/math] to the sample variance of [math]\displaystyle{ \{\Theta_{P,i}^{}(t^{}_{n},j),j=1,\dots,J\} }[/math]
set [math]\displaystyle{ \theta_i^{(m+1)}= \theta_i^{(m)}+V_i(t_{1})\sum_{n=1}^N V_i^{-1}(t_{n})(\bar\theta_i(t_n)-\bar\theta_i(t_{n-1})) }[/math]
Output: Maximum likelihood estimate [math]\displaystyle{ \hat\theta=\theta^{(M+1)} }[/math]

Variations

  1. For IF1, parameters which enter the model only in the specification of the initial condition, [math]\displaystyle{ X(t_0) }[/math], warrant some special algorithmic attention since information about them in the data may be concentrated in a small part of the time series.[1]
  2. Theoretically, any distribution with the requisite mean and variance could be used in place of the normal distribution. It is standard to use the normal distribution and to reparameterise to remove constraints on the possible values of the parameters.
  3. Modifications to the IF1 algorithm have been proposed to give superior asymptotic performance.[24][25]

Procedure: Iterated filtering (IF2)

Input: A partially observed Markov model specified as above; Monte Carlo sample size [math]\displaystyle{ J }[/math]; number of iterations [math]\displaystyle{ M }[/math]; cooling parameter [math]\displaystyle{ 0\lt a\lt 1 }[/math]; covariance matrix [math]\displaystyle{ \Phi }[/math]; initial parameter vectors [math]\displaystyle{ \{\Theta_j, j=1,\dots,J\} }[/math]
for [math]\displaystyle{ m^{}_{}=1 }[/math] to [math]\displaystyle{ M^{}_{} }[/math]
set [math]\displaystyle{ \Theta_F(t^{}_0,j) \sim \mathrm{Normal}(\Theta_j, a^{m-1} \Phi) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ X_F(t^{}_0,j)=h\big(\Theta_F(t^{}_0,j)\big) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
for [math]\displaystyle{ n^{}_{}=1 }[/math] to [math]\displaystyle{ N^{}_{} }[/math]
draw [math]\displaystyle{ \Theta_P(t^{}_n,j)\sim \mathrm{Normal}(\Theta_F(t^{}_{n-1},k^{}_j), a^{m-1} \Phi) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ X_P(t^{}_n,j)=f(X_F(t^{}_{n-1},j),t^{}_{n-1},t_n,\Theta_P(t_{n},j),W) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ w(n,j) = g(y_n|X_P(t^{}_n,j),t^{}_n,\Theta_P(t_{n},j)) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
draw [math]\displaystyle{ k^{}_1,\dots,k^{}_J }[/math] such that [math]\displaystyle{ P(k^{}_j=i)=w(n,i)\big/{\sum}_\ell w(n,\ell) }[/math]
set [math]\displaystyle{ X_F(t^{}_n,j)=X_P(t^{}_n,k^{}_j) }[/math] and [math]\displaystyle{ \Theta_F(t^{}_n,j)=\Theta_P(t^{}_n,k^{}_j) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
set [math]\displaystyle{ \Theta_j=\Theta_F(t^{}_N,j) }[/math] for [math]\displaystyle{ j=1,\dots, J }[/math]
Output: Parameter vectors approximating the maximum likelihood estimate, [math]\displaystyle{ \{\Theta_j, j=1,\dots, J \} }[/math]

Software

"pomp: statistical inference for partially-observed Markov processes" : R package.

References

  1. 1.0 1.1 1.2 Ionides, E. L.; Breto, C.; King, A. A. (2006). "Inference for nonlinear dynamical systems". Proceedings of the National Academy of Sciences of the USA 103 (49): 18438–18443. doi:10.1073/pnas.0603181103. PMID 17121996. Bibcode2006PNAS..10318438I. 
  2. 2.0 2.1 Ionides, E. L.; Bhadra, A.; Atchade, Y.; King, A. A. (2011). "Iterated filtering". Annals of Statistics 39 (3): 1776–1802. doi:10.1214/11-AOS886. 
  3. 3.0 3.1 Ionides, E. L.; Nguyen, D.; Atchadé, Y.; Stoev, S.; King, A. A. (2015). "Inference for dynamic and latent variable models via iterated, perturbed Bayes maps". Proceedings of the National Academy of Sciences of the USA 112 (3): 719–724. doi:10.1073/pnas.1410597112. PMID 25568084. Bibcode2015PNAS..112..719I. 
  4. King, A. A.; Ionides, E. L.; Pascual, M.; Bouma, M. J. (2008). "Inapparent infections and cholera dynamics". Nature 454 (7206): 877–880. doi:10.1038/nature07084. PMID 18704085. Bibcode2008Natur.454..877K. https://deepblue.lib.umich.edu/bitstream/2027.42/62519/1/nature07084.pdf. 
  5. 5.0 5.1 Breto, C.; He, D.; Ionides, E. L.; King, A. A. (2009). "Time series analysis via mechanistic models". Annals of Applied Statistics 3: 319–348. doi:10.1214/08-AOAS201. 
  6. "Avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to Ebola". Proceedings of the Royal Society B 282 (1806): 20150347. 2015. doi:10.1098/rspb.2015.0347. PMID 25833863. 
  7. He, D.; J. Dushoff; T. Day; J. Ma; D. Earn (2011). "Mechanistic modelling of the three waves of the 1918 influenza pandemic". Theoretical Ecology 4 (2): 1–6. doi:10.1007/s12080-011-0123-3. 
  8. Camacho, A.; S. Ballesteros; A. L. Graham; R. Carrat; O. Ratmann; B. Cazelles (2011). "Explaining rapid reinfections in multiple-wave influenza outbreaks: Tristan da Cunha 1971 epidemic as a case study". Proceedings of the Royal Society B 278 (1725): 3635–3643. doi:10.1098/rspb.2011.0300. PMID 21525058. 
  9. Earn, D.; He, D.; Loeb, M. B.; Fonseca, K.; Lee, B. E.; Dushoff, J. (2012). "Effects of School Closure on Incidence of Pandemic Influenza in Alberta, Canada". Annals of Internal Medicine 156 (3): 173–181. doi:10.7326/0003-4819-156-3-201202070-00005. PMID 22312137. 
  10. Shrestha, S.; Foxman, B.; Weinberger, D. M.; Steiner, C.; Viboud, C.; Rohani, P. (2013). "Identifying the interaction between influenza and pneumococcal pneumonia using incidence data". Science Translational Medicine 5 (191): 191ra84. doi:10.1126/scitranslmed.3005982. PMID 23803706. 
  11. Laneri, K.; A. Bhadra; E. L. Ionides; M. Bouma; R. C. Dhiman; R. S. Yadav; M. Pascual (2010). "Forcing versus feedback: Epidemic malaria and monsoon rains in NW India". PLOS Computational Biology 6 (9): e1000898. doi:10.1371/journal.pcbi.1000898. PMID 20824122. Bibcode2010PLSCB...6E0898L. 
  12. Bhadra, A.; E. L. Ionides; K. Laneri; M. Bouma; R. C. Dhiman; M. Pascual (2011). "Malaria in Northwest India: Data analysis via partially observed stochastic differential equation models driven by Lévy noise". Journal of the American Statistical Association 106 (494): 440–451. doi:10.1198/jasa.2011.ap10323. 
  13. Roy, M.; Bouma, M. J.; Ionides, E. L.; Dhiman, R. C.; Pascual, M. (2013). "The potential elimination of Plasmodium vivax malaria by relapse treatment: Insights from a transmission model and surveillance data from NW India". PLOS Neglected Tropical Diseases 7 (1): e1979. doi:10.1371/journal.pntd.0001979. PMID 23326611. 
  14. Zhou, J.; Han, L.; Liu, S. (2013). "Nonlinear mixed-effects state space models with applications to HIV dynamics". Statistics and Probability Letters 83 (5): 1448–1456. doi:10.1016/j.spl.2013.01.032. 
  15. Lavine, J.; Rohani, P. (2012). "Resolving pertussis immunity and vaccine effectiveness using incidence time series". Expert Review of Vaccines 11 (11): 1319–1329. doi:10.1586/ERV.12.109. PMID 23249232. 
  16. Blackwood, J. C.; Cummings, D. A. T.; Broutin, H.; Iamsirithaworn, S.; Rohani, P. (2013). "Deciphering the impacts of vaccination and immunity on pertussis epidemiology in Thailand". Proceedings of the National Academy of Sciences of the USA 110 (23): 9595–9600. doi:10.1073/pnas.1220908110. PMID 23690587. Bibcode2013PNAS..110.9595B. 
  17. Blake, I. M.; Martin, R.; Goel, A.; Khetsuriani, N.; Everts, J.; Wolff, C.; Wassilak, S.; Aylward, R. B. et al. (2014). "The role of older children and adults in wild poliovirus transmission". Proceedings of the National Academy of Sciences of the USA 111 (29): 10604–10609. doi:10.1073/pnas.1323688111. PMID 25002465. Bibcode2014PNAS..11110604B. 
  18. He, D.; Ionides, E. L.; King, A. A. (2010). "Plug-and-play inference for disease dynamics: measles in large and small towns as a case study". Journal of the Royal Society Interface 7 (43): 271–283. doi:10.1098/rsif.2009.0151. PMID 19535416. 
  19. Ionides, E. L.. (2011). "Discussion on "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong.". Statistical Science 26: 49–52. doi:10.1214/11-STS345C. 
  20. Blackwood, J. C.; Streicker, D. G.; Altizer, S.; Rohani, P. (2013). "Resolving the roles of immunity, pathogenesis, and immigration for rabies persistence in vampire bat". Proceedings of the National Academy of Sciences of the USA 110 (51): 20837––20842. doi:10.1073/pnas.1308817110. PMID 24297874. Bibcode2013PNAS..11020837B. 
  21. Bhadra, A. (2010). "Discussion of "Particle Markov chain Monte Carlo methods" by C. Andrieu, A. Doucet and R. Holenstein". Journal of the Royal Statistical Society, Series B 72 (3): 314–315. doi:10.1111/j.1467-9868.2009.00736.x. 
  22. Breto, C. (2014). "On idiosyncratic stochasticity of financial leverage effects". Statistics and Probability Letters 91: 20–26. doi:10.1016/j.spl.2014.04.003. 
  23. Lindstrom, E.; Ionides, E. L.; Frydendall, J.; Madsen, H. (2012). "Efficient Iterated Filtering". System Identification 45 (16): 1785–1790. doi:10.3182/20120711-3-BE-2027.00300. 
  24. Lindstrom, E. (2013). "Tuned iterated filtering". Statistics and Probability Letters 83 (9): 2077–2080. doi:10.1016/j.spl.2013.05.019. 
  25. Doucet, A.; Jacob, P. E.; Rubenthaler, S. (2013). "Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models". arXiv:1304.5768 [stat.ME].