Software:PyMC

From HandWiki
PyMC
Original author(s)PyMC Development Team
Initial releaseMay 4, 2013 (2013-05-04)
Repositoryhttps://github.com/pymc-devs/pymc
Written inPython
Operating systemUnix-like, Mac OS X, Microsoft Windows
PlatformIntel x86 – 32-bit, x64
TypeStatistical package
LicenseApache License, Version 2.0
Websitewww.pymc.io

PyMC (formerly known as PyMC3) is a probabilistic programming language written in Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.

PyMC performs inference based on advanced Markov chain Monte Carlo and/or variational fitting algorithms.[1] [2][3][4][5] It is a rewrite from scratch of the previous version of the PyMC software.[6] Unlike PyMC2, which had used Fortran extensions for performing computations, PyMC relies on PyTensor, a Python library that allows defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays. From version 3.8 PyMC relies on ArviZ to handle plotting, diagnostics, and statistical checks. PyMC and Stan are the two most popular probabilistic programming tools.[7] PyMC is an open source project, developed by the community and fiscally sponsored by NumFOCUS.[8]

PyMC has been used to solve inference problems in several scientific domains, including astronomy,[9][10] epidemiology,[11][12] molecular biology,[13] crystallography,[14][15] chemistry,[16] ecology[17][18] and psychology.[19] Previous versions of PyMC were also used widely, for example in climate science,[20] public health,[21] neuroscience,[22] and parasitology.[23][24]

After Theano announced plans to discontinue development in 2017,[25] the PyMC team evaluated TensorFlow Probability as a computational backend,[26] but decided in 2020 to fork Theano under the name Aesara.[27] Large parts of the Theano codebase have been refactored and compilation through JAX[28] and Numba were added. The PyMC team has released the revised computational backend under the name PyTensor and continues the development of PyMC.[29]

Inference engines

PyMC implements non-gradient-based and gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference and stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference.

  • MCMC-based algorithms:
    • No-U-Turn sampler[30] (NUTS), a variant of Hamiltonian Monte Carlo and PyMC's default engine for continuous variables
    • Metropolis–Hastings, PyMC's default engine for discrete variables
    • Sequential Monte Carlo for static posteriors
    • Sequential Monte Carlo for approximate Bayesian computation
  • Variational inference algorithms:
    • Black-box Variational Inference[31]

See also

  • Stan is a probabilistic programming language for statistical inference written in C++
  • ArviZ a Python library for Exploratory Analysis of Bayesian Models

References

  1. Abril-Pla O, Andreani V, Carroll C, Dong L, Fonnesbeck CJ, Kochurov M, Kumar R, Lao J, Luhmann CC, Martin OA, Osthege M, Vieira R, Wiecki T, Zinkov R. (2023) PyMC: a modern, and comprehensive probabilistic programming framework in Python. PeerJ Comput. Sci. 9:e1516 doi:10.7717/peerj-cs.1516
  2. Salvatier J, Wiecki TV, Fonnesbeck C. (2016) Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2:e55 doi:10.7717/peerj-cs.55
  3. Martin, Osvaldo (2016) (in en). Bayesian Analysis with Python. Packt Publishing Ltd. pp. 31–60. ISBN 9781785889851. https://books.google.com/books?id=t6PcDgAAQBAJ&q=%22PyMC3%22. Retrieved 16 September 2017. 
  4. Martin, Osvaldo; Kumar, Ravin; Lao, Junpeng (2021) (in en). Bayesian Modeling and Computation in Python. CRC-press. pp. 1–420. ISBN 9780367894368. https://www.routledge.com/Bayesian-Modeling-and-Computation-in-Python/Martin-Kumar-Lao/p/book/9780367894368. Retrieved 7 July 2022. 
  5. Davidson-Pilon, Cameron (2015-09-30) (in en). Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference. Addison-Wesley Professional. ISBN 9780133902921. https://books.google.com/books?id=rMKiCgAAQBAJ. 
  6. "documentation" (in en). http://docs.pymc.io. 
  7. "The Algorithms Behind Probabilistic Programming". http://blog.fastforwardlabs.com/2017/01/30/the-algorithms-behind-probabilistic-programming.html. 
  8. "NumFOCUS Announces New Fiscally Sponsored Project: PyMC3". NumFOCUS | Open Code = Better Science. http://www.numfocus.org/blog/numfocus-announces-new-fiscally-sponsored-project-pymc3. 
  9. Greiner, J.; Burgess, J. M.; Savchenko, V.; Yu, H.-F. (2016). "On the Fermi-GBM Event 0.4 s after GW150914" (in en). The Astrophysical Journal Letters 827 (2): L38. doi:10.3847/2041-8205/827/2/L38. ISSN 2041-8205. Bibcode2016ApJ...827L..38G. http://stacks.iop.org/2041-8205/827/i=2/a=L38. 
  10. Hilbe, Joseph M.; Souza, Rafael S. de; Ishida, Emille E. O. (2017-04-30) (in en). Bayesian Models for Astrophysical Data: Using R, JAGS, Python, and Stan. Cambridge University Press. ISBN 9781108210744. https://books.google.com/books?id=7D2wDgAAQBAJ&q=PyMC3&pg=PA161. 
  11. Brauner, Jan M.; Mindermann, Sören; Sharma, Mrinank; Johnston, David; Salvatier, John; Gavenčiak, Tom; Stephenson, Anna B.; Leech, Gavin et al. (2020-12-15). "Inferring the effectiveness of government interventions against COVID-19". Science 371 (6531): eabd9338. doi:10.1126/science.abd9338. PMID 33323424. 
  12. Systrom, Kevin; Vladek, Thomas; Krieger, Mike. "Rt.live Github repository". https://github.com/rtcovidlive/covid-model. 
  13. Wagner, Stacey D.; Struck, Adam J.; Gupta, Riti; Farnsworth, Dylan R.; Mahady, Amy E.; Eichinger, Katy; Thornton, Charles A.; Wang, Eric T. et al. (2016-09-28). "Dose-Dependent Regulation of Alternative Splicing by MBNL Proteins Reveals Biomarkers for Myotonic Dystrophy". PLOS Genetics 12 (9): e1006316. doi:10.1371/journal.pgen.1006316. ISSN 1553-7404. PMID 27681373. 
  14. Sharma, Amit; Johansson, Linda; Dunevall, Elin; Wahlgren, Weixiao Y.; Neutze, Richard; Katona, Gergely (2017-03-01). "Asymmetry in serial femtosecond crystallography data" (in en). Acta Crystallographica Section A 73 (2): 93–101. doi:10.1107/s2053273316018696. ISSN 2053-2733. PMID 28248658. 
  15. Katona, Gergely; Garcia-Bonete, Maria-Jose; Lundholm, Ida (2016-05-01). "Estimating the difference between structure-factor amplitudes using multivariate Bayesian inference" (in en). Acta Crystallographica Section A 72 (3): 406–411. doi:10.1107/S2053273316003430. ISSN 2053-2733. PMID 27126118. 
  16. Garay, Pablo G.; Martin, Osvaldo A.; Scheraga, Harold A.; Vila, Jorge A. (2016-07-21). "Detection of methylation, acetylation and glycosylation of protein residues by monitoring13C chemical-shift changes: A quantum-chemical study" (in en). PeerJ 4: e2253. doi:10.7717/peerj.2253. ISSN 2167-8359. PMID 27547559. 
  17. Wang, Yan; Huang, Hong; Huang, Lida; Ristic, Branko (2017). "Evaluation of Bayesian source estimation methods with Prairie Grass observations and Gaussian plume model: A comparison of likelihood functions and distance measures". Atmospheric Environment 152: 519–530. doi:10.1016/j.atmosenv.2017.01.014. Bibcode2017AtmEn.152..519W. 
  18. MacNeil, M. Aaron; Chong-Seng, Karen M.; Pratchett, Deborah J.; Thompson, Casssandra A.; Messmer, Vanessa; Pratchett, Morgan S. (2017-03-14). "Age and Growth of An Outbreaking Acanthaster cf. solaris Population within the Great Barrier Reef" (in en). Diversity 9 (1): 18. doi:10.3390/d9010018. 
  19. Tünnermann, Jan; Scharlau, Ingrid (2016). "Peripheral Visual Cues: Their Fate in Processing and Effects on Attention and Temporal-Order Perception" (in en). Frontiers in Psychology 7: 1442. doi:10.3389/fpsyg.2016.01442. ISSN 1664-1078. PMID 27766086. 
  20. Graham, Nicholas A. J.; Jennings, Simon; MacNeil, M. Aaron; Mouillot, David; Wilson, Shaun K. (2015). "Predicting climate-driven regime shifts versus rebound potential in coral reefs". Nature 518 (7537): 94–97. doi:10.1038/nature14140. PMID 25607371. Bibcode2015Natur.518...94G. 
  21. Mascarenhas, Maya N.; Flaxman, Seth R.; Boerma, Ties; Vanderpoel, Sheryl; Stevens, Gretchen A. (2012-12-18). "National, Regional, and Global Trends in Infertility Prevalence Since 1990: A Systematic Analysis of 277 Health Surveys". PLOS Medicine 9 (12): e1001356. doi:10.1371/journal.pmed.1001356. ISSN 1549-1676. PMID 23271957. 
  22. Cavanagh, James F; Wiecki, Thomas V; Cohen, Michael X; Figueroa, Christina M; Samanta, Johan; Sherman, Scott J; Frank, Michael J (2011). "Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold". Nature Neuroscience 14 (11): 1462–1467. doi:10.1038/nn.2925. PMID 21946325. 
  23. Gething, Peter W.; Elyazar, Iqbal R. F.; Moyes, Catherine L.; Smith, David L.; Battle, Katherine E.; Guerra, Carlos A.; Patil, Anand P.; Tatem, Andrew J. et al. (2012-09-06). "A Long Neglected World Malaria Map: Plasmodium vivax Endemicity in 2010". PLOS Neglected Tropical Diseases 6 (9): e1814. doi:10.1371/journal.pntd.0001814. ISSN 1935-2735. PMID 22970336. 
  24. Pullan, Rachel L.; Smith, Jennifer L.; Jasrasaria, Rashmi; Brooker, Simon J. (2014-01-21). "Global numbers of infection and disease burden of soil transmitted helminth infections in 2010". Parasites & Vectors 7: 37. doi:10.1186/1756-3305-7-37. ISSN 1756-3305. PMID 24447578. 
  25. Lamblin, Pascal (28 September 2017). "MILA and the future of Theano". theano-users (Mailing list). Retrieved 28 September 2017.
  26. Developers, PyMC (2018-05-17). "Theano, TensorFlow and the Future of PyMC". https://medium.com/@pymc_devs/theano-tensorflow-and-the-future-of-pymc-6c9987bb19d5. 
  27. "The Future of PyMC3, or: Theano is Dead, Long Live Theano". 27 October 2020. https://pymc-devs.medium.com/the-future-of-pymc3-or-theano-is-dead-long-live-theano-d8005f8a0e9b. 
  28. Bradbury, James; Frostig, Roy; Hawkins, Peter; James, Matthew James; Leary, Chris; Maclaurin, Dougal; Necula, George; Paszke, Adam et al.. "JAX". https://github.com/google/jax. 
  29. "PyMC Timeline". https://github.com/pymc-devs/pymc3/wiki/Timeline. 
  30. Hoffman, Matthew D.; Gelman, Andrew (April 2014). "The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo". Journal of Machine Learning Research 15: pp. 1593–1623. http://jmlr.org/papers/v15/hoffman14a.html. 
  31. Kucukelbir, Alp; Ranganath, Rajesh; Blei, David M. (June 2015). Automatic Variational Inference in Stan. 1506. Bibcode2015arXiv150603431K. 

Further reading

External links

  • PyMC website
  • PyMC source, a Git repository hosted on GitHub
  • PyTensor is a Python library for defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays.