Bayesian Analysis with Python - Second Edition

Book description

Bayesian modeling with PyMC3 and exploratory analysis of Bayesian models with ArviZ

Key Features

  • A step-by-step guide to conduct Bayesian data analyses using PyMC3 and ArviZ
  • A modern, practical and computational approach to Bayesian statistical modeling
  • A tutorial for Bayesian analysis and best practices with the help of sample problems and practice exercises.

Book Description

The second edition of Bayesian Analysis with Python is an introduction to the main concepts of applied Bayesian inference and its practical implementation in Python using PyMC3, a state-of-the-art probabilistic programming library, and ArviZ, a new library for exploratory analysis of Bayesian models.

The main concepts of Bayesian statistics are covered using a practical and computational approach. Synthetic and real data sets are used to introduce several types of models, such as generalized linear models for regression and classification, mixture models, hierarchical models, and Gaussian processes, among others.

By the end of the book, you will have a working knowledge of probabilistic modeling and you will be able to design and implement Bayesian models for your own data science problems. After reading the book you will be better prepared to delve into more advanced material or specialized statistical modeling if you need to.

What you will learn

  • Build probabilistic models using the Python library PyMC3
  • Analyze probabilistic models with the help of ArviZ
  • Acquire the skills required to sanity check models and modify them if necessary
  • Understand the advantages and caveats of hierarchical models
  • Find out how different models can be used to answer different data analysis questions
  • Compare models and choose between alternative ones
  • Discover how different models are unified from a probabilistic perspective
  • Think probabilistically and benefit from the flexibility of the Bayesian framework

Who this book is for

If you are a student, data scientist, researcher, or a developer looking to get started with Bayesian data analysis and probabilistic programming, this book is for you. The book is introductory so no previous statistical knowledge is required, although some experience in using Python and NumPy is expected.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Bayesian Analysis with Python Second Edition
  3. Dedication
  4. About Packt
    1. Why subscribe?
    2. Packt.com
  5. Foreword
  6. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  7. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  8. Thinking Probabilistically
    1. Statistics, models, and this book's approach
      1. Working with data
      2. Bayesian modeling
    2. Probability theory
      1. Interpreting probabilities
      2. Defining probabilities
        1. Probability distributions
        2. Independently and identically distributed variables
        3. Bayes' theorem
    3. Single-parameter inference
      1. The coin-flipping problem
        1. The general model
        2. Choosing the likelihood
        3. Choosing the prior
        4. Getting the posterior
        5. Computing and plotting the posterior
        6. The influence of the prior and how to choose one
    4. Communicating a Bayesian analysis
      1. Model notation and visualization
      2. Summarizing the posterior
        1. Highest-posterior density
    5. Posterior predictive checks
    6. Summary
    7. Exercises
  9. Programming Probabilistically
    1. Probabilistic programming
    2. PyMC3 primer
      1. Flipping coins the PyMC3 way
        1. Model specification
        2. Pushing the inference button
    3. Summarizing the posterior
      1. Posterior-based decisions
        1. ROPE
        2. Loss functions
    4. Gaussians all the way down
      1. Gaussian inferences
      2. Robust inferences
        1. Student's t-distribution
    5. Groups comparison
      1. Cohen's d
      2. Probability of superiority
      3. The tips dataset
    6. Hierarchical models
      1. Shrinkage
      2. One more example
    7. Summary
    8. Exercises
  10. Modeling with Linear Regression
    1. Simple linear regression
      1. The machine learning connection
      2. The core of the linear regression models
      3. Linear models and high autocorrelation
        1. Modifying the data before running
      4. Interpreting and visualizing the posterior
      5. Pearson correlation coefficient
        1. Pearson coefficient from a multivariate Gaussian
    2. Robust linear regression
    3. Hierarchical linear regression
      1. Correlation, causation, and the messiness of life
    4. Polynomial regression
      1. Interpreting the parameters of a polynomial regression
      2. Polynomial regression – the ultimate model?
    5. Multiple linear regression
      1. Confounding variables and redundant variables
      2. Multicollinearity or when the correlation is too high
      3. Masking effect variables
      4. Adding interactions
    6. Variable variance
    7. Summary
    8. Exercises
  11. Generalizing Linear Models
    1. Generalized linear models
    2. Logistic regression
      1. The logistic model
      2. The Iris dataset
        1. The logistic model applied to the iris dataset
    3. Multiple logistic regression
      1. The boundary decision
      2. Implementing the model
      3. Interpreting the coefficients of a logistic regression
      4. Dealing with correlated variables
      5. Dealing with unbalanced classes
      6. Softmax regression
      7. Discriminative and generative models
    4. Poisson regression
      1. Poisson distribution
      2. The zero-inflated Poisson model
      3. Poisson regression and ZIP regression
    5. Robust logistic regression
    6. The GLM module
    7. Summary
    8. Exercises
  12. Model Comparison
    1. Posterior predictive checks
    2. Occam's razor – simplicity and accuracy
      1. Too many parameters leads to overfitting
      2. Too few parameters leads to underfitting
      3. The balance between simplicity and accuracy
      4. Predictive accuracy measures
        1. Cross-validation
    3. Information criteria
      1. Log-likelihood and deviance
      2. Akaike information criterion
      3. Widely applicable Information Criterion
      4. Pareto smoothed importance sampling leave-one-out cross-validation
      5. Other Information Criteria
      6. Model comparison with PyMC3
        1. A note on the reliability of WAIC and LOO computations
      7. Model averaging
    4. Bayes factors
      1. Some remarks
        1. Computing Bayes factors
        2. Common problems when computing Bayes factors
        3. Using Sequential Monte Carlo to compute Bayes factors
      2. Bayes factors and Information Criteria
    5. Regularizing priors
    6. WAIC in depth
      1. Entropy
      2. Kullback-Leibler divergence
    7. Summary
    8. Exercises
  13. Mixture Models
    1. Mixture models
    2. Finite mixture models
      1. The categorical distribution
      2. The Dirichlet distribution
      3. Non-identifiability of mixture models
      4. How to choose K
      5. Mixture models and clustering
    3. Non-finite mixture model
      1. Dirichlet process
    4. Continuous mixtures
      1. Beta-binomial and negative binomial
      2. The Student's t-distribution
    5. Summary
    6. Exercises
  14. Gaussian Processes
    1. Linear models and non-linear data
    2. Modeling functions
      1. Multivariate Gaussians and functions
      2. Covariance functions and kernels
      3. Gaussian processes
    3. Gaussian process regression
    4. Regression with spatial autocorrelation
    5. Gaussian process classification
    6. Cox processes
      1. The coal-mining disasters
      2. The redwood dataset
    7. Summary
    8. Exercises
  15. Inference Engines
    1. Inference engines
    2. Non-Markovian methods
      1. Grid computing
      2. Quadratic method
      3. Variational methods
        1. Automatic differentiation variational inference
    3. Markovian methods
      1. Monte Carlo
      2. Markov chain
      3. Metropolis-Hastings
      4. Hamiltonian Monte Carlo
      5. Sequential Monte Carlo
    4. Diagnosing the samples
      1. Convergence
      2. Monte Carlo error
      3. Autocorrelation
      4. Effective sample sizes
      5. Divergences
        1. Non-centered parameterization
    5. Summary
    6. Exercises
  16. Where To Go Next?
  17. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Bayesian Analysis with Python - Second Edition
  • Author(s): Osvaldo Martin
  • Release date: December 2018
  • Publisher(s): Packt Publishing
  • ISBN: 9781789341652