book

Learning Bayesian Models with R

Name: Learning Bayesian Models with R
Author: Hari Manassery Koduvely
ISBN: 9781783987603

by Hari Manassery Koduvely

October 2015

Beginner to intermediate

168 pages

4h 11m

English

Packt Publishing

Read now

Unlock full access

Learning Bayesian Models with R
Table of Contents
Learning Bayesian Models with R
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and moreWhy subscribe?Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for

Conventions
Reader feedback
Customer support
Downloading the example codeErrataPiracyQuestions
1. Introducing the Probability Theory
Probability distributions
Conditional probability
Bayesian theorem
Marginal distribution
Expectations and covariance
Binomial distributionBeta distributionGamma distributionDirichlet distributionWishart distribution
Exercises
References
Summary
2. The R Environment
Setting up the R environment and packagesInstalling R and RStudioYour first R program
Managing data in R
Data Types in RData structures in RImporting data into RSlicing and dicing datasetsVectorized operations
Writing R programs
Control structuresFunctionsScoping rulesLoop functionslapplysapplymapplyapplytapply
Data visualization
High-level plotting functionsLow-level plotting commandsInteractive graphics functions
Sampling
Random uniform sampling from an intervalSampling from normal distribution
Exercises
References
Summary
3. Introducing Bayesian Inference
Bayesian view of uncertaintyChoosing the right prior distributionNon-informative priorsSubjective priorsConjugate priorsHierarchical priorsEstimation of posterior distributionMaximum a posteriori estimationLaplace approximationMonte Carlo simulationsThe Metropolis-Hasting algorithmR packages for the Metropolis-Hasting algorithmGibbs samplingR packages for Gibbs samplingVariational approximationPrediction of future observations
Exercises
References
Summary
4. Machine Learning Using Bayesian Inference
Why Bayesian inference for machine learning?
Model overfitting and bias-variance tradeoff
Selecting models of optimum complexity
Subset selectionModel regularization
Bayesian averaging
An overview of common machine learning tasks
References
Summary
5. Bayesian Regression Models
Generalized linear regression
The arm package
The Energy efficiency dataset
Regression of energy efficiency with building parameters
Ordinary regressionBayesian regression
Simulation of the posterior distribution
Exercises
References
Summary
6. Bayesian Classification Models
Performance metrics for classification
The Naïve Bayes classifier
Text processing using the tm packageModel training and prediction
The Bayesian logistic regression model
The BayesLogit R packageThe datasetPreparation of the training and testing datasetsUsing the Bayesian logistic model
Exercises
References
Summary
7. Bayesian Models for Unsupervised Learning
Bayesian mixture modelsThe bgmm package for Bayesian mixture models
Topic modeling using Bayesian inference
Latent Dirichlet allocation
R packages for LDA
The topicmodels packageThe lda package
Exercises
References
Summary
8. Bayesian Neural Networks
Two-layer neural networks
Bayesian treatment of neural networks
The brnn R package
Deep belief networks and deep learning
Restricted Boltzmann machinesDeep belief networksThe darch R packageOther deep learning packages in R
Exercises
References
Summary
9. Bayesian Modeling at Big Data Scale
Distributed computing using Hadoop
RHadoop for using Hadoop from R
Spark – in-memory distributed computing
SparkR
Linear regression using SparkR
Computing clusters on the cloud
Amazon Web ServicesCreating and running computing instances on AWSInstalling R and RStudioRunning Spark on EC2Microsoft AzureIBM Bluemix
Other R packages for large scale machine learning
The parallel R packageThe foreach R package
Exercises
References
Summary
Index

Content preview from Learning Bayesian Models with R

Spark – in-memory distributed computing

One of the issues with Hadoop is that after a MapReduce operation, the resulting files are written to the hard disk. Therefore, when there is a large data processing operation, there would be many read and write operations on the hard disk, which makes processing in Hadoop very slow. Moreover, the network latency, which is the time required to shuffle data between different nodes, also contributes to this problem. Another disadvantage is that one cannot make real-time queries from the files stored in HDFS. For machine learning problems, during training phase, the MapReduce will not persist over iterations. All this makes Hadoop not an ideal platform for machine learning.

A solution to this problem was invented ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781783987603

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Learning Bayesian Models with R

by Hari Manassery Koduvely

Spark – in-memory distributed computing

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.