Book Description
Master the art of predictive modeling
About This Book
 Load, wrangle, and analyze your data using the world's most powerful statistical programming language
 Familiarize yourself with the most common data mining tools of R, such as kmeans, hierarchical regression, linear regression, Naïve Bayes, decision trees, text mining and so on.
 We emphasize important concepts, such as the biasvariance tradeoff and overfitting, which are pervasive in predictive modeling
Who This Book Is For
If you work with data and want to become an expert in predictive analysis and modeling, then this Learning Path will serve you well. It is intended for budding and seasoned practitioners of predictive modeling alike. You should have basic knowledge of the use of R, although it's not necessary to put this Learning Path to great use.
What You Will Learn
 Get to know the basics of R's syntax and major data structures
 Write functions, load data, and install packages
 Use different data sources in R and know how to interface with databases, and request and load JSON and XML
 Identify the challenges and apply your knowledge about data analysis in R to imperfect realworld data
 Predict the future with reasonably simple algorithms
 Understand key data visualization and predictive analytic skills using R
 Understand the language of models and the predictive modeling process
In Detail
Predictive analytics is a field that uses data to build models that predict a future outcome of interest. It can be applied to a range of business strategies and has been a key player in search advertising and recommendation engines.
The power and domainspecificity of R allows the user to express complex analytics easily, quickly, and succinctly. R offers a free and open source environment that is perfect for both learning and deploying predictive modeling solutions in the real world. This Learning Path will provide you with all the steps you need to master the art of predictive modeling with R.
We start with an introduction to data analysis with R, and then gradually you'll get your feet wet with predictive modeling. You will get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. You will be able to solve the difficulties relating to performing data analysis in practice and find solutions to working with ?messy data?, large data, communicating results, and facilitating reproducibility. You will then perform key predictive analytics tasks using R, such as train and test predictive models for classification and regression tasks, score new data sets and so on. By the end of this Learning Path, you will have explored and tested the most popular modeling techniques in use on realworld data sets and mastered a diverse range of techniques in predictive analytics.
This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:
 Data Analysis with R, Tony Fischetti
 Learning Predictive Analytics with R, Eric Mayor
 Mastering Predictive Analytics with R, Rui Miguel Forte
Style and approach
Learn data analysis using engaging examples and fun exercises, and with a gentle and friendly but comprehensive "learnbydoing" approach. This is a practical course, which analyzes compelling data about life, health, and death with the help of tutorials. It offers you a useful way of interpreting the data that's specific to this course, but that can also be applied to any other data. This course is designed to be both a guide and a reference for moving beyond the basics of predictive modeling.
Publisher Resources
Table of Contents

R: Predictive Analysis
 Table of Contents
 R: Predictive Analysis
 Credits
 Preface

1. Module 1
 1. RefresheR
 2. The Shape of Data
 3. Describing Relationships
 4. Probability
 5. Using Data to Reason About the World
 6. Testing Hypotheses
 7. Bayesian Methods
 8. Predicting Continuous Variables
 9. Predicting Categorical Variables
 10. Sources of Data
 11. Dealing with Messy Data
 12. Dealing with Large Data
 13. Reproducibility and Best Practices

2. Module 2
 1. Visualizing and Manipulating Data Using R
 2. Data Visualization with Lattice
 3. Cluster Analysis
 4. Agglomerative Clustering Using hclust()
 5. Dimensionality Reduction with Principal Component Analysis
 6. Exploring Association Rules with Apriori
 7. Probability Distributions, Covariance, and Correlation
 8. Linear Regression
 9. Classification with kNearest Neighbors and Naïve Bayes
 10. Classification Trees
 12. Multilevel Analyses
 13. Text Analytics with R
 14. Crossvalidation and Bootstrapping Using Caret and Exporting Predictive Models Using PMML

A. Exercises and Solutions

Exercises
 Chapter 1 – Setting GNU R for Predictive Modeling
 Chapter 2 – Visualizing and Manipulating Data Using R
 Chapter 3 – Data Visualization with Lattice
 Chapter 4 – Cluster Analysis
 Chapter 5 – Agglomerative Clustering Using hclust()
 Chapter 6 – Dimensionality Reduction with Principal Component Analysis
 Chapter 7 – Exploring Association Rules with Apriori
 Chapter 8 – Probability Distributions, Covariance, and Correlation
 Chapter 9 – Linear Regression
 Chapter 10 – Classification with kNearest Neighbors and Naïve Bayes
 Chapter 11 – Classification Trees
 Chapter 12 – Multilevel Analyses
 Chapter 13 – Text Analytics with R

Solutions
 Chapter 1 – Setting GNU R for Predictive Modeling
 Chapter 2 – Visualizing and Manipulating Data Using R
 Chapter 3 – Data Visualization with Lattice
 Chapter 4 – Cluster Analysis
 Chapter 5 – Agglomerative Clustering Using hclust()
 Chapter 6 – Dimensionality Reduction with Principal Component Analysis
 Chapter 7 – Exploring Association Rules with Apriori
 Chapter 8 – Probability Distributions, Covariance, and Correlation
 Chapter 9 – Linear Regression
 Chapter 10 – Classification with kNearest Neighbors and Naïve Bayes
 Chapter 11 – Classification Trees
 Chapter 12 – Multilevel Analyses
 Chapter 13 – Text Analytics with R

Exercises

B. Further Reading and References
 Preface
 Chapter 1 – Setting GNU R for Predictive Modeling
 Chapter 2 – Visualizing and Manipulating Data Using R
 Chapter 3 – Data Visualization with Lattice
 Chapter 4 – Cluster Analysis
 Chapter 5 – Agglomerative Clustering Using hclust()
 Chapter 6 – Dimensionality Reduction with Principal Component Analysis
 Chapter 7 – Exploring Association Rules with Apriori
 Chapter 8 – Probability Distributions, Covariance, and Correlation
 Chapter 9 – Linear Regression
 Chapter 10 – Classification with kNearest Neighbors and Naïve Bayes
 Chapter 11 – Classification Trees
 Chapter 12 – Multilevel Analyses
 Chapter 13 – Text Analytics with R
 Chapter 14 – Crossvalidation and Bootstrapping Using Caret and Exporting Predictive Models Using PMML

3. Module 3

1. Gearing Up for Predictive Modeling
 Models
 Types of models
 The process of predictive modeling
 Performance metrics
 Summary
 2. Linear Regression
 3. Logistic Regression
 4. Neural Networks
 5. Support Vector Machines
 6. Treebased Methods
 7. Ensemble Methods
 8. Probabilistic Graphical Models
 9. Time Series Analysis
 10. Topic Modeling
 11. Recommendation Systems

1. Gearing Up for Predictive Modeling
 A. Bibliography
 Index
Product Information
 Title: R: Predictive Analysis
 Author(s):
 Release date: March 2017
 Publisher(s): Packt Publishing
 ISBN: 9781788290371