Book description
Over 80 recipes to help you breeze through your data analysis projects using R
About This Book
 Analyse your data using the popular R packages like ggplot2 with readytouse and customizable recipes
 Find meaningful insights from your data and generate dynamic reports
 A practical guide to help you put your data analysis skills in R to practical use
Who This Book Is For
This book is for data scientists, analysts and even enthusiasts who want to learn and implement the various data analysis techniques using R in a practical way. Those looking for quick, handy solutions to common tasks and challenges in data analysis will find this book to be very useful. Basic knowledge of statistics and R programming is assumed.
What You Will Learn
 Acquire, format and visualize your data using R
 Using R to perform an Exploratory data analysis
 Introduction to machine learning algorithms such as classification and regression
 Get started with social network analysis
 Generate dynamic reporting with Shiny
 Get started with geospatial analysis
 Handling large data with R using Spark and MongoDB
 Build Recommendation system Collaborative Filtering, Content based and Hybrid
 Learn real world dataset examples Fraud Detection and Image Recognition
In Detail
Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data.
This book will show you how you can put your data analysis skills in R to practical use, with recipes catering to the basic as well as advanced data analysis tasks. Right from acquiring your data and preparing it for analysis to the more complex data analysis techniques, the book will show you how you can implement each technique in the best possible manner. You will also visualize your data using the popular R packages like ggplot2 and gain hidden insights from it. Starting with implementing the basic data analysis concepts like handling your data to creating basic plots, you will master the more advanced data analysis techniques like performing cluster analysis, and generating effective analysis reports and visualizations. Throughout the book, you will get to know the common problems and obstacles you might encounter while implementing each of the data analysis techniques in R, with ways to overcoming them in the easiest possible way.
By the end of this book, you will have all the knowledge you need to become an expert in data analysis with R, and put your skills to test in realworld scenarios.
Style and Approach
 Handson recipes to walk through data science challenges using R
 Your onestop solution for common and notsocommon pain points while performing realworld problems to execute a series of tasks.
 Addressing your common and notsocommon pain points, this is a book that you must have on the shelf
Publisher resources
Table of contents
 Preface

Acquire and Prepare the Ingredients  Your Data
 Introduction
 Working with data
 Reading data from CSV files
 Reading XML data
 Reading JSON data
 Reading data from fixedwidth formatted files
 Reading data from R files and R libraries
 Removing cases with missing values
 Replacing missing values with the mean
 Removing duplicate cases
 Rescaling a variable to specified minmax range
 Normalizing or standardizing data in a data frame
 Binning numerical data
 Creating dummies for categorical variables
 Handling missing data
 Correcting data
 Imputing data
 Detecting outliers

What's in There  Exploratory Data Analysis
 Introduction
 Creating standard data summaries
 Extracting a subset of a dataset
 Splitting a dataset
 Creating random data partitions
 Generating standard plots, such as histograms, boxplots, and scatterplots
 Generating multiple plots on a grid
 Creating plots with the lattice package
 Creating charts that facilitate comparisons
 Creating charts that help to visualize possible causality

Where Does It Belong? Classification
 Introduction
 Generating error/classification confusion matrices
 Principal Component Analysis
 Generating receiver operating characteristic charts
 Building, plotting, and evaluating with classification trees
 Using random forest models for classification
 Classifying using the support vector machine approach
 Classifying using the Naive Bayes approach
 Classifying using the KNN approach
 Using neural networks for classification
 Classifying using linear discriminant function analysis
 Classifying using logistic regression
 Text classification for sentiment analysis

Give Me a Number  Regression
 Introduction
 Computing the rootmeansquare error
 Building KNN models for regression
 Performing linear regression
 Performing variable selection in linear regression
 Building regression trees
 Building random forest models for regression
 Using neural networks for regression
 Performing kfold crossvalidation
 Performing leaveoneout crossvalidation to limit overfitting

Can you Simplify That? Data Reduction Techniques
 Introduction
 Performing cluster analysis using hierarchical clustering
 Performing cluster analysis using partitioning clustering
 Image segmentation using minibatch Kmeans
 Partitioning around medoids
 Clustering large application
 Performing cluster validation
 Performing Advance clustering
 Modelbased clustering with the EM algorithm
 Reducing dimensionality with principal component analysis

Lessons from History  Time Series Analysis
 Introduction
 Exploring finance datasets
 Creating and examining date objects
 Operating on date objects
 Performing preliminary analyses on time series data
 Using time series objects
 Decomposing time series
 Filtering time series data
 Smoothing and forecasting using the HoltWinters method
 Building an automated ARIMA model

How does it look?  Advanced data visualization
 Introduction
 Creating scatter plots
 Creating line graphs
 Creating bar graphs
 Making distributions plots
 Creating mosaic graphs
 Making treemaps
 Plotting a correlations matrix
 Creating heatmaps
 Plotting network graphs
 Labeling and legends
 Coloring and themes
 Creating multivariate plots
 Creating 3D graphs and animation
 Selecting a graphics device
 This may also interest you  Building Recommendations

It's All About Your Connections  Social Network Analysis
 Introduction
 Downloading social network data using public APIs
 Creating adjacency matrices and edge lists

Plotting social network data
 Getting ready
 How to do it...
 How it works...

There's more...
 Specifying plotting preferences
 Plotting directed graphs
 Creating a graph object with weights
 Extracting the network as an adjacency matrix from the graph object
 Extracting an adjacency matrix with weights
 Extracting an edge list from a graph object
 Creating a bipartite network graph
 Generating projections of a bipartite network
 Computing important network metrics
 Cluster analysis
 Force layout
 YiFan Hu layout
 Put Your Best Foot Forward  Document and Present Your Analysis

Work Smarter, Not Harder  Efficient and Elegant R Code
 Introduction
 Exploiting vectorized operations
 Processing entire rows or columns using the apply function
 Applying a function to all elements of a collection with lapply and sapply
 Applying functions to subsets of a vector
 Using the splitapplycombine strategy with plyr
 Slicing, dicing, and combining data with data tables

Where in the World? Geospatial Analysis
 Introduction
 Downloading and plotting a Google map of an area
 Overlaying data on the downloaded Google map
 Importing ESRI shape files to R
 Using the sp package to plot geographic data
 Getting maps from the maps package
 Creating spatial data frames from regular data frames containing spatial and other data
 Creating spatial data frames by combining regular data frames with spatial objects
 Adding variables to an existing spatial data frame
 Spatial data analysis with R and QGIS

Playing Nice  Connecting to Other Systems
 Introduction
 Using Java objects in R
 Using JRI to call R functions from Java
 Using Rserve to call R functions from Java
 Executing R scripts from Java
 Using the xlsx package to connect to Excel
 Reading data from relational databases  MySQL
 Reading data from NoSQL databases  MongoDB
 Working with inmemory data processing with Apache Spark
Product information
 Title: R Data Analysis Cookbook  Second Edition
 Author(s):
 Release date: September 2017
 Publisher(s): Packt Publishing
 ISBN: 9781787124479
You might also like
book
40 Algorithms Every Programmer Should Know
Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …
book
R Statistics Cookbook
Solve realworld statistical problems using the most popular R packages and techniques Key Features Learn how …
book
R Data Mining
Mine valuable insights from your data using popular tools and techniques in R About This Book …
book
HandsOn Data Science with R
A handson guide for professionals to perform various data science tasks in R Key Features Explore …