Video Description
Learn R and get comfortable with data science
In Detail
Excited by the endless possibilities offered by the fields of data science and data analysis? Let R set you on your way!
Data scientists, statisticians and analysts use R for statistical analysis, data visualization and predictive modeling. R gives aspiring analysts and data scientists the ability to represent complex sets of data in an impressive way.
Make yourself comfortable in R and get deep into data science using R with this Learning Path.
Prerequisites: Requires no programming knowledge  we’re covering basics of R too!
Resources: Code downloads and errata:
PATH PRODUCTS
This path navigates across the following products (in sequential order):
Introduction to R Programming (3h 46m)
Learning Data Mining with R (2h 17m)
Learning R for Data Visualization (1h 59m)
R for Data Science Solutions (5h 32m)
Table of Contents

Chapter 1 : Introduction to R Programming
 The Course Overview 00:04:54
 Installing R 00:03:46
 Installing RStudio 00:04:36
 Installing Packages 00:04:50
 Data Types and Data Structures 00:03:05
 Vectors 00:05:44
 Random Numbers, Rounding, and Binning 00:04:00
 Missing Values 00:02:47
 The which() Operator 00:03:11
 Lists 00:04:35
 Set Operations 00:02:09
 Sampling and Sorting 00:02:52
 Check Conditions 00:02:17
 For Loops 00:02:34
 Dataframes 00:08:30
 Importing and Exporting Data 00:06:30
 Matrices and Frequency Tables 00:03:41
 Merging Dataframes 00:02:26
 Aggregation 00:02:48
 Melting and Cross Tabulations with dcast() 00:03:58
 Dates 00:05:35
 String Manipulation 00:05:14
 Functions 00:05:34
 Debugging and Error Handling 00:04:30
 Fast Loops with apply() 00:04:27
 Fast Loops with sapply(), lapply() and vapply() 00:02:00
 Creating and Customizing an R Plot 00:07:03
 Drawing Plots with 2 Y Axes 00:02:23
 Multiplots and Custom Layouts 00:03:08
 Creating Basic Graph Types 00:04:47
 Univariate Analysis 00:06:16
 Normal Distribution, Central Limit Theorem, and Confidence Intervals 00:05:32
 Correlation and Covariance 00:03:03
 Chisq Statistic 00:04:42
 ANOVA 00:04:54
 Statistical Tests 00:05:14
 Project 1 – Data Munging and Summarizing 00:11:31
 Project 2 – Visualization with Base Graphics 00:05:42
 Project 3 – Statistical Inference 00:03:50
 Pipes with Magrittr 00:05:21
 The 7 Data Manipulation Verbs 00:05:19
 Aggregation and Special Functions 00:03:36
 Two Table Verbs 00:02:43
 Working With Databases 00:05:30
 Understanding Basics, Filter, and Select 00:07:34
 Understanding Syntax, Creating and Updating Columns 00:04:06
 Aggregating Data, .N, and .I 00:04:21
 data.table 00:04:17
 Fast Loops with set(), Keys, and Joins 00:09:13

Chapter 2 : Getting Started with R for Data Science
 The Course Overview 00:04:15
 What is R? 00:02:34
 The Structure of the Language 00:03:52
 Data Structures within R 00:05:57
 Writing a Simple Program in R 00:04:33
 The Structure of a DataFrame 00:05:35
 Creating a DataFrame from a CSV File 00:02:41
 Creating a DataFrame from a Zip File 00:03:03
 Creating a DataFrame from a Database 00:06:55
 The Tools Available for Cleaning Data 00:06:50
 Dealing with Null Values 00:04:03
 Standardizing Date Formats 00:03:13
 Blending Multiple DataFrames 00:04:22
 What Is a Codebook and Why Create One? 00:03:50
 Creating the Codebook Using Standard R API Functionality 00:02:28
 Manually Creating a Custom Codebook 00:03:32
 Introduction to Data Mining and Analysis 00:03:49
 The Tools and Techniques for Creating the Story 00:03:32
 Regression Analysis with R 00:02:24
 Clustering Data with R 00:03:20
 Classifying Data with R 00:04:01
 Data Visualization Tools 00:03:09
 Creating Static Visualization Plots 00:03:47
 Creating Interactive Plots 00:02:01
 Publishing the Graphics 00:02:06
 What's Next? 00:03:12

Chapter 3 : Learning Data Mining with R
 The Course Overview 00:03:31
 Getting Started with R 00:05:06
 Data Preparation and Data Cleansing 00:04:10
 The Basic Concepts of R 00:05:46
 Data Frames and Data Manipulation 00:05:29
 Data Points and Distances in a Multidimensional Vector Space 00:03:59
 An Algorithmic Approach to Find Hidden Patterns in Data 00:06:24
 A Realworld Life Science Example 00:04:29
 Example – Using a Single Line of Code in R 00:04:00
 R Data Types 00:05:44
 R Functions and Indexing 00:04:15
 S3 Versus S4 – Objectoriented Programming in R 00:04:45
 Market Basket Analysis 00:03:01
 Introduction to Graphs 00:02:09
 Different Association Types 00:05:27
 The Apriori Algorithm 00:06:38
 The Eclat Algorithm 00:03:54
 The FPGrowth Algorithm 00:03:48
 Mathematical Foundations 00:06:01
 The Naive Bayes Classifier 00:03:50
 Spam Classification with Naïve Bayes 00:03:33
 Support Vector Machines 00:04:29
 Knearest Neighbors 00:03:21
 Hierarchical Clustering 00:05:45
 Distributionbased Clustering 00:06:55
 Densitybased Clustering 00:03:12
 Using DBSCAN to Cluster Flowers Based on Spatial Properties 00:02:25
 Introduction to Neural Networks and Deep Learning 00:06:09
 Using the H2O Deep Learning Framework 00:02:28
 Realtime Cloud Based IoT Sensor Data Analysis 00:06:17

Chapter 4 : Learning R for Data Visualization
 The Course Overview 00:05:32
 Preview of R Plotting Functionalities 00:03:16
 Introducing the Dataset 00:03:21
 Loading Tables and CSV Files 00:04:41
 Loading Excel Files 00:03:33
 Exporting Data 00:04:19
 Creating Histograms 00:05:01
 The Importance of Box Plots 00:03:44
 Plotting Bar Charts 00:02:43
 Plotting Multiple Variables – Scatterplots 00:03:07
 Dealing with Time – Timeseries Plots 00:02:38
 Handling Uncertainty 00:04:15
 Changing Theme 00:03:07
 Changing Colors 00:03:20
 Modifying Axis and Labels 00:02:40
 Adding Supplementary Elements 00:04:08
 Adding Text Inside and Outside of the Plot 00:05:02
 Multiplots 00:03:59
 Exporting Plots as Images 00:03:24
 Adjusting the Page Size 00:02:33
 Getting Started with Interactive Plotting 00:02:44
 Creating Interactive Histograms and Box Plots 00:04:55
 Plotting Interactive Bar Charts 00:03:12
 Creating Interactive Scatterplots 00:02:58
 Developing Interactive Timeseries Plots 00:03:47
 Getting Started with Shiny 00:04:09
 Creating a Simple Website 00:04:52
 File Input 00:03:09
 Conditional Panels – UI 00:03:45
 Conditional Panels – Servers 00:05:31
 Deploying the Site 00:05:38

Chapter 5 : R for Data Science Solutions
 R Functions and Arguments 00:06:25
 Understanding Environments 00:02:59
 Working with Lexical Scoping 00:02:49
 Understanding Closure 00:02:17
 Performing Lazy Evaluation 00:01:56
 Creating Infix Operators 00:02:51
 Using the Replacement Function 00:02:17
 Handling Errors in a Function 00:04:31
 The Debugging Function 00:04:05
 Downloading Open Data 00:02:15
 Reading and Writing CSV Files 00:01:13
 Scanning Text Files 00:02:21
 Working with Excel Files 00:01:56
 Reading Data from Databases 00:04:04
 Scraping Web Data 00:05:17
 Renaming the Data Variable 00:02:27
 Converting Data Types 00:04:03
 Working with Date Format 00:02:36
 Adding New Records 00:02:55
 Filtering Data 00:02:09
 Dropping Data 00:03:29
 Merging and Sorting Data 00:01:42
 Reshaping Data 00:04:00
 Detecting Missing Data 00:02:42
 Imputing Missing Data 00:03:15
 Enhancing a data.frame with a data.table 00:04:50
 Managing Data with data.table 00:01:40
 Performing Fast Aggregation with data.table 00:01:14
 Merging Large Datasets with a data.table 00:01:54
 Subsetting and Slicing Data with dplyr 00:02:11
 Sampling Data with dplyr 00:04:14
 Selecting Columns with dplyr 00:02:10
 Chaining Operations in dplyr 00:02:41
 Arranging Rows with dplyr 00:02:09
 Eliminating Duplicated Rows with dplyr 00:01:26
 Adding New Columns with dplyr 00:02:40
 Summarizing Data with dplyr 00:02:10
 Merging Data with dplyr 00:01:22
 Creating Basic Plots with ggplot2 00:04:15
 Changing Aesthetics Mapping 00:03:09
 Introducing Geometric Objects 00:03:13
 Performing Transformations 00:03:27
 Adjusting Scales 00:02:16
 Faceting 00:02:07
 Adjusting Themes 00:01:33
 Combining Plots 00:02:04
 Creating Maps 00:04:39
 Creating R Markdown Reports 00:02:47
 Learning the Markdown Syntax 00:03:14
 Embedding R Code Chunks 00:02:19
 Creating Interactive Graphics with ggvis 00:02:39
 Understanding Basic Syntax and Gramma 00:01:57
 Controlling Axes and Legends and Using Scales 00:02:55
 Adding Interactivity to a ggvis Plot 00:03:41
 Creating an R Shiny Document 00:02:16
 Publishing an R Shiny Report 00:02:29
 Generating Random Samples 00:02:52
 Understanding Uniform Distributions 00:01:39
 Generating Binomial Random Variates 00:02:30
 Generating Poisson Random Variates 00:02:06
 Sampling from a Normal Distribution 00:04:08
 Sampling from a ChiSquared Distribution 00:02:00
 Understanding Student's t Distribution 00:02:11
 Sampling from a Dataset 00:01:52
 Simulating the Stochastic Process 00:02:29
 Getting Confidence Intervals 00:05:54
 Performing Ztests 00:03:12
 Performing Student's tTests 00:02:15
 Conducting Exact Binomial Tests 00:02:09
 Performing KolmogorovSmirnov Tests 00:02:17
 Working with the Pearson's ChiSquared Tests 00:01:40
 Understanding the Wilcoxon Rank Sum and Signed Rank Tests 00:01:48
 Conducting Oneway ANOVA 00:02:39
 Performing Twoway ANOVA 00:03:02
 Transforming Data into Transactions 00:05:12
 Displaying Transactions and Associations 00:03:03
 Mining Associations with the Apriori Rule 00:04:19
 Pruning Redundant Rules 00:02:15
 Visualizing Association Rules 00:02:36
 Mining Frequent Itemsets with Eclat 00:03:08
 Creating Transactions with Temporal Information 00:02:53
 Mining Frequent Sequential Patterns with cSPADE 00:02:42
 Creating Time Series Data 00:05:12
 Plotting a Time Series Object 00:02:26
 Decomposing Time Series 00:02:11
 Smoothing Time Series 00:05:21
 Forecasting Time Series 00:02:31
 Selecting an ARIMA Model 00:03:19
 Creating an ARIMA Model 00:02:20
 Forecasting with an ARIMA Model 00:02:11
 Predicting Stock Prices with an ARIMA Model 00:04:24
 Fitting a Linear Regression Model with lm 00:05:35
 Summarizing Linear Model Fits 00:02:14
 Using Linear Regression to Predict Unknown Values 00:01:38
 Measuring the Performance of the Regression Model 00:03:46
 Performing a Multiple Regression Analysis 00:02:54
 Selecting the BestFitted Regression Model with Stepwise Regression 00:03:57
 Applying the Gaussian Model for Generalized Linear Regression 00:03:23
 Performing a Logistic Regression Analysis 00:04:17
 Building a Classification Model with Recursive Partitioning Trees 00:02:42
 Visualizing Recursive Partitioning Tree 00:02:19
 Measuring Model Performance with a Confusion Matrix 00:04:31
 Measuring Prediction Performance Using ROCR 00:03:59
 Clustering Data with Hierarchical Clustering 00:06:10
 Cutting Tree into Clusters 00:01:51
 Clustering Data with the kmeans Method 00:01:20
 Clustering Data with the DensityBased Method 00:02:54
 Extracting Silhouette Information from Clustering 00:01:45
 Comparing Clustering Methods 00:02:09
 Recognizing Digits Using the DensityBased Clustering Method 00:03:12
 Grouping Similar Text Documents with kmeans Clustering Method 00:01:50
 Performing Dimension Reduction with Principal Component Analysis (PCA) 00:02:12
 Determining the Number of Principal Components Using a Scree Plot 00:01:52
 Determining the Number of Principal Components Using the Kaiser Method 00:02:15
 Visualizing Multivariate Data Using a biplot 00:02:51
Product Information
 Title: Learning Path: Data Science with R
 Author(s):
 Release date: November 2016
 Publisher(s): Packt Publishing
 ISBN: 9781787289192