O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Path: R: Complete Machine Learning and Deep Learning Solutions to develop and implement machine learning and deep learning algorithms using R in real-world scenarios.

Video Description

Unlock the hidden layers of data with R

In Detail

R is one of the leading technologies in the field of data science. Are you looking at gaining in-depth knowledge of machine learning and deep learning? If yes, then this Learning Path is for you. Starting out at a basic level, this Learning Path will teach you how to develop and implement machine learning and deep learning algorithms using R in real-world scenarios.

The Learning Path begins with covering some basic concepts of R to refresh your R knowledge before we deep dive into advanced techniques. You will start with setting up the environment and then perform data ETL in R. You will then learn important machine learning topics, including data classification, regression, clustering, association rule mining, and dimensionality reduction. Next, you will understand the basics of deep learning and artificial neural networks and move on to exploring topics such as ANNs, RNNs, and CNNs. Finally, you will learn about the applications of deep learning in various fields and understand the practical implementations of scalability, HPC and feature engineering.

By the end of the Learning Path, you will have a solid knowledge of all these algorithms and techniques and be able to implement it efficiently in your data science projects.

Prerequisites: Basic knowledge of R would be beneficial. A background in linear algebra and statistics is expected.

Resources: Code downloads and errata:

  • Mastering R Programming

  • R Machine Learning Solutions

  • Deep Learning with R

  • PATH PRODUCTS

    This path navigates across the following products (in sequential order):

  • Mastering R Programming (5h 12m)

  • R Machine Learning Solutions (8h 20m)

  • Deep Learning with R (4h 4m)

  • Table of Contents

    1. Chapter 1 : Mastering R Programming
      1. The Course Overview 00:07:45
      2. Performing Univariate Analysis 00:05:22
      3. Bivariate Analysis – Correlation, Chi-Sq Test, and ANOVA 00:05:43
      4. Detecting and Treating Outlier 00:03:21
      5. Treating Missing Values with `mice` 00:03:59
      6. Building Linear Regressors 00:07:35
      7. Interpreting Regression Results and Interactions Terms 00:05:19
      8. Performing Residual Analysis and Extracting Extreme Observations With Cook's Distance 00:03:25
      9. Extracting Better Models with Best Subsets, Stepwise Regression, and ANOVA 00:04:39
      10. Validating Model Performance on New Data with k-Fold Cross Validation 00:02:29
      11. Building Non-Linear Regressors with Splines and GAMs 00:05:20
      12. Building Logistic Regressors, Evaluation Metrics, and ROC Curve 00:12:38
      13. Understanding the Concept and Building Naive Bayes Classifier 00:09:24
      14. Building k-Nearest Neighbors Classifier 00:07:01
      15. Building Tree Based Models Using RPart, cTree, and C5.0 00:06:33
      16. Building Predictive Models with the caret Package 00:08:11
      17. Selecting Important Features with RFE, varImp, and Boruta 00:05:19
      18. Building Classifiers with Support Vector Machines 00:08:04
      19. Understanding Bagging and Building Random Forest Classifier 00:05:07
      20. Implementing Stochastic Gradient Boosting with GBM 00:05:18
      21. Regularization with Ridge, Lasso, and Elasticnet 00:08:53
      22. Building Classifiers and Regressors with XGBoost 00:10:10
      23. Dimensionality Reduction with Principal Component Analysis 00:05:05
      24. Clustering with k-means and Principal Components 00:03:16
      25. Determining Optimum Number of Clusters 00:05:25
      26. Understanding and Implementing Hierarchical Clustering 00:02:36
      27. Clustering with Affinity Propagation 00:05:25
      28. Building Recommendation Engines 00:09:01
      29. Understanding the Components of a Time Series, and the xts Package 00:05:42
      30. Stationarity, De-Trend, and De-Seasonalize 00:04:07
      31. Understanding the Significance of Lags, ACF, PACF, and CCF 00:03:49
      32. Forecasting with Moving Average and Exponential Smoothing 00:02:25
      33. Forecasting with Double Exponential and Holt Winters 00:03:23
      34. Forecasting with ARIMA Modelling 00:05:26
      35. Scraping Web Pages and Processing Texts 00:09:24
      36. Corpus, TDM, TF-IDF, and Word Cloud 00:09:07
      37. Cosine Similarity and Latent Semantic Analysis 00:07:20
      38. Extracting Topics with Latent Dirichlet Allocation 00:05:07
      39. Sentiment Scoring with tidytext and Syuzhet 00:04:23
      40. Classifying Texts with RTextTools 00:03:57
      41. Building a Basic ggplot2 and Customizing the Aesthetics and Themes 00:07:18
      42. Manipulating Legend, AddingText, and Annotation 00:03:31
      43. Drawing Multiple Plots with Faceting and Changing Layouts 00:03:18
      44. Creating Bar Charts, Boxplots, Time Series, and Ribbon Plots 00:05:25
      45. ggplot2 Extensions and ggplotly 00:03:11
      46. Implementing Best Practices to Speed Up R Code 00:05:47
      47. Implementing Parallel Computing with doParallel and foreach 00:04:22
      48. Writing Readable and Fast R Code with Pipes and DPlyR 00:05:40
      49. Writing Super Fast R Code with Minimal Keystrokes Using Data.Table 00:06:38
      50. Interface C++ in R with RCpp 00:11:09
      51. Understanding the Structure of an R Package 00:05:02
      52. Build, Document, and Host an R Package on GitHub 00:07:10
      53. Performing Important Checks Before Submitting to CRAN 00:04:06
      54. Submitting an R Package to CRAN 00:03:11
    2. Chapter 2 : R Machine Learning solutions
      1. The Course Overview 00:04:38
      2. Downloading and Installing R 00:06:10
      3. Downloading and Installing RStudio 00:03:10
      4. Installing and Loading Packages 00:05:46
      5. Reading and Writing Data 00:05:54
      6. Using R to Manipulate Data 00:05:47
      7. Applying Basic Statistics 00:04:47
      8. Visualizing Data 00:03:33
      9. Getting a Dataset for Machine Learning 00:02:39
      10. Reading a Titanic Dataset from a CSV File 00:08:36
      11. Converting Types on Character Variables 00:03:05
      12. Detecting Missing Values 00:03:19
      13. Missing values affect the inference of a dataset. Thus it is important to detect them. 00:04:31
      14. Exploring and Visualizing Data 00:04:25
      15. Predicting Passenger Survival with a Decision Tree 00:03:59
      16. Validating the Power of Prediction with a Confusion Matrix 00:02:08
      17. Assessing performance with the ROC curve 00:02:33
      18. Understanding Data Sampling in R 00:03:31
      19. Operating a Probability Distribution in R 00:05:42
      20. Working with Univariate Descriptive Statistics in R 00:05:10
      21. Performing Correlations and Multivariate Analysis 00:03:02
      22. Operating Linear Regression and Multivariate Analysis 00:03:25
      23. Conducting an Exact Binomial Test 00:03:48
      24. Performing Student's t-test 00:03:13
      25. Performing the Kolmogorov-Smirnov Test 00:04:43
      26. Understanding the Wilcoxon Rank Sum and Signed Rank Test 00:02:04
      27. Working with Pearson's Chi-Squared Test 00:05:09
      28. Conducting a One-Way ANOVA 00:04:16
      29. Performing a Two-Way ANOVA 00:04:02
      30. Fitting a Linear Regression Model with lm 00:04:53
      31. Summarizing Linear Model Fits 00:05:21
      32. Using Linear Regression to Predict Unknown Values 00:02:51
      33. Generating a Diagnostic Plot of a Fitted Model 00:03:58
      34. Fitting a Polynomial Regression Model with lm 00:02:16
      35. Fitting a Robust Linear Regression Model with rlm 00:02:16
      36. Studying a case of linear regression on SLID data 00:06:39
      37. Reducing Dimensions with SVD 00:02:11
      38. Applying the Poisson model for Generalized Linear Regression 00:01:34
      39. Applying the Binomial Model for Generalized Linear Regression 00:02:02
      40. Fitting a Generalized Additive Model to Data 00:03:14
      41. Visualizing a Generalized Additive Model 00:01:27
      42. Diagnosing a Generalized Additive Model 00:03:38
      43. Preparing the Training and Testing Datasets 00:03:45
      44. Building a Classification Model with Recursive Partitioning Trees 00:06:10
      45. Visualizing a Recursive Partitioning Tree 00:03:04
      46. Measuring the Prediction Performance of a Recursive Partitioning Tree 00:02:49
      47. Pruning a Recursive Partitioning Tree 00:02:38
      48. Building a Classification Model with a Conditional Inference Tree 00:01:56
      49. Visualizing a Conditional Inference Tree 00:02:38
      50. Measuring the Prediction Performance of a Conditional Inference Tree 00:02:10
      51. Classifying Data with the K-Nearest Neighbor Classifier 00:05:31
      52. Classifying Data with Logistic Regression 00:04:38
      53. Classifying data with the Naïve Bayes Classifier 00:06:16
      54. Classifying Data with a Support Vector Machine 00:05:58
      55. Choosing the Cost of an SVM 00:02:57
      56. Visualizing an SVM Fit 00:03:33
      57. Predicting Labels Based on a Model Trained by an SVM 00:03:49
      58. Tuning an SVM 00:02:48
      59. Training a Neural Network with neuralnet 00:04:08
      60. Visualizing a Neural Network Trained by neuralnet 00:02:22
      61. Predicting Labels based on a Model Trained by neuralnet 00:03:07
      62. Training a Neural Network with nnet 00:02:46
      63. Predicting labels based on a model trained by nnet 00:02:49
      64. Estimating Model Performance with k-fold Cross Validation 00:03:42
      65. Performing Cross Validation with the e1071 Package 00:03:22
      66. Performing Cross Validation with the caret Package 00:02:59
      67. Ranking the Variable Importance with the caret Package 00:02:21
      68. Ranking the Variable Importance with the rminer Package 00:02:30
      69. Finding Highly Correlated Features with the caret Package 00:02:13
      70. Selecting Features Using the Caret Package 00:04:59
      71. Measuring the Performance of the Regression Model 00:03:58
      72. Measuring Prediction Performance with a Confusion Matrix 00:02:07
      73. Measuring Prediction Performance Using ROCR 00:02:46
      74. Comparing an ROC Curve Using the Caret Package 00:03:44
      75. Measuring Performance Differences between Models with the caret Package 00:03:41
      76. Classifying Data with the Bagging Method 00:07:53
      77. Performing Cross Validation with the Bagging Method 00:01:56
      78. Classifying Data with the Boosting Method 00:06:05
      79. Performing Cross Validation with the Boosting Method 00:02:06
      80. Classifying Data with Gradient Boosting 00:07:10
      81. Calculating the Margins of a Classifier 00:05:30
      82. Calculating the Error Evolution of the Ensemble Method 00:02:19
      83. Classifying Data with Random Forest 00:07:02
      84. Estimating the Prediction Errors of Different Classifiers 00:04:35
      85. Clustering Data with Hierarchical Clustering 00:08:40
      86. Cutting Trees into Clusters 00:03:30
      87. Clustering Data with the k-Means Method 00:04:10
      88. Drawing a Bivariate Cluster Plot 00:03:32
      89. Comparing Clustering Methods 00:04:16
      90. Extracting Silhouette Information from Clustering 00:02:40
      91. Obtaining the Optimum Number of Clusters for k-Means 00:02:49
      92. Clustering Data with the Density-Based Method 00:06:42
      93. Clustering Data with the Model-Based Method 00:04:38
      94. Visualizing a Dissimilarity Matrix 00:03:24
      95. Validating Clusters Externally 00:04:12
      96. Transforming Data into Transactions 00:03:35
      97. Displaying Transactions and Associations 00:02:14
      98. Mining Associations with the Apriori Rule 00:07:24
      99. Pruning Redundant Rules 00:02:26
      100. Visualizing Association Rules 00:05:07
      101. Mining Frequent Itemsets with Eclat 00:03:36
      102. Creating Transactions with Temporal Information 00:02:41
      103. Mining Frequent Sequential Patterns with cSPADE 00:04:16
      104. Performing Feature Selection with FSelector 00:07:38
      105. Performing Dimension Reduction with PCA 00:07:19
      106. Determining the Number of Principal Components Using the Scree Test 00:03:34
      107. Determining the Number of Principal Components Using the Kaiser Method 00:02:05
      108. Visualizing Multivariate Data Using biplot 00:03:17
      109. Performing Dimension Reduction with MDS 00:05:38
      110. Reducing Dimensions with SVD 00:03:19
      111. Compressing Images with SVD 00:03:05
      112. Performing Nonlinear Dimension Reduction with ISOMAP 00:04:34
      113. Performing Nonlinear Dimension Reduction with Local Linear Embedding 00:04:55
      114. Preparing the RHadoop Environment 00:05:36
      115. Installing rmr2 00:03:53
      116. Installing rhdfs 00:04:15
      117. Operating HDFS with rhdfs 00:05:47
      118. Implementing a Word Count Problem with RHadoop 00:05:27
      119. Comparing the Performance between an R MapReduce Program and a Standard R Program 00:05:03
      120. Testing and Debugging the rmr2 Program 00:03:49
      121. Installing plyrmr 00:03:12
      122. Manipulating Data with plyrmr 00:03:52
      123. Conducting Machine Learning with RHadoop 00:04:39
      124. Configuring RHadoop Clusters on Amazon EMR 00:05:28
    3. Chapter 3 : Deep Learning with R
      1. The Course Overview 00:05:22
      2. Fundamental Concepts in Deep Learning 00:07:43
      3. Introduction to Artificial Neural Networks 00:07:58
      4. Classification with Two-Layers Artificial Neural Networks 00:08:03
      5. Probabilistic Predictions with Two-Layer ANNs 00:06:33
      6. Introduction to Multi-hidden-layer Architectures 00:04:31
      7. Tuning ANNs Hyper-Parameters and Best Practices 00:06:12
      8. Neural Network Architectures 00:04:58
      9. Neural Network Architectures Continued 00:08:02
      10. The LearningProcess 00:05:36
      11. Optimization Algorithms and Stochastic Gradient Descent 00:08:11
      12. Backpropagation 00:06:44
      13. Hyper-Parameters Optimization 00:07:18
      14. Introduction to Convolutional Neural Networks 00:09:57
      15. Introduction to Convolutional Neural Networks Continued 00:10:36
      16. CNNs in R 00:10:41
      17. Classifying Real-World Images with Pre-Trained Models 00:08:29
      18. Introduction to Recurrent Neural Networks 00:11:58
      19. Introduction to Long Short-Term Memory 00:08:08
      20. RNNs in R 00:08:55
      21. Use-Case – Learning How to Spell English Words from Scratch 00:06:35
      22. Introduction to Unsupervised and Reinforcement Learning 00:06:45
      23. Autoencoders 00:04:57
      24. Restricted Boltzmann Machines and Deep Belief Networks 00:07:45
      25. Reinforcement Learning with ANNs 00:07:23
      26. Use-Case – Anomaly Detection through Denoising Autoencoders 00:06:53
      27. Deep Learning for Computer Vision 00:07:20
      28. Deep Learning for Natural Language Processing 00:06:05
      29. Deep Learning for Audio Signal Processing 00:05:02
      30. Deep Learning for Complex Multimodal Tasks 00:04:32
      31. Other Important Applications of Deep Learning 00:05:24
      32. Debugging Deep Learning Systems 00:05:56
      33. GPU and MGPU Computing for Deep Learning 00:04:57
      34. A Complete Comparison of Every DL Packages in R 00:04:41
      35. Research Directions and Open Questions 00:04:48