book

Advanced Machine Learning with R

by Cory Lesmeister, Dr. Sunil Kumar Chinnamgari

May 2019

Intermediate to advanced

664 pages

15h 41m

English

Packt Publishing

Read now

Unlock full access

Content preview from Advanced Machine Learning with R

Data preparation

What we should do now is create our training and test data using a 70/30 split. Then, we should subject it to the standard feature exploration we started discussing in Chapter 1, Preparing and Understanding Data, with these tasks in mind:

Eliminate low variance features
Identify and remove linear dependencies
Explore highly correlated features

The first thing then is for us to turn the numeric outcome into a factor to be used for creating a stratified data index, like so:

> y_factor <- as.factor(y)> set.seed(1492)> index <- caret::createDataPartition(y_factor, p = 0.7, list = F)

Using the index, we create train/test input features and labels:

> train <- x[index, ]> train_y <- y_factor[index]> test <- x[-index, ]> test_y ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Machine Learning Using R

Karthik Ramasubramanian, Abhishek Singh

Machine Learning with R, the tidyverse, and mlr

Hefin Rhys

Machine Learning with R Cookbook - Second Edition

AshishSingh Bhatia, Yu-Wei, Chiu (David Chiu)

Practical Machine Learning in R

Fred Nwanganga, Mike Chapple

Publisher Resources

ISBN: 9781838641771Supplemental Content