Skip to Content
R Deep Learning Essentials - Second Edition
book

R Deep Learning Essentials - Second Edition

by Mark Hodnett, Joshua F. Wiley
August 2018
Intermediate to advanced
378 pages
9h 9m
English
Packt Publishing
Content preview from R Deep Learning Essentials - Second Edition

Data leakage

Data leakage is where a feature used to train the model has values that could not exist if the model was used in production. It occurs most frequently in time series data. For example, in our churn use case in Chapter 4Training Deep Prediction Modelsthere were a number of categorical variables in the data that indicated customer segmentation. A data modeler may assume that these are good predictor variables, but it is not known how and when these variables were set. They could be based on customer' spend, which means that if they are used in the prediction algorithm, there is a circular reference—an external process calculates the segment based on the spend and then this variable is used to predict spend!

When extracting ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

R Deep Learning Cookbook

R Deep Learning Cookbook

PKS Prakash, Achyutuni Sri Krishna Rao
Hands-On Deep Learning with R

Hands-On Deep Learning with R

Rodger Devine, Michael Pawlus
R: Unleash Machine Learning Techniques

R: Unleash Machine Learning Techniques

Raghav Bali, Dipanjan Sarkar, Brett Lantz, Cory Lesmeister
Deep Learning with R Cookbook

Deep Learning with R Cookbook

Swarna Gupta, Rehan Ali Ansari, Dipayan Sarkar

Publisher Resources

ISBN: 9781788992893Supplemental Content