book

R Deep Learning Essentials - Second Edition

by Mark Hodnett, Joshua F. Wiley

August 2018

Intermediate to advanced

378 pages

9h 9m

English

Packt Publishing

Read now

Unlock full access

Content preview from R Deep Learning Essentials - Second Edition

Data leakage

Data leakage is where a feature used to train the model has values that could not exist if the model was used in production. It occurs most frequently in time series data. For example, in our churn use case in Chapter 4, Training Deep Prediction Models, there were a number of categorical variables in the data that indicated customer segmentation. A data modeler may assume that these are good predictor variables, but it is not known how and when these variables were set. They could be based on customer' spend, which means that if they are used in the prediction algorithm, there is a circular reference—an external process calculates the segment based on the spend and then this variable is used to predict spend!

When extracting ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9781788992893Supplemental Content

R Deep Learning Essentials - Second Edition

by Mark Hodnett, Joshua F. Wiley

Data leakage

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

R Deep Learning Cookbook

Hands-On Deep Learning with R

R: Unleash Machine Learning Techniques

Deep Learning with R Cookbook

Publisher Resources