Garbage In/Garbage Out applies to more than just manufacturing. Dirty data can doom your predictive analytics project from the very start! In this video, Matt North will show you how to identify flaws, such as statistical outliers and missing values, to improve the usefulness and reliability of your results.
Using RapidMiner, Matt starts by importing a data set and examining it to ensure that it is importing correctly with the right data types. You will learn how to quickly identify outliers and missing values; and take steps to correct those problems in the data using filters on your data import. Business and data analysts that are using data for predictive modeling will find these techniques useful. A basic understanding of statistics and data organization/representations will help you get the most out of this video.
- learn how to identify and handle missing values on data imports in RapidMiner.
- learn to identify and handle statistical outliers in RapidMiner.
- understand techniques for evaluating data quality.
Other videos in this series:
Does Correlation Prove Causation in Predictive Analytics?
How Do I Choose the Correct Predictive Model for My Organizational Questions?
Table of Contents
- Title: How Can I Clean My Data for Use in a Predictive Model?
- Release date: May 2017
- Publisher(s): Infinite Skills
- ISBN: 9781491990872