5. Data Munging with Hadoop
If you torture the data long enough, it will confess.
Ronald Coase, Economist
In This Chapter:
What data quality is, the different types of data quality issues that arise in data, and how to address them with Hadoop
The importance of feature generation, various types of features, and how to generate features for your model with Hadoop
Feature selection and dimensionality reduction and its importance in addressing the ...
Get Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.