Preface
Most people imagine data science to be focused on advanced math and machine learning techniques. In reality, most data scientists find themselves spending a significant amount of time (70%–80%) in a variety of tasks that are often called “data munging,” including data cleansing and normalization, aggregation, sampling, transformation, and other forms of feature generation.
These activities are often considered low-value or “grunt work,” but they are actually interesting and sometimes require machine learning to accomplish. The resulting set of skills is a complex mishmash of normal data cleansing and extraction techniques that most data analysts or software engineers will recognize and more advanced skills that would normally be seen ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access