Skip to Main Content
Python Data Science Essentials
book

Python Data Science Essentials

by Alberto Boschetti
April 2015
Beginner content levelBeginner
258 pages
5h 48m
English
Packt Publishing
Content preview from Python Data Science Essentials

Chapter 3. The Data Science Pipeline

Until now, we explored how to load data into Python and process it up to a point to create a dataset as a bidimensional NumPy array of numeric values. At this point, we are ready to get fully immersed into data science and extract meaning from data and potential data products. This chapter and the next chapter on machine learning are the most challenging sections of the entire book.

In this chapter, you will learn how to:

  • Briefly explore data and create new features
  • Reduce the dimensionality of data
  • Spot and treat outliers
  • Decide on the score or loss metrics that are the best for your project
  • Apply the scientific methodology and effectively test the performance of your machine learning hypothesis
  • Select the best feature ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Science Essentials - Second Edition

Python Data Science Essentials - Second Edition

Luca Massaron, Alberto Boschetti
Python Data Science Essentials - Third Edition

Python Data Science Essentials - Third Edition

Alberto Boschetti, Luca Massaron, Pietro Marinelli, Matteo Malosetti
Python: End-to-end Data Analysis

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Publisher Resources

ISBN: 9781785280429Supplemental Content