Skip to Content
Python: Real-World Data Science
book

Python: Real-World Data Science

by Dusty Phillips, Fabrizio Romano, Phuong Vo.T.H, Martin Czygan, Robert Layton, Sebastian Raschka
June 2016
Beginner to intermediate content levelBeginner to intermediate
1255 pages
29h 1m
English
Packt Publishing
Content preview from Python: Real-World Data Science

Preprocessing using pipelines

When taking measurements of real-world objects, we can often get features in very different ranges. For instance, if we are measuring the qualities of an animal, we might have several features, as follows:

  • Number of legs: This is between the range of 0-8 for most animals, while some have many more!
  • Weight: This is between the range of only a few micrograms, all the way to a blue whale with a weight of 190,000 kilograms!
  • Number of hearts: This can be between zero to five, in the case of the earthworm.

For a mathematical-based algorithm to compare each of these features, the differences in the scale, range, and units can be difficult to interpret. If we used the above features in many algorithms, the weight would probably ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python for Data Science

Python for Data Science

Yuli Vasiliev

Publisher Resources

ISBN: 9781786465160