Chapter 3. Finding a Needle in a Haystack

Analyzing a dataset to find patterns is an art as much as it is a science. There can be a lot of metrics associated with a dataset and you would like to find the needle in this haystack. For us, a needle is the insight that we look for within data that we weren't aware of earlier. Here, insight could refer to important information about people who buy milk of a particular brand and also buy cereals of another brand, for instance. The retail store can then stack the products near each other.

Whenever you try to analyze a dataset, you should have a detailed understanding of it and also of the domain that it is associated with. If it's a simple dataset that can be understood very easily, then the analysis ...

Get Mastering Python for Data Science now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.