NumPy and pandas

When you think about it, NumPy is a fairly low-level array-manipulation library, and the majority of other Python libraries are written on top of it.

One of these libraries is pandas, which is a high-level data-manipulation library. When you are exploring a dataset, you usually perform operations such as calculating descriptive statistics, grouping by a certain characteristic, and merging. The pandas library has many friendly functions to perform these various useful operations.

Let's use a diabetes dataset in this example. The diabetes dataset in sklearn.datasets is standardized with a zero mean and unit L2 norm.

The dataset contains 442 records with 10 features: age, sex, body mass index, average blood pressure, and six ...

Get Mastering Numerical Computing with NumPy now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.