Skip to Content
Python Data Science Handbook, 2nd Edition
book

Python Data Science Handbook, 2nd Edition

by Jake VanderPlas
December 2022
Beginner to intermediate
588 pages
13h 43m
English
O'Reilly Media, Inc.
Content preview from Python Data Science Handbook, 2nd Edition

Part II. Introduction to NumPy

This part of the book, along with Part III, outlines techniques for effectively loading, storing, and manipulating in-memory data in Python. The topic is very broad: datasets can come from a wide range of sources and in a wide range of formats, including collections of documents, collections of images, collections of sound clips, collections of numerical measurements, or nearly anything else. Despite this apparent heterogeneity, many datasets can be represented fundamentally as arrays of numbers.

For example, images—particularly digital images—can be thought of as simply two-dimensional arrays of numbers representing pixel brightness across the area. Sound clips can be thought of as one-dimensional arrays of intensity versus time. Text can be converted in various ways into numerical representations, such as binary digits representing the frequency of certain words or pairs of words. No matter what the data is, the first step in making it analyzable will be to transform it into arrays of numbers. (We will discuss some specific examples of this process in Chapter 40.)

For this reason, efficient storage and manipulation of numerical arrays is absolutely fundamental to the process of doing data science. We’ll now take a look at the specialized tools that Python has for handling such numerical arrays: the NumPy package and the Pandas package (discussed in Part III).

This part of the book will cover NumPy in detail. NumPy (short for Numerical Python) provides ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Science Handbook

Python Data Science Handbook

Jake VanderPlas

Publisher Resources

ISBN: 9781098121211Errata PageSupplemental Content