Python Data Science Handbook, 2nd Edition

by Jake VanderPlas

Released December 2022

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781098121228

Book description

Python is a first-class tool for many researchers, primarily because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the new edition of Python Data Science Handbook do you get them all—IPython, NumPy, pandas, Matplotlib, Scikit-Learn, and other related tools.

Working scientists and data crunchers familiar with reading and writing Python code will find the second edition of this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.

With this handbook, you'll learn how:

IPython and Jupyter provide computational environments for scientists using Python
NumPy includes the ndarray for efficient storage and manipulation of dense data arrays
Pandas contains the DataFrame for efficient storage and manipulation of labeled/columnar data
Matplotlib includes capabilities for a flexible range of data visualizations
Scikit-learn helps you build efficient and clean Python implementations of the most important and established machine learning algorithms