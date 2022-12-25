Python Data Science Handbook, 2nd Edition

Python Data Science Handbook, 2nd Edition

by Jake VanderPlas
Released December 2022
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781098121204

Explore a preview version of Python Data Science Handbook, 2nd Edition right now.

O’Reilly members get unlimited access to live online training experiences, plus books, videos, and digital content from 200+ publishers.

Start your free trial

Book description

Python is a first-class tool for many researchers, primarily because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the new edition of Python Data Science Handbook do you get them all--IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.

In this second edition, working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.

With this handbook, you'll learn how:

  • IPython and Jupyter provide computational environments for scientists using Python
  • NumPy includes the ndarray for efficient storage and manipulation of dense data arrays
  • Pandas contains the DataFrame for efficient storage and manipulation of labeled/columnar data
  • Matplotlib includes capabilities for a flexible range of data visualizations
  • Scikit-Learn helps you build efficient and clean Python implementations of the most important and established machine learning algorithms

Table of contents

  1. Preface
    1. What Is Data Science?
    2. Who Is This Book For?
    3. Why Python?
    4. Outline of the Book
    5. Using Code Examples
    6. Installation Considerations
  2. 1. IPython: Beyond Normal Python
    1. Shell or Notebook?
      1. Launching the IPython Shell
      2. Launching the Jupyter Notebook
    2. Help and Documentation in IPython
      1. Accessing Documentation with ?
      2. Accessing Source Code with ??
      3. Exploring Modules with Tab-Completion
    3. Keyboard Shortcuts in the IPython Shell
      1. Navigation shortcuts
      2. Text Entry Shortcuts
      3. Command History Shortcuts
      4. Miscellaneous Shortcuts
    4. IPython Magic Commands
      1. Pasting Code Blocks: %paste and %cpaste
      2. Running External Code: %run
      3. Timing Code Execution: %timeit
      4. Help on Magic Functions: ?, %magic, and %lsmagic
    5. Input and Output History
      1. IPython’s In and Out Objects
      2. Underscore Shortcuts and Previous Outputs
      3. Suppressing Output
      4. Related Magic Commands
    6. IPython and Shell Commands
      1. Quick Introduction to the Shell
      2. Shell Commands in IPython
      3. Passing Values to and from the Shell
    7. Shell-Related Magic Commands
    8. Errors and Debugging
      1. Controlling Exceptions: %xmode
      2. Debugging: When Reading Tracebacks Is Not Enough
    9. Profiling and Timing Code
      1. Timing Code Snippets: %timeit and %time
      2. Profiling Full Scripts: %prun
      3. Line-By-Line Profiling with %lprun
      4. Profiling Memory Use: %memit and %mprun
    10. More IPython Resources
      1. Web Resources
      2. Books
  3. 2. Introduction to NumPy
    1. Reminder about Built-In Documentation
    2. Understanding Data Types in Python
      1. A Python Integer Is More Than Just an Integer
      2. A Python List Is More Than Just a List
      3. Fixed-Type Arrays in Python
      4. Creating Arrays from Python Lists
      5. Creating Arrays from Scratch
      6. NumPy Standard Data Types
    3. The Basics of NumPy Arrays
      1. NumPy Array Attributes
      2. Array Indexing: Accessing Single Elements
      3. Array Slicing: Accessing Subarrays
      4. Reshaping of Arrays
      5. Array Concatenation and Splitting
    4. Computation on NumPy Arrays: Universal Functions
      1. The Slowness of Loops
      2. Introducing UFuncs
      3. Exploring NumPy’s UFuncs
      4. Advanced Ufunc Features
      5. Ufuncs: Learning More
    5. Aggregations: Min, Max, and Everything In Between
      1. Summing the Values in an Array
      2. Minimum and Maximum
      3. Example: What is the Average Height of US Presidents?
    6. Computation on Arrays: Broadcasting
      1. Introducing Broadcasting
      2. Rules of Broadcasting
      3. Broadcasting in Practice
    7. Comparisons, Masks, and Boolean Logic
      1. Example: Counting Rainy Days
      2. Comparison Operators as ufuncs
      3. Working with Boolean Arrays
      4. Boolean Arrays as Masks
      5. Aside: Using the Keywords and/or Versus the Operators &/|
    8. Fancy Indexing
      1. Exploring Fancy Indexing
      2. Combined Indexing
      3. Example: Selecting Random Points
      4. Modifying Values with Fancy Indexing
      5. Example: Binning Data
    9. Sorting Arrays
      1. Fast Sorting in NumPy: np.sort and np.argsort
      2. Partial Sorts: Partitioning
      3. Example: k-Nearest Neighbors
      4. Aside: Big-O Notation
    10. Structured Data: NumPy’s Structured Arrays
      1. Exploring Structured Array Creation
      2. More Advanced Compound Types
      3. RecordArrays: Structured Arrays with a Twist
      4. On to Pandas
  4. About the Author

Product information

  • Title: Python Data Science Handbook, 2nd Edition
  • Author(s): Jake VanderPlas
  • Release date: December 2022
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098121204