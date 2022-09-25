Book description
Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.9 and pandas 1.2, the third edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You'll learn the latest versions of pandas, NumPy, and Jupyter in the process.
Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It's ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub.
- Use the Jupyter notebook and IPython shell for exploratory computing
- Learn basic and advanced features in NumPy
- Get started with data analysis tools in the pandas library
- Use flexible tools to load, clean, transform, merge, and reshape data
- Create informative visualizations with matplotlib
- Apply the pandas groupby facility to slice, dice, and summarize datasets
- Analyze and manipulate regular and irregular time series data
- Learn how to solve real-world data analysis problems with thorough, detailed examples
Publisher resources
Table of contents
-
Preface
- New for the Third Edition
- Conventions Used in This Book
- Using Code Examples
- O’Reilly Safari
- How to Contact Us
- Acknowledgments
-
Preliminaries
- 1.1 What Is This Book About?
- 1.2 Why Python for Data Analysis?
- 1.3 Essential Python Libraries
- 1.4 Installation and Setup
- 1.5 Community and Conferences
- 1.6 Navigating This Book
-
Python Language Basics, IPython, and Jupyter Notebooks
- 2.1 The Python Interpreter
- 2.2 IPython Basics
- 2.3 Python Language Basics
- 2.4 Conclusion
-
Built-in Data Structures, Functions, and Files
- 3.1 Data Structures and Sequences
- 3.2 Functions
- 3.3 Files and the Operating System
- 3.4 Conclusion
-
NumPy Basics: Arrays and Vectorized
Computation
- 4.1 The NumPy ndarray: A Multidimensional Array Object
- 4.2 Universal Functions: Fast Element-Wise Array Functions
- 4.3 Array-Oriented Programming with Arrays
- 4.4 File Input and Output with Arrays
- 4.5 Linear Algebra
- 4.6 Pseudorandom Number Generation
- 4.7 Example: Random Walks
- 4.8 Conclusion
-
Getting Started with pandas
- 5.1 Introduction to pandas Data Structures
- 5.2 Essential Functionality
- 5.3 Summarizing and Computing Descriptive Statistics
- 5.4 Conclusion
-
Data Loading, Storage, and File
Formats
- 6.1 Reading and Writing Data in Text Format
- 6.2 Binary Data Formats
- 6.3 Interacting with Web APIs
- 6.4 Interacting with Databases
- 6.5 Conclusion
-
Data Cleaning and Preparation
- 7.1 Handling Missing Data
- 7.2 Data Transformation
- 7.3 String Manipulation
- 7.4 Conclusion
-
Data Wrangling: Join, Combine, and Reshape
- 8.1 Hierarchical Indexing
- 8.2 Combining and Merging Datasets
- 8.3 Reshaping and Pivoting
- 8.4 Conclusion
-
Plotting and Visualization
- 9.1 A Brief matplotlib API Primer
- 9.2 Plotting with pandas and seaborn
- 9.3 Other Python Visualization Tools
- 9.4 Conclusion
-
Data Aggregation and Group
Operations
- 10.1 How to Think About Group Operations
- 10.2 Data Aggregation
- 10.3 Apply: General split-apply-combine
- 10.4 Pivot Tables and Cross-Tabulation
- 10.5 Conclusion
-
Time Series
- 11.1 Date and Time Data Types and Tools
- 11.2 Time Series Basics
- 11.3 Date Ranges, Frequencies, and Shifting
- 11.4 Time Zone Handling
- 11.5 Periods and Period Arithmetic
- 11.6 Resampling and Frequency Conversion
- 11.7 Moving Window Functions
- 11.8 Conclusion
-
Advanced pandas
- 12.1 Categorical Data
- 12.2 Advanced GroupBy Use
- 12.3 Extension Arrays
- 12.4 Techniques for Method Chaining
- 12.5 Conclusion
-
Introduction to Modeling Libraries in
Python
- 13.1 Interfacing Between pandas and Model Code
- 13.2 Creating Model Descriptions with Patsy
- 13.3 Introduction to statsmodels
- 13.4 Introduction to scikit-learn
- 13.5 Continuing Your Education
-
Data Analysis Examples
- 14.1 1.USA.gov Data from Bitly
- 14.2 MovieLens 1M Dataset
- 14.3 US Baby Names 1880–2010
- 14.4 USDA Food Database
- 14.5 2012 Federal Election Commission Database
- 14.6 Conclusion
-
Advanced NumPy
- A.1 ndarray Object Internals
- A.2 Advanced Array Manipulation
- A.3 Broadcasting
- A.4 Advanced ufunc Usage
- A.5 Structured and Record Arrays
- A.6 More About Sorting
- A.7 Writing Fast NumPy Functions with Numba
- A.8 Advanced Array Input and Output
- A.9 Performance Tips
-
More on the IPython System
- B.1 Using the Command History
- B.2 Interacting with the Operating System
- B.3 Software Development Tools
- B.4 Tips for Productive Code Development Using IPython
- B.5 Advanced IPython Features
- B.6 Conclusion
- Index
- About the Author
Product information
- Title: Python for Data Analysis, 3rd Edition
- Author(s):
- Release date: September 2022
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098104030
You might also like
book
Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud
This is the eBook of the printed book and may not include any media, website access …
book
Learning Python, 5th Edition
Get a comprehensive, in-depth introduction to the core Python language with this hands-on book. Based on …
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
Clean Code: A Handbook of Agile Software Craftsmanship
Even bad code can function. But if code isn't clean, it can bring a development organization …