Book description
Over 95 handson recipes to leverage the power of pandas for efficient scientific computation and data analysis
About This Book
 Use the power of pandas to solve most complex scientific computing problems with ease
 Leverage fast, robust data structures in pandas to gain useful insights from your data
 Practical, easy to implement recipes for quick solutions to common problems in data using pandas
Who This Book Is For
This book is for data scientists, analysts and Python developers who wish to explore data analysis and scientific computing in a practical, handson manner. The recipes included in this book are suitable for both novice and advanced users, and contain helpful tips, tricks and caveats wherever necessary. Some understanding of pandas will be helpful, but not mandatory.
What You Will Learn
 Master the fundamentals of pandas to quickly begin exploring any dataset
 Isolate any subset of data by properly selecting and querying the data
 Split data into independent groups before applying aggregations and transformations to each group
 Restructure data into tidy form to make data analysis and visualization easier
 Prepare realworld messy datasets for machine learning
 Combine and merge data from different sources through pandas SQLlike operations
 Utilize pandas unparalleled time series functionality
 Create beautiful and insightful visualizations through pandas direct hooks to Matplotlib and Seaborn
In Detail
This book will provide you with unique, idiomatic, and fun recipes for both fundamental and advanced data manipulation tasks with pandas. Some recipes focus on achieving a deeper understanding of basic principles, or comparing and contrasting two similar operations. Other recipes will dive deep into a particular dataset, uncovering new and unexpected insights along the way.
The pandas library is massive, and it’s common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands like one would do during an actual analysis. This book guides you, as if you were looking over the shoulder of an expert, through practical situations that you are highly likely to encounter.
Many advanced recipes combine several different features across the pandas library to generate results.
Style and approach
The author relies on his vast experience teaching pandas in a professional setting to deliver very detailed explanations for each line of code in all of the recipes. All code and dataset explanations exist in Jupyter Notebooks, an excellent interface for exploring data.
Table of contents
 Title Page
 Copyright
 Credits
 About the Author
 Acknowledgement
 About the Reviewers
 www.PacktPub.com
 Customer Feedback
 Preface

Pandas Foundations
 Introduction
 Dissecting the anatomy of a DataFrame
 Accessing the main DataFrame components
 Understanding data types
 Selecting a single column of data as a Series
 Calling Series methods
 Working with operators on a Series
 Chaining Series methods together
 Making the index meaningful
 Renaming row and column names
 Creating and deleting columns

Essential DataFrame Operations
 Introduction
 Selecting multiple DataFrame columns
 Selecting columns with methods
 Ordering column names sensibly
 Operating on the entire DataFrame
 Chaining DataFrame methods together
 Working with operators on a DataFrame
 Comparing missing values
 Transposing the direction of a DataFrame operation
 Determining college campus diversity
 Beginning Data Analysis
 Selecting Subsets of Data

Boolean Indexing
 Introduction
 Calculating boolean statistics
 Constructing multiple boolean conditions
 Filtering with boolean indexing
 Replicating boolean indexing with index selection
 Selecting with unique and sorted indexes
 Gaining perspective on stock prices
 Translating SQL WHERE clauses
 Determining the normality of stock market returns
 Improving readability of boolean indexing with the query method
 Preserving Series with the where method
 Masking DataFrame rows
 Selecting with booleans, integer location, and labels
 Index Alignment

Grouping for Aggregation, Filtration, and Transformation
 Introduction
 Defining an aggregation
 Grouping and aggregating with multiple columns and functions
 Removing the MultiIndex after grouping
 Customizing an aggregation function
 Customizing aggregating functions with *args and **kwargs
 Examining the groupby object
 Filtering for states with a minority majority
 Transforming through a weight loss bet
 Calculating weighted mean SAT scores per state with apply
 Grouping by continuous variables
 Counting the total number of flights between cities
 Finding the longest streak of ontime flights

Restructuring Data into a Tidy Form
 Introduction
 Tidying variable values as column names with stack
 Tidying variable values as column names with melt
 Stacking multiple groups of variables simultaneously
 Inverting stacked data
 Unstacking after a groupby aggregation
 Replicating pivot_table with a groupby aggregation
 Renaming axis levels for easy reshaping
 Tidying when multiple variables are stored as column names
 Tidying when multiple variables are stored as column values
 Tidying when two or more values are stored in the same cell
 Tidying when variables are stored in column names and values
 Tidying when multiple observational units are stored in the same table
 Combining Pandas Objects

Time Series Analysis
 Introduction
 Understanding the difference between Python and pandas date tools
 Slicing time series intelligently
 Using methods that only work with a DatetimeIndex
 Counting the number of weekly crimes
 Aggregating weekly crime and traffic accidents separately
 Measuring crime by weekday and year
 Grouping with anonymous functions with a DatetimeIndex
 Grouping by a Timestamp and another column
 Finding the last time crime was 20% lower with merge_asof

Visualization with Matplotlib, Pandas, and Seaborn
 Introduction
 Getting started with matplotlib
 Visualizing data with matplotlib
 Plotting basics with pandas
 Visualizing the flights dataset
 Stacking area charts to discover emerging trends
 Understanding the differences between seaborn and pandas
 Doing multivariate analysis with seaborn Grids
 Uncovering Simpson's paradox in the diamonds dataset with seaborn
Product information
 Title: Pandas Cookbook
 Author(s):
 Release date: October 2017
 Publisher(s): Packt Publishing
 ISBN: 9781784393878
You might also like
book
Pandas 1.x Cookbook  Second Edition
Use the power of pandas to solve most complex scientific computing problems with ease. Revised for …
book
Pandas in Action
Take the next steps in your data science career! This friendly and handson guide shows you …
book
Learning pandas  Second Edition
Get to grips with pandas—a versatile and highperformance Python library for data manipulation, analysis, and discovery …
book
Mastering pandas  Second Edition
Perform advanced data manipulation tasks using pandas and become an expert data analyst. Key Features Manipulate …