book

Mastering Matplotlib

Name: Mastering Matplotlib
Author: Duncan M. McGreggor
ISBN: 9781783987542

by Duncan M. McGreggor

June 2015

Beginner to intermediate

292 pages

6h 16m

English

Packt Publishing

Read now

Unlock full access

Mastering matplotlib
Table of Contents
Mastering matplotlib
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and moreWhy subscribe?Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for

Conventions
Reader feedback
Customer support
Downloading the example codeDownloading the color images of this bookErrataPiracyQuestions
1. Getting Up to Speed
A brief historical overview of matplotlib
What's new in matplotlib 1.4
The intermediate matplotlib user
Prerequisites for this book
Python 3
Coding style
Installing matplotlib
Using IPython Notebooks with matplotlib
Advanced plots – a preview
Setting up the interactive backend
Joint plots with SeabornScatter plot matrix graphs with Pandas
Summary
2. The matplotlib Architecture
The original design goals
The current matplotlib architecture
The backend layer
FigureCanvasBaseRendererBaseEventVisualizing the backend layer
The artist layer
PrimitivesContainersCollectionsA view of the artist layer
The scripting layer
The supporting components of the matplotlib stack
matplotlib modules
Exploring the filesystemExploring imports visuallyModuleFinderModGrapher
The execution flow
An overview of the scriptAn interactive session
The matplotlib architecture as it relates to this book
Summary
3. matplotlib APIs and Integrations
The procedural pylab API
The pyplot scripting API
The matplotlib object-oriented API
EquationsHelper classesThe Plotter classRunning the jobs
matplotlib in other frameworks
An important note on IPython
Summary
4. Event Handling and Interactive Plots
Event loops in matplotlibEvent-based systemsThe event loopGUI toolkit main loopsIPython Notebook event loopsmatplotlib event loops
Event handling
Mouse eventsKeyboard eventsAxes and figure eventsObject pickingCompound event handlingThe navigation toolbarSpecialized eventsInteractive panning and zooming
Summary
5. High-level Plotting and Data Analysis
High-level plottingHistorical backgroundmatplotlibNetworkXPandasThe grammar of graphicsBokehThe ŷhat ggplotNew styles in matplotlibSeaborn
Data analysis
Pandas, SciPy, and SeabornExamining and shaping a datasetAnalysis of temperatureAnalysis of precipitation
Summary
6. Customization and Configuration
CustomizationCreating a custom styleSubplotsRevisiting PandasIndividual plotsBringing everything togetherFurther explorations in customization
Configuration
The run control for matplotlibFile and directory locationsUsing the matplotlibrc fileUpdating the settings dynamicallyOptions in IPython
Summary
7. Deploying matplotlib in Cloud Environments
Making a use case for matplotlib in the CloudThe data sourceDefining a workflowChoosing technologiesConfiguration managementTypes of deployment
An example – AWS and Docker
Getting set up locallyRequirementsDockerfiles and the Docker imagesExtending a Docker imageBuilding a new imagePreparing for deploymentGetting the setup on AWSPushing the source data to S3Creating a host server on EC2Using Docker on EC2Reading and writing with S3Running the taskEnvironment variables and DockerChanges to the Python moduleExecution
Summary
8. matplotlib and Big Data
Big data
Working with large data sources
An example problemBig data on the filesystemNumPy's memmap functionHDF5 and PyTablesDistributed dataMapReduceOpen source optionsAn example – working with data on EMR
Visualizing large data
Finding the limits of matplotlibAgg rendering with matplotlibrcDecimationAdditional techniques
Other visualization tools
Summary
9. Clustering for matplotlib
Clustering and parallel programming
The custom ZeroMQ cluster
Estimating the value of πCreating the ZeroMQ componentsWorking with the results
Clustering with IPython
Getting startedThe direct viewThe load-balanced viewThe parallel magic functionsAn example – estimating the value of π
More clustering
Summary
Index

Content preview from Mastering Matplotlib

Working with large data sources

Most of the data that users feed into matplotlib when generating plots is from NumPy. NumPy is one of the fastest ways of processing numerical and array-based data in Python (if not the fastest), so this makes sense. However by default, NumPy works on in-memory database. If the dataset that you want to plot is larger than the total RAM available on your system, performance is going to plummet.

In the following section, we're going to take a look at an example that illustrates this limitation. But first, let's get our notebook set up, as follows:

In [1]: import matplotlib
        matplotlib.use('nbagg')
        %matplotlib inline

Here are the modules that we are going to use:

In [2]: import glob, io, math, os
        import psutil
 import ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781783987542

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Mastering Matplotlib

by Duncan M. McGreggor

Working with large data sources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.