O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Visualization with Python

Book Description

Understand, explore, and effectively present data using the powerful data visualization techniques of Python programming.

Key Features

  • Study key visualization tools and techniques with real-world data
  • Explore industry-standard plotting libraries, including Matplotlib and Seaborn
  • Breathe life into your visuals with exciting widgets and animations using Bokeh

Book Description

Data Visualization with Python reviews the spectrum of data visualization and its importance. Designed for beginners, it'll help you learn about statistics by computing mean, median, and variance for certain numbers.

In the first few chapters, you'll be able to take a quick tour of key NumPy and Pandas techniques, which include indexing, slicing, iterating, filtering, and grouping. The book keeps pace with your learning needs, introducing you to various visualization libraries. As you work through chapters on Matplotlib and Seaborn, you'll discover how to create visualizations in an easier way. After a lesson on these concepts, you can then brush up on advanced visualization techniques like geoplots and interactive plots.

You'll learn how to make sense of geospatial data, create interactive visualizations that can be integrated into any webpage, and take any dataset to build beautiful visualizations. What's more? You'll study how to plot geospatial data on a map using Choropleth plot and understand the basics of Bokeh, extending plots by adding widgets and animating the display of information.

By the end of this book, you'll be able to put your learning into practice with an engaging activity, where you can work with a new dataset to create an insightful capstone visualization.

What you will learn

  • Understand and use various plot types with Python
  • Explore and work with different plotting libraries
  • Learn to create effective visualizations
  • Improve your Python data wrangling skills
  • Hone your skill set by using tools like Matplotlib, Seaborn, and Bokeh
  • Reinforce your knowledge of various data formats and representations

Who this book is for

Data Visualization with Python is designed for developers and scientists, who want to get into data science or want to use data visualizations to enrich their personal and professional projects. You do not need any prior experience in data analytics and visualization, however, it'll help you to have some knowledge of Python and familiarity with high school level mathematics. Even though this is a beginner level course on data visualization, experienced developers will be able to improve their Python skills by working with real-world data.

Table of Contents

  1. Preface
    1. About the Book
      1. About the Author
      2. Objectives
      3. Audience
      4. Approach
      5. Minimum Hardware Requirements
      6. Software Requirements
      7. Conventions
      8. Installation and Setup
      9. Working with JupyterLab and Jupyter Notebook
      10. Importing Python Libraries
      11. Installing the Code Bundle
      12. Additional Resources
  2. Chapter 1
  3. The Importance of Data Visualization and Data Exploration
    1. Introduction
      1. Introduction to Data Visualization
      2. The Importance of Data Visualization
      3. Data Wrangling
      4. Tools and Libraries for Visualization
    2. Overview of Statistics
      1. Measures of Central Tendency
      2. Measures of Dispersion
      3. Correlation
      4. Types of Data
      5. Summary Statistics
    3. NumPy
      1. Exercise 1: Loading a Sample Dataset and Calculating the Mean
      2. Activity 1: Using NumPy to Compute the Mean, Median, Variance, and Standard Deviation for the Given Numbers
      3. Basic NumPy Operations
      4. Activity 2: Indexing, Slicing, Splitting, and Iterating
      5. Advanced NumPy Operations
      6. Activity 3: Filtering, Sorting, Combining, and Reshaping
    4. pandas
      1. Advantages of pandas over NumPy
      2. Disadvantages of pandas
      3. Exercise 2: Loading a Sample Dataset and Calculating the Mean
      4. Activity 4: Using pandas to Compute the Mean, Median, and Variance for the Given Numbers
      5. Basic Operations of pandas
      6. Series
      7. Activity 5: Indexing, Slicing, and Iterating using pandas
      8. Advanced pandas Operations
      9. Activity 6: Filtering, Sorting, and Reshaping
    5. Summary
  4. Chapter 2
  5. All You Need to Know About Plots
    1. Introduction
    2. Comparison Plots
      1. Line Chart
      2. Bar Chart
      3. Radar Chart
      4. Activity 7: Employee Skill Comparison
    3. Relation Plots
      1. Scatter Plot
      2. Bubble Plot
      3. Correlogram
      4. Heatmap
      5. Activity 8: Road Accidents Occurring over Two Decades
    4. Composition Plots
      1. Pie Chart
      2. Stacked Bar Chart
      3. Stacked Area Chart
      4. Activity 9: Smartphone Sales Units
      5. Venn Diagram
    5. Distribution Plots
      1. Histogram
      2. Density Plot
      3. Box Plot
      4. Violin Plot
      5. Activity 10: Frequency of Trains during Different Time Intervals
    6. Geo Plots
      1. Dot Map
      2. Choropleth Map
      3. Connection Map
    7. What Makes a Good Visualization?
      1. Activity 11: Identifying the Ideal Visualization
    8. Summary
  6. Chapter 3
  7. A Deep Dive into Matplotlib
    1. Introduction
    2. Overview of Plots in Matplotlib
    3. Pyplot Basics
      1. Creating Figures
      2. Closing Figures
      3. Format Strings
      4. Plotting
      5. Plotting Using pandas DataFrames
      6. Displaying Figures
      7. Saving Figures
      8. Exercise 3: Creating a Simple Visualization
    4. Basic Text and Legend Functions
      1. Labels
      2. Titles
      3. Text
      4. Annotations
      5. Legends
      6. Activity 12: Visualizing Stock Trends by Using a Line Plot
    5. Basic Plots
      1. Bar Chart
      2. Activity 13: Creating a Bar Plot for Movie Comparison
      3. Pie Chart
      4. Exercise 4: Creating a Pie Chart for Water Usage
      5. Stacked Bar Chart
      6. Activity 14: Creating a Stacked Bar Plot to Visualize Restaurant Performance
      7. Stacked Area Chart
      8. Activity 15: Comparing Smartphone Sales Units Using a Stacked Area Chart
      9. Histogram
      10. Box Plot
      11. Activity 16: Using a Histogram and a Box Plot to Visualize the Intelligence Quotient
      12. Scatter Plot
      13. Activity 17: Using a Scatter Plot to Visualize Correlation Between Various Animals
      14. Bubble Plot
    6. Layouts
      1. Subplots
      2. Tight Layout
      3. Radar Charts
      4. Exercise 5: Working on Radar Charts
      5. GridSpec
      6. Activity 18: Creating Scatter Plot with Marginal Histograms
    7. Images
      1. Basic Image Operations
      2. Activity 19: Plotting Multiple Images in a Grid
    8. Writing Mathematical Expressions
    9. Summary
  8. Chapter 4
  9. Simplifying Visualizations Using Seaborn
    1. Introduction
      1. Advantages of Seaborn
    2. Controlling Figure Aesthetics
      1. Seaborn Figure Styles
      2. Removing Axes Spines
      3. Contexts
      4. Activity 20: Comparing IQ Scores for Different Test Groups by Using a Box Plot
    3. Color Palettes
      1. Categorical Color Palettes
      2. Sequential Color Palettes
      3. Diverging Color Palettes
      4. Activity 21: Using Heatmaps to Find Patterns in Flight Passengers' Data
    4. Interesting Plots in Seaborn
      1. Bar Plots
      2. Activity 22: Movie Comparison Revisited
      3. Kernel Density Estimation
      4. Plotting Bivariate Distributions
      5. Visualizing Pairwise Relationships
      6. Violin Plots
      7. Activity 23: Comparing IQ Scores for Different Test Groups by Using a Violin Plot
    5. Multi-Plots in Seaborn
      1. FacetGrid
      2. Activity 24: Top 30 YouTube Channels
    6. Regression Plots
      1. Activity 25: Linear Regression
    7. Squarify
      1. Activity 26: Water Usage Revisited
    8. Summary
  10. Chapter 5
  11. Plotting Geospatial Data
    1. Introduction
      1. The Design Principles of Geoplotlib
      2. Geospatial Visualizations
      3. Exercise 6: Visualizing Simple Geospatial Data
      4. Activity 27: Plotting Geospatial Data on a Map
      5. Exercise 7: Choropleth Plot with GeoJSON Data
    2. Tile Providers
      1. Exercise 8: Visually Comparing Different Tile Providers
    3. Custom Layers
      1. Activity 28: Working with Custom Layers
    4. Summary
  12. Chapter 6
  13. Making Things Interactive with Bokeh
    1. Introduction
      1. Concepts of Bokeh
      2. Interfaces in Bokeh
      3. Output
      4. Bokeh Server
      5. Presentation
      6. Integrating
      7. Exercise 9: Plotting with Bokeh
      8. Exercise 10: Comparing the Plotting and Models Interfaces
    2. Adding Widgets
      1. Exercise 11: Basic Interactivity Widgets
      2. Activity 29: Extending Plots with Widgets
    3. Summary
  14. Chapter 7
  15. Combining What We Have Learned
    1. Introduction
      1. Activity 30: Implementing Matplotlib and Seaborn on New York City Database
      2. Bokeh
      3. Activity 31: Visualizing Bokeh Stock Prices
      4. Geoplotlib
      5. Activity 32: Analyzing Airbnb Data with geoplotlib
    2. Summary
  16. Appendix
    1. Chapter 1: The Importance of Data Visualization and Data Exploration
      1. Activity 1: Using NumPy to Compute the Mean, Median, Variance, and Standard Deviation for the Given Numbers
      2. Activity 2: Indexing, Slicing, Splitting, and Iterating
      3. Activity 3: Filtering, Sorting, Combining, and Reshaping
      4. Activity 4: Using pandas to Compute the Mean, Median, and Variance for the Given Numbers
      5. Activity 5: Indexing, Slicing, and Iterating Using pandas
      6. Activity 6: Filtering, Sorting, and Reshaping
    2. Chapter 2: All You Need to Know about Plots
      1. Activity 7: Employee Skill Comparison
      2. Activity 8: Road Accidents Occurring over Two Decades
      3. Activity 9: Smartphone Sales Units
      4. Activity 10: Frequency of Trains during Different Time Intervals
      5. Activity 11: Identifying the Ideal Visualization
    3. Chapter 3: A Deep Dive into Matplotlib
      1. Activity 12: Visualizing Stock Trends by Using a Line Plot
      2. Activity 13: Creating a Bar Plot for Movie Comparison
      3. Activity 14: Creating a Stacked Bar Plot to Visualize Restaurant Performance
      4. Activity 15: Comparing Smartphone Sales Units Using a Stacked Area Chart
      5. Activity 16: Using a Histogram and a Box Plot to Visualize the Intelligence Quotient
      6. Activity 17: Using a Scatter Plot to Visualize Correlation between Various Animals
      7. Activity 18: Creating a Scatter Plot with Marginal Histograms
      8. Activity 19: Plotting Multiple Images in a Grid
    4. Chapter 4: Simplifying Visualizations Using Seaborn
      1. Activity 20: Comparing IQ Scores for Different Test Groups by Using a Box Plot
      2. Activity 21: Using Heatmaps to Find Patterns in Flight Passengers' Data
      3. Activity 22: Movie Comparison Revisited
      4. Activity 23: Comparing IQ Scores for Different Test Groups by Using a Violin Plot
      5. Activity 24: Top 30 YouTube Channels
      6. Activity 25: Linear Regression
      7. Activity 26: Water Usage Revisited
    5. Chapter 5: Plotting Geospatial Data
      1. Activity 27: Plotting Geospatial Data on a Map
      2. Activity 28: Working with Custom Layers
    6. Chapter 6: Making Things Interactive with Bokeh
      1. Activity 29: Extending Plots with Widgets
    7. Chapter 7: Combining What We Have Learned
      1. Activity 30: Implementing Matplotlib and Seaborn on New York City Database
      2. Activity 31: Bokeh Stock Prices Visualization
      3. Activity 32: Analyzing Airbnb Data with Geoplotlib