The Data Visualization Workshop

Book description

Explore a modern approach to visualizing data with Python and transform large real-world datasets into expressive visual graphics using this beginner-friendly workshop

Key Features

  • Discover the essential tools and methods of data visualization
  • Learn to use standard Python plotting libraries such as Matplotlib and Seaborn
  • Gain insights into the visualization techniques of big companies

Book Description

Do you want to transform data into captivating images? Do you want to make it easy for your audience to process and understand the patterns, trends, and relationships hidden within your data?

The Data Visualization Workshop will guide you through the world of data visualization and help you to unlock simple secrets for transforming data into meaningful visuals with the help of exciting exercises and activities.

Starting with an introduction to data visualization, this book shows you how to first prepare raw data for visualization using NumPy and pandas operations. As you progress, you'll use plotting techniques, such as comparison and distribution, to identify relationships and similarities between datasets. You'll then work through practical exercises to simplify the process of creating visualizations using Python plotting libraries such as Matplotlib and Seaborn. If you've ever wondered how popular companies like Uber and Airbnb use geoplotlib for geographical visualizations, this book has got you covered, helping you analyze and understand the process effectively. Finally, you'll use the Bokeh library to create dynamic visualizations that can be integrated into any web page.

By the end of this workshop, you'll have learned how to present engaging mission-critical insights by creating impactful visualizations with real-world data.

What you will learn

  • Understand the importance of data visualization in data science
  • Implement NumPy and pandas operations on real-life datasets
  • Create captivating data visualizations using plotting libraries
  • Use advanced techniques to plot geospatial data on a map
  • Integrate interactive visualizations to a webpage
  • Visualize stock prices with Bokeh and analyze Airbnb data with Matplotlib

Who this book is for

The Data Visualization Workshop is for beginners who want to learn data visualization, as well as developers and data scientists who are looking to enrich their practical data science skills. Prior knowledge of data analytics, data science, and visualization is not mandatory. Knowledge of Python basics and high-school-level math will help you grasp the concepts covered in this data visualization book more quickly and effectively.

Table of contents

  1. The Data Visualization Workshop
  2. Preface
    1. About the Book
      1. Audience
      2. About the Chapters
      3. Conventions
      4. Code Presentation
      5. Setting up Your Environment
      6. Installing Python
        1. Installing Python on Windows
        2. Installing Python on Linux
        3. Installing Python on macOS
      7. Installing Libraries
      8. Working with JupyterLab and Jupyter Notebook
      9. Importing Python Libraries
      10. Accessing the Code Files
  3. 1. The Importance of Data Visualization and Data Exploration
    1. Introduction
      1. Introduction to Data Visualization
      2. The Importance of Data Visualization
      3. Data Wrangling
      4. Tools and Libraries for Visualization
    2. Overview of Statistics
      1. Measures of Central Tendency
      2. Measures of Dispersion
      3. Correlation
      4. Types of Data
      5. Summary Statistics
    3. NumPy
      1. Exercise 1.01: Loading a Sample Dataset and Calculating the Mean Using NumPy
      2. Activity 1.01: Using NumPy to Compute the Mean, Median, Variance, and Standard Deviation of a Dataset
      3. Basic NumPy Operations
        1. Indexing
        2. Slicing
        3. Splitting
        4. Iterating
      4. Exercise 1.02: Indexing, Slicing, Splitting, and Iterating
      5. Advanced NumPy Operations
        1. Filtering
        2. Sorting
        3. Combining
        4. Reshaping
      6. Exercise 1.03: Filtering, Sorting, Combining, and Reshaping
    4. pandas
      1. Advantages of pandas over NumPy
      2. Disadvantages of pandas
      3. Exercise 1.04 Loading a Sample Dataset and Calculating the Mean using Pandas
      4. Exercise 1.05: Using pandas to Compute the Mean, Median, and Variance of a Dataset
      5. Basic Operations of pandas
        1. Indexing
        2. Slicing
        3. Iterating
        4. Series
      6. Exercise 1.06: Indexing, Slicing, and Iterating Using pandas
      7. Advanced pandas Operations
        1. Filtering
        2. Sorting
        3. Reshaping
      8. Exercise 1.07: Filtering, Sorting, and Reshaping
      9. Activity 1.02: Forest Fire Size and Temperature Analysis
    5. Summary
  4. 2. All You Need to Know about Plots
    1. Introduction
    2. Comparison Plots
      1. Line Chart
        1. Uses
        2. Example
        3. Design Practices
      2. Bar Chart
        1. Use
        2. Don’ts of Bar Charts
        3. Examples
        4. Design Practices
      3. Radar Chart
        1. Uses
        2. Examples
        3. Design Practices
      4. Activity 2.01: Employee Skill Comparison
    3. Relation Plots
      1. Scatter Plot
        1. Uses
        2. Examples
        3. Design Practices
        4. Variants: Scatter Plots with Marginal Histograms
        5. Examples
      2. Bubble Plot
        1. Use
        2. Example
        3. Design Practices
      3. Correlogram
        1. Examples
        2. Design Practices
      4. Heatmap
        1. Use
        2. Examples
        3. Design Practice
      5. Activity 2.02: Road Accidents Occurring over Two Decades
    4. Composition Plots
      1. Pie Chart
        1. Use
        2. Examples
        3. Design Practices
        4. Variants: Donut Chart
        5. Design Practice
      2. Stacked Bar Chart
        1. Use
        2. Examples
        3. Design Practices
      3. Stacked Area Chart
        1. Use
        2. Examples
        3. Design Practice
      4. Activity 2.03: Smartphone Sales Units
      5. Venn Diagram
        1. Use
        2. Example
        3. Design Practice
    5. Distribution Plots
      1. Histogram
        1. Use
        2. Example
        3. Design Practice
      2. Density Plot
        1. Use
        2. Example
        3. Design Practice
      3. Box Plot
        1. Use
        2. Examples
      4. Violin Plot
        1. Use
        2. Examples
        3. Design Practice
      5. Activity 2.04: Frequency of Trains during Different Time Intervals
    6. Geoplots
      1. Dot Map
        1. Use
        2. Example
        3. Design Practices
      2. Choropleth Map
        1. Use
        2. Example
        3. Design Practices
      3. Connection Map
        1. Use
        2. Examples
        3. Design Practices
    7. What Makes a Good Visualization?
      1. Common Design Practices
      2. Activity 2.05: Analyzing Visualizations
      3. Activity 2.06: Choosing a Suitable Visualization
    8. Summary
  5. 3. A Deep Dive into Matplotlib
    1. Introduction
    2. Overview of Plots in Matplotlib
    3. Pyplot Basics
      1. Creating Figures
      2. Closing Figures
      3. Format Strings
      4. Plotting
      5. Plotting Using pandas DataFrames
      6. Ticks
      7. Displaying Figures
      8. Saving Figures
      9. Exercise 3.01: Creating a Simple Visualization
    4. Basic Text and Legend Functions
      1. Labels
      2. Titles
      3. Text
      4. Annotations
      5. Legends
      6. Activity 3.01: Visualizing Stock Trends by Using a Line Plot
    5. Basic Plots
      1. Bar Chart
      2. Activity 3.02: Creating a Bar Plot for Movie Comparison
      3. Pie Chart
      4. Exercise 3.02: Creating a Pie Chart for Water Usage
      5. Stacked Bar Chart
      6. Activity 3.03: Creating a Stacked Bar Plot to Visualize Restaurant Performance
      7. Stacked Area Chart
      8. Activity 3.04: Comparing Smartphone Sales Units Using a Stacked Area Chart
      9. Histogram
      10. Box Plot
      11. Activity 3.05: Using a Histogram and a Box Plot to Visualize Intelligence Quotient
      12. Scatter Plot
      13. Exercise 3.03: Using a Scatter Plot to Visualize Correlation between Various Animals
      14. Bubble Plot
    6. Layouts
      1. Subplots
      2. Tight Layout
      3. Radar Charts
      4. Exercise 3.04: Working on Radar Charts
      5. GridSpec
      6. Activity 3.06: Creating a Scatter Plot with Marginal Histograms
    7. Images
      1. Basic Image Operations
      2. Activity 3.07: Plotting Multiple Images in a Grid
    8. Writing Mathematical Expressions
    9. Summary
  6. 4. Simplifying Visualizations Using Seaborn
    1. Introduction
      1. Advantages of Seaborn
    2. Controlling Figure Aesthetics
      1. Seaborn Figure Styles
      2. Removing Axes Spines
      3. Controlling the Scale of Plot Elements
      4. Exercise 4.01: Comparing IQ Scores for Different Test Groups by Using a Box Plot
    3. Color Palettes
      1. Categorical Color Palettes
      2. Sequential Color Palettes
      3. Diverging Color Palettes
      4. Exercise 4.02: Surface Temperature Analysis
      5. Activity 4.01: Using Heatmaps to Find Patterns in Flight Passengers' Data
    4. Advanced Plots in Seaborn
      1. Bar Plots
      2. Activity 4.02: Movie Comparison Revisited
      3. Kernel Density Estimation
      4. Plotting Bivariate Distributions
      5. Visualizing Pairwise Relationships
      6. Violin Plots
      7. Activity 4.03: Comparing IQ Scores for Different Test Groups by Using a Violin Plot
    5. Multi-Plots in Seaborn
      1. FacetGrid
      2. Activity 4.04: Visualizing the Top 30 Music YouTube Channels Using Seaborn's FacetGrid
    6. Regression Plots
      1. Activity 4.05: Linear Regression for Animal Attribute Relations
    7. Squarify
      1. Exercise 4.03: Water Usage Revisited
      2. Activity 4.06: Visualizing the Impact of Education on Annual Salary and Weekly Working Hours
    8. Summary
  7. 5. Plotting Geospatial Data
    1. Introduction
      1. The Design Principles of geoplotlib
    2. Geospatial Visualizations
      1. Voronoi Tessellation
      2. Delaunay Triangulation
      3. Choropleth Plot
      4. Exercise 5.01: Plotting Poaching Density Using Dot Density and Histograms
      5. Activity 5.01: Plotting Geospatial Data on a Map
      6. The GeoJSON Format
      7. Exercise 5.02: Creating a Choropleth Plot with GeoJSON Data
    3. Tile Providers
      1. Exercise 5.03: Visually Comparing Different Tile Providers
    4. Custom Layers
      1. Exercise 5.04: Plotting the Movement of an Aircraft with a Custom Layer
      2. Activity 5.02: Visualizing City Density by the First Letter Using an Interactive Custom Layer
    5. Summary
  8. 6. Making Things Interactive with Bokeh
    1. Introduction
      1. Concepts of Bokeh
      2. Interfaces in Bokeh
      3. Output
      4. Bokeh Server
      5. Presentation
      6. Integrating
    2. Basic Plotting
      1. Exercise 6.01: Plotting with Bokeh
      2. Exercise 6.02: Comparing the Plotting and Models Interfaces
      3. Activity 6.01: Plotting Mean Car Prices of Manufacturers
    3. Adding Widgets
      1. Exercise 6.03: Building a Simple Plot Using Basic Interactivity Widgets
      2. Exercise 6.04: Plotting Stock Price Data in Tabs
      3. Activity 6.02: Extending Plots with Widgets
    4. Summary
  9. 7. Combining What We Have Learned
    1. Introduction
      1. Activity 7.01: Implementing Matplotlib and Seaborn on the New York City Database
      2. Bokeh
      3. Activity 7.02: Visualizing Stock Prices with Bokeh
      4. Geoplotlib
      5. Activity 7.03: Analyzing Airbnb Data with Geoplotlib
    2. Summary
  10. Appendix
    1. 1. The Importance of Data Visualization and Data Exploration
      1. Activity 1.01: Using NumPy to Compute the Mean, Median, Variance, and Standard Deviation of a Dataset
      2. Activity 1.02: Forest Fire Size and Temperature Analysis
    2. 2. All You Need to Know about Plots
      1. Activity 2.01: Employee Skill Comparison
      2. Activity 2.02: Road Accidents Occurring over Two Decades
      3. Activity 2.03: Smartphone Sales Units
      4. Activity 2.04: Frequency of Trains during Different Time Intervals
      5. Activity 2.05: Analyzing Visualizations
        1. First Visualization
        2. Second Visualization
      6. Activity 2.06: Choosing a Suitable Visualization
    3. 3. A Deep Dive into Matplotlib
      1. Activity 3.01: Visualizing Stock Trends by Using a Line Plot
      2. Activity 3.02: Creating a Bar Plot for Movie Comparison
      3. Activity 3.03: Creating a Stacked Bar Plot to Visualize Restaurant Performance
      4. Activity 3.04: Comparing Smartphone Sales Units Using a Stacked Area Chart
      5. Activity 3.05: Using a Histogram and a Box Plot to Visualize Intelligence Quotient
      6. Activity 3.06: Creating a Scatter Plot with Marginal Histograms
      7. Activity 3.07: Plotting Multiple Images in a Grid
    4. 4. Simplifying Visualizations Using Seaborn
      1. Activity 4.01: Using Heatmaps to Find Patterns in Flight Passengers' Data
      2. Activity 4.02: Movie Comparison Revisited
      3. Activity 4.03: Comparing IQ Scores for Different Test Groups by Using a Violin Plot
      4. Activity 4.04: Visualizing the Top 30 Music YouTube Channels Using Seaborn's FacetGrid
      5. Activity 4.05: Linear Regression for Animal Attribute Relations
      6. Activity 4.06: Visualizing the Impact of Education on Annual Salary and Weekly Working Hours
    5. 5. Plotting Geospatial Data
      1. Activity 5.01: Plotting Geospatial Data on a Map
      2. Activity 5.02: Visualizing City Density by the First Letter Using an Interactive Custom Layer
    6. 6. Making Things Interactive with Bokeh
      1. Activity 6.01: Plotting Mean Car Prices of Manufacturers
      2. Activity 6.02: Extending Plots with Widgets
    7. 7. Combining What We Have Learned
      1. Activity 7.01: Implementing Matplotlib and Seaborn on the New York City Database
      2. Activity 7.02: Visualizing Stock Prices with Bokeh
      3. Activity 7.03: Analyzing Airbnb Data with geoplotlib

Product information

  • Title: The Data Visualization Workshop
  • Author(s): Mario Dobler, Tim Großmann
  • Release date: July 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781800568846