O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Fundamentals of Data Visualization

Book Description

Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences. But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.

This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.

  • Explore the basic concepts of color as a tool to highlight, distinguish, or represent a value
  • Understand the importance of redundant coding to ensure you provide key information in multiple ways
  • Use the book’s visualizations directory, a graphical guide to commonly used types of data visualizations
  • Get extensive examples of good and bad figures
  • Learn how to use figures in a document or report and how employ them effectively to tell a compelling story

Table of Contents

  1. Preface
    1. Thoughts on Graphing Software and Figure-Preparation Pipelines
    2. Conventions Used in This Book
    3. Using Code Examples
    4. O’Reilly Online Learning
    5. How to Contact Us
    6. Acknowledgments
  2. 1. Introduction
    1. Ugly, Bad, and Wrong Figures
  3. I. From Data to Visualization
  4. 2. Visualizing Data: Mapping Data onto Aesthetics
    1. Aesthetics and Types of Data
    2. Scales Map Data Values onto Aesthetics
  5. 3. Coordinate Systems and Axes
    1. Cartesian Coordinates
    2. Nonlinear Axes
    3. Coordinate Systems with Curved Axes
  6. 4. Color Scales
    1. Color as a Tool to Distinguish
    2. Color to Represent Data Values
    3. Color as a Tool to Highlight
  7. 5. Directory of Visualizations
    1. Amounts
    2. Distributions
    3. Proportions
    4. x–y relationships
    5. Geospatial Data
    6. Uncertainty
  8. 6. Visualizing Amounts
    1. Bar Plots
    2. Grouped and Stacked Bars
    3. Dot Plots and Heatmaps
  9. 7. Visualizing Distributions: Histograms and Density Plots
    1. Visualizing a Single Distribution
    2. Visualizing Multiple Distributions at the Same Time
  10. 8. Visualizing Distributions: Empirical Cumulative Distribution Functions and Q-Q Plots
    1. Empirical Cumulative Distribution Functions
    2. Highly Skewed Distributions
    3. Quantile-Quantile Plots
  11. 9. Visualizing Many Distributions at Once
    1. Visualizing Distributions Along the Vertical Axis
    2. Visualizing Distributions Along the Horizontal Axis
  12. 10. Visualizing Proportions
    1. A Case for Pie Charts
    2. A Case for Side-by-Side Bars
    3. A Case for Stacked Bars and Stacked Densities
    4. Visualizing Proportions Separately as Parts of the Total
  13. 11. Visualizing Nested Proportions
    1. Nested Proportions Gone Wrong
    2. Mosaic Plots and Treemaps
    3. Nested Pies
    4. Parallel Sets
  14. 12. Visualizing Associations Among Two or More Quantitative Variables
    1. Scatterplots
    2. Correlograms
    3. Dimension Reduction
    4. Paired Data
  15. 13. Visualizing Time Series and Other Functions of an Independent Variable
    1. Individual Time Series
    2. Multiple Time Series and Dose–Response Curves
    3. Time Series of Two or More Response Variables
  16. 14. Visualizing Trends
    1. Smoothing
    2. Showing Trends with a Defined Functional Form
    3. Detrending and Time-Series Decomposition
  17. 15. Visualizing Geospatial Data
    1. Projections
    2. Layers
    3. Choropleth Mapping
    4. Cartograms
  18. 16. Visualizing Uncertainty
    1. Framing Probabilities as Frequencies
    2. Visualizing the Uncertainty of Point Estimates
    3. Visualizing the Uncertainty of Curve Fits
    4. Hypothetical Outcome Plots
  19. II. Principles of Figure Design
  20. 17. The Principle of Proportional Ink
    1. Visualizations Along Linear Axes
    2. Visualizations Along Logarithmic Axes
    3. Direct Area Visualizations
  21. 18. Handling Overlapping Points
    1. Partial Transparency and Jittering
    2. 2D Histograms
    3. Contour Lines
  22. 19. Common Pitfalls of Color Use
    1. Encoding Too Much or Irrelevant Information
    2. Using Nonmonotonic Color Scales to Encode Data Values
    3. Not Designing for Color-Vision Deficiency
  23. 20. Redundant Coding
    1. Designing Legends with Redundant Coding
    2. Designing Figures Without Legends
  24. 21. Multipanel Figures
    1. Small Multiples
    2. Compound Figures
  25. 22. Titles, Captions, and Tables
    1. Figure Titles and Captions
    2. Axis and Legend Titles
    3. Tables
  26. 23. Balance the Data and the Context
    1. Providing the Appropriate Amount of Context
    2. Background Grids
    3. Paired Data
    4. Summary
  27. 24. Use Larger Axis Labels
  28. 25. Avoid Line Drawings
  29. 26. Don’t Go 3D
    1. Avoid Gratuitous 3D
    2. Avoid 3D Position Scales
    3. Appropriate Use of 3D Visualizations
  30. III. Miscellaneous Topics
  31. 27. Understanding the Most Commonly Used Image File Formats
    1. Bitmap and Vector Graphics
    2. Lossless and Lossy Compression of Bitmap Graphics
    3. Converting Between Image Formats
  32. 28. Choosing the Right Visualization Software
    1. Reproducibility and Repeatability
    2. Data Exploration Versus Data Presentation
    3. Separation of Content and Design
  33. 29. Telling a Story and Making a Point
    1. What Is a Story?
    2. Make a Figure for the Generals
    3. Build Up Toward Complex Figures
    4. Make Your Figures Memorable
    5. Be Consistent but Don’t Be Repetitive
  34. Annotated Bibliography
    1. Thinking About Data and Visualization
    2. Programming Books
    3. Statistics Texts
    4. Historical Texts
    5. Books on Broadly Related Topics
  35. Technical Notes
  36. References
  37. Index