Chapter 3. Find and Question Your Data

In the early stages of a visualization project, we often start with two interrelated issues: Where can I find reliable data? And after you find something, What does this data truly represent? If you leap too quickly into constructing charts and maps without thinking deeply about these dual issues, you run the risk of creating meaningless, or perhaps worse, misleading visualizations.

This chapter breaks down both of these broad issues in “Guiding Questions for Your Search”, “Public and Private Data”, “Mask or Aggregate Sensitive Data”, “Open Data Repositories”, “Source Your Data”, “Recognize Bad Data”. Finally, once you’ve found some files, we propose some ways to question and acknowledge the limitations of your data in “Question Your Data”.

Information does not magically appear out of thin air. Instead, people collect and publish data, with explicit or implicit purposes, within the social contexts and power structures of their times. As data visualization advocates, we strongly favor evidence-based reasoning over less-informed alternatives. We caution against embracing so-called data objectivity, however, since numbers and other forms of data are not neutral. Therefore, when working with data, pause to inquire more deeply about Whose stories are told? and Whose perspectives remain unspoken? Only by asking these types of questions, according to Data Feminism authors Catherine D’Ignazio and Lauren Klein, will we “start to see how privilege ...

Get Hands-On Data Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.