Skip to Content
Visualizing Data
book

Visualizing Data

by Ben Fry
December 2007
Beginner to intermediate
382 pages
10h 29m
English
O'Reilly Media, Inc.
Content preview from Visualizing Data

Chapter 6. Scatterplot Maps

In this chapter, we cover the seven steps as laid out in Chapter 1 and apply them to the question, “How do zip codes relate to geography?” (The background for this project was introduced in Chapter 1.)

Preprocessing

Data is always dirty, and once you’ve found your data set, you’ll need to clean it up. As in the previous chapter, we’ll go through the steps of acquiring and parsing in detail. None of this is rocket science, but again, it’s meant to familiarize you with the various formats in which you’ll find data, and alert you to some of the common issues you’ll encounter along the way. If you just want to start playing with locations and maps, you can download the finished zips.tsv file from the book web site (http://benfry.com/writing/zipdecode/zips.tsv) and jump ahead to the next section.

Data from the U.S. Census Bureau (Acquire)

The acronym ZIP stands for Zoning Improvement Plan, a 1963 initiative to simplify the delivery of mail in the United States. Personal correspondence, once the majority of all mail, was rapidly being overtaken by business mail, which by the 1960s accounted for 80% of the post. Faced with an ever-increasing amount of mail to process, the U.S. Postal Service initiated the zip system to specify more accurately the geographic area of the mail’s destination. The U.S. Postal Service’s web site features a lengthier history of the system at http://www.usps.com/history.

Versions of the zip code database are available from a variety of sources. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Visualizing Graph Data

Visualizing Graph Data

Corey Lanum
Hands-On Data Visualization

Hands-On Data Visualization

Jack Dougherty, Ilya Ilyankou
Designing Data Visualizations

Designing Data Visualizations

Noah Iliinsky, Julie Steele

Publisher Resources

ISBN: 9780596514556Errata Page