Census Information

The U.S. Census Bureau provides demographic data from recent surveys at both the county and city levels. The quickfacts website (e.g., http://quickfacts.census.gov/qfd/states/06/0649670.html) displays a number of interesting demographic variables for each city. Unfortunately, city-level data is not available in an easily downloadable format, but we were able to use scripting methods (like those we used for the sales data) to collect the demographic information and convert it into csv format. In addition, the definition of a city differed slightly between the census data and the sales data, so we could match only 46 out of the full 58 cities. The census data didn't cover some of cities we chose, because their population was below some cutoff, and some of what the housing data calls "cities" are actually neighborhoods within larger cities, as we noted earlier with respect to San Jose.

A glance at the demographic variables revealed that the most affected cities have a high percentage of babies and children, bigger households, fewer bachelor's degrees, and longer commutes. Most significantly, these cities also have lower average incomes, which is probably the factor that drives many of the other relationships. Figure 18-11 includes three scatterplots that illustrate the relationship between the drop in home prices and income, percentage of college graduates, and commute time. The correlation between price drop and commute time is weak, but note that all of the cities ...

Get Beautiful Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.