Why Is Spatial Data Analysis So Hard?

Spatial (that is, geographic) datasets are notoriously difficult to analyze. In other fields of human endeavor, most of the datasets one wants to analyze are naturally made up of numbers. What is the history of the stock market’s up and downs? Numbers. What are the statistics relating to the grades of students in the sophomore class? Numbers. How many parts-per-million carbon monoxide molecules may be safely tolerated by different types of air-breathing animals? Numbers. But the chief way of storing spatial data for most of human history has been the map—whether paper, Mylar, or computer image.

Numbers and text are composed of nicely behaved discrete symbols. Each symbol may be represented by a bit of ink or by a few pixels on a computer screen that fit neatly into a square roughly an eighth of an inch on a side. And, in English, there aren’t very many different symbols: 10 digits, 26 letters uppercase, another 26 lowercase, and a bunch of special symbols—in total a maximum of 256. Maps use symbols also, but they are not nearly so well behaved. For example, symbolizing a road may result in a wavy line 2 feet long.

As discussed in Chapter 2, maps are difficult to analyze, and it is hard to compare maps. Also, the map has been the primary way of both storing and displaying spatial data—an idea we discussed earlier. One of the major advantages of a computer-based GIS is that we separate the storage function from the display function.

A physical ...

Get Introducing Geographic Information Systems with ArcGIS: A Workbook Approach to Learning GIS, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.