7.5 Implementing Cluster Analysis: Earthquakes

In the Bigger Data chapter, we worked with real data describing 501 earthquakes that occurred during a month in late 2018. Given the raw data, it might be difficult to see any type of pattern or similarity in this data set. However, if we extend our cluster analysis technique from the previous section, we might discover some interesting results.

7.5.1 File Processing

Our first problem will be to find a way to process and store the data contained in the data file so that we can use it in our clustering algorithm. Recall that in the earthquakes.csv file, the first line contains titles that identify each data item, like this:

Each succeeding line of the file describes one earthquake. The line ...

Get Python Programming in Context, 3rd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.