Chapter 5

From Maps to Regression

“Even before you understand them, your brain is drawn to maps.”

Ken Jennings, author and Jeopardy! champ

You have been learning some basics about security data and how to pull meaning from IP addresses. As briefly discussed in Chapter 4, IP addresses can be associated with geographic data if you look them up using a geolocation service. But what is the value in doing that? How much can you learn by associating a longitude and latitude with your data? The answer to that is dependent on what the IP represents and how deep you are willing to go. In order to describe the value of mapping the virtual world into the physical, this chapter begins with a list of over 800,000 latitude/longitude pairs shared by our friends at Symantec. The location data is from client IP addresses infected with the ZeroAccess rootkit, collected over a 24-hour period during the month of July in 2013.

Now that you know these are locations of hosts with ZeroAccess, you could ask a series of questions:

  • How is ZeroAccess distributed across geographic areas and is there any significance to this distribution?
  • What types of clients are more likely to be infected with ZeroAccess? Do things like education and income affect the rate of infection?
  • Are ZeroAccess infections the result of alien visitors?

Obviously, this chapter hones in on that last question. It is the most important and worthy of some serious research (anyone have some spare grant money?). But seriously, our purpose ...

Get Data-Driven Security: Analysis, Visualization and Dashboards now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.