Chapter 1. Two important technologies: Spark and graphs

This chapter covers

  • Why Spark has become the leading Big Data processing system
  • What makes graphs a unique way of modeling connected data
  • How GraphX makes Spark a leading platform for graph analytics

It’s well-known that we are generating more data than ever. But it’s not just the individual data points that are important—it’s also the connections between them. Extracting information from such connected datasets can give insights into numerous areas such as detecting fraud, collecting bioinformatics, and ranking pages on the web.

Graphs provide a powerful way to represent and exploit these connections. Graphs represent networks of data points as vertices and encode connections through ...

Get Spark GraphX in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.