Chapter 4. GraphX Basics

This chapter covers

  • The basic GraphX classes
  • The basic GraphX operations, based on Map/Reduce and Pregel
  • Serialization to disk
  • Stock graph generation

Now that we have covered the fundamentals of Spark and of graphs in general, we can put them together with GraphX. In this chapter you’ll use both the basic GraphX API and the alternative, and often better-performing, Pregel API. You’ll also read and write graphs, and for those times when you don’t have graph data handy, generate random graphs.

4.1. Vertex and edge classes

As discussed in chapter 3, Resilient Distributed Datasets (RDDs) are the fundamental building blocks of Spark programs, providing for both flexible, high--performance, data-parallel processing ...

Get Spark GraphX in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.