Joining graph datasets

In addition to the previous mapping and filtering operations, GraphX also provides APIs for joining RDD datasets with graphs. This can be useful when we want to add extra information to the vertex attributes of a graph or when we want to merge the vertex attributes of two related graphs. These tasks can be accomplished using the following join operators.

joinVertices

The following is the method signature for the first operator joinVertices:

def joinVertices[U](table: RDD[(VertexId, U)])(map: (VertexId, VD, U) => VD): Graph[VD, ED]

It is invoked on a Graph[VD, ED] object and requires two inputs, which are passed as curried parameters. First, joinVertices joins a graph's vertex attributes with an input vertex RDD table of type ...

Get Apache Spark Graph Processing now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.