O'Reilly logo

Apache Spark Graph Processing by Rindra Ramamonjison

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Joining graph datasets

In addition to the previous mapping and filtering operations, GraphX also provides APIs for joining RDD datasets with graphs. This can be useful when we want to add extra information to the vertex attributes of a graph or when we want to merge the vertex attributes of two related graphs. These tasks can be accomplished using the following join operators.

joinVertices

The following is the method signature for the first operator joinVertices:

def joinVertices[U](table: RDD[(VertexId, U)])(map: (VertexId, VD, U) => VD): Graph[VD, ED]

It is invoked on a Graph[VD, ED] object and requires two inputs, which are passed as curried parameters. First, joinVertices joins a graph's vertex attributes with an input vertex RDD table of type ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required