Algorithms

Now we dive into the most interesting part of GraphX: algorithms and the graph parallel computation APIs to implement more algorithms. The following table shows a bird's eye view of the algorithms:

Type

GraphX method/example

Graph-Parallel Computation

The method is aggregateMessages(), Function

Pregel(). Refer to https://issues.apache.org/jira/browse/SPARK-5062 for examples.

PageRank

The method is PageRank(). As an example, refer to the influential papers in a citation network, Influencer in retweet. You can specifically check out the following:

staticPageRank(): This provides a static no of iterations and dynamic tolerance; see the parameters (tol versus numIter)

personalizedPageRank(): This is a variation of PageRank that ...

Get Fast Data Processing with Spark 2 - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.