Getting started with GraphX

You don't need any additional installation of software to get started with GraphX. GraphX is included within the Spark installation. This section introduces how to create and explore graphs using a simple family relationship graph. The family graph created will be used in all operations within this section.

Basic operations of GraphX

GraphX does not support the Python API yet. For easy understanding, let's use spark-shell to interactively work with GraphX. First of all, let's create input data (vertex and edge files) needed for our GraphX operations and then store it on HDFS.

Note

All programs in this chapter are executed on CDH 5.8 VM. For other environments, file paths might change, but the concepts are the same in any ...

Get Big Data Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.