Time for action – representing the graph
Let's define a textual representation of the graph that we'll use in the following examples.
Create the following as graph.txt
:
12,3,40C 21,4 31,5,6 41,2 53,6 63,5 76
What just happened?
We defined a file structure that will represent our graph, based somewhat on the adjacency list approach. We assumed that each node has a unique ID and the file structure has four fields, as follows:
- The node ID
- A comma-separated list of neighbors
- The distance from the start node
- The node status
In the initial representation, only the starting node has values for the third and fourth columns: its distance from itself is 0 and its status is "C", which we'll explain later.
Our graph is directional—more formally referred to as a directed ...
Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.