Chapter 30. Graph Analytics
The previous chapter covered some conventional unsupervised techniques. This chapter is going to dive into a more specialized toolset: graph processing. Graphs are data structures composed of nodes, or vertices, which are arbitrary objects, and edges that define relationships between these nodes. Graph analytics is the process of analyzing these relationships. An example graph might be your friend group. In the context of graph analytics, each vertex or node would represent a person, and each edge would represent a relationship. Figure 30-1 shows a sample graph.
This particular graph is undirected, in that the edges do not have a specified “start” and “end” vertex. There are also directed graphs that specify a start and end. Figure 30-2 shows a directed graph where the edges are directional.
Edges and vertices in graphs can also have data associated with them. In our friend example, the weight of the edge might represent the intimacy between different friends; acquaintances would have low-weight edges between them, while married individuals would have edges with large weights. We could set this value by looking at communication frequency between nodes and weighting ...
Get Spark: The Definitive Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.