Centrality algorithms are used to understand the roles of particular nodes in a graph and their impact on that network. They’re useful because they identify the most important nodes and help us understand group dynamics such as credibility, accessibility, the speed at which things spread, and bridges between groups. Although many of these algorithms were invented for social network analysis, they have since found uses in a variety of industries and fields.
We’ll cover the following algorithms:
Degree Centrality as a baseline metric of connectedness
Closeness Centrality for measuring how central a node is to the group, including two variations for disconnected groups
Betweenness Centrality for finding control points, including an alternative for approximation
PageRank for understanding the overall influence, including a popular option for personalization
Different centrality algorithms can produce significantly different results based on what they were created to measure. When you see suboptimal answers, it’s best to check the algorithm you’ve used is aligned to its intended purpose.
We’ll explain how these algorithms work and show examples in Spark and Neo4j. Where an algorithm is unavailable on one platform or where the differences are unimportant, we’ll provide just one platform example.