13 Using Graph Partitioning to Calculate PageRank in a Changing Network

PageRank was first defined by S. Brin and L. Page in 1998 in order to rank home pages on the Internet by ranking pages according to the stationary distribution of a random walk on the web graph. While the original way to calculate PageRank is fast, due to the huge size and growth of the web there have been many attempts at improving upon the calculation speed of PageRank through various means. In this chapter, we will look at a slightly different but equally important problem, namely how to improve the calculation of PageRank in a changing network where PageRank of an earlier stage of the network is available. In particular, we consider two types of changes in the graph, the change in rank after changing the personalization vector used in calculating PageRank as well as added or removed edges between different strongly connected components (SCCs) in the network.

13.1. Introduction

PageRank was initially developed by S. Brin and L. Page to rank home pages on the Internet for the search giant Google (Brin and Page 1998). Since then, PageRank or similar methods have been used for a variety of application in many types of networks such as evaluating trust in P2P networks (Kamvar 2003) or Lazy PageRank used in clustering of networks (Chung and Tsiatas 2012).

There have been a large amount of works aimed at speeding up the calculation of PageRank, both in the form of approximations such as aggregating of web ...

Get Data Analysis and Applications 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.