3 PageRank and Perturbed Markov Chains

PageRank is a widely used hyperlink-based algorithm for estimating the relative importance of nodes in networks. Since many real-world networks are large sparse networks, efficient calculation of PageRank is complicated. Moreover, we need to overcome dangling effects in some cases as well as slow convergence of the transition matrix. Primitivity adjustment with a damping (perturbation) parameter is one of the essential procedures known to ensure convergence of the transition matrix. If the perturbation parameter is not small enough, the transition matrix loses information due to the shift of information to the teleportation matrix. We formulate the PageRank problem as a first- and second-order Markov chains perturbation problem. Using numerical experiments, we compare convergence rates for different values of perturbation parameter on different graph structures and investigate the difference in ranks for the two problems.

3.1. Introduction

PageRank is a link-based criterion that captures the importance of web pages and provides a ranking of the pages in Google, subject to a user’s search query (Brin and Page 1998; Langville and Meyer 2012; Avrachenkov et al. 2013; Engström and Silvestrov 2014, 2016a,b, 2017; Engström 2016; Biganda et al. 2017). The theory behind PageRank is built from the Perron–Frobenius theory (Berman and Plemmons 1994) and Markov chains (Norris 1998), whose matrix of transition probabilities, as defined in the literature, ...

Get Applied Modeling Techniques and Data Analysis 1 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.