# Google Page Rank and the impact of the Second EigenValue of the Google Matrix on Power Iteration Convergence in R

(Sandipan Dey, 2 Jan 2017)

• In this article, the basic concepts of the Google Page Rank Algorithm will be discussed along with a few examples.
• Web net can be represented as a directed graph, with nodes represented by web pages and edges represented by the links between them.
• Probabilistic point of view: Since the importance of a web page is measured by its popularity (how many incoming links it has), the importance of page i can be viewed as the probability that a random surfer on the Internet that opens a browser to any page and starts following hyperlinks, visits the page i.
• The process can be modeled as a random walk on the web graph.
• A smaller, but positive percentage of the time, the surfer will dump the current page and choose arbitrarily a different page from the web and “teleport” there. The damping factor p reflects the probability.
• The PageRank vector for a web graph with transition matrix A, and damping factor p, is the unique probabilistic eigenvector of the matrix M (as shown in the following figure), corresponding to the dominant eigenvalue 1.
• By Perron-Frobenius theorem, if the underlying web-graph is connected and aperiodic, then the power-iteration algorithm used to compute the page ranks is guaranteed to converge to a steady state vector, which is precisely the vector with the page ranks of all the nodes.

• The next figure and animations show an example problem for computing page ranks that appeared in the final exam of the same edX (CornellX) course Networks, Crowds and Markets.

• The following animation shows the pageranks computed using the power-iteration algorithm without a damping factor. As can be seen, the nodes with larger sizes have higher pageranks.

• The next animation shows the pageranks computed using the power-iteration algorithm with a damping factor 0.2.

• The next animations show the pageranks computed for an Erdos-Renyi random graph with n=100 nodes, with probability p=1/n using the power-iteration algorithm and also directly as a dominant eigenvector of the matrix M. As can be seen, the nodes with ids in larger fonts have higher pageranks.

• The next animation shows how fast the page ranks computed by the power-iteration algorithm converges to the true page ranks (computed directly as normalized dominant eigenvector) for the same E-R random graph.

• The power-iteration algorithm convergence rate is directly proportional to the ratio of the dominant eigenvalues λ2/λ1 of the matrix M. Since λ1=1, the page rank computation algorithm converges faster if the second eigenvalue is smaller.
• The next figures show the result of an experiment done starting with an Erdos-Renyi random graph with n=100 nodes, with probability p=1/2 and then iteratively removing 10 edges randomly from the graph and running the power iteration algorithm and recording the # iterations to converge with an error at most 1e-5, also noting the λ2/λ1 each time.
• As can be seen, with ratio of the dominant eigenvalues (or λ2) decreases as we go on removing the edges and as expected, #iterations taken to converge (with the same accuracy threshold) decreases (which means a faster convergence) as the second eigenvalue becomes smaller.