The Effect of Inbound Links
It has already been shown that each additional inbound link for
a web page always increases that page's PageRank. Taking a look at the PageRank algorithm, which is given
by
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
one may assume that an additional inbound link from page X increases the PageRank of
page A by
d x PR(X) / C(X)
where PR(X) is the PageRank of page X and C(X) is the total number of its outbound
links. But page A usually links to other pages itself. Thus, these pages get a PageRank benefit also. If these
pages link back to page A, page A will have an even higher PageRank benefit from its additional inbound
link.
The single effects of additional inbound links shall be illustrated by an
example.
We regard a website consisting of four pages A, B, C and D which are linked to each
other in circle. Without external inbound links to one of these pages, each of them obviously has a PageRank of 1.
We now add a page X to our example, for which we presume a constant Pagerank PR(X) of 10. Further, page X links to
page A by its only outbound link. Setting the damping factor d to 0.5, we get the following equations for the
PageRank values of the single pages of our site:
PR(A) = 0.5 + 0.5 (PR(X) + PR(D)) = 5.5 + 0.5 PR(D)
PR(B) = 0.5 + 0.5 PR(A)
PR(C) = 0.5 + 0.5 PR(B)
PR(D) = 0.5 + 0.5 PR(C)
Since the total number of outbound links for each page is one, the outbound links do
not need to be considered in the equations. Solving them gives us the following PageRank values:
PR(A) = 19/3 = 6.33
PR(B) = 11/3 = 3.67
PR(C) = 7/3 = 2.33
PR(D) = 5/3 = 1.67
We see that the initial effect of the additional inbound link of page A, which was
given by
d x PR(X) / C(X) = 0,5 x 10 / 1 = 5
is passed on by the links on our site.
The Influence of the Damping
Factor
The degree of PageRank propagation from one page to another by
a link is primarily determined by the damping factor d. If we set d to 0.75 we get the following equations for our
above example:
PR(A) = 0.25 + 0.75 (PR(X) + PR(D)) = 7.75 + 0.75 PR(D)
PR(B) = 0.25 + 0.75 PR(A)
PR(C) = 0.25 + 0.75 PR(B)
PR(D) = 0.25 + 0.75 PR(C)
Solving these equations gives us the following PageRank values:
PR(A) = 419/35 = 11.97
PR(B) = 323/35 = 9.23
PR(C) = 251/35 = 7.17
PR(D) = 197/35 = 5.63
First of all, we see that there is a significantly higher initial effect of
additional inbound link for page A which is given by
d x PR(X) / C(X) = 0.75 x 10 / 1 = 7.5
This initial effect is then propagated even stronger by the links on our site. In
this way, the PageRank of page A is almost twice as high at a damping factor of 0.75 than it is at a damping factor
of 0.5. At a damping factor of 0.5 the PageRank of page A is almost four times superior to the PageRank of page D,
while at a damping factor of 0.75 it is only a little more than twice as high. So, the higher the damping factor,
the larger is the effect of an additional inbound link for the PageRank of the page that receives the link and the
more evenly distributes PageRank over the other pages of a site.
The Actual Effect of Additional Inbound
Links
At a damping factor of 0.5, the accumulated PageRank of all
pages of our site is given by
PR(A) + PR(B) + PR(C) + PR(D) = 14
Hence, by a page with a PageRank of 10 linking to one page of our example site by its
only outbound link, the accumulated PageRank of all pages of the site is increased by 10. (Before adding the link,
each page has had a PageRank of 1.) At a damping factor of 0.75 the accumulated PageRank of all pages of the site
is given by
PR(A) + PR(B) + PR(C) + PR(D) = 34
This time the accumulated PageRank increases by 30. The accumulated PageRank of all
pages of a site always increases by
(d / (1-d)) x (PR(X) / C(X))
where X is a page additionally linking to one page of the site, PR(X) is its PageRank
and C(X) its number of outbound links. The formula presented above is only valid, if the additional link points to
a page within a closed system of pages, as, for instance, a website without outbound links to other sites. As far
as the website has links pointing to external pages, the surplus for the site itself diminishes accordingly,
because a part of the additional PageRank is propagated to external pages.
The justification of the above formula is given by Raph Levien and it is based on the
Random Surfer Model. The walk length of the random surfer is an exponential distribution with a mean of (d/(1-d)).
When the random surfer follows a link to a closed system of web pages, he visits on average (d/(1-d)) pages within
that closed system. So, this much more PageRank of the linking page - weighted by the number of its outbound links
- is distributed to the closed system.
For the actual PageRank calculations at Google, Lawrence Page and Sergey Brin claim
to usually set the damping factor d to 0.85. Thereby, the boost for a closed system of web pages by an additional
link from page X is given by
(0.85 / 0.15) x (PR(X) / C(X)) = 5.67 x (PR(X) / C(X))
So, inbound links have a far larger effect than one may assume.
The PageRank-1
Rule
Users of the Google Toolbar often notice that pages with a
certain Toolbar PageRank have an inbound link from a page with a Toolbar PageRank which is higher by one. Some take
this observation to doubt the validity of the PageRank algorithm presented here for the actual ranking methods of
the Google search engine. It shall be shown, however, that the PageRank-1 rule complies with the PageRank
algorithm.
Basically, the PageRank-1 rule proves the fundamental principle of PageRank. Web
pages are important themselves if other important web pages link to them. It is not necessary for a page to have
many inbound links to rank well. A single link from a high ranking page is sufficient.
To show the actual consistance of the PageRank-1 rule with the PageRank algorithm
several factors have to be taken into consideration. First of all, the toolbar PageRank is a logarithmically scaled
version of real PageRank values. If the PageRank value of one page is one higher than the PageRank value of another
page in terms of Toolbar PageRank, than its real PageRank can at least be higher by an amount which equals the
logarithmical basis for the scalation of Toolbar PageRank. If the logarithmical basis for the scalation is 6 and
the toolbar PageRank of a linking Page is 5, then the real PageRank of the page which receives the link can be at
least 6 times smaller to make that page still get a toolbar PageRank of 4.
However, the number of outbound links on the linking page thwarts the effect of the
logarithmical basis, because the PageRank propagation from one page to another is devided by the number of outbound
links on the linking page. But it has already been shown that the PageRank benefit by a link is higher than
PageRank algorithm's term d(PR(Ti)/C(Ti)) pretends. The reason is that the PageRank benefit for one page is further
distributed to other pages within the site. If those pages link back as it usualy happens, the PageRank benefit for
the page which initially received the link is accordingly higher. If we assume that at a high damping factor the
logarithmical basis for PageRank scalation is 6 and a page receives a PageRank benefit which is twice as high as
the PageRank of the linking page devided by the number of its outbound links, the linking page could have at least
12 outbound links so that the Toolbar PageRank of the page receiving the link is still at most one lower than the
toolbar PageRank of the linking page.
A number of 12 outbound links admittedly seems relatively small. But normally, if a
page has an external inbound link, this is not the only one for that page. Most likely other pages link to that
page and propagate PageRank to it. And if there are examples where a page receives a single link from another page
and the PageRanks of both pages comply the PageRank-1 rule although the linking page has many outbound links, this
is first of all an indication for the linking page's toolbar PageRank being at the upper end of its scale. The
linking page could be a "high" 5 and the page receiving the link could be a "low" 4. In this way, the linking page
could have up to 72 outbound links. This number rises accordingly if we assume a higher logarithmical basis for the
scalation of Toolbar PageRank.
|