The Effect of Outbound
Links
Since PageRank is based on the linking
structure of the whole web, it is inescapable that if the
inbound links of a page influence its PageRank, its outbound
links do also have some impact. To illustrate the effects of
outbound links, we take a look at a simple
example.
We regard a web consisting of to
websites, each having two web pages. One site consists of pages
A and B, the other constists of pages C and D. Initially, both
pages of each site solely link to each other. It is obvious
that each page then has a PageRank of one. Now we add a link
which points from page A to page C. At a damping factor of
0.75, we therefore get the following equations for the single
pages' PageRank values:
PR(A) = 0.25 + 0.75
PR(B)
PR(B) = 0.25 + 0.375
PR(A)
PR(C) = 0.25 + 0.75 PR(D) + 0.375
PR(A)
PR(D) = 0.25 + 0.75
PR(C)
Solving the equations gives us
the following PageRank values for the first site:
PR(A) = 14/23
PR(B) = 11/23
We therefore get an accumulated
PageRank of 25/23 for the first site. The PageRank values of
the second site are given by
PR(C) = 35/23
PR(D) = 32/23
So, the accumulated PageRank of
the second site is 67/23. The total PageRank for both sites is
92/23 = 4. Hence, adding a link has no effect on the total
PageRank of the web. Additionally, the PageRank benefit for one
site equals the PageRank loss of the other.
The
Actual Effect of Outbound
Links
As it has
already been shown, the PageRank benefit for a closed system of
web pages by an additional inbound link is given by
(d / (1-d)) x (PR(X) /
C(X))
where X is the linking page,
PR(X) is its PageRank and C(X) is the number of its outbound
links. Hence, this value also represents the PageRank loss of a
formerly closed system of web pages, when a page X within this
system of pages now points by a link to an external
page.
The validity of the above formula
requires that the page which receives the link from the
formerly closed system of pages does not link back to that
system, since it otherwise gains back some of the lost
PageRank. Of course, this effect may also occur when not the
page that receives the link from the formerly closed system of
pages links back directly, but another page which has an
inbound link from that page. Indeed, this effect may be
disregarded because of the damping factor, if there are enough
other web pages in-between the link-recursion. The validity of
the formula also requires that the linking site has no other
external outbound links. If it has other external outbound
links, the loss of PageRank of the regarded site diminishes and
the pages already receiving a link from that page lose PageRank
accordingly.
Even if the actual PageRank
values for the pages of an existing web site were known, it
would not be possible to calculate to which extend an added
outbound link diminishes the PageRank loss of the site, since
the above presented formula regards the status after adding the
link.
Intuitive Justification of the Effect of
Outbound Links
The
intuitive justification for the loss of PageRank by an
additional external outbound link according to the Random
Surfer Modell is that by adding an external outbound link to
one page the surfer will less likely follow an internal link on
that page. So, the probability for the surfer reaching other
pages within a site diminishes. If those other pages of the
site have links back to the page to which the external outbound
link has been added, also this page's PageRank will
deplete.
We can conclude that external
outbound links diminish the totalized PageRank of a site and
probably also the PageRank of each single page of a site. But,
since links between web sites are the fundament of PageRank and
indespensable for its functioning, there is the possibility
that outbound links have positive effects within other parts of
Google's ranking criteria. Lastly, relevant outbound links do
constitute the quality of a web page and a webmaster who points
to other pages integrates their content in some way into his
own site.
Dangling
Links
An
important aspect of outbound links is the lack of them on web
pages. When a web page has no outbound links, its PageRank
cannot be distributed to other pages. Lawrence Page and Sergey
Brin characterise links to those pages as dangling
links.
The effect of dangling links
shall be illustrated by a small example website. We take a look
at a site consisting of three pages A, B and C. In our example,
the pages A and B link to each other. Additionally, page A
links to page C. Page C itself has no outbound links to other
pages. At a damping factor of 0.75, we get the following
equations for the single pages' PageRank values:
PR(A) = 0.25 + 0.75
PR(B)
PR(B) = 0.25 + 0.375
PR(A)
PR(C) = 0.25 + 0.375
PR(A)
Solving the equations gives us
the following PageRank values:
PR(A) = 14/23
PR(B) = 11/23
PR(C) = 11/23
So, the accumulated PageRank of
all three pages is 36/23 which is just over half the value that
we could have expected if page A had links to one of the other
pages. According to Page and Brin, the number of dangling links
in Google's index is fairly high. A reason therefore is that
many linked pages are not indexed by Google, for example
because indexing is disallowed by a robots.txt file.
Additionally, Google meanwhile indexes several file types and
not HTML only. PDF or Word files do not really have outbound
links and, hence, dangling links could have major impacts on
PageRank.
In order to prevent PageRank from
the negative effects of dangling links, pages wihout outbound
links have to be removed from the database until the PageRank
values are computed. According to Page and Brin, the number of
outbound links on pages with dangling links is thereby
normalised. As shown in our illustration, removing one page can
cause new dangling links and, hence, removing pages has to be
an iterative process. After the PageRank calculation is
finished, PageRank can be assigned to the formerly removed
pages based on the PageRank algorithm. Therefore, as many
iterations are needed as for removing the pages. Regarding our
illustration, page C could be processed before page B. At that
point, page B has no PageRank yet and, so, page C will not
receive any either. Then, page B receives PageRank from page A
and during the second iteration, also page C gets its
PageRank.
Regarding our example website for
dangling links, removing page C from the database results in
page A and B each having a PageRank of 1. After the
calculations, page C is assigned a PageRank of 0.25 + 0.375
PR(A) = 0.625. So, the accumulated PageRank does not equal the
number of pages, but at least all pages which have outbound
links are not harmed from the danging links problem.
By removing dangling links from
the database, they do not have any negative effects on the
PageRank of the rest of the web. Since PDF files are dangling
links, links to PDF files do not diminish the PageRank of the
linking page or site. So, PDF files can be a good means of
search engine optimisation for Google.
ALGO SEO UK - Home
|