A simple explanation of PageRank.


A ThatsIT Solutions Tutorial

A simple explanation of PageRank. For those who have maths degrees, here it is.

PR(A) = (1-d) + d(PR(t1)/C(t1)+ ... + PR(tn)/C(tn))

Toolbar PR
0 0 to 8
1 9 to 64
2 65 to 512
3 513 to 4096
4 4097 to 32768
5 32769 to 262144

For those that would like a simpler explanation.

First you must understand that the PR you see on the Google toolbar is not the same as the numbers we are using here. The toolbar PR is a logarithmic scale based it is believed on the number 8.

The following is a simple explanation of page rank according to the algorithm published in 1998 by Lawrence Page and Sergey Brin founders of Google.

There have been numerous changes since that time, but tests have shown that the main premise still holds true.

Some of the obvious changes to be aware of are…

  • A link higher in the page carries more weight than one lower in the page.
  • Links in the main content area of a page hold more weight than other sections, the footer holds the least.
  • Links also carry relevance
    • Relevance of the linking page to the linked page.
    • Link text relevance

The explanation

Total PageRank

All pages have PR, we are going to start of our pages with a PR of 1, this number can be anything.

So in this example 3 page website seen below, has a total PR of 3, thats 1 per page, we can double our page rank by adding 3 more pages, we will then have a total PR of 6, but still 1 per page. So we are getting nowhere.

pagerank explained
How to manipulate PageRank.

Depending on our link structure we can manipulate PR and push the PR onto certain pages we want to rank at the expense of other pages in the site.

PageRank flow

First we need to understand the flow of PageRank. When 1 page links to another it passes some of its PR to the linked page, it passes 85% and keeps 15%.

Even linking

Here we see that the home page A passes its page rank to page B, B now B passes 85% to C and C passes 85% back to A, in this scenario all is equal and all pages end up with a PR of 1 the same as it stated with.

simple pagerank

If all pages linked to all pages as in the first image the same would result, all is equal.

In order to manipulate PR we need to have some sort of un-evenness to the linking structure.

I will take this time to explain another complexity to this calculation. In this simple example we get a true answer after doing the calculations once because of the equality of the linking structure, but in more complex scenarios we cannot get a true calculation with one iteration. We cannot know Page A’s true PR till we calculate the PR of B and C and we cannot calculate the PR of B or C till we calculate the PR of the other pages either. The way we get around this, is we do the calculation over and over until we get to a point where the results change by only a small fraction. Each time we do the calculations the changes in PR is smaller and smaller. After about 40 Iterations we have a change of less than 1 millionth, that is close enough.

Un-even linking

I this next example we have a flat linking structure, here we have 2 links from page A, so only 50% of the available 85% of PR is passed to each page.

flat linking structure

After doing our calculations we find that we have a PR of 1.5 for A, and only 0.75 for both B and C.

You can now see that we are able to make one page rank at the expense of the others, now lets go back to the mention of total PR of the website, the more pages we have the more we can increase the rank of one page or a small number of pages at the expense of the others.

If we had a website with 20 pages, following this linking structure we would have a PR for page A of 9.27, and the rest would have 0.56 each. So adding pages to our site can help us make a small set of pages rank higher.

Now what would happen if we had a link to page A from an external page with PR of 10(remember this is not Google toolbar PR), we now have a PR of 6.5 for A and 3.25 for all the other pages.

Some points to consider.
  • In these examples we only had page A receiving more then one link, on a site with more pages, we can get more pages to rank, but you need to keep a good ratio of ranking pages to supporting pages. There is a balance of ranking pages to supporting pages
  • Once we consider that an external link coming into the site can increase the average PR per page above 1, then adding a new page of PR1 will decrease our average PR per page. This will still help the ranking pages rank higher, but the ratio of pages that can rank above the average gets smaller and the number of pages that must rank below the average gets larger. It is very hard to work out how many high ranking pages you can have, but as a rule of thumb, keep your ranking pages to 10% or less of the total number of pages; a site with more then 10,000 pages may need to lower this number further.
  • All pages must be in the index to be included in the calculations
  • All pages must have an incoming and outgoing link in order to be included in calculations.
Keep it natural

I hope that was informative, but remember, you need to have a linking structure that your users can use navigate around the site. You will not be able to have the optimum linking structure and still be user friendly. What my advice here is to treat this information as you should treat a golf lesson. If you try too hard to do what you have just learnt, your swing with be all stiff and un-natural, you should try to swing naturally with the new knowledge in the back of your head.