PageStat Blog » Nonsense, SEM, SEO

PageRank Algorithm

PageRank Algorithm

PageRank Algorithm

PageRank Algorithm. Oh the mighty PageRank Algorithm, it’s a near mythical mathematical formulation that nearly every seo specialist and webmaster has tried to figure out, or attempt to manipulate.

PageRank is named after Lawrence Page one of the founders of Google. The original patent was approved with the United States Patent Office on September 4th, 2001, but was actually filed January 9th, 1998. Many people say it was named after Larry Page, but the patent says Lawrence so we’ll go with that one. The patent number is 6,285,999 if you want to look it up go to. http://patft.uspto.gov/netahtml/PTO/srchnum.htm

What does the patent say?
The abstract for the patent states “A method assigns importance ranks to nodes in a linked database, such as any database of documents containing citations, the world wide web or any other hypermedia database. The rank assigned to a document is calculated from the ranks of documents citing it. In addition, the rank of a document is calculated from a constant representing the probability that a browser through the database will randomly jump to the document. The method is particularly useful in enhancing the performance of search engine results for hypermedia databases, such as the world wide web, whose documents have a large
variation in quality.”

So what does this mean?
Google has a database of links like other hypermedia databases. The rank to the page or document is assigned by looking at the quality of the sites linking to it. The rank is also calculated using the probability that someone will visit that site or document from wherever the links exist. The purpose of this is to make a better search engine.

That’s my take on it at least. The part where it says “In addition, the rank of a document is calculated from a constant representing the probability that a browser through the database will randomly jump to the document.” is almost cryptic and i may be totally off in my assumption. It may actually mean the rank of the page is affected not by the position of the links on a page, but the CTR or click through rate which is possibly a better measure.

Example 2

Example 2

Now here is where it gets a bit tricky. While pagerank is largely, if not entirely, based off of links, most pages have more than 1 link on them. So a Pagerank 4 page linking to 5 pages would hypothetically give .25 or 1/4 of it’s “link juice” to each page. But when figuring out pagerank you cannot assume that a PR 4 page linking to only one other page would automatically make that other page a PR 4 page. It is more likely that a PR 4 page linking to only one page would make that page a PR1, so there are some dampening effects going on here. If it wasn’t like this then anyone with a PR4 page could string together 100 websites each one linking to just one other to make 100 PR 4 sites.

Curve

Curve

I believe the algorithm is a curve of some sort, but am not sure. In developing this site PageStat.com I attempted to create a similar curve.

I’m not a mathematician but I’m into stats and data. Data is a way of painting a picture of the world using numbers. Whenever you are looking at the “Doppler Radar” on the news and see the darker or lighter colors and hues that represent rain or snow. Those are creating using numbers.

So looking at a curve like this you can tell a few things, and all these things are true of what we see in PageRank. At the beginning of curve on the far bottom left we can see there is very little slope. Just like in life the steeper the slope the harder it is to get up. It takes more fuel (quality backlinks) to get to a PR 5 than it does to PR 1. If your site somehow makes it to PR 9 you can see that the curve gradually reduces slope before hitting the ceiling at the very top right of the page. There are only a few PR 10 sites out there, not much room up there right?

If you hop over to WikiPedia they have a PageRank page as well. So if you need even more info have a look over at their page. If you are interested in PageRank updates check out our PR Updates post for the updates in 2010.

If you are interested in finding out the PR as well as many other stats like Alexa Rank, Compete Rank, and Tweet’s on a site then go top http://pagestat.com and search for your site.