Monday, January 03, 2005

Is the Page Rank System Fair, Part 2

In a previous post I sketched out some of the background of the Page Rank system used by Google. In particular I compared it to the "citation" system used by academics to gauge the importance and relative status of academic researchers and university professors.

The primary purpose of this series of posts is to demonstrate how "Power Linking" — or what I am now calling "Multi Linking" — is not cheating. Multi-linking — the process of strategically setting out to create multiple links pointing at target sites — might be considered "cheating" if one was to take what I call "an overly moralistic view" of the Page Rank system. In these two articles I explain what I mean by "an overly moralistic view", and why it is not acceptable.

The truth is, I do not spend a lot of time reading forum posts by SEO professionals. But when I do, I get the general impression that many of them take a "path of least resistance" approach to Google because they are very concerned about getting "penalized". On the face of it this would appear to be a prudent course of action. If Google penalization actually happens — e.g., the page rank of one's web site gets removed or downgraded or ranking for specific keywords gets downgraded — because of some tactic deemed to be against the rules, then it would be best to steer clear of such tactics. Especially when you are doing SEO on behalf of clients who trust your judgement.

Some similarities with the athletes-on-steriods debate

In some posts I've read, this argument tends to move from a discussion of strategy to one of ethics, more or less analogous to discussions about steriod use by athletes. In other words, it sounds like a "moral" or "ethical" argument.

From a strictly strategic point of view one would expect athletes like Mark McGwire, Ben Johnson, Carl Lewis, Barry Bonds (and a whole host of others) to make the decision about steriod use strictly from a consideration of its benefits vs risks. But of course there is an ethical dimension that cannot be ignored, and in the minds of most onlookers this dimension is actually paramount. Winners who break the rules are cheaters. End of discussion.

In fact there are at least two layers of ethical considerations going on. On one layer there are explicit rules against certain behaviour. And on another layer there are implicit rules of behaviour that one accepts as a member of the fraternity of athletes. Both of these have an important bearing on what is acceptable ("ethical") and what is not — on deciding who is and who is not a "cheater".

Understanding the dimensions of ethical behaviour is never as easy as the fundamentalists among us pretend. It is easy to say "rules are rules", but the fact of the matter is that rules are usually applied and interpreted in a dynamic context where circumstances are constantly changing. So it is not clear who exactly is abiding by the rules in such a context, because it is not always clear exactly what the rules are. Traffic speed limit laws are another good example of this. Usually police are free to exercise their discretion when applying the rules. Most of us think this is a good thing.

Charlie Francis (Ben Johnson's coach) was about as cold-blooded about this as one can be. He assumed that since virtually all the competitive sprinters of the era were using "performance-enhancing drugs", then it was strategically unwise for an aspiring sprinter not to use them as well. (Whether he was right or wrong in his facts is not really the question here.)

The moralist replies "I have never taken these drugs. I am clean. I will not cheat." And then we find out he or she was lying, or if they were not specifically using these drugs they were using some other practice that was equally against the "spirit" of the rules. The rules had just not caught up with them yet. Or we find out that associations charged with policing the rules were letting their athletes get away with infractions so they would appear to be clean.

So was Charlie Francis right or wrong? Only the fundamentalist who is too dogmatic or too lazy to think the matter through would cling to a simplistic answer.

What does this have to do with Page Rank and SEO?

The relevance of this analogy to Search Engine Optimization techniques will be lost on many of its practitioners. That is because they are too close to the activity to see that SEO is itself an activity meant to skew outcomes in the desired direction — never mind blatant techniques like keyword or link spamming. SEO is already manipulation of text and content in order to speak clearly to the Search Engines.

In other words, SEO involves what we might call "soft" manipulation. The SEO practitioner encourages us to write our text in ways we would not normally do — unnaturally. Soft manipulation is assumed to be acceptable while other "harder" types of manipulation are condemned.

So how do we distinguish between acceptable levels of manipulation and unacceptable ones? Presumably by looking at both the letter and the spirit of the rules and seeing if a certain practice falls within them. And probably understanding the "spirit" of the rules is even more important than focusing on what we think is the letter of the law, since the people at Google play their cards very close to the vest. They do not say: "Here are the rules we are now using", because that would result in even more overt attempts by SEO experts and webmasters to manipulate them.

For the moment, then, let us assume a set of "rules" actually exists. In an important sense the famous Google "algorithms" is a set of rules. But it is not really the algorithms we are after is it? The algorithms are notoriously changeable, and are set up to operate in the service of some more fundamental principles or assumptions. So it must be the set of assumptions behind the algorithms that determine what is "right" and what is "wrong".

It is these general assumptions we are after. They will look something like this:

1. Pages should be ranked according to the "relevance" of their content.
2. "Relevance" is determined quantitatively by the emphasis within a given document on certain terms (keywords and key phrases) related to the subject matter.
3. "Relevance" is determined qualitatively by the extent to which others with similar interests refer to or "cite" a given document, calculated by the frequency with which they "link" to it.
4. Some links will be more valuable than others in determining relevance. Links from important, authoritative sites will be more valuable than links from unimportant, non-authritative sites.

This is essentially the way the Page Rank system was formulated by the founders of Google while still post-graduate students at Stanford. (Find a link to the original document here.)

Where do these rules come from?

Given the significant influence these rules have on the development of the web, and the economic well-being of millions of people who use it to make a living, it seems fair to ask some questions about these rules. Unless we think everyone should be an obedient little Google slave, we surely are justified in asking "Where do these rules come from?", "Why these rules rather than some others?" "Are they fair?", "What assurance do we have the Google is applying them even-handedly?"

I will take up these questions in my next post.

-- Rick Hendershot
Internet Marketing and Web Traffic Generation
Linknet Small Business Marketing Resource Library