My Recommendations for Google to Help Them Fight Web Spam in Their Index

[This Is an Important Post for You; See the Conclusion]
Goooogle
As the days pass by, web spam is on the rise. Even Google's inimitable algorithm to rank the worthiest content seems to fail at times. It's why we miss a lot of great information due to the mere lack of enough backlinks or funds to place advertisements, and get a lot of spammy links on the first few search result pages. With the world's most popular search engine at their hand, Google is an information hub. It has been given the ultimate power by God to make or break any business today. That's why spending a lot of money on this search engine is more and more important.

As Uncle Ben says (and so far one of my most loved quotes): "With Great Power Comes Great Responsibility", Google has an enormous responsibility on their shoulders. In this case, helping them improve their index and remove more spam content from the web is a responsibility for every individual. Here are my recommendations to Google and their web spam captain, Matt Cutts, to help remove spam from their index.

1. Google Should Start Following People, Not Websites

There are websites like the NYTimes, and there are websites like Hubpages. The difference is that the first is moderated by a select community of individuals, who are the absolute best in the news reporting field, and the second is unmoderated and almost 90 per cent full of spam.

Currently, Google rates a page's importance based on the website in which it appears. That's why PR6 websites have their new posts get indexed faster and PR0 websites have to wait weeks.

My recommendation is that Google should start to follow individuals that post a particular article/content, rather than the websites in which they are posted.

With Google becoming more and more ubiquitous and there are fewer and fewer people not having a Google account, Google should be well able to track the identity of a person and his web activities. Though there is quite a bit of paranoia floating around based on this information-hungry (or supposed to be so) nature of Google (which nonetheless doesn't affect me—virtue of having a crappy bank balance and no credit card!), I believe they should use this information for reducing spam content.

A website like the Hubpages can have all sort of content, just like Blogger, WordPress, Squidoo, etc. They are unmoderated, and available to everyone. Hence, naturally the companies and individuals looking to promote their businesses, may make use of these resources for spamming purposes. On the other hand, if you start follow an honorable person, who is a trusted authority, you will be tracking only good quality content.

For instance, a journalist from NY Times may start a blog on Blogger and write high quality content. But this blog may have too few links to rank well for any related search terms. If Google were tracking the content based on the identity of person who created it, this resource would get its deserved popularity.

As everyone well knows, self-hosted blogs/websites are owned by individuals. Therefore, there should be no assumption that they contain quality content as long as the identity of the owner of the website is confirmed.

2. Track Moderated Websites Differently From Unmoderated Ones

As in the above example, unmoderated websites, in which people can post anything, should be valued below moderated websites. For instance, a site like BBC has only useful content, while Squidoo can have useful as well as crappy content.

The degree (level) of moderation should decide the degree of importance. BBC is a highly experienced, select group of individuals, while Associated Content is a wider group (though still moderated). While BBC doesn't have any spam content, there should be a slight amount of spam on AC, and a greater amount in Squidoo. [One more suggestion: Please rank Encarta and Encyclopedia Britannica higher than Wikipedia.]

3. People Reporting Spam/Contributing Well to the Community

Google has a spam reporting facility. I have no idea how much Google makes use of this. I occasionally report some spam I find on the Net (such as keyword overstuffing, hidden text, hidden redirects, cloaked pages, etc). I report only spam, but there may be thousands of people who deliberately report their competitors' URLs as spam, just to see if they can thwart their businesses.

Google should start a hidden point-based system (like the Yahoo Answers), with which it rewards every individual that genuinely reports a spam URL, with a point. This way, Google can remove the individuals who try to manipulate the system by reporting competitors. Also, Google can pin down to a few individuals of high trust, whose spam reports can be attended faster.

4. News Stories Vs. News Reviews

News stories are one thing, news reviews are quite another. The former only reports any latest event, while the latter analyzes the story further and places valuable information to the public. As the news breaks, Google should give importance to the popular news media sites, which give basic information. But as newer and newer articles from responsible people, analyzing the news story in various angles, spring up, Google should expand their index to rank these posts higher.

5. Image Search; Video Search

I believe Google can evaluate the quality of a particular image/video. I recently found a lot of web pages in search results, displaying the same images or images with less quality, for a search query for an actress. I believe, Google should rank respectable websites with more images in higher quality, higher for image search. The same goes for video search.

6. Don't Depend Thoroughly on Search Engine Optimization

We know well that Google is built upon search engine optimization. A select group of people knows the importance and tactics of SEO. They help rank websites higher in search results. But there is a huge number of individuals on the Web, who manage highly respectable websites and blogs, rich with information, without knowing shit from shinola about SEO. Though Google ranks a website based on almost 200 factors, it still misses some of these very best resources.

When users search on Google, if they go directly to the tenth search result page (SERP), they may find a quite useful resource, which is there because it doesn't have enough expertise or revenue to surge up to the top. It just happened to me yesterday. A website I found to be a very good resource was ranked on the tenth page, while crappiest websites ranked on the first and second pages.

Therefore, Google should employ personnel solely to find out the best resources on the Web that are not yet indexed or ranked. This will not only help improve the index, but also reduce the spam. People's trust on Google will increase through these factors.

7. Don't Disclose the Most Important SEO Factors

I sincerely hope that you have closely guarded secrets, Google. If you take any unknown factor very importantly while ranking a page (and I hope that is so), then keep it very secretive.

Google bombing happened in the first place because Google published information about how it ranks sites. When it keeps secrets closely guarded, I believe, people will be confused as to what to do and what not to.

Maybe Google should reduce the importance of links and other factors like keyword placement. Rank websites based on factors not really known to most (it is legally possible, I believe), and then you may well be able to reduce bombing incidents.

8. Selective Moderation of Comments

I believe all comments should not be devalued with nofollow attributes. Comments from honorable people who really add to a discussion should be given value, and those which are pure spam should be devalued even if they are dofollow. Google should even think about starting a comment rating system that allows community rating of comments based on the value it adds. Otherwise, think about acquiring such comment-based social interaction services like SezWho.

9. Give Importance to Synonyms and Related Words

When a page is made for search engines, it looks like this:

Microsoft Office Purchasing: purchase Microsoft office suite, purchase Microsoft office with our purchasing Microsoft office system is very easy. So, go ahead and purchase your copy of Microsoft office software now!
On the other hand, if a page is made for users, it looks like this:
Purchasing Microsoft Office: Go to Microsoft website here, and look for Microsoft Store. Click the Office link on the top. Now, you will see different versions of Office (Standard, Professional, Small Business, Ultimate, Home/Student). Choose the appropriate one and click Add to Cart.

If you analyze both versions closer, you will see that on the first version, there is Microsoft Office everywhere. In the second, Microsoft Office comes only at one or two places. But in the second version, there are quite a few related terms like the names of MS Office versions, site where it is found, etc. The second is perfectly useful for a reader, while the first doesn't add any value and is a pure spam. This shouts the importance of related words and synonyms.

Right now, Google ranks a page based on the keywords appearing on it, not based on related words and synonyms. With Google's impressive dictionary, it should be able to find out synonyms and related words quite easily. Use it to rank a page.

Conclusion

Those were my recommendations to the ubiquitous Google for bettering their search service. I sincerely hope that these suggestions somehow reach Google. You can help me in this. If you love Google and want to make their index better, then write your ideas to me or post your ideas to your blog (if you have one) and let me know. You will get a link from this post, if your suggestions are really good.

Also, make sure that you digg this post and share so that these ideas will reach Google quickly.

Copyright © Lenin Nair 2008

1 Opinions:

Comments are moderated very strictly