Duplicate content to an extent may not affect your blog’s search engine rankings. However, there are quite a few times when it can go out of control and start to hit your rankings badly, even without your knowledge. Here are ways to curb it.
What Is Duplicate Content
Do you have a blog in which you post regularly? Do any two different URLs in that blog have the same content? Then it is duplication. In case of self-hosted blogs, various features like print preview pages, monthly archive pages, category pages, etc., can cause duplicate content. In such cases, normally search engines rank one of the pages lower. However, in extreme cases, when your blog has a number of pages with the same content, the blog can be penalized.
Google puts it:
How It Comes to Play?
Duplicate content in Blogger blogs can be the result of having different URLs point to the same content. Recently, I found that this was happening with the Recent Comments widget in Blogger. If you enable it in the sidebar, the new comments will receive a special URL in this form: http://cutewriting.blogspot.com/2008/07/story-of-theme-redesign-and-blogger-w3c.html?showComment=1224840480000#c7659187557945906929
This ‘showComment’ link may get indexed by Google and it causes a duplicate content of the original blog post URL (which is only up to "w3c.html"). This can be serious in case of blog posts with many comments, as all of these comments will be indexed as separate URLs by Google.
Another case with duplicate content is with monthly archive pages. These have all posts made in a particular month. In Blogger, these archive pages are not disallowed with a Robots.txt directive. They get indexed by search engines and causes duplicate content problems.
The categories (labels) pages on Blogger do not cause duplicate content penalty, as they are already disallowed through the Robots.txt file in Blogger. You can find this in Webmaster Tools.
How to Fight Duplicate Content
All you have to do to find out duplicate content is this:
Go to Google, search for site:your URL (without space after the colon).
Now, see how many URLs are indexed by Google. If it is more than the number of posts you have published, then there is duplication for sure.
Send a removal request at Google Webmaster Tools for all unnecessary URLs (like the showComment URL above). Don’t request to remove any normal post URL.
Removal Request Screenshots
Before requesting removal, make sure you meet the criteria here. Here are the screenshots of requesting removal of URL from Webmaster Tools. Click to enlarge.
1. The Red colored link is the duplicate comment link
2. Starting a removal request
On clicking the New Removal Request button from Google Webmaster Tools->Tools->Remove URLs, you will see this window.
3. Put in the URL and add it
4. Once done, submit the request
When using the Recent Comments widget, take this precaution.
[A very important update to this post: Adding Nofollow to post page timestamp]
However, make sure these URLs are not linked to from anywhere on the web. That will cause it get indexed and noticed by search engines.
We cannot edit the Blogger Robots.txt file. So, the monthly archive pages may be automatically included in the index. To prevent this, make sure you don’t link the monthly archive page from anywhere on your blog. If you find any links to this page, try to request the person to remove it or put nofollow attribute to it. If you are self-hosting your blog, disallow all duplicate pages from search bots through a Robots.txt directive.
Look at my sidebar, where each month’s archives are shown as a monthly post recap page (for better indexing), which is a post page with links to that month’s all posts.
If you have a design like my WP Premium here, the monthly archives are placed within a JavaScript widget on the sidebar, which will not be found by search engines. Just make sure, however, that nobody links to these archive pages. If you find they are still indexed, try to request a removal at Webmaster Tools.
Copyright © Lenin Nair 2008
What Is Duplicate Content
Do you have a blog in which you post regularly? Do any two different URLs in that blog have the same content? Then it is duplication. In case of self-hosted blogs, various features like print preview pages, monthly archive pages, category pages, etc., can cause duplicate content. In such cases, normally search engines rank one of the pages lower. However, in extreme cases, when your blog has a number of pages with the same content, the blog can be penalized.
Google puts it:
In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.Which indicates how serious duplicate content can be.
How It Comes to Play?
Duplicate content in Blogger blogs can be the result of having different URLs point to the same content. Recently, I found that this was happening with the Recent Comments widget in Blogger. If you enable it in the sidebar, the new comments will receive a special URL in this form: http://cutewriting.blogspot.com/2008/07/story-of-theme-redesign-and-blogger-w3c.html?showComment=1224840480000#c7659187557945906929
This ‘showComment’ link may get indexed by Google and it causes a duplicate content of the original blog post URL (which is only up to "w3c.html"). This can be serious in case of blog posts with many comments, as all of these comments will be indexed as separate URLs by Google.
Another case with duplicate content is with monthly archive pages. These have all posts made in a particular month. In Blogger, these archive pages are not disallowed with a Robots.txt directive. They get indexed by search engines and causes duplicate content problems.
The categories (labels) pages on Blogger do not cause duplicate content penalty, as they are already disallowed through the Robots.txt file in Blogger. You can find this in Webmaster Tools.
How to Fight Duplicate Content
All you have to do to find out duplicate content is this:
Go to Google, search for site:your URL (without space after the colon).
Now, see how many URLs are indexed by Google. If it is more than the number of posts you have published, then there is duplication for sure.
Send a removal request at Google Webmaster Tools for all unnecessary URLs (like the showComment URL above). Don’t request to remove any normal post URL.
Removal Request Screenshots
Before requesting removal, make sure you meet the criteria here. Here are the screenshots of requesting removal of URL from Webmaster Tools. Click to enlarge.
1. The Red colored link is the duplicate comment link
2. Starting a removal request
On clicking the New Removal Request button from Google Webmaster Tools->Tools->Remove URLs, you will see this window.
3. Put in the URL and add it
4. Once done, submit the request
When using the Recent Comments widget, take this precaution.
- Go to Layout->Edit HTML
- Search for “Recent Comments” or whichever title you have given to the recent comments widget.
- Look for expr:href='data:i.alternate.href' after that, add the rel="nofollow" attribute as in the image.
- Once done, save the template. Now the recent comments widget will be automatically nofollowed.
[A very important update to this post: Adding Nofollow to post page timestamp]
However, make sure these URLs are not linked to from anywhere on the web. That will cause it get indexed and noticed by search engines.
We cannot edit the Blogger Robots.txt file. So, the monthly archive pages may be automatically included in the index. To prevent this, make sure you don’t link the monthly archive page from anywhere on your blog. If you find any links to this page, try to request the person to remove it or put nofollow attribute to it. If you are self-hosting your blog, disallow all duplicate pages from search bots through a Robots.txt directive.
Look at my sidebar, where each month’s archives are shown as a monthly post recap page (for better indexing), which is a post page with links to that month’s all posts.
If you have a design like my WP Premium here, the monthly archives are placed within a JavaScript widget on the sidebar, which will not be found by search engines. Just make sure, however, that nobody links to these archive pages. If you find they are still indexed, try to request a removal at Webmaster Tools.
Copyright © Lenin Nair 2008
It is a nice post and i did the comment no follow right now.
ReplyDeleteThank you for the tip.
I had a question.
If we show some important links on the side bar like recent posts or important posts will it be a duplication issue ?
Request you to clarity ?
Hi, Suresh, thanks for the comment.
ReplyDeleteYou can of course show your related posts or featured posts on the sidebar. That will not be counted as duplicate content. Make just sure that you don't link to any comment link or the monthly archive page link. If you want, you can always make these links NoFollow.
Lenin
Big thanks to all of you!
ReplyDeleteExcellent article! Duplicate content issues are very important to avoid in terms of on-page SEO.
ReplyDeleteThanks, Barry. I had recently commented on your blog. You got a great resource as well.
ReplyDeleteHi Lenin, thanks for the tip! Just did it on my blog. Time to remove all that duplicate content from search engines! =)
ReplyDeleteAnother useful tip: find the code (b:if cond='data:blog.pageType == "archive"') on the top of your template and add a meta robots noindex right below it. That will prevent search engines from indexing your archives. Hope it helps! ;)
Hi presidente, thanks for the comment. definitely your tip looks like workable. Thanks for it.
ReplyDeleteThe request for deleting urls from google using webmaster tools have failed
ReplyDeleteIt sayed there is third party owner ( blogspot) and urls can be moved only by it !!
what to do
I have about 430 duplicate url because I have comments
Blogger URLs can't be removed as you can't edit robots.txt file or robots meta tag. You can remove comments right?
ReplyDeleteI really enjoyed to read your blog because you have content what I expected here. thanks
ReplyDeleteThis post really helped me alot,
ReplyDeleteI was worry about the tons of duplicate content on my blogs,
But now I can relax and my peace of mind is back.
thanks
Thankyou for this nice post, very useful, but now i have a problem.
ReplyDeletei have a duplicate html?commentPage=2 can you help me please ?