How to Remove Duplicate Content From Your Blogger Blog Posts to Avoid SERP Penalty

Duplicate content to an extent may not affect your blog’s search engine rankings. However, there are quite a few times when it can go out of control and start to hit your rankings badly, even without your knowledge. Here are ways to curb it.

What Is Duplicate Content

Do you have a blog in which you post regularly? Do any two different URLs in that blog have the same content? Then it is duplication. In case of self-hosted blogs, various features like print preview pages, monthly archive pages, category pages, etc., can cause duplicate content. In such cases, normally search engines rank one of the pages lower. However, in extreme cases, when your blog has a number of pages with the same content, the blog can be penalized.

Google puts it:

In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.
Which indicates how serious duplicate content can be.

How It Comes to Play?

Duplicate content in Blogger blogs can be the result of having different URLs point to the same content. Recently, I found that this was happening with the Recent Comments widget in Blogger. If you enable it in the sidebar, the new comments will receive a special URL in this form: http://cutewriting.blogspot.com/2008/07/story-of-theme-redesign-and-blogger-w3c.html?showComment=1224840480000#c7659187557945906929

This ‘showComment’ link may get indexed by Google and it causes a duplicate content of the original blog post URL (which is only up to "w3c.html"). This can be serious in case of blog posts with many comments, as all of these comments will be indexed as separate URLs by Google.

Another case with duplicate content is with monthly archive pages. These have all posts made in a particular month. In Blogger, these archive pages are not disallowed with a Robots.txt directive. They get indexed by search engines and causes duplicate content problems.

The categories (labels) pages on Blogger do not cause duplicate content penalty, as they are already disallowed through the Robots.txt file in Blogger. You can find this in Webmaster Tools.

How to Fight Duplicate Content

All you have to do to find out duplicate content is this:

Go to Google, search for site:your URL (without space after the colon).
Now, see how many URLs are indexed by Google. If it is more than the number of posts you have published, then there is duplication for sure.
Send a removal request at Google Webmaster Tools for all unnecessary URLs (like the showComment URL above). Don’t request to remove any normal post URL.

Removal Request Screenshots

Before requesting removal, make sure you meet the criteria here. Here are the screenshots of requesting removal of URL from Webmaster Tools. Click to enlarge.

1. The Red colored link is the duplicate comment link
Google showing duplicate Blogger comment link
2. Starting a removal request

On clicking the New Removal Request button from Google Webmaster Tools->Tools->Remove URLs, you will see this window.
Starting a removal request at Google Webmaster Tools
3. Put in the URL and add it
Google webmaster Tools URL removal request adding URLs
4. Once done, submit the request
Google Webmaster Tools URL removal request submitted

When using the Recent Comments widget, take this precaution.

  • Go to Layout->Edit HTML
  • Search for “Recent Comments” or whichever title you have given to the recent comments widget.
  • Look for expr:href='data:i.alternate.href' after that, add the rel="nofollow" attribute as in the image.
  • Once done, save the template. Now the recent comments widget will be automatically nofollowed.
Recent Comments widget nofollowing

[A very important update to this post: Adding Nofollow to post page timestamp]

However, make sure these URLs are not linked to from anywhere on the web. That will cause it get indexed and noticed by search engines.

We cannot edit the Blogger Robots.txt file. So, the monthly archive pages may be automatically included in the index. To prevent this, make sure you don’t link the monthly archive page from anywhere on your blog. If you find any links to this page, try to request the person to remove it or put nofollow attribute to it. If you are self-hosting your blog, disallow all duplicate pages from search bots through a Robots.txt directive.

Look at my sidebar, where each month’s archives are shown as a monthly post recap page (for better indexing), which is a post page with links to that month’s all posts.

If you have a design like my WP Premium here, the monthly archives are placed within a JavaScript widget on the sidebar, which will not be found by search engines. Just make sure, however, that nobody links to these archive pages. If you find they are still indexed, try to request a removal at Webmaster Tools.

Copyright © Lenin Nair 2008

12 Opinions:

  1. It is a nice post and i did the comment no follow right now.
    Thank you for the tip.

    I had a question.

    If we show some important links on the side bar like recent posts or important posts will it be a duplication issue ?
    Request you to clarity ?

    ReplyDelete
  2. Hi, Suresh, thanks for the comment.

    You can of course show your related posts or featured posts on the sidebar. That will not be counted as duplicate content. Make just sure that you don't link to any comment link or the monthly archive page link. If you want, you can always make these links NoFollow.

    Lenin

    ReplyDelete
  3. Excellent article! Duplicate content issues are very important to avoid in terms of on-page SEO.

    ReplyDelete
  4. Thanks, Barry. I had recently commented on your blog. You got a great resource as well.

    ReplyDelete
  5. Hi Lenin, thanks for the tip! Just did it on my blog. Time to remove all that duplicate content from search engines! =)

    Another useful tip: find the code (b:if cond='data:blog.pageType == "archive"') on the top of your template and add a meta robots noindex right below it. That will prevent search engines from indexing your archives. Hope it helps! ;)

    ReplyDelete
  6. Hi presidente, thanks for the comment. definitely your tip looks like workable. Thanks for it.

    ReplyDelete
  7. The request for deleting urls from google using webmaster tools have failed
    It sayed there is third party owner ( blogspot) and urls can be moved only by it !!

    what to do
    I have about 430 duplicate url because I have comments

    ReplyDelete
  8. Blogger URLs can't be removed as you can't edit robots.txt file or robots meta tag. You can remove comments right?

    ReplyDelete
  9. I really enjoyed to read your blog because you have content what I expected here. thanks

    ReplyDelete
  10. This post really helped me alot,
    I was worry about the tons of duplicate content on my blogs,
    But now I can relax and my peace of mind is back.
    thanks

    ReplyDelete
  11. Thankyou for this nice post, very useful, but now i have a problem.
    i have a duplicate html?commentPage=2 can you help me please ?

    ReplyDelete

Comments are moderated very strictly