Thursday, October 16, 2014

Dealing With Onsite Duplicate Content Issues

The problem of duplicate content arises when there is more than one version of a page indexed by the search engine. Duplicity can be both onsite and offsite: Onsite duplicity is when the same content is seen on multiple pages within a website and offsite duplicity is when the content on your website is similar to that on some other site. 

Duplicate content within the same site makes it difficult for a search engine to decide which page to rank. 

Here are some of the most common onsite duplicate content issues and how to fix them:

Duplicate Content Issues

  • Duplicate content issues can lead to a decrease in crawl rate – this happens because Googlebot is busy crawling unnecessary similar pages
  • Wrong page ranks result in poor user experience
  • New websites may face delays in rankings  
  • Search engines don't know which page to index 
  • Search engines fail to determine which page to rank for a search query

The Cause of Duplicate Content Issues

URL parameters like click tracking and certain analytics code can cause issues of duplicate content. Google offers advice here for URLs containing specific parameters. 

Printer-friendly version content can also cause duplicate content issues when different versions of a page get indexed.  

Identical product descriptions for similar products, either within your site or across multiple sites selling the same products, is a problem mostly faced by e-commerce sites when they use generic product descriptions, i.e. the manufacturer-supplied copy. Since they are coming form the same source, they remain 100 percent identical. 

Another factor that causes duplicate content issues is the session ID. The problem arises when individual users visiting a website are assigned different session IDs. 
 
Using different URLs or domains like the M. approach for mobile versions of websites can also cause problems. 

Duplicate content can also arise when both www and non-www versions of a page are available and the same content is served on both. 

Other causes of duplicate content can include scraping and content syndication; paginating comments; similar content on a post page, home page, and archives page; or a site architecture in which there are multiple paths to the same page. 

Matt Cutts offers some great advice as to what e-commerce sites can do to prevent the problem of duplicate content here.

Solving the Problem of Duplicate Content

Redirecting Duplicate Content: Set up a 301 redirect from the page with copied content to the one with the original content. Make sure you redirect all the old duplicate content URLs to the proper canonical URLs. 

Use a "rel=canonical" Tag: Using a "rel=canonical" tag tells search engines which version of the page you want the search engine to show in the search results page. The canonical tag is found in the header of a Web page.

Use Meta Tags: Use meta tags to tell search engines which pages you do not want to index. 

Syndicate Carefully: In case you syndicate your content on other sites, be careful. Make sure each site to which your content is syndicated links back to your site. You can also ask them to use "no follow." 

If you have multiple pages that are similar, expand the pages to contain unique content or consolidate them into a single page. 

The Same URL for Mobile Sites: To solve the duplicate content issues in case of a mobile version of your site, going responsive or the same URL will solve the problem. 

Check Guest Posts for Duplicity: Before you accept guest posts, check them for duplicity. Plagiarism can cause serious penalties to reputable websites. 

Tell Google How to Index Your Site: Google allows you to decide which page should be crawled and which should not. You can also inform Google how you would like it to index your pages. 

Be Consistent With Your Internal Linking Strategy: Just stick to one particular format to avoid confusion. 


(via)

No comments: