URL canonicalization is an SEO practice that is effective for website optimization. Not only does optimizing your website tell search engines how to index and rank your pages, but it also helps your site rank higher on sites like Google. A poorly optimized website can negatively impact its ranking with search engines. One of the issues many sites face today is duplicate content.

What is duplicate content?

Duplicate content is when the exact same content on one webpage appears in more than one location with a unique URL. While not technically considered a penalty by Google, duplicate content can still affect the way search engines crawl your website:

There are a few issues that can be present when duplicate content exists: First, it can be difficult for search engines to decide which version is more relevant when ranking it on the search engine and they won’t know which versions to include/exclude from their indices. Duplicate content can affect link metrics, which include domain authority, link equity, anchor text, trust, etc. Lastly, they won’t know which version(s) to rank for query results. As a result, the website can suffer ranking and traffic losses.

Let’s say your website’s URL is www.fuzzybearslippers.com. When you type this into the browser you are taken to the home page of this site. This isn’t the only way that you can type in the URL; other variations on the URL include:

fuzzybearslippers.com

http://www.fuzzybearslippers.com

https://www.fuzzybearslippers.com

https://www.fuzzybearslippers.com/index.html

All the URLs above can be typed in and send you to the site’s homepage. We know that, despite different URL variations, it is all the same site. However, search engines will look at all these URL’s and consider them to be separate pages. This results in the search engines marking the pages as duplicate content.

How to fix duplicate content

One effective solution to taking care of duplicate content is URL canonicalization. Canonicalization, specifically URL canonicalization is a solution for web content that has more than one possible URL and is the process of picking the best URL for the webpage.

When and how to use URL canonicalization

Canonicalization should be used when:

  • 2 or more URL’s display the same content
  • When a website has both an HTTP and HTTPS URL
  • When Mobile sites display the same content on a different URL/subdomain
  • Canonicalization is also useful for managing syndicated content

There are different ways to implement canonicalization on your website:

1. Rel=canonical <link> tag

This method of canonicalization goes into the HTML head tag of the webpage that you want to be the authoritative source for a piece of content. This can map an infinite number of duplicate content. However, keep in mind that this method may add to the size of the page, it only works for HTML pages, and it can be difficult to maintain mapping on larger sites, or sites where URLs changes often.

<head>
    <link rel="canonical" href="https://www.fuzzybearslippers.com">
</head>

2. Rel=canonical HTTP header

Another way to add a canonical tag is to send a rel=canonical header in your page response. Unlike placing the canonical tags in a link tag, this method would allow you to add the tag to pdf files and it does not increase the page size. This method is also helpful when a website serves the same file from multiple URL’s.

3. Sitemap

Sitemaps allow you to specify your canonical pages. It makes it easier to map and maintain canonical URLs on larger sites or sites where the URLs change often. However, Googlebot still must determine the associated duplicate for any canonicals that you declare in the sitemap and it is a less powerful signal to Googlebot than the rel=canonical mapping technique.

4. 301 redirects

A 301 redirect is a permanent redirect that forwards one URL to another. It sends visitors and search engines to a different, preferred URL that the one typed. Keep in mind, this method should only be used when deprecating a duplicate page.