1 17 Apr, 2014
What does Canonicalization mean?

In computer science

Canonicalization is a process for converting data that has more than one possible representation into a “standard”, “normal”, or canonical form. This can be done to compare different representations for equivalence, to count the number of distinct data structures, to improve the efficiency of various algorithms by eliminating repeated calculations, or to make it possible to impose a meaningful sorting order.


In the search engine optimizing world, Canonicalization is the process of picking the best URLS amongst the available choices. Like most of you would not know that there can be various Types of URLs for the same page. Have a look at these URLs.







These are totally different URLS. And hence as totally different URL’s the server can even return totally different content for all of these pages, which would lead to incoherence within the sites architecture.

So what are its repercussions?

Link Authority Dilution

Having failed to do canonicalize the URLs, the crawlers would have a difficult time trying to decide which URL’s to index and which to leave. And hence every time the crawlers confronts a choice, it might pick a different URL. This would affect your Link Authority which gets diluted into different URLs.

Traffic Dilution

If you haven’t done canonicalization for your multiple URLs yet, then your traffic would get diluted into the www v/s non-www pages. Half of your traffic might land up on www.example.com and half might land on example.com/.

PageRank Dilution

If you haven’t done canonicalization for your multiple URLs yet, then your traffic would dilute and page rank would follow. It’s even possible that www.example.com has a 6 PageRank and example.com/ has 5 PageRank.

Duplicate title and description tags

If you quite often see the duplicate title and description errors in your Google Webmaster and unable fathom why? It might be a canonicalization issue. Since the crawler would assume every non-canonicalized URL as a different page with the same content, it would start giving Duplication errors. There are a number of ways a duplication error can come, which can prove harmful for your websites performance in the search engines.


Since some URLs are returning with www and some without it, the site would lack a uniform URL structure.

Yes! Yes! It’s bad. How to fix it?

Canonicalization issue can be solved by simply adding a <link> element with the rel=”canonical” attribute to the head section of the non-canonical version of the each page. Add a rel=”canonical” link to the <head> section of the non-canonical version of each HTML page.

To specify a canonical link to the page: http://www.example.com/product.php?item=swedish-fish, create a <link> element as follows: <link rel=”canonical” href=”http://www.example.com/product.php?item=swedish-fish”/>

Copy this link into the head section of all non-canonical versions of the page, such as http://www.example.com/product.php?item=swedish-fish&sort=price.

If you publish content on both http://www.example.com/product.php?item=swedish-fish and https://www.example.com/product.php?item=swedish-fish, you can specify the canonical version of the page. Create the <link> element: <link rel=”canonical” href=”http://www.example.com/product.php?item=swedish-fish”/>. Add this link to the <head> section of https://www.example.com/product.php?item=swedish-fish.

Indicate the canonical version of a URL by responding with the Link rel=”canonical” HTTP header. Adding rel=”canonical” to the head section of a page is useful for HTML content, but it can’t be used for PDFs and other file types indexed by Google Web Search. In these cases you can indicate a canonical URL by responding with the Link rel=”canonical” HTTP header, like this (note that to use this option, you’ll need to be able to configure your server): Link: <http://www.example.com/downloads/white-paper.pdf>; rel=”canonical”

