Welcome back to the SEO Essentials Series of the Content Champion Podcast. In this episode, I'm discussing canonicalization with highly experienced SEO, Tom Peary. Yes we know, don't let your eyes glaze over quite yet - as getting this right could save your organic traffic.
Tom is the Head of SEO here at Content Champion, and also specialises in optimizing Google Shopping feeds. Between the two of us, we have over 30 years' experience in the content marketing and SEO industries.
Tom joins me today to explain what canonicalization is and why it’s an essential part of your SEO strategy. He also tell us how using canonical tags can help boost your search engine ranking, and explains how to fix and prevent canonical errors.
Listen To The Canonicalization Show
Show Notes
- What are canonical tags and canonicalized URLs?
- How using canonicalization tags can help improve your website ranking
- How to identify canonical problems on your website
- How to prevent canonicalization problems
- Using htaccess commands and 301 redirects to resolve canonical website issues
- How long does it take to reap the benefits of correcting canonical errors?
[Podcast] Canonicalization (SEO Essentials) From Content Champion #contentmarketing #seo
Click to TweetResources Mentioned In This Episode
Google Search Console - Webmaster Tools
Read the transcript
Loz James: I'm Loz James, and this is the Content Champion podcast, the content, marketing, and SEO show where you can learn actionable techniques from real world examples.
Hi guys, welcome to the show. This is our first SEO Essentials programme. We're going to be talking ... Me and Tom Peary, our head of SEO here, are going to be talking about those foundational elements of SEO on-page, technical, everything related to that, that you need to get in order before you do your in house SEO or content marketing, or start to use the services of a company like Content Champion. So, this time, we're going to be talking about canonicalization, and, look, me and Tom have got about 30 years SEO and content marketing between us, so you're in safe hands. Let's introduce Tom. Hi, Tom.
Tom Peary: Hi, Loz.
Loz James: So, what is canonicalization?
Tom Peary: OK, so a canonical tag is really just a way of telling a search engine about a page on your website. If you've got a piece on content on a page that is then duplicated elsewhere, the rel=canonical tag just, literally, shows Google, this is the main page, this is the master URL, we're not intentionally trying to duplicate content across the site. That way, it just sorts out any problems with duplicate content.
Loz James: Okay. Can you give us some examples of what that looks like on a small business site?
Tom Peary: Yeah, sure. I think one of the easiest ways is, if you take, for example, the www. extension on your domain. You don't have to have that as your domain set up. You'll see many sites where it's just http then the domain, or https. You can actually use anything. I've seen it before where it's ww2, four ws. I've seen that in examples where people have penalties on the ww and have changed it, but what it really does is, if you don't have the ww as the non-ww as a canonicalized URL, you could have both of then indexed, and then, effectively, your home page, for example, could have two pages and then be hundred percent duplicate content.
Loz James: Okay. And what does that mean, because there isn't really a duplicate content penalty, as we know, but it just means that your site won't rank as well, because there are two versions of it competing in search.
Tom Peary: Yeah, it doesn't look great for a search engine, in particular, because it sees that you really haven't done your work to make the site good for the user. Automatically with content issues, you'll generally see it when there's more than 30% ... This is the kind of ratio we look at, is more than 30% of duplicate content on any page, you'll see that that page doesn't rank as well as it did if it's recently been duplicated. So, it's really important that, before you do any SEO, is to look at the foundation of your site. Is there anything wrong? Is there anything immediately preventing my site from ranking well in organic search. So, canonicalized URLs, I think, should be the first thing you look at, and it's a first thing we look at on any audit of any site.
Loz James: With the specific example of some of the client's sites we've worked on, and with their own eCommerce store, when it gets to things like pagination on eCommerce stores, that can be a problem can't it? If you've got canonicalized content on different category and product pages.
Tom Peary: Yeah, definitely, we see this all the time. I mean, when we set up our eCommerce store, it was one thing we made sure that wasn't going to be a problem from the get go, but we do see it a lot on eCommerce sites in particular. So, paginated pages, where you've got page one, two, three and four, and, say, on that main first page for a category page, you've got 600 words on content, and then on each page afterwards, two, three, up to however many pages are, that content's duplicated, it really cannibalises itself in terms of the value of that content that you've written, or had a copywriter write, so it's really important, from day one, that you look at this, because this could be holding your site back from day one.
Loz James: Okay. And let's be clear on this as well, if you start building links in house, or you employ an agency to build links, and your site's got a canonical duplicate content problem like this, it doesn't matter, you could build links all day, if you don't sort this out, your site's not going to rank very well, is it?
Tom Peary: Yeah, you're just really wasting your time. I mean, if you don't sort these foundation issues out, you're making it much harder, you're going to have to do a lot more link building, and it's really not ever going to have the success what it should do by all your off-page SEO, so getting these things right early on is going to make a big difference to how your site ranks in the very near future.
Loz James: So, we're going to go onto what we can do about it in a minute, and using canonical tags, and everything like that, but we know it's bad, how can we find it? What tools and methods can we use to find out if we've got a canonical problem?
Tom Peary: There's a load of ways that you can find it. If you don't have access to some of the more specialised SEO tools that we use, things such as using your Google Console, so your search console used to be Webmaster Tools, you can go in there and it will give you HTML suggestions on duplicate metadata. If you see a page ... This is always a good one I find that'll flag things up immediately, if you don't have access to these more sophisticated, powerful tools that we used is, if you see the page titles that are duplicated three or four times, or even twice, have a look into those and see what they are. Most of the time, those are paginated pages, you can see that. Another great way is just in Google.
Open Google now, as a test, and put side colon, your domain, with no spaces. Try it without the ww first, and see how many page results it comes up. Now, it may be slightly different if you've got sub domains on there, so if you've got blog.yourdomain.com, or if you've got extra files on there that maybe aren't in the same directory, but, generally, if you don't see the same number of pages in the indexation, so 100 pages without ww, and then you've got 450 with ww, you want to look at that.
Is it because the same pages, say your About page, is duplicated twice, and that's a really way to see quite quickly. Like I say, there are other tools out there. Things like Copyscape as well, it's a free tool. You can buy the licence, but you do get to use it for so many pages as a free licence, and check in there. And then use Siteliner. Siteliner will check on page content.
So, Copyscape, which we'll go into on another podcast, will talk about page content on other sites, duplicate content, but, for your own site, use Siteliner. It's free to use and it will tell you the amount of duplicate content and then it'll show you the sources, and that's really a great way of immediately seeing what's wrong.
And then just export all your sitemap, so save your sitemap, and then, in a spreadsheet form, just create another column that says duplicated, add an X, and go through it. That's a really easy way of quickly identifying all that duplicate content because of none canonicalized URLs.
Loz James: Okay. Another great tool that I've got to make a shoutout for is Sitebulb. That's more advanced. It finds out a tonne of stuff about what's wrong with your technical SEO, but it's really good in this particular instance.
Tom Peary: If you've got a huge site, you may need a more advanced tool, because, if you've got thousands of pages, doing this manually might be a little bit more difficult.
Loz James: You're listening to the Content Champion podcast. Available at contentchampion.com and on iTunes. So, we know that canonicalization is bad. We know it causes duplicate content issues on the page. We also, now, know what sort of tools we need to use to collate a great big list of all these duplicate pages. Then what do we do about it? How do we stop this from happening? Obviously, it might be as simple as going through each individual page and making sure the content is unique, but what else can we do to stop this happening?
Tom Peary: Okay. It's a good question, because, depending on the site you use, the platform, if it's WordPress and say, for example, you write a really good blog post, it's got 2000 words on, and then you use the tag feature, so the post tags. In some older builds of WordPress, you'll find that it'll duplicate with those tags a post and it's duplicate.
I mean, the later versions of WordPress and certain plugins allow you to put canonical URL support in there, so it automatically adds the tag to show the originating source, but, really, what you want to be doing is speaking with your developer, if you have one, and saying to them, "Right, why has this duplicate content happened? We're not aware of it appearing on ..." The great example that I start with it the ww and the non-ww. Speak to your developer and say, "Look, we need this to redirect, so it's the same." You can't access non-ww and ww. Have a master URL domain, and then make sure they all feed into that.
That's the first thing I would do, is speak to your developer, if you have one, or if you know how to do it yourself, get that fixed, and then as Google crawls it, you'll be able to see the changes. You'll see peaks in your organic traffic. So, that's the first thing that I would do.
Loz James: And you can use htaccess commands and 301 redirects as well, so tell us about those.
Tom Peary: Htaccess is great, because it can do it in a mass form, very simply. Now, you do have to be careful with a htaccess code. I wouldn't recommend a beginner going and editing the htaccess code, because a htaccess file is how your ... Effectively, your hosting server, how that connects to the world wide web, so if you're messing with that too much and you don't know what you're doing, you could cause problems, so speak to a developer.
But the htaccess file is fantastic for making huge changes, so if you've got hundreds or thousands of pages you need to affect, that is the best way to do it, in my opinion, from server level. The 301 direct thing, if you've got the odd URL, where for some reason it's a duplicate. It could be an old page, it could just be a duplicate post somebody's left on, a 301 is perfect for that.
You don't really need the skillset to be able to go and change the htaccess file, so 301s work differently. And, especially with a lot of sites like WordPress, Drupal, they come with plugins that'll do this for you, so you can just write a 301 from the duplicate URL and point it to the new one.
Loz James: Okay. And we touched on this earlier as well. Obviously, you can go into Webmaster Console and take ownership of different variations of your domain, and then specify which one you want to use, and it'll default to that, can't you?
Tom Peary: Yeah, you can, yeah. You can do that as well.
Loz James: Okay. So, we've been through what canonicalization is. We know how to find the errors, the problems. We know how to deal with them. You've mentioned, then, looking at pages, seeing a spike in organic traffic. How long does it take to see the benefits of doing this kind of correction work?
Tom Peary: It is pretty quick, yeah. It's not like when you've had a link penalty, or a manual action, perhaps, and you're waiting months, or even years in some cases. It's quick. If you start changing these problems, depending on how many you have and how many changes you have to make, as soon as Google starts indexing these and then updating its database, it will start to creep up. It's normally within a few weeks.
All the ones we've done where we've taken a site and it's got loads of duplicate content issues, as soon as we sort these out, you'll see a pretty rapid increase within a few weeks. There's no specific time set, Google doesn't give one, but we've noticed, generally, within a few weeks, there's pages started to reindex, and Google sees that you've fixed that problem, you'll see a jump.
Loz James: And just finally, this can lead to massive increases in traffic just dealing with canonicalization problems, can't it?
Tom Peary: Oh, yeah, huge. We've dealt with so many times, Loz, where we've had clients that don't know why their has dropped from 20, 30 thousand visitors a day down to a thousand visitors or less and they think it's a link penalty, or something else, or their site's just dropped in rankings. There's generally a reason, and so many times we've seen where it is this duplicate content issue. We had, once, one where a site was completely de-indexed, because the whole site was duplicated.
Not just a few pages, we're talking, I think, it was 30 plus thousand pages at the time, and it was completely de-indexed. We got that fixed, and it came back, and the traffic shot up, and then they saw record sales, so it is really a big deal. This can be overlooked and just a glancing look, "Oh, maybe I should check into this." This should be the first thing that you're doing, because you can change this today and make a huge change in the future.
Loz James: Okay. Well, that's great. What I'm going to do is put some links below this podcast, on the Content Champion blog, contentchampion.com, so you can see some of the resources related on htaccess files and 301 redirects and all the type of thing. Next time, the sister subject of this is duplicate content, so in the next show, we're going to talk about duplicate content on your own website and across the web, and how we can find that and how we can deal with that, because the two are related, but for now, that's been canonicalization. Thanks very much for listening and we'll see you next time.
You've been listening to the Content Champion podcast. Actionable SEO and content marketing techniques based on real world examples. Until next time, thanks for being here.
Subscribe, Rate & Review the Content Champion Podcast
Thanks for tuning into the Content Champion Podcast - the content marketing and SEO show bringing you actionable techniques from real world experience. If you enjoyed today’s episode, head over to iTunes or Stitcher to leave us a rating and review. Subscribe to the Content Champion Podcast so you never miss an episode and don’t forget to share your favorite episodes on Facebook and Twitter.
- What are canonical tags and canonicalized URLs?