A few months ago I started to take on more Search Engine Optimization responsibilities at work. I'm a software engineer at an online English-Spanish dictionary called SpanishDict, and a large percentage of our traffic comes to us from users searching for translations on Google. Recognizing that, we've invested lots of time recently in optimizing our page titles and descriptions that appear on Google in addition to ensuring that the SpanishDict links which Spanish speakers see on Google are just as effective as those that English speakers see.
SpanishDict has many different types of pages, but for this post, I'll focus on our translate
pages, as these are the SpanishDict pages that Google searchers see most often. A translate
page contains a dictionary entry for a particular word.
An English speaker querying Google for caminar
should see a link to http://www.spanishdict.com/translate/caminar with the title Caminar | Spanish to English Translation - SpanishDict
.
A Spanish speaker interested in translating caminar
should instead see a link to http://www.spanishdict.com/traductor/caminar with the title Caminar en inglés | Traductor de español a inglés - SpanishDict
.
Why We Use Hreflang Tags on SpanishDict
Despite these being two different pages on our site, the traductor
page is identical to the translate
page except for some pieces (like the page's meta title) which have been translated into Spanish. The same dictionary entry for caminar
is shown on both pages.
Because the traductor
and translate
pages have duplicate content served in different languages, years ago we added hreflang
tags to the <head>
of these pages as Google recommends. This allows Google to know which of our links are for Spanish speakers and which are for English speakers:
This worked well until sometime last year when we started seeing a steady decline in the number of hreflang
tags reported by Google in the International Targeting section of Search Console. Despite having millions of pages with hreflang
tags, by the end of 2016, Google reported a count of only 5,500.
Troubled and Confused
This decline was both troubling and confusing. It was troubling because when a Spanish speaker searched Google for caminar en inglés
, they unfortunately saw a link to http://www.spanishdict.com/translate/caminar with an English title, greatly impacting the click through rate of our links. It was confusing because we hadn't made any change to the way we serve hreflang
tags on our pages, and we knew our implementation was correct.
In the <head>
of both http://www.spanishdict.com/translate/caminar and http://www.spanishdict.com/traductor/caminar were the tags:
<link rel=\"alternate\" hreflang=\"en\" href=\"http://www.spanishdict.com/translate/caminar\"/>
AND
<link rel=\"alternate\" hreflang=\"es\" href=\"http://www.spanishdict.com/traductor/caminar\"/>
.
In an effort to get our tags recognized, we read just about every available post on hreflang
tags. We scoured the Google Webmaster Forums. We used page validation tools that indicated our tags were correct. We viewed our page source as Google sees it and confirmed that the tags were present. We even moved our tags to the top of the <head>
in case that somehow made them more visible to the Google crawler. None of this remedied the situation. We were at a loss, so we turned to alternate methods of conveying hreflang
information to Google.
Other Ways to Serve Hreflang Tags
In addition to adding tags directly on the page, Google allows you to include this information in your sitemap or to send it in the page's HTTP response header.
Putting hreflang
information in our sitemap would not have been a good approach for a few reasons. Our sitemap has millions of entries and we are constantly adding new pages to SpanishDict. We didn't want to have to frequently regenerate our sitemap to make sure that Google would have hreflang
information for our latest pages. Our backend also runs through a non-trivial amount of logic to determine whether to provide hreflang
tags for a particular URL.
Because of our reluctance to use the sitemap approach, we turned to HTTP headers. We initially had reservations about doing this because everything we had read indicated that sending tags in the response header is more of a specialized approach for pages that serve non-HTML files like PDFs.
Nevertheless, we gave it a shot, removing the HTML tags from our pages and adding a couple of lines of code to send the hreflang
tags in the HTTP header. Now when you visit http://www.spanishdict.com/translate/caminar, the server returns hreflang
information in the Link header:
Link: <http://www.spanishdict.com/translate/caminar>; rel=\"alternate\"; hreflang=\"en\", <http://www.spanishdict.com/traductor/caminar>; rel=\"alternate\"; hreflang=\"es\"
So Did It Work?
When we deployed this change on 2/6/2017, Google counted about 5,000 SpanishDict pages with hreflang
tags. Four days later Google recognized 60,000 tags, and that number has been increasing ever since: Today the number is 2.1 million!
How Did This Affect Spanish Speaker Pageviews?
Making this change had an immediate, positive effect on the total number of SpanishDict pages viewed by Spanish speakers:
The red line details the total daily Spanish speaker pageviews in 2016, while the blue line is the 2017 data. The 2017 pageviews track the 2016 pageviews extremely closely until February 6th, the day we started sending hreflang tags in the HTTP header. On February 5th, Spanish speaker pageviews were up 5% YoY. By February 15th, they were up 30% YoY. Hreflang tags really matter!
Looking only at Spanish speaker pageviews that came directly from Google tells the same story, jumping from up 100% YoY to up over 200% YoY:
As expected, there was no noticeable change to our English speaker traffic.
So Why Couldn't Google See Our Tags?
Unfortunately, we don't know. Because Google's crawler is such a black box, we have no idea whether it was a change on their side or on our side that caused the tags to no longer be recognized. Even if we knew the problem were due to some change we had made, performing a git bisect
would be impractical for many reasons.
My hunch is that a script, potentially some advertisement code, on our site was preventing Google from "seeing" the tags. If you've ever noticed anything similar or have any ideas about what might have been going on, please let me know in the comments!"
Comments powered by Talkyard.