Jonathan Leger – SEO And Internet Marketing Blog Internet Marketing Blog

24May/12Off

Anchor Text Optimization – How Much Is Too Much?

In my last blog post I talked about how to rank in Google after their Penguin update. One of the things I pointed out is that Google is penalizing sites for "anchor text over-optimization." That is, if you get too many links with the exact keywords you want to rank for in the anchor text, you might get penalized.

Notice that I said you might get penalized. That's because there are some circumstances in which site's aren't being penalized. More on that in a bit.

After reading about Google going after over-optimized sites, I decided to do some data mining and see if I could figure out exactly how much exact-match anchor text is too much. Here's what I did:

1. I gathered 1,500 keywords from 30 very diverse markets. Everything from business to technology to transportation to food and beverage to chemicals. A wide range of markets.

2. I ran those keywords through Google to get all of the top-level domains ranking on the first page for the keywords. I removed all ranking inner page urls from the data set.

3. I checked the link profiles of all of those domains to see what percentage of their links had anchor text matching the keywords they were ranking for.

Before we continue, let me explain why I removed the inner page urls from the data set. As I also pointed out in my previous blog post about Penguin, Google is favoring authority ("big box") sites in their search results more highly than ever. That means that an authority site can get an inner page ranked with very little or no backlinks at all. Wikipedia is a big example, as is Amazon.com. They're all over the search results after Penguin, as are other authority sites.

Because those authority sites would skew the results of my data mining (since they would typically have tiny exact-match anchor text percentages), and because I'm pretty sure most of my blog readers don't run huge authority sites, I only logged the data for the top-level domains whose home pages were ranking for the keywords. That is, if mydomain.com was on page one, I kept it, but if mydomain.com/innerpage.html was ranking, I dropped it. So the results you're seeing reflect sites that are more typical of what the average webmaster would be able to achieve.

Okay, onto the numbers. Some of what I learned was quite enlightening.

(For the purposes of this blog post, I'll refer to the percentage of links whose anchor text exactly match the keywords the site is ranking for as its EMA. Also, all of the queries were done via Google.com, so the ranking sites generally favor the USA. Lastly, all of the linking data was gathered using KeywordCanine.com.)

 
1. The Average EMA Is Pretty Low

It probably won't come as much of a surprise for me to tell you that the average EMA for a site is pretty low -- just 10% across all of the markets. So, on average, only 10% of the links to a ranking site contain the exact keywords the site is ranking for. But don't take that as the standard to aim for, because it varies a lot across markets.

Here's the full list of the markets, their average EMA, and the Maximum EMA found for any site that ranks for its keywords in that market. For the Maximum EMA, only sites with at least 50 unique domains linking to it were considered.

Average anchor text diversity across topics: 10%

Market Avg EMA Max EMA
business 9% 97%
home based business 17% 57%
work from home 14% 53%
b2b 4% 22%
business management 5% 25%
b2b ecommerce 18% 18%
management 9% 49%
office management 4% 8%
online stores 36% 36%
semiconductors 4% 15%
software 5% 60%
startup 9% 33%
technology 6% 80%
transportation 3% 26%
virtual server hosting 8% 11%
web design 18% 98%
web hosting 8% 62%
food and beverage 8% 8%
office supplies 2% 36%
biotechnology 19% 42%
email hosting 15% 26%
office backup 6% 10%
plumbing 29% 92%
agriculture 4% 44%
pharmaceutical 9% 60%
human resources 12% 40%
internet security 1% 16%
chemicals 6% 93%
ecommerce 4% 41%
aerospace 6% 54%

From these figures you can see that the average EMAs are really low, but that there are usually sites with a much higher EMA than the average ranking on page one for their keywords. So who's getting away with having a higher EMA, and how are they doing it? Read on.

 
2. The Get Out Of Jail Free Card - Exact Match Domain Names

In pretty much every market I tested, Google is ranking one or more exact-match domain (EMD). What I mean by "exact match domain" is a domain whose name is made up of all of the terms in the keywords it's ranking for. That is, if workfromhomeonline.us is ranking for the keywords "work from home online", that's an exact match domain for the keywords. That site is ranking for those keywords, by the way, despite having a 53% EMA.

However (and this discovery goes against much of what's said about exact match domains), domain names that have dashes in between the terms also appear to be benefiting from this exception to the over-optimization penalty. That is, work-from-home.biz is ranking on page one for "work from home" despite having a 70% EMA.

Those two examples also bring to light something else that goes against common SEO knowledge -- Google is not penalizing EMDs even if they aren't one of the "big four": .com, .net, .org and .edu. The "lesser" domain name extensions are also being exempted: .info, .biz, .us, .ie, .ro, etc. They all appear to be averting punishment despite having very high EMAs.

Here's a list of some of the exact-match domains ranking in Google in the markets I tested. Each of the ones in the list have a 40% EMA or higher:

Keywords Rank* URL EMA
internet business expert 6 http://internetbusinessexpert.co/ 97%
work from home 4 http://www.work-from-home.biz/ 70%
work from home online 2 http://workfromhomeonline.us/ 53%
work from home india 6 http://www.workfromhomeindia.biz/ 50%
work from home ideas 6 http://www.workfromhomeideas.us/ 67%
b2b strategy 5 http://b2b-strategy.ro/ 42%
business management solutions 6 http://www.businessmanagementsolutions.ie/ 100%
pain management 10 http://painmanagement.com/ 72%
music management 7 http://www.musicmanagement.com/ 49%
music management software 8 http://www.music-management-software.com/ 100%
network management software 3 http://www.networkmanagementsoftware.com/ 40%
obsolete semiconductors 2 http://www.obsoletesemiconductors.net/ 100%
technology new 8 http://www.technologynew.org/ 86%
computer technology 10 http://www.computertechnology.com/ 67%
la crosse technology 1 http://www.lacrossetechnology.com/ 80%
technology books 4 http://www.technologybooks.com/ 55%
creative technology 3 http://www.creativetechnology.com/ 58%
web design company 1 http://www.webdesigncompany.net/ 88%
affordable web design 5 http://www.affordablewebdesign.net/ 53%
new web design 1 http://newwebdesign.com/ 57%
top web hosting 8 http://topwebhosting.com/ 50%
web hosting review 6 http://www.web-hosting-review.net/ 41%
web hosting canada 2 http://webhostingcanada.org/ 51%
food beverage canada 2 http://www.foodbeveragecanada.com/ 48%
biotechnology companies 4 http://www.biotechnologycompanies.net/ 50%
exchange email hosting 6 http://www.exchangeemailhosting.com/ 100%
plumbing contractors 4 http://www.plumbingcontractors.com/ 56%
plumbing how to 6 http://www.plumbinghowto.biz/ 100%
home improvements maryland 10 http://homeimprovementsmaryland.net/ 100%
pharmaceutical jobs 6 http://www.pharmaceutical-jobs.com/ 100%
pharmaceutical jobs 8 http://pharmaceutical-jobs.org/ 60%
pharmaceutical technology 3 http://www.pharmaceutical-technology.com/ 48%
computer internet security 8 http://computerinternetsecurity.org/ 92%
pharmaceutical chemicals 3 http://www.pharmaceuticalchemicals.com/ 100%
ecommerce marketing 6 http://www.ecommerce-marketing.com/ 67%
bigelow aerospace 1 http://bigelowaerospace.com/ 54%
plumbing supplies 9 http://www.plumbingsupply.com/ 92%

* Due to Google's localization, the rank you see the site at may be different than what's shown in the table.

Just look at those EMAs! Google is clearly letting those domains get past any anchor text over-optimization penalty. Also notice the last entry ("plumbing supplies"). Google is giving a pass to plumbingsupply.com even though its exact match anchor text is for "plumbing supplies", not "plumbing supply." So it seems Google is also exempting exact match domains which it determines have a variation of the keywords in the domain (in this case "supply" instead of "supplies").

So why is Google letting these guys get by without a penalty? It makes sense, really.

If Google penalized sites with a high EMA even if their domain name was an exact match for the keywords, they would end up dropping all kinds of brand-name sites out of the rankings. Think about it: what anchor text do most people use when linking out to a brand-name site (like Ford or Adobe or Amazon, etc.)? Their name, of course! So Google has to give those sites a pass despite having a high EMA. It just makes sense that Google won't penalize you for links with anchor text matching the keywords in your domain name. That's often the site's brand, and it will naturally have a much higher EMA for those keywords.

Is Google giving a pass to all EMDs with very high EMAs? The data can't answer that question. But clearly they are giving a pass to a lot of them.

 
3. The Ranking Exact-Match Domains Have A Lot Less Links

Another important point about the ranking exact-match domains versus all of the other ranking top-level domains: they have a lot fewer links aimed at them. In fact, on average the EMDs only have about 15% as many links as the other ranking sites.

One stand-out example is pharmaceutical-jobs.com, which is ranking on page one for (of course) "pharmaceutical jobs." It only has about two dozen external domains linking to the entire site. The other top ranking results typically have many hundreds or thousands of domains linking to them. Clearly Google is highly favoring the EMD in this case.

Here's a breakdown of the number of links from unique external domains ranking the non-EMD sites versus the EMD sites:

Market Avg Links Avg EMD Links  
business 1,901 1,556 82%
home based business 156 225 69% more
work from home 204 85 42%
b2b 289 23 8%
business management 70 1 1%
management 350 79 22%
office management 138 6 4%
semiconductors 644 2 0%
software 1,438 417 29%
startup 135 64 47%
technology 414 115 28%
transportation 1,166 121 10%
web design 435 475 92% more
web hosting 2,949 159 5%
food and beverage 430 40 9%
office supplies 371 40 11%
biotechnology 72 2 3%
email hosting 143 278 51% more
office backup 70 70 100%
plumbing 502 5 1%
agriculture 1,232 595 48%
home improvements 970 1 0%
pharmaceutical 243 135 55%
human resources 260 68 26%
internet security 930 11 1%
chemicals 273 35 13%
ecommerce 2,835 745 26%
aerospace 316 137 43%

 
Some Non-EMD Sites Are Also Getting Away With It

The data also shows other sites with very high EMA values getting a pass from Google even if they don't have an exact-match domain name and aren't a brand. Why Google is giving those sites a pass isn't clear. For example, addonchat.com is ranking for "chat software" with a 60% EMA, and plimun.com is ranking for "web design" with a whopping 98% EMA. If I figure out why Google is letting these sites get away with that, I'll definitely be blogging about that, too.

 
The Take-Away

So what can you take away from all of this data and these numbers? In short, exact-match domain names are your friend! They can be ranked with a lot fewer links and apparently have a much better chance of not getting penalized for anchor-text over-optimization. This includes the "lesser" domains (.info, .biz, etc.), as well as domain names with the keywords separated by dashes (e.g. work-from-home.biz).

Also, if you're not using EMDs, it's important to diversify your anchor text a lot. How much is "a lot" really depends on your market. So do the research. Check out the link profiles of other ranking sites in your market and see what their anchor text looks like.

If you have any questions or comments, or would like to suggest other post-Penguin ranking factors for me to dig into in a blog post, please leave a comment below. Your feedback is always welcome!

Oh, and one last note: if you found this post beneficial, please share it using one of the buttons below: