Jonathan Leger – SEO And Internet Marketing Blog Internet Marketing Blog

14Dec/09Off

How big is Google? Why should you care?

Remember way back when Google used to show the number of pages they have indexed on their home page? Remember the war between Yahoo and Google where they competed to get the most pages indexed? It seemed that the operators of the big engines felt that if their index was bigger, they were the better search engine. It was fun to watch at the time, but eventually those numbers quietly disappeared.

However, when the Cuil search engine came out last year, its creators made the bold claim that it was the largest search engine in the world. Its home page, as of this blog post, proclaims its index to be 127 billion pages. Interestingly, just three days before Cuil officially launched on July 28, 2008, Google made the statement that their search engine was aware of one trillion urls, but added that they "don't index every one of those trillion pages." For a moment I wondered whether the number wars were going to start up again. They didn't.

But is it true? Does Cuil index more pages than Google? And why should you care either way?

First, let's see if it's true, then we'll talk about the implications to you as a webmaster.

Here's a simple way to find out: search for a single word term in Cuil and Google and see how many pages comes back in the result counts. For optimal results, the word should be extremely common -- likely to appear on just about every single (English) content page indexed by both engines.

For example, the word the. I searched for "the" in Cuil and Google (and, for fun, Yahoo and Bing). Here are the numbers I got back, sorted with the engines having the most results first.

Results for "the"
Cuil 89,042,476,840
Yahoo 32,700,000,000
Google 14,850,000,000
Bing 6,700,000,000

 

Rather dramatic differences! Based on these results, Cuil does seem to index a lot more pages than Google and the other major search engines (at least pages written in English).

Remember, though: Google made it clear that they don't index every page they are aware of. In fact, assuming that Cuil actually indexes most of the publicly available content on the web, that means that Google is choosing not to index more than 80% of pages which contain the word "the" (which it's safe to say appears on pretty much all content written in English).

What causes Google to filter a page from its index? The previously referenced blog post on Google's blog says that "many [pages] are similar to each other, or represent auto-generated content ... that isn't very useful to searchers."

Google is notorious for making vague statements that are understood by just about nobody. So what's the real truth? Let's disect their statement a bit and find out.

Google says that pages which are "similar to each other" aren't necessarily indexed. These kinds of statements from Google have really caused a lot of misunderstanding and the dissemination of misinformation by self-proclaimed gurus of search engine optimization, who often claim that your page will get penalized if it's a duplicate of some other page.

We can prove from Google's own results that the engine does, in fact, index duplicate content. How? It's easy: Hop over to EzineArticles.com and grab the title of the most published article in any given category, then search for that title in Google using the "intitle:" operator.

For example, the most published article in the last 60 days in the Finance category right now is titled "Same Day Loans - When You Are Running Out of Options!"

Go to Google and search for this:

intitle:"Same Day Loans - When You Are Running Out of Options!"

Right now 7 results show up. When I click the link at the bottom of the results to show duplicates as well, I get 87 results. That means Google has 87 copies of the same article in its index. Clearly, the fact that the content is the same doesn't prevent Google from adding a page to its index.

Reading Google's own words about duplicate content in their support material gives the impression that when they refer to duplicate content, they're mostly talking about content on the same site. They also state that "duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results."

So Google appears to be indicating that similar content is not indexed if it's perceived to be for search engine manipulation.

That may be their goal, but it's not the case in actuality. Content is very often created and syndicated for the purpose of building links that get a page ranked in Google. That practice works very well, too. Perhaps if the duplication is egregious enough... but unless you're doing some large scale duplication and distribution you generally have nothing to worry about.

The second part of Google's statement indicated that a page may not be indexed if it's "auto-generated content ... that isn't very useful to searchers." An example they give is a calendar script that would create an infinite number of pages if the search engine crawler kept following all of the links for all of the dates going forward in time. Google was not specifically talking about page content that's generated by software.

Again, we can prove that Google indexes software-generated content by using their own index. I went to Google's shopping page and clicked on one of the "recently found" links (in this case, it was "bicycle trailers"). The first result was from Amazon.com, which has an API that allows people to use Amazon product information on their own sites (e.g., you can create Amazon-like product pages using software).

I searched at Google for two sentences found in the product description (with quotes):

"The Burley Solo trailer is the top of the line when it comes to a single child trailer. With its newly designed reclining seat and full suspension axel your child will be riding in comfort."

That search returned 10 pages, and when I clicked the link for showing filtered results, 46 pages. So obviously Google has no problem indexing that kind of content, either.

So what is it, then, that will prevent Google from indexing a page? The answer is simple, and it's one that you'll probably never hear from Google.

Sometimes I wonder why Google bothers putting up support materials when they never give you a real answer to anything. They want to be vague to prevent people from manipulating the results, but guess what? People manipulate it all the time anyway!

So here it is, the real answer to why Google's index is 80% smaller than Cuil's, and why so many pages go unindexed:

No links = No Indexing

That is, if a page is a duplicate of some other page on some other site, but the page has no links to it, Google will crawl the page -- and even put it in their index for a while -- but after a few days or weeks the page will usually be removed from their index.

I say "usually", because if the site that the duplicate page appears on has enough links to it overall, then the page will stay indexed even if there aren't any links directly to it. That's why sites like EzineArticles.com, for instance, has 4,690,000 pages in Google's index even though many of those pages don't have any external links to them -- the site as a whole has enough links for Google to feel it's worth keeping anything that appears on that site indexed.

That makes sense, right? Why muck up the index with massive amounts of duplicate content that isn't important enough for anybody to link to it?

What all of this means for you as a webmaster is simple: if you're going to distribute articles or other duplicate content in order to build links to your web site and rank better in Google, you need to make sure that the content you distribute is linked to by other external pages. Whether you accomplish that by social bookmarking or writing additional articles on EzineArticles that link to your articles on "lesser" sites or though some other method, you need to be sure that the content is linked to.

So, sure, Cuil's database might be a lot larger than Google's, but that doesn't mean that it's better -- Google just does more filtering. That's important for you, because if you want your content to stick around, you need to make sure Google considers it valuable by throwing some links at it.

Please post your thoughts and questions in a comment below.

Comments (99) Trackbacks (0)
  1. I think it is intelligent how Google does things so far. If it says it would not index every page on the Web, it wouldn’t for a reason–one of those was said above in this article. I mean who would want duplicate/triplicate search results when what we want most of the time is to get the most relevant in our search results. I don’t know the search algorithms of Cuil (in fact I just heard about it here), but it’s not just the numbers–its relevance that matters.

  2. I’m really surprised that Yahoo! pulled more pages than Google. I don’t understand how Google can be aware of 1 trillion pages but not index them all, does that really count?

  3. Very well thought out and informative. I’m sure many others enjoy reading this too, but are just a little scared to post – anyway – thanks again!

  4. thanx for the sharing, tis is really useful info for online jewelry marketeers like me! =)

  5. As far as search engine is concerned Google is No1 and no one is near it. I don’t think anyone can even beat Google search engine.

  6. i really love Google, Google is very useful to us! Great! thanks for sharing this.

  7. I use Google from so many years and I always enjoy reading about it so much. Thanks for sharing it with us.

  8. Google controls how people use the web (videos, search, maps, email, host of other services)If Google plays hard ball with Apple, like for example, introduce a new nifty flash based service in google maps (which works on android, rim, palm etc.) or for that matter in youtube. Not sure what Apple would do, Google might get some -ve press, but Apple will definitely be left in cold.Ultimately, this fight might cost the consumers in the short term, until there is a level playing field, which Mr. Jobs is not acknowledging right now.

  9. Google is every thing today. that is the reason people are around google today.

  10. Great post but i already know the nonsense of duplicate content….thanks

  11. Thank you for share very much. This site is the best forever.

  12. If you want to have some fun showing Google’s hypocrisy, do searches for conservative value terms and compare them across search engines, then do one for liberal terms.
    I can all but guarantee you that anything to do with “green” or “environmentalism” will get indexed rather quickly and have staying power.
    LOL
    AL

  13. How does the new Google earth feature “street view” work?

  14. You really share very informative article about Google…great information thanks…

  15. Google is very big and also its very helpful important search engine.It has dropped one of the biggest PageRank nuke which affected even the big name bloggers like ProBlogger, Andy Beard and Coppyblogger.

  16. This is very interesting.

    Of course, I am mega ignorant when it comes to search engines. However, I had never heard of CUIL.

    I used it to search for my favorite hobby shop, and Good had many times more pages relative to the business, ToysPeriod, than CUIL.

    So, even with all the pages indexed, I’m wondering how Google compares with regard to appropriate content with CUIL.

    At least in my search, Google was far superior.

    Then I think, companies can work as hard as they want, but if people are using them, businesses are better off sticking with what brings in the business.

    Any information regarding what the traffic comparison might be comparing Yahoo to Google to CUIL would be interesting.

    My impression is that Google has a lion’s share of the internet traffic searches outside China.

    Beth

  17. Thanks for dispelling the duplicate content myth. I absolutely agree with you, it’s all about the links.

  18. According to me Google is the king….Google is amazing I am trying to do a project for college ,about a car .I go to Google,and I click a link, and it takes me someplace to buy a car.its really amazing thanks…..

  19. It is scary the power of Google. But when any organism gets too big it starts to decay. US steel, Standard Oil, IBM, Microsoft, Yahoo, all got huge and then stagnated or shrunk.

  20. Why do people find it impossible to spell correctly and use Google? very interesting blog post thanks man.

  21. Great blog post and very nice article johnathan .this really help me a lot i linking thanks …

  22. Great stuff as always Johnathan. This will be interesting to follow over the coming years. Perhaps the search engine wars are not over yet?

  23. Well said dude!! I have found your post while searching Google and I am glad I found it.I have submitted my site article,directory,blog but i am not getting PR so please let me know what should i do?please reply me as soon as possible.

  24. I feel like I’ve wasted so much time building links on pages that don’t have links pointing to them. It sure explains why it seemed like a lot of my link building only affected my Yahoo & Bing rankings.

  25. Thanks for clearing up the myth on duplicate content penalty. This is really useful information.

  26. Well, if you’re asking how Google compares to its competitors, then the answer is: Google is King. Yes, if it is possible to monopolize how people search for information, then Google is the reigning monarch.

  27. You are right that the following statement is a myth “Your page will get penalized if it’s a duplicate of some other page”

    Cutts of Google put out a video earlier this year explaining they don’t penalize for duplicate content, but they may/do penalize for spam content.

  28. Thanks for sharing such a nice article related to the Google. I had known after reading this article that how much the Google is Big.

  29. Backlinks are important but quality of content is even more important in my opinion. A top ranking in Google but having so many visitors leave within 30 seconds does no good as well.

  30. The duplicate content Google should be most concerned with is the kind appearing on multiple sites.

  31. Good article Jonathan. I always learn a lot from your posts and really appreciate it! I wish you a Happy New Year. Regards, Martin

  32. The bigger they are the harder they fall. Google will outgrow itself. I can’t wait for the rise of organic search engines that are grass roots based similar to WordPress and Linux.

  33. I’m definitely going to do this with my links. Thanks for the tips John.

  34. Really good articles. Making sense to me. Thanks for sharing with us :)

  35. In my opinion there is no matter how much sites are in base but what is their quality. If most of searched sites in some search engine is dead, damaged or dangerous it is better not to find them and not to visit them.

  36. I’m already thinking about many more applications for networked learning we may yet see come out of Google Wave.

  37. It is interesting that cuil is the only one still bragging about their index size. I guess the other search engines are realizing that people aren’t always impressed with numbers alone. If the results are not relevant, it does not matter how many trillions of pages they have indexed.

  38. Wish we could go back to the “good old days” when you there was MSN, Yahoo AND Google, all with reasonable shares of the search engine traffic.
    Had never fully understood why indexed pages could get dropped, but found the article ( and comments) helpful

  39. Since we’re on the subject of of Google and its indexing habits, I thought I’d share a little trick I’ve been using. (Some of you older SEO folks may know this already.)

    Try this with your own site:

    In the Google search field, type site:yoursite.com

    (fill in YOUR domain name of course)

    then type site:yoursite.com/*

    See the difference?

    The /* will display all “truly” indexed pages in your site.

    leaving the /* out will show “supplemental pages” AND
    “truly” indexed pages.

    Have fun!

  40. You have a point Jon. Why would google want all of that information on their servers if no one is linking to it. It shows them that nobody is getting any benefit from it except the site owner.

  41. It just goes to show that with Google, the leader in the search engine market, you are naturally going to have to do more in order to stand out from the rest and have your website benefit from search engine traffic. Google is the leader of the pack and they are pushing the hardest to return the best results in their eyes; the most “relevant” results.

    With all of the new Google stuff that has been coming out, I can’t help but wonder how much their algo will change in just a few short years. It’s just so darn dynamic, the search engine game.

  42. Thanks Jon for another amazing post!!!

    Shannon

    On a side note, I have some support tickets at askjonleger.com that are more than a week old and still no response. I could really use some help on these… any chance you could look into the matter for me :)

  43. Sean @ Home Gyms:

    Generally, yes, but the sites you’ve named (EzineArticles, GoArticles. iSnare, etc.) are already well-linked enough that your articles will get indexed regardless.

    It’s for the lesser-linked article directories, blogs, etc. that you need to bookmark your articles so they remain indexed.

  44. I agree that google is not indexing all the content they need to. For example, this small yoga site has all their pages indexed but I worked previously with a much larger site (20,000 pages) and only half of those were indexed. What I found to increase my success with the site was not only links to the main site but a number of deep links the individual pages. It really helped to increase indexing since Google was following those links on to the deep pages and not having to go from my main page.

  45. Ok I see..

    So If I use one article.. and I distribute that exact same article to ezine articles, go articles, isnare, buzzle & article alley…

    Then… I should visit each page on those sites where my article is published and book-mark them? Or write another article and point it to those locations etc etc…

    Am I right in what I am saying here?

    Sean ;)

  46. Professor:

    My request for unique (spun) content is just an experiment. I almost never bother spinning or submitting unique content for the purpose of gaining backlinks.

    It’s EASIER to rank unique content, because you have less competition (Google will filter duplicates for any given search result, choosing the one with the most links as the “winner”), so I tend to try and rank unique content — but when gaining backlinks, I don’t bother going through the extra effort.

  47. Thanks Jonathan, for responding to my 2nd statement. I did as you suggested, and checked on the article sites to which I submitted an article last January. In February, when searching the article title, Google listed 92 sites and Yahoo 77. Yesterday, I checked again: Google only had 5 listings when I searched for the article title, but 309 listings when I searched for the author ( a unique pseudonym ). And Yahoo had 141 results when searching the title, and 90 results for the author.

    So I think that I will stick with my article submission service for a while. I don’t have the time to try to provide a link to each of the dozens of article sites they submit to, but I will try to do it for a few of them. However, the persistent Yahoo results make it worthwhile for me. I have found it much easier to rise in the Yahoo search rankings anyway. The website linked to by the article in question is still not on Google’s first page for any of its 3 main keywords, but it’s on Yahoo’s first page for all of them.

    You didn’t respond to my second statement directly, but I’m on your mailing list, so I got your invitation to submit a post to your new weight loss blog. In your submission requirements, you insist on “unique content”, so it seems that even though you’ve shown that it doesn’t seem to matter to Google, you still avoid dupliate content yourself.

  48. well I care!!!
    great post..

  49. It’s a neat trick by Google to start talking about how many URLs they know about, it inflates the numbers quite a bit and many will not understand that one page can have many urls pointing to it

    www . jonathanleger . com
    www . jonathanleger . com / index . php
    jonathanleger . com
    jonathanleger . com / index . php

    That was four URLs pointing to the exact same web page.

    Simon

  50. Well said. I’ve never taken much stock in the duplicate content rumors. The key to ranking is, as you’ve always said, keyword based links.

  51. This make sense.. thank you!

  52. Thank you Jonathan for this informative post as usual. However, I am having difficulty understanding the term “article sites” when you replied to Professor’s question. Are you talking about article directories or any sites that syndicate and accept articles?

    Thanks a lot.

  53. I use a method I call CashRank to solve the circular logic problem with link rep, it works like this:

    The basis for cashRank is that someone has to pay one dollar to have a page included, not to me but to anyone.

    Like if someone has a domain I know they are paying a domain renewal fee every year, lets say $10/year so I index the main page of a domain AND the first 9 pages linked to from that domain.

    This means I can’t get an infinite amount of pages in the index as there is always someone paying money to someone else for every page included.

    {snip: no personal links in the comments please}

    Simon

  54. Google is the key-player of the internet world, taking from PR to search results to indexing etc., these all are governed by Google mainly.

  55. Jonathan, Great post.

    Links are the key to page success. Most of us do not make links to inner pages. This reinforces that idea.

    Keller

  56. This goes with my mantra—pages and links, pages and links, and time.

    Sammy

  57. That’s a Great BlogPost! Really appreciate your work Jon. Thanks once again!

  58. Interesting find Jon, I never heard about cuil until today, but I will be checking it out more, it has more indexed pages than google? That’s definitely good for more keyword research and better chances of getting indexed, thanks.

  59. Professor:

    If the article sites you’re submitting to are well linked, then it’s not a problem. The best way to know is to check the URL of your articles in Google to see if the pages are indexed. If they are, you’re in good shape.

    If they’re not, then you can get them indexed by bookmarking them at social bookmarking sites and the like.

    Stan:

    It is an interesting situation, I agree. Articles need links from pages that need links, etc. Seems like circular logic. But the web is so huge now that if a page isn’t linked to by somebody else, then that page can be dismissed as unimportant. That’s how Google works right now, and since they command most of the market share, we have to work with it.

  60. I see what you’re saying Jon but this seems almost like perpetual motion or the trick of going round in circles ’til you disappear up your own exhaust pipe!
    The articles distributed for links also need links – then don’t those link generators also need links or do we just rely on the authority of where those secondary links are placed?
    Where does it end?

    Stan

  61. So what you’re saying is that I have wasted a whole lot of time and money on (1) spinning my articles to avoid duplicate content, and (2) submitting them to dozens of article sites without linking to them?

    Please respond – I can’t afford to continue to throw away any more time or money if it’s not going to do any good.

    I would also like a response to Shiva’s question.

  62. As usual, very informative and insightful post. It is nice to see someone post supportive content for why they say this or that is the way things work.

  63. Great post Jonathon. I have been doing exactly what you are talking about over the past few weeks and it has really paid dividends. Building backlinks to the lesser web2.0 sites has really hepled my main site.

    Backlinks are King.

  64. There’s a big difference between spidering, indexing and displaying.

    Every time I’ve ever looked at my AWSTats Google never fails to spider deeper than any other engine’s spider – bar none. On a site they ‘like’ (and that’s more an art than a science I’ve found) they’ll do the lot including, annoyingly, indexing even the site local search results if you’re not careful.

    If they ‘don’t like’ your site (and these are sites with almost identical inbound linking patterns to the ones they ‘like’) their spidering will be minimal thereafter and subsequent indexing and display will be paltry.

    Bottom line though, the site linked through from my name here currently has over 200,000 indexed pages (over 260,000 at its peak) and it’s primarily an article directory so there inevitably has to be a fair bit of duplicate content and there’s minimal deep linking. Google just happens to ‘like’ this site!

    In my own experience deep in-site linking has even more importance than inbound links when it comes to getting the Google spider active, but even that doesn’t affect whether Google will like you in the first place.

  65. I think another important thing is that Google has the biggest share of search market. I cannot quote, but when checking Nielsen stats (if I am not mistaken it was for US traffic) that Google had well over 60% of the search market.

    And this is much more than 45% for Google, 40% for Yahoo about 5-6 years ago (the numbers are not exact, but close to that).

    So, thanks to all stuff that Google has been doing (including the media marketing) they managed to get the biggest part of the searches. That is why Cuil can be awesome, but until they grab at least 5% of this search market – they are simply not exciting for webmasters and SEO geeks.

  66. Jon

    It seems your Google watching geekiness knows no bounds. You always weem to make sense but of course with Google, you just never know. I will continue to strive to have unique self written content on my sites just in case and continue with my backlinking systems as normal, including using your tools!

    Jo

  67. Thanks John! This makes sense and should really help my linking efforts.

  68. It seems that the duplicate content penalty is not that Google doesn’t index it, but that they relegate it to their secondary index. 99.8% of people will never click the link at the bottom of the search results page to reveal the duplicate results, i.e. those in the secondary index.

    Jon, the breakthrough you’ve really brought to the table is that you can elevate those duplicates into the main index by linking to them. That is a real “ah ha” action point.

  69. Hi Jonathan,

    Thanks a lot for your post! I always heard the importance of article marketing – but i don’t understand what’s the reason behind of it. Now, i understand and will do it dedicately. Thank you.

  70. I get clear what you explain that we have to get link to our page so our page will get index in google database.

    Thanks for the post.

  71. This probably explains why my blogger blog which only had articles from ezinearticles got de-indexed from Google recently. It just had duplicate articles on it.

    Let me build a few links to it and hope for the best.

  72. Hi Jon,

    That’s a great article and a definite Eye Opener for many including me abt how google is indexing pages and more importantly how its retaining them.

    But then i couldn’t clearly understand what you meant in the last bust one para

    “if you’re going to distribute articles or other duplicate content in order to build links to your web site and rank better in Google, you need to make sure that the content you distribute is linked to by other external pages. Whether you accomplish that by social bookmarking or writing additional articles on EzineArticles that link to your articles on “lesser” sites or though some other method, you need to be sure that the content is linked to.”

    Can you explain this point again or with an example.

  73. Hey Jon,

    Insightful post as usual! I like how you’re always so methodical when trying to get to the root of things.

    I think Google is doing some smart things, given its objectives.

  74. I tried to get so many backlinks for my site with so many methods. But the methods discuseed in this post will help me a lot to build strong backlinks. Thanks for sharing JON!!!

  75. Jon, the link in your email to this post is wrong, all I got was a 404 error. Unfortunately, since you’re using a “noreply” email address that is so anti-customer and since there’s no way to get to you via your support desk, I’ll have to inform you here.

  76. Great explaination Jon!

    It’s really no surprise if you look at the history of Google. The whole original algo was designed (as I understand it) around the premise that backlinks means more authority/credibility.

    So following that logic would mean (as you said) no links=no indexing.

  77. Wait a minute. Was that small voice, above, The Matt Cutts? Was Google confirming that it wants to see manufactured links? That if someone will run all over the web making links to themselves that it really does give them a competitive advantage?

    OK, then. I’m in. I’d been waiting for some authoritative confirmation before I began spreading links all over the web, if you will — no more.

    I had been disappointed when Terry Kyle’s grand link experiment on Warrior (124306) didn’t include a control site with no manufactured links. I must have been the last guy who wondered if Google would really allow a scenario where the biggest self-linkers win.

  78. I’ve only been marketing online for 4 months so please bear with me.
    Question.
    Is this part of the reason Ezine articles works better than other article directories? Google seem to Index all/most Ezine articles, and if this is the case does this mean if you build links to you articles on other article directories they can be as powerful as Ezine articles for your site as far as links go?

  79. Great post. It is true, you should never create an article without linking to it from a page you know is already indexed. This is a great reason to have a blog. You can use it to link to any web page. It is also a good way to keep track of articles, etc.

  80. Good points – especially the one about duplicate content issue. Its not that wwe are concerned about the duplicat content (in the example of articles) but we want the articles linsk to count, and you are 100% right- start linking to those articles we have already put out.

    I have noted that buzzle is a good site to put articles on- if a little slow to put them up.

  81. Finally!!!…..

    Thank you, I knew the Duplicate Content nonsense was a myth put out by gurus looking to exploit webmasters’ fears.

    And you have succinctly (spelling?) explained it. It not the content, its the content’s link juice!!!

    Jonathon, do yourself and the world a favor and put this out as a press release.

    Kudos

  82. That’s pretty fascinating. Who knows what Google are up to or how they algorithm works. I’m sure you know a lot more than I do in these cases. I’m just sailing along in my ignorance and having a good time on the web.

    I have no trouble submitting Ezinearticles since I always submit original articles there and link to my other original articles on my website, so there are no duplicate content issues for me.

  83. Great post Jon. I know people who freak out over the way Google indexes their sites. However most of these individuals usually aren’t providing quality, unique content on their sites. I’ve found that Google rewards websites which possess quality, unique content.

  84. If there is a topic that is most misunderstood among online marketers, it must be “Duplicate Content”. Knowing what duplicate content is seems to pose less of a problem than knowing how Google views it and how they treat it.

    Thanks for the analysis in this post as it refines our understanding of the topic and at the same time shows the reason some pages are not indexed. Links.. Links.. Links.

    Links play such a major part in site promotion. Not many years ago on-page SEO played a much bigger role than today. Now, the real driver is Links or Off-page SEO. I am learning much from your publishing , Jon. Keep up the good work.

    I hope you were able to start up as a RP in September. If so, look to PSS in the summer. Unless, perhaps you’ve already attended.

    Best wishes

    Tony

  85. Wow! So if you are using any kind of article distribution software you need to get some kind of linking to each of those articles otherwise your article will get de-indexed and thus lose that link. Automation anyone?

  86. I like the way you drew your conclusions. I’ve long been skeptical of anything that google says. I don’t think they’re necessarily nefarious in nature; they’re political in nature and only want to obfuscate the truth for their own gains.

    Nevertheless, google does matter. Just like Microsoft matters, even if you don’t like them.

    However, there will be a search engine that comes along to take their place. It happens in every industry. The SE that wins next time around will be the one that actually delivers relevant and useful results, unlike most of them today.

  87. Another great post Jonathan. You have a really good way of explaining complex issues like this indexing question. Once again the crux of your post on indexing is the all important step of back links, back links, back links. Thanks again for the clear, concise information.

  88. I worked my butt off for over 5 years to rank 1st page on google…now this uPstart comes along and doesn’t even list any of my keywords in their top 10…so how useful can they be…I say they’re useless…call it sour grapes if ya wanna…

  89. Jon, there is a lot of logic in all these. I did the checks as you did. The key word is weight loss.

    Cuil
    2,435,697,197

    Google
    108,000,000

    Bing
    115,000,000

    Yahoo
    02,000,000

    It’s nothing new that google likes to confuse people and they do it all the time by not giving adequate information. Google fools webmasters or webmasters tricked google. I feel all is a sort of cat and mouse issue.

    Anyway thanks for these views of yours. Now I see why google indexed my pages only to drop them later.
    Thanks Jon.

  90. Amazing! The level of detail of your research on the search engines is simple yet, can derive that many conclusions. Thanks for showing the examples as well on how you search google for answers. Time to start building my links!

  91. Hi Jonathan,

    Thanks for another very interesting post.

    It would be interesting to know if Cuil is attracting more business for anyone. I agree that just because they have indexed more pages does not make them the best search engine, but if they are creating results for websites, then I don’t much care who is the biggest.

    I don’t believe that I have seen any results from Cuil, even with well-linked sites. Have you received any benefit from them?

    To me, that is the best judge of which search engine is best.

    Kind regards,

    Barry

  92. Great post Jonathan. You really make me question my own article marketing strategy. In the past I never took the time to build links pointing to each of those article publications. As a result, Google has likely not been indexing most copies of those articles. Do you know if Yahoo or Bing does something similar to this?

  93. Great info as always Jonathan. I have heard all of these self proclaimed gurus say that duplicate content could keep you from rising to the top of Google, but then wondered why it was ok for all of the article sites out there to do it.

  94. sounds like some affiliate marketers need to spin their articles better.. haha

    sounds like the golden rule of article marketing. put the content on YOUR blog first, get it indexed, then send out your article blast or whatever you do w/ your content (same w/ video, etc..)

    So the question becomes, does Google totally dismiss the links on those other pages too? ;-)

    Adam Holland

  95. Great post Jonathan, and it makes sense.

    I read somewhere recently that Google are about to place more relevance on unique content, and load time of web pages (great idea) and less relevance on back links.

    But I believe if you write for visitors and not search engines, back links and indexing will come naturally.

    Who knows, Google are Google!

  96. thats why we encourage people to build links to the sites that they submit their content to. have been saying that for years, whether its through other aggregators, social bookmarking, twitter, etc.

  97. Indeed, many linking pages are not indexed in Google (for example, articles submitted to article directories)

    But what tools/methods do you use or suggest to get these numerous pages indexed ?

    Laurent

  98. Hi Jonathan,

    Thank you for that detailed post again.

    Its obvious from your earlier posts that back linking is one of the most important bedrocks of not just getting good rankings but from today’s post as to weather you stay indexed as well.

    I love the way you communicate your thought processes and the way you deduce from the facts, and they speak for themselves from the clear facts that you present us here.

    Enjoyed your article immensely and learn a great deal about real SEO from your free content alone.

    Many thanks

    Hamant

  99. It’s all about the search results. It doesn’t matter how big your index is, what matters is how relevant the results that you deliver are. Last week when Google announced real time search, “relevant search results” was the mantra of the day. Great post!

    Rafael


Trackbacks are disabled.