What happens when your website falls out of Google’s index? Most people react with panic. But after seven (7) years of reading forum threads whose contributors have suffered similar fates, we know panic is the last thing you should do.
From the outside looking in, Google seems like a well-behaved giant. Users rarely see Google errors. And Google is not in the habit of issuing press releases when they do have a technology issue. However, those who follow Google closely know better.
Google has had a history of technology goofs; many of Google’s updates don’t go as planned. And for a few unlucky souls whose livelihoods are tied to the ‘giant’ the world comes crashing down when Google goofs. When Google Chaos happens:
- Websites, for no apparent reason, loose significant rankings.
- Pages appear to be de-indexed.
- And no matter what the Webmaster does to try to reverse the condition, in the short term, Google chaos persists.
Self Inflicted Wounds
To be fair, not all of these circumstances are Google’s fault. Sometimes, Webmasters inadvertently do something that creates problems for Google. Here is a short list that covers some fatal moves:
- A Webmaster decides to create a new home page and changes the URL to www.domain.com/home. Furthermore, the Webmaster uses a 302 server re-direct from www.domain.com to www.domain.com/home. And, finally, the Webmaster strips the content off the old /index.html page. All works perfectly in a browser but Google sees something entirely different - Google no longer sees content on www.domain.com and does not necessarily follow a 302 server re-direct. As a consequence, /home never inherits the SEO value associated with www.domain.com.
- A Webmaster accidentally makes an adjustment to the robots.txt file that disallows a primary directory. Google still know about the pages but stops ranking all pages in that directory.
- A Webmaster makes an adjustment and adds a new, slick looking JavaScript menu. Google does not typically read JavaScript and no longer follows the links in the menu. As a consequence, ranked pages disappear from Google’s index.
Google Volatility
In other cases, Webmasters report wide swings in rankings. This symptom is not unusual. In May 2008, Matt Cutts, an engineer on the Google Spam team, went on record saying that the Search Engine Titan is currently conducting a major experiment, code named “Dewey.” Webmasters in the SEO community have observed that pages with little or no PageRank, which have never shown up in the top 100 SERPS (Search Engine Result Pages), are now displacing other sites that had ‘page one’ rankings for more than five (5) years. Others observe that site rankings fluctuate +/- 30 positions at different times of the day and some have reported ranking fluctuation of more then 50 positions within the same day.
Position Research had been observing these conditions for many months. We speculate that Google is performing live testing similar to the tests described in a recent white paper titled “Search Engines that Learn from Implicit Feedback.” The premise of the paper is that search engines can determine website relevance by analyzing what listings users do not click. This testing requires that Google bring websites that may not otherwise deserve high ranking into a top position for a short period of time in order to record user behavior.
Data Loss
Many times, something happens within the Google system and data gets lost. You might think: “How could data get lost”? The answer has to do with understanding Google’s spidering and reporting network. Based on latest reports, Google maintains over 200,000 spidering servers. These servers are constantly crawling website pages. When you consider that over 100,000 new website pages are added daily, and that the current number of website pages is estimated at over 80 BILLION, you begin to understand the enormity of the task. Other servers are consolidating and synchronizing this information so that data can be compiled into ranking data.
There is another set of servers dedicated to serving search results to the public. These servers are clustered in datacenters spread throughout the world. At last count, Google has over 40 datacenter with more than 750 IP addresses, each comprising of several servers. Check out http://www.seocritique.com/datacentertool/ for a humbling view of Google datacenters.
Now we all know that (data) electrons are obedient most of the time, but not all the time. Hard drives crash. Data packets get lost during transit from one location to another. And some times hardware fails during critical transmissions. You can begin to understand how enormous Google’s task is and how easy it might be to loose data during all the consolidation / synchronization / compiling steps involved.
So what happens when data is lost? That depends on what data is lost. If the lost data is compiled data, then Google simply recompiles. But if the data is original data, then Google must re-gather and then re-compile. This can take time - weeks if not months.
Filter Traps
Google filters are another story. In part, Google compiles page data to determine page attributes - and Google collects over 300 unique attributes for each page. If Google determines that there is a combination of negative attributes to merit a ranking adjustment, then rankings decline. But these attributes are only reasonably predictive when in combined with other attributes.
Google filters are based on statistics - and in statistics, the larger the sample, the higher the correlation. In a perfect world, Google would have all the page attributes it needs and unlimited computing power to reach very high correlation coefficients. Under these circumstances, Google would be able to detect the bad from the good websites with 100% accuracy. But in reality, Google doesn’t have enough attributes or computing power. Therefore, their filters are less than perfect. In other words, Google presumes that if ‘it’ walks like a duck, quacks like a duck, and smells like a duck, ‘it’ is probably a duck, but not certain. ‘It’ may be a goose. So to some degree, Google’s filters are throwing some ‘baby’ out with the ‘bathwater’.
Fireworks really start flying when Google introduces a new filter in an effort to improve rankings. Invariably, some website pages are collateral damage. It gets more interesting when Google starts turning the dials on these filters. Website pages pop back in and out like popcorn. The Webmaster’s hope is that Google engineers optimize their filter algorithms and minimize collateral damage. But there are always some pages that get the ’short end of the stick’.
Minimize Google Chaos
So what can you do to avoid Google Chaos? First, recognize that Google offers rankings for free and as such is not obligated to be bug free. Second, realize Google ‘love’ goes to those whom Google chooses (through its complex algorithms). And third, benchmark, benchmark, benchmark.
Google bugs and Google ‘love’ are things you cannot control. But benchmarking is something you can control because when you record and log metrics (i.e. critical observations) you can better determining what your course of action you should take.
Here is a list of metrics that you should record:
- Keyword rankings - Track Google rankings on a daily basis - weekly is not good enough because you need to know if a poor ranking condition is temporary or permanent. You also need to know the exact date when rankings declined so that you can compare your date with that other Webmasters who may have experienced a similar condition on the same date.
- Google ‘cache’ query - Make sure Google is caching your pages and check cache dates.
- Google ‘info:’ query - Make sure Google is reporting ‘info:’ query results. If your pages does not show results for an ‘info:’ query, something is wrong.
- Google ‘URL’ query - Make sure Google is reporting results when a page URL is entered. Your page URL should be at or near the top of the results.
- Google Webmaster account - Record changes to and observations reported by Google.
- Record and log all website navigation and infrastructure changes. This can include changes to robots.txt and sitemap.xml files, your Google Webmaster account, and server changes.
Unexpected Google results for any of the Google queries may be a Google glitch or it may be indicative of something more serious. These metrics may be performed on a weekly basis, daily if you notice something unusual. A Google Webmaster account should be check on a monthly basis; more frequently if any other metric is concerning.
Making Sense of Google Chaos
If Google Chaos strikes your website and you have benchmarked Google metrics, you will be armed with the kind of information necessary to determine your next course of action.
If your site looses rankings, the first thing to determine is whether your Google metrics have changed and if your experience correlates with other webmasters. Check the forums to see if something unusual is happening. If your observations seem to be isolated, then the problem is likely to be self-inflicted. Check your logs and start reverting to known stable conditions. Then allow Google to react to these changes, which make take week or months depending on what changes were made. As a rule of thumb, the time Google takes to react is the time Google will need to react again. Observing your Google metrics will help determine whether your actions are making a real difference.
If your experience is not isolated and others are reporting the same conditions, it is probably a Google error. Don’t panic. Most of the time, Google corrects its mistakes within 1-2 weeks.
But if the reason for Google’s reaction is based on a new filter, hang on to your hat. It may take much more time for Google to sort things out. And even when it does, your site may be part of an ‘elite’ few that is considered acceptable collateral damage.
How can you tell if your site is part of collateral damage? This is pretty tough - it is a process of elimination. First, wait and make sure that a Google bug has not caused your situation. The forums can help determine this condition. Second, make sure Google filters are optimized and stable. Again, forum activity will help determine this condition. Only after failure of these 2 conditions should a more radical approach be considered.
If the forums are quiet and your site is still in Google Chaos, start an extensive research effort, which considers any and all page attributes. Check outbound links. Check inbound links. Check duplicate and near-duplicate content. Check everything you can think of and start making site adjustments. It just may be something really subtle that needs to be changed so that Google’s filters think you’re website is a ‘goose’ and not a ‘duck’.










A new search engine has entered the seen: