Effective ways to get out of Google's sandbox


This time I'll explain you how do I managed to get one of my websites out of the google's sandbox. It has stayed there almost a year. The technique that I've used is a combination between the part 1 and part 2 of the How to get out of the google's sandbox article series. So read them before to understand the main principes:

  • How to get out of Google Sandbox - part I

  • How to get out of Google Sandbox - part II

  • I've applied all the stated techniques such as:
    - redirect website's www urls to non-www urls
    - have a website structure not deeper than 3rd level(I.e: don't put your website content so deep inside via more than 3 links away. This way the crawler/spider could stop searching it.)
    - convert webpages using .htaccess from dynamic to static ones.
    -
    rewrite all the meta tags and explicitly manifest the pages that must not be indexed.
    How?
    Put in the headers of your webpage:
    meta name="robots" content="index,follow" - for webpages that will be indexed
    meta name="robots" content="noindex,nofollow" - for webpages that you don't want to be indexed
    For more seo techniques and avoiding duplicate website content look here: http://nevyan.blogspot.com/2007/01/duplicate-website-content-how-to-avoid.html

    - put a slight delay into the crawling machines, this is important especially if your hosting server doesn't have a fast bandwidth.
    How?
    In your robots.txt file put:
    User-agent: *
    Crawl-Delay: 20

    You can also adjust the Crawl delay time.
    - remove the duplicate or invalid pages from your website that are still in the google's index/cache.
    How?
    First make a list of all the invalid pages. Then go to google's webpage about urgent url removal requests:
    http://services.google.com/urlconsole/controller
    - Wait. Then ensure that your website is no longer indexed in google: Find out by typing your full website address in google.com. If there are no results this means that you've succeeded to get your website out of google's index. It may sounds strange but this way you can reindex them again.
    - When ready remove all the restrictions that you might have from robots.txt, .htaccess and webpage headers(noindex,nofollow)
    - Go to http://www.google.com/addurl/?continue=/addurl . Put your website in the field for inclusion and wait for the re-indexing process to become.
    - During the waiting process achieve more links from forums and article directories. Links should point not only to your top level domain, but also to specific webpages.
    For example: not only <a href="www.website.com"> but <a href="www.website.com/mywebpage1.html" >

    That's it. In a few days your website will be re-crawled and out of the sandbox.
    Congratulations!

    0 коментара: