Removing Your Materials from Google

How to remove your content from Google’s various web properties.

Some people are more than thrilled to have Google’s properties index their sites. Other folks don’t want the Google bot anywhere near them. If you fall into the latter category and the bot’s already done its worst, there are several things you can do to remove your materials from Google’s index. Each of Google’s properties—Web Search, Google Images, and Google Groups—has its own set of methodologies.

Google’s Web Search

Here are several tips to avoid being listed.

Making sure your pages never get there to begin with

While you can take steps to remove your content from the Google index after the fact, it’s always much easier to make sure the content is never found and indexed in the first place.

Google’s crawler obeys the “robot exclusion protocol,” a set of instructions you put on your web site that tells the crawler how to behave when it comes to your content. You can implement these instructions in two ways: via a META tag that you put on each page (handy when you want to restrict access to only certain pages or certain types of content) or via a robots.txt file that you insert in your root directory (handy when you want to block some spiders completely or want to restrict access to kinds or directories of content). You can get more information about the robots exclusion protocol and how to implement it at http://www.robotstxt.org/.

Removing your pages after they’re indexed

There are ...

Get Google Hacks now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.