Remove Your Materials from Google
Remove your content from Google’s various web properties.
Some people are more than thrilled to have Google index their sites. Other folks don’t want the GoogleBot anywhere near them. If you fall into the latter category and the bot’s already done its worst, there are several things you can do to remove your materials from Google’s index. Each part of Google—Web Search, Google Images, and Google Groups—has its own set of methodologies.
Google Web Search
Here are several tips to avoid being listed.
Making sure your pages never get there to begin with
While you can take steps to remove your content from the Google index after the fact, it’s always much easier to make sure the content is never found and indexed in the first place.
Google’s crawler obeys the robot exclusion protocol, a set of instructions you put on your web site
that tells the crawler how to behave when it comes to your content.
You can implement these instructions in two ways: via a
META
tag that you put on each page (handy when you
want to restrict access to only certain pages or certain types of
content) or via a robots.txt
file that you
insert in your root directory (handy when you want to block some
spiders completely or want to restrict access to kinds or directories
of content). You can get more information about the robots exclusion
protocol and how to implement it at http://www.robotstxt.org/.
Removing your pages after they’re indexed
There are several things you can have removed ...
Get Google Hacks, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.