Removing Your Materials from Google
How to remove your content from Google’s various web properties.
Some people are more than thrilled to have Google’s properties index their sites. Other folks don’t want the Google bot anywhere near them. If you fall into the latter category and the bot’s already done its worst, there are several things you can do to remove your materials from Google’s index. Each of Google’s properties—Web Search, Google Images, and Google Groups—has its own set of methodologies.
Google’s Web Search
Here are several tips to avoid being listed.
Making sure your pages never get there to begin with
While you can take steps to remove your content from the Google index after the fact, it’s always much easier to make sure the content is never found and indexed in the first place.
Google’s crawler obeys the “robot
exclusion protocol,” a set of instructions you put
on your web site that tells the crawler how to behave when it comes
to your content. You can implement these instructions in two ways:
via a META
tag that you put on each page (handy
when you want to restrict access to only certain pages or certain
types of content) or via a robots.txt
file that
you insert in your root directory (handy when you want to block some
spiders completely or want to restrict access to kinds or directories
of content). You can get more information about the
robots exclusion protocol
and how to implement it at http://www.robotstxt.org/.
Removing your pages after they’re indexed
There are ...
Get Google Hacks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.