More Searchable Content and Content Types
The emphasis throughout this book has been on providing the crawlers with textual content semantically marked up using HTML. However, the less accessible document types—such as multimedia, content behind forms, and scanned historical documents—are being integrated into the search engine results pages (SERPs) more and more, as search algorithms evolve in the ways that the data is collected, parsed, and interpreted. Greater demand, availability, and usage also fuel the trend.
Engines Will Make Crawling Improvements
In June 2008, Google announced that it was crawling and indexing Flash content (http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html). In particular, this announcement indicated that Google was finding text and links within the content. However, there were still major limitations in Google’s ability to deal with Flash-based content. For example, it applied only to Flash implementations that do not rely on external ...