Chapter 15

Focused Web Crawling

Contents

Preamble

Focused web crawling stands one step above the other techniques discussed in this book. It is an integrated technology that combines two base technologies: classification and web analytics. This combination of basic capabilities enables more complex decision making and provides information that is more specific to its intended use. Figure 15.1 shows a complete analysis system that “feeds” automatically on Internet data and outputs information that can be used to make ...

Get Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.