CHAPTER 13

image

Log Analysis Using Hadoop

The explosive growth of the Web toward the end of the 20th century led to web scale data, particularly log files. Suddenly, everyone who had a web site generated lots and lots of web access logs that were initially used as to debug problems with a web site. Eventually, organizations realized that web access logs were a rich source of information about their customers and potential customers. Click stream analysis offered insights into customer behavior within a web site, and search query analysis offered examples of products and services most important to customers.

Log files are not limited to web servers. ...

Get Pro Apache Hadoop, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.