Chapter 2. Web Scraping

The amount of data created each day on the Internet is quite staggering. Much of this data is created on social media websites as well as individual blogs. We also have data that we create from our cell phones, tablets, and wearable devices. According to the following website (http://www.livevault.com/2-5-quintillion-bytes-of-data-are-created-every-day/) in 2015 IBM reported that the average amount of data created per day is approximately 2.5 quintillion bytes. It would be useful to any organization to get their hands on this data and make sense out of it. This is where web scraping comes into play.

Simply put web scraping is a technique to extract data from different websites, manipulate the data into a structured format, ...

Get Practical Business Intelligence now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.