5SCRAPING A LIVE SITE

Image

Seen through the eyes of a data sleuth, almost every piece of online content is a treasure trove of information to be collected. Think of a series of Tumblr posts, or the comments for a business listed on Yelp. Every day, people who use online accounts produce an ever-growing amount of content that is displayed on websites and apps. Everything is data just waiting to be structured.

In the previous chapter, we talked about web scraping, or extracting data from HTML elements using their tags and attributes. In that chapter we scraped data from archive files we downloaded from Facebook, but in this chapter we’ll turn our attention ...

Get Mining Social Media now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.