Chapter 14. Web Scraping

R provides a platform with easy access to statistical computing and data analysis. Given a data set, it is handy to perform data transformation and apply analytic models and numeric methods with either flexible data structures or high performance, as discussed in previous chapters.

However, the input data set is not always as immediately available as tables provided by well-organized commercial databases. Sometimes, we have to collect data by ourselves. Web content is an important source of data for a wide range of research fields. To collect (scrape or harvest) data from the Internet, we need appropriate techniques and tools. In this chapter, we'll introduce the basic knowledge and tools of web scraping, including:

  • Looking ...

Get Learning R Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.