Appendix B. Automating the web with scraping

This appendix covers

  • Creating structured data from web pages
  • Performing basic web scraping with cheerio
  • Handling dynamic content with jsdom
  • Parsing and outputting structured data

In the preceding chapter, you learned some general Node programming techniques, but now we’re going to start focusing on web development. Scraping the web is an ideal way to do this, because it requires a combination of server and client-side programming skills. Scraping is all about using programming techniques to make sense of web pages and transform them into structured data. Imagine you’re tasked with creating a new version of a book publisher’s website that’s currently just a set of old-fashioned, static HTML ...

Get Node.js in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.