Screen Scraping and Web Services

Even in the early days of the Web, developers looked for ways to combine information from multiple sites. Back then, this meant screen scraping—writing code to dig through loosely structured HTML and extract the vital pieces—which was often a troublesome process. As Web 2.0 emerged, more and more of that information became available through web services, which presented it in a much more structured and more readily usable form.

Applicable Web 2.0 Patterns

These two types of content grabbing illustrate the following patterns:

  • Service-Oriented Architecture

  • Collaborative Tagging

You can find more information on these patterns in Chapter 7.

Intent and Interaction

In the earliest days of the Web, screen scraping often meant capturing information from the text-based interfaces of terminal applications to repurpose them for use in web applications, but the same technology was quickly turned to websites themselves. HTML is, after all, a text-based format, if a loosely (or even chaotically, sometimes) structured one. Web services, on the other hand, are protocols and standards from various standards bodies that can be used to programmatically allow access to resources in a predictable way. XML made the web services revolution when it made it easy to create structured, labeled, and portable data.

Note

There is no specific standardized definition of web services that explains the exact set of protocols and specifications that make up the stack, but there is a set that ...

Get Web 2.0 Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.