Skip to Main Content
Amazon Hacks
book

Amazon Hacks

by Paul Bausch
August 2003
Intermediate to advanced content levelIntermediate to advanced
304 pages
7h 33m
English
O'Reilly Media, Inc.
Content preview from Amazon Hacks

Accessing Community Features

Amazon of course provides access to all of their community features through their web site. As more and more sites integrate closely with Amazon, though, there is more demand to tap into the community via code.

Accessing Through Web Services

The Web Services API (see Chapter 6) offers some access. When accessing an individual product’s information through the API, you can find the following community data:

  • The three latest reviews

  • ASINs of five related items

  • Three lists that contain the item

This is fantastic information to have access to. Developers are building tools that work with this data in many creative ways. But when compared with the volume of information that’s available on Amazon’s site, the community information in the API is only a small window into the larger community. That leaves one route for integration-minded developers: screen scraping.

Accessing Through Screen Scraping

The term screen scraping refers to requesting a web page programmatically with a script, and picking through the resulting HTML for the interesting data. Finding the data itself involves writing complex regular expressions . Regular expressions are a pattern-matching syntax that can become complicated quickly. For example, here’s a regular expression that extracts a list of books from a purchase circle page [Hack #44]:

<td.*?<b><a.*?-/(.*?)/.*?>(.*?)</a></b>.*?by (.*?)<br>.*?</td>

You can see some HTML there, and the expressions are based on where the data is within the HTML ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

From ChatGPT to HackGPT: Meeting the Cybersecurity Threat of Generative AI

From ChatGPT to HackGPT: Meeting the Cybersecurity Threat of Generative AI

Karen Renaud, Merrill Warkentin, George Westerman
Incident Metrics in SRE

Incident Metrics in SRE

Stepan Davidovic

Publisher Resources

ISBN: 0596005423Supplemental ContentCatalog PageErrata