Scraper

Scraper

scrape.rb

Scraping, in its most basic form, is the action of pulling data from another website through normal HTTP queries. The scraper script is a culmination of the previous scripts. It combines the prior techniques discussed in previous scripts into one large script with a few more features. This script allows for a one-stop shop in basic website scraping. This script is not a bot, because it requires user interaction for each scrape; but with a few minor tweaks, this script could be completely automated.

The Code

 require 'rio'
 require 'open-uri'
 require 'uri'

 unless ARGV[0] and ARGV[1]
     puts "You must specify an operation and URL."
     puts "USAGE: scrape.rb [page|images|links] <url to scrape>"
     exit
 end


 case ARGV[0] when "page"  ...

Get Wicked Cool Ruby Scripts now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.