Scraper

Scraper

scrape.rb

Scraping, in its most basic form, is the action of pulling data from another website through normal HTTP queries. The scraper script is a culmination of the previous scripts. It combines the prior techniques discussed in previous scripts into one large script with a few more features. This script allows for a one-stop shop in basic website scraping. This script is not a bot, because it requires user interaction for each scrape; but with a few minor tweaks, this script could be completely automated.

The Code

 require 'rio'
 require 'open-uri'
 require 'uri'

 unless ARGV[0] and ARGV[1]
     puts "You must specify an operation and URL."
     puts "USAGE: scrape.rb [page|images|links] <url to scrape>"
     exit
 end


 case ARGV[0] when "page"  ...

Get Wicked Cool Ruby Scripts now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.