Skip to Main Content
Java Cookbook
book

Java Cookbook

by Ian F. Darwin
June 2001
Intermediate to advanced content levelIntermediate to advanced
888 pages
21h 1m
English
O'Reilly Media, Inc.
Content preview from Java Cookbook

Program: Data Mining

Suppose that I, as a published author, want to track how my book is selling in comparison to others. This information can be obtained for free just by clicking on the page for my book on any of the major bookseller sites, reading the sales rank number off the screen, and typing the number into a file, but that’s tedious. As I somewhat haughtily wrote in the book that this example looks for, “computers get paid to extract relevant information from files; people should not have to do such mundane tasks.” This program uses the regular expressions API and, in particular, newline matching to extract a value from an HTML page. It also reads from a URL (discussed later in Section 17.7.) The pattern to look for is something like this (bear in mind that the HTML may change at any time, so I want to keep the pattern fairly general):

<b>QuickBookShop.web Sales Rank: </b>
26,252
</font><br>

As the pattern may extend over more than one line, I read the entire web page from the URL into a single long string using my FileIO.readerAsString( ) method (see Section 9.6) instead of the more traditional line-at-a-time paradigm. I then plot a graph using an external program (see Section 26.2); this could (and should) be changed to use a Java graphics program. The complete program is shown in Example 4-2.

Example 4-2. BookRank.java

import java.io.*; import com.darwinsys.util.FileIO; import java.net.*; import java.text.*; import java.util.*; import org.apache.regexp.*; /** Graph ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Practical Cloud-Native Java Development with MicroProfile

Practical Cloud-Native Java Development with MicroProfile

Emily Jiang, Andrew McCright, John Alcorn, David Chan, Alasdair Nottingham
Distributed Computing in Java 9

Distributed Computing in Java 9

Raja Malleswara Rao Malleswara Rao Pattamsetti

Publisher Resources

ISBN: 0596001703Supplemental ContentCatalog PageErrata