Hack #74. Program Yahoo! with Java

Java's native support for working with XML makes parsing Yahoo! Search Web Services responses a snap.

Simple REST, XML-over-HTTP interfaces like the one Yahoo! provides are most often associated with scripting languages such as Perl or PHP, while Java—a compiled language—is viewed as a better tool for working with more complex web services protocols. If Java is your preferred language, some new tools in the latest version have made it even easier to work with services like the Yahoo! API.

Java 1.5 added native support for working with XML documents using XPath. This means that if you know the XML format ahead of time, you can retrieve a specific piece of the document with a simple query. For example, every Yahoo! response includes an attribute called totalResultsAvailable in the <ResultSet> tag, which holds the value of the total number of results Yahoo! has for that particular query. If you'd like to grab this value with XPath, you can use this simple query:

/ResultSet/@totalResultsAvailable

This hack presents a quick example using Java's built-in XML and XPath tools to work with Yahoo! Search Web Services. This native support means you won't need to download any external XML parsers; you simply need the latest version of Java.

What You Need

If you don't have Java 1.5 or higher, you'll need the latest Java Developer Kit (JDK™), available at http://java.sun.com. On the right side of the page, you'll find a section called Popular Downloads. Click the link for Java 2 Standard Edition (J2SE™) 5.0 and download the JDK. Don't confuse this with the Java Runtime Environment (JRE), because you'll need the complier included with the JDK.

Once you've downloaded the file for your system, install the JDK and you'll be set to compile and run the hack.

The Code

This simple Java program accepts a query term, builds the appropriate Yahoo! Search Web Services URL with that term, and parses the response. Then the program lists the top 10 URLs for that search term. The code uses XPath queries to pick out the total results available and each URL in the response.

Save the following code to a file called WebSearch.java and be sure to include a unique application ID:

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLEncoder;
import java.text.MessageFormat;

/**
*Simple demonstration of the Yahoo! Web Search Service using Java 1.5's
*XML support.
*/
public class WebSearch {
// Need to have an application ID to call the Yahoo! services.
private static final String APPLICATION_ID = "insert app ID";

// URL format for the request. The simplest request includes the 
// application ID and the query. See the service documentation for 
// a list of additional parameters. 
private static final String WEB_SEARCH_URL_FORMAT =
		"http://api.search.yahoo.com/WebSearchService/V1/
 webSearch?appid={0}&query={1}";
/**

*
*Main program that takes a query and executes it as a web search
*using the Yahoo! Web Search Service.
*
*@param args Command line arguments. There should be at least 1. 
*/ 
public static void main(String[] args) throws UnsupportedEncodingException,
	MalformedURLException, XPathExpressionException,
	ParserConfigurationException {
// Make sure a query was given.
String query = null;
if(args.length == 0) {
	System.out.println("Usage: java WebSearch <query>");
	System.exit(1);
}
else {
	// Construct the query from the command line arguments. 
	query = prepareQuery(args); }

// Construct the URL. Inject the URL encoded application ID and
// the search query.
URL url = new URL(MessageFormat.format(WEB_SEARCH_URL_FORMAT,
	new Object[]{URLEncoder.encode(APPLICATION_ID, "utf-8"), 
	URLEncoder.encode(query, "utf-8")}));        
System.out.println("Request URL = " + url.toString());

// Create an XPath engine.
XPath xpath = XPathFactory.newInstance().newXPath();

// Execute the query.
Document responseDocument = null;
try {

	// We need a Document to use XPath.            
	DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    responseDocument = builder.parse(url.openStream()); 
} 
catch (IOException e) {
	// Error calling the service.
	System.err.println("Error calling the service: " + e.toString());
e.printStackTrace(System.err);
System.exit(1);
}
catch (SAXException e) {
// Error parsing the XML.
	System.err.println("Error parsing the XML: " + e.toString());
	e.printStackTrace(System.err);
	System.exit(1);

}
// Query the XML for the total results available.
String totalResultsAvailable = (String) xpath.evaluate( 
		"/ResultSet/@totalResultsAvailable", 
		responseDocument, 
		XPathConstants.STRING);
System.out.println("Total results available for '" + query + "' is "
		+ totalResultsAvailable);

// Query the XML for the URLs. 
NodeList urls = (NodeList) xpath.evaluate("/ResultSet/Result/Url", 
		 responseDocument, XPathConstants.NODESET);
for(int i = 0; i < urls.getLength(); i++) { 
	Node urlNode = urls.item(i); 
	System.out.println("URL " + (i + 1) + ": "
			+ urlNode.getTextContent());
   }

}
/**
* Simple method that stitches together an array of strings into
* a single string. Used to take multiple command line arguments
* and turn it into a single query string.
*
*@param args The individual strings to stitch together.
*@return A new string containing each of the strings passed in, all
*seperated by spaces. 
*/
private static String prepareQuery(String[] args) {
	String query;
	StringBuffer queryBuffer = new StringBuffer();
	for (int i = 0; i < args.length; i++) {
		queryBuffer.append(args[i]);
		if((i + 1) < args.length) {
			queryBuffer.append(" ");

		}
	}
	query = queryBuffer.toString();
	return query;

 }

}

To compile this code, open up a command prompt and type the following:

               
                  javac WebSearch.java

This should create the compiled WebSearch.class, which you can now run.

Running the Hack

From the same command prompt, you can run the code like so:

               
                  java WebSearch insert term
               

In response, the program shows the request URL it used, total results for that query, and the top 10 URLs. Figure 4-10 shows results for the term "Java XML".

Yahoo! Search results for "Java XML"

Figure 4-10. Yahoo! Search results for "Java XML"

As you can see, handling REST queries and responses with Java is fairly quick work!

Ryan Kennedy

Get Yahoo! Hacks now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.