BUY THIS BOOK
Add to Cart

Print Book $49.99


Add to Cart

Print+PDF $64.99

Add to Cart

PDF $39.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £28.50

What is this?

Looking to Reprint or License this content?


Programming Web Services with Perl
Programming Web Services with Perl By Randy J. Ray, Pavel Kulchenko
December 2002
Pages: 486

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction to Web Services
The world is full of useful data and services offered by computer programs. But most of that data and most of those services are locked away. Web sites, designed for access by people and not programs, bury the information in an ever-changing morass of HTML. Communication protocols have been specific to applications and sometimes to operating systems. Precompiled libraries are useful only for particular programming languages on the system they were compiled for.
If you want to write a program to book a flight, check how much paid time off you have accrued, or find all the shows on TV that feature the stars of Buffy the Vampire Slayer, you're facing an uphill battle. All that data exists, but it's effectively inaccessible.
Sure, you could screenscrape HTML from web sites such as expedia.com and tvguide.com, but that puts you at the mercy of the web designers of those sites. Every time they decide to make their pages look prettier, you'll have to rewrite your screenscraper.
You might be able to wangle access to the machine that runs the payroll system, but it's unlikely. You might even know the programming language the payroll software was written in. But can you figure out the database structure?
Web services are all about enabling computers to communicate with each other, opening up services and data. Built on open standards, the way that the Web is, web services offer convenient standard ways to open up the functionality of your applications to other applications.
In this chapter you'll learn a bit about the history of web services and the current lie of the land—what systems you can choose from, where the hype exceeds reality, and so on.
Web services didn't spring full-formed from the collective forehead of Microsoft, IBM, and Sun. Systems such as SOAP, XML-RPC, and WSDL are merely the latest iteration in a long series of distributed computing initiatives. Ever since there were two computers, people have been trying to make them work together.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
History
Web services didn't spring full-formed from the collective forehead of Microsoft, IBM, and Sun. Systems such as SOAP, XML-RPC, and WSDL are merely the latest iteration in a long series of distributed computing initiatives. Ever since there were two computers, people have been trying to make them work together.
The web services we have today trace their ancestry back to the Sun Remote Procedure Call (RPC) system. This provided a standard way for a client to interact with a server, using the model of a procedure call. A server might offer many services (procedures, identified by name), and a client would tell the server which service to use and what values to pass it (parameters). The server would send back a value (the return value) from the service.
The problem at the time was that the binary representation of values varied depending on the operating system, hardware, and programming languages that created the value. The Sun RPC system solved the problem by specifying how to encode the values in the parameters—a standard binary format. Representing data can still be a problem, but webservers solve it by representing values in XML (Extensible Markup Language).
Microsoft offered the next major step forward, with its Component Object Model (COM). COM was based on language independence, interoperability, a strong focus on reusable components, and extensibility. The ability to develop components and object libraries that would be accessible over varying platforms and by multiple languages removed some old hurdles in areas of rapid application development and system integration. While RPC dealt only in procedures, COM was designed to exist in a world of objects and method calls.
Microsoft extended this model with DCOM, the Distributed Component Object Model. DCOM overcame many of the limitations of data and interface specifications. Remote objects could be accessed as though they were local, and you could even extend a remote interface. DCOM has never really taken off outside the Microsoft world, though.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Web Services Dream
From this melting pot of history came web services. Dissatisfaction with various aspects of Sun RPC, COM, and CORBA lead programmers to look for glue that had the best parts of all of these systems (cross-platform, high-level, interlanguage) and left behind the drawbacks (complex ORB systems, proprietary ownership, and confusing IDLs).
Web services escape being shackled to particular hardware or languages by using the Extensible Markup Language (XML) to represent data. There are XML parsers available for everything from embedded systems to supercomputers, and almost every conceivable programming language (including Perl!). The ubiquity of XML parsers and the platform neutrality of the XML standard means that web services designers don't have to worry about the issues of byte ordering and datatype size that were a major hurdle for the Sun RPC designers.
To get away from the complexity of ORBs, sockets, and all manner of connectivity hassles, web services are built on top of the Hypertext Transfer Protocol (HTTP). HTTP is also ubiquitous, with web servers available for almost every platform. A server is identified by a URL, and managing communication simply becomes a problem of mapping a procedure call onto an HTTP request and response (this mapping is quite natural, as we'll see in Chapter 2).
XML-RPC was the first web service protocol, forked from the early development of SOAP (Simple Object Access Protocol). As the name XML-RPC suggests, it only tries to encode procedure calls. It defines a standard way to encode data, method calls, and exceptions. It's quite simple, and has gained popularity in the world of scripting languages such as Perl and Python, because its type system is very similar to those of most scripting languages. Chapter 3 and Chapter 4 show how to develop XML-RPC servers and clients in Perl.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Web Services Cold Shower
The previous section showed the positive side of web services—open standards, cross-platform interoperability, and loose coupling. But it's only half the story. As web services are more widely deployed, developers have found holes and shortcomings in systems described earlier.
The XML-RPC specification, for instance, has no standardized error system. Sure, there's a way for a server to say "something went wrong," and it can even return an error string and number, but there are no standard values for the string and number. Each application invents its own error codes. (There is an effort outside the XML-RPC specification to come up with some standard error codes, and this is described in Chapter 3).
The SOAP standard uses advanced features of XML, such as namespaces and XML Schema types (Chapter 2 introduces these if you've never met them before). This eliminates some parsers, which can't handle namespaces, and increases the processing overhead.
There are some who say that XML-RPC and SOAP promote a misleading view of distributed computing; that any complex application built around remote procedure calls is inevitably going to be poorly designed and ineffectively implemented. The Representational State Transfer (REST) philosophy of web services offers a completely different view of how you should design your web services. You'll see REST in detail in Chapter 11.
The web services world is so fractious that even REST has its detractors. They say REST is too academic, it is theory that's difficult to translate into practice, and it avoids the hard problems of standardized encodings that XML-RPC and SOAP are designed to solve.
UDDI is arguably the biggest example of the web-services hype. At one point there were predictions of artificially intelligent (AI) programs that would discover and connect web services automatically, from standardized descriptions of functionality offered and wanted. However, there are no standardized descriptions of functionality, UDDI covered only a very small part of the business relationship that most large-scale web services needed to express, and ultimately nobody has felt the need for this kind of Holy Grail strongly enough to implement it.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Who to Believe?
Reality, as always, lies somewhere between the optimist and the cynic. There are real advantages to using web services instead of systems such as CORBA, as well as real drawbacks.
The toolkits for SOAP and XML-RPC are far more convenient to use than CORBA or COM, and convenience matters. Perl programmers know that you don't have to run the fastest, just fast enough. (As the old joke says, I don't have to outrun the bear, I just have to outrun you). In a world in which development cycles are shrinking, and deadline pressures are growing, convenience leads to quicker development, which means higher profits.
Web services succeed in being lightweight, in the sense that it's easier to implement an HTTP server and client than it is to write an ORB or talk to one. As a result, there are SOAP and XML-RPC libraries for languages that don't have ORBs, and web services have spread further than CORBA, COM, and Sun RPC ever did.
While the AI pipe dream of UDDI servers reducing programming problems to a set of XML transformations has gone up in smoke, UDDI isn't a complete bust. For domain-specific applications, UDDI is still useful. For example, in bioinformatics you don't care which human genome server you search, so long as it has the data. A UDDI registry of searchable databases has been proposed to let programs say "I want to search the human genome" and respond by automatically finding a current list of servers.
Interoperability does remain a bugbear, though. The web-services standards are long and often nebulous, so it's quite easy to write a server or client that appears to conform to the standards but actually doesn't. There are interoperability suites and periodic "bakeoffs" in which toolkit implementers run extensive tests against one another to ensure interoperatability. However, there are many toolkits that have some interoperability problems, and the promise of transparent interoperability is still just that.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Web Services in the Real World
Let's look at the applications in this book and see how they lend themselves to the web-services protocol they illustrate.
Chapter 3 gives a quick example of writing an XML-RPC client for the Meerkat news service from nothing more than an XML parser and an HTTP library. XML-RPC lends itself to this because there's no innate object orientation to the design of the news service, and the main audience for the news service is the programmers of the scripting languages that happen to best support XML-RPC.
Chapter 4 reimplements the Meerkat client using the three Perl XML-RPC toolkits to show how much simpler life is with a toolkit to do the heavy lifting for you. Each toolkit also implements another example, fetching an entry from one of several quote databases. This is a more complex example, with an API to implement as both server and client.
Here is a simple program that fetches and prints five Perl stories from Meerkat using the XMLRPC::Lite toolkit:
#!/usr/bin/perl -w
   
use XMLRPC::Lite;
   
$client = XMLRPC::Lite
  ->proxy('http://www.oreillynet.com/meerkat/xml-rpc/server.php')
  ->on_fault(sub { die "Transport error: " .  $_[1]->faultstring });
   
$resp = $client->call('meerkat.getItems',
                      { 'search' => '/[pP]erl/',
                        num_items    => 5,
                        descriptions => 75 })->result( );
   
foreach $story (@$resp) {
  print $story->{description}, "\n";
  print "  ", $story->{link}, "\n";
  print "\n";
}
This is the kind of output it produces:
Deploy USE_GNOMENG infrastrcuture o USE_REINPLACE instead of PERL o Mark
  http://www.FreshPorts.org/audio/freebirth/
   
Support for merging, speed improvements, support for XTMPath, and LTM.
  http://www.garshol.priv.no/download/xmltools/prod/XTMBase.html
   
Michael Stevens compares two popular mail filtering tools, both written in
  http://www.perl.com/pub/a/2002/08/27/filtering.html
   
Directory layouts of py-gtk and py-gnome packages have been changed, so tha
  http://www.FreshPorts.org/mail/pmail/
   
Directory layouts of py-gtk and py-gnome packages have been changed, so tha
  http://www.FreshPorts.org/editors/moleskine/
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: HTTP and XML Basics
The web-services technologies described in this book are built primarily from XML and HTTP. While toolkits don't demand an intimate knowledge of these techniques, having a basic understanding of these low-level elements will lead to better grasp of the more complex concepts presented later in this book.
This chapter presents overviews of HTTP (with some discussion of HTTP/S), XML, and a little coverage of the XML Schema language. The goal of this chapter is to present these topics if you aren't yet familiar with them, without straying too far from the main focus of the book. The discussion of these concepts will focus on their application to XML-RPC and SOAP, with references to other books and web sites if you want to learn more.
Anyone who has surfed the Web has used the Hypertext Transfer Protocol; it's the dominant protocol for fetching web pages from a server. The name is now something of a misnomer because the protocol is used for more than HTML web pages. URLs (Uniform Resource Locators, or web addresses) that start with http indicate a page that is fetched through HTTP.
HTTP was originally developed as a layer over the TCP protocol to simplify applications that exchanged HTML data. Since then, it's been adopted and standardized by the World Wide Web Consortium (W3C) into the form currently in use. The current HTTP standard is at Version 1.1, which supports a number of optimizations over the 1.0 specification that had been the standard for some time.
Fortunately, as will be illustrated later, because there are excellent programming toolkits available for Perl that simplify using HTTP, it isn't necessary to have an intimate knowledge of HTTP details and internals.
HTTP is based on a simple model of a request/response conversation. A client sends a request to a target server, possibly with some amount of data accompanying the request. The server always gives a response, even if it's an error. There are even ways for the server to report to the client that it is completely unable to handle the client's request.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
HTTP
Anyone who has surfed the Web has used the Hypertext Transfer Protocol; it's the dominant protocol for fetching web pages from a server. The name is now something of a misnomer because the protocol is used for more than HTML web pages. URLs (Uniform Resource Locators, or web addresses) that start with http indicate a page that is fetched through HTTP.
HTTP was originally developed as a layer over the TCP protocol to simplify applications that exchanged HTML data. Since then, it's been adopted and standardized by the World Wide Web Consortium (W3C) into the form currently in use. The current HTTP standard is at Version 1.1, which supports a number of optimizations over the 1.0 specification that had been the standard for some time.
Fortunately, as will be illustrated later, because there are excellent programming toolkits available for Perl that simplify using HTTP, it isn't necessary to have an intimate knowledge of HTTP details and internals.
HTTP is based on a simple model of a request/response conversation. A client sends a request to a target server, possibly with some amount of data accompanying the request. The server always gives a response, even if it's an error. There are even ways for the server to report to the client that it is completely unable to handle the client's request.
Figure 2-1 shows a simple layout of the request sent by a client and the corresponding response. Note the similarities between the two.
Figure 2-1: Basic request/response model
Both ends of the conversation communicate a lot of their information in message headers, which are similar in style to those used by electronic mail and Internet news servers.
The structure of an HTTP message is basically the same for requests and responses, except for the first line. Both messages start with a line specific to the message type (a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
XML
The Extensible Markup Language, or XML, is arguably one of the most useful and important technologies to emerge as a result of HTML and the World Wide Web. While the basic concepts and theories behind it aren't very complicated, it has proven to be a critical tool in solving numerous problems, from providing neutral data representation between very different architectures, to bridging the gap between software systems with minimal effort.
XML is often referred to as "self-describing data," because the XML version of the data contains information you'd otherwise use to describe the data format including: element/parameter names, structural relationships between elements, hierarchical relationships, and so forth. If the element tags (names) are chosen so as to be meaningful and descriptive, the resulting XML can often be read and reasonably understood separate from the applications that use it.
At the simplest level, understanding XML is a matter of understanding the definitions and roles of the three primary building blocks: elements, attributes, and data. There are other things that can appear in an XML file, and they're also explained in the following sections.
This XML fragment shows elements (book), attributes (isbn), and data ("Programming Web Services with Perl"):
<book isbn="0596002068">Programming Web Services with Perl</book>

Section 2.2.1.1: Elements and namespaces

Elements are the building blocks of XML. To those familiar with HTML, elements are what make XML look like HTML at first glance. However, XML is very different from HTML, and much of the difference is in the rules governing the elements.
An element (also referred to interchangeably as a tag) is a name, or symbol, made up of alphabetic, numeric, and a handful of special characters (hyphens, underscores or periods). The very first character of an element name must be either alphabetic or an underscore; numbers or the other special characters can't start an element name. Also, the leading three characters can't be
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
XML Schema
The XML Schema language is the specification that the W3C organization developed to replace the DTD as the preferred way to describe the content and structure of XML documents. While the DTD still has a well-established place in XML technologies, schemas are being used by more and more applications. The overall acceptance of the XML Schema format continues to grow at a steady pace.
In general, you have to read this section only if you're planning to read the chapters detailing the low-down and dirty details of SOAP, WSDL, and UDDI. Those three standards build heavily on the XML Schema Language. If you're planning on letting toolkits do all the heavy lifting for you, you can flip straight past this to Chapter 3 and enjoy the simple life.
The main argument against the DTD is simple: it isn't XML. The DTD structure was inherited from HTML's roots in SGML, which itself is designed to solve a much wider range of problems. Thus, the syntax and structure of the DTD has to manage and support this flexibility that XML itself doesn't use or need.
The DTD still has some benefits over XML Schema:
  • A DTD is generally simpler and smaller in size than the schema describing the same structure.
  • XML Schema don't provide a way to define named text entities, such as &eacute; for the character é.
  • While being an XML application is a boon for XML Schema, the selection of available tools still heavily favors DTD. This factor can be expected to change over time, however.
The main area in which XML Schema wins out over the DTD is in expressing more complex structures and relationships. While a DTD can express the same level of complexity, the complexity of the DTD itself grows at an alarming rate. The XML Schema language has very rich support not just for defining elements and types themselves, but also for defining them by extending and expanding upon existing structures.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Introduction to XML-RPC
XML-RPC is a web services protocol that implements remote procedure calls over HTTP. While it doesn't have the advanced feature set of SOAP or CORBA, its simplicity makes it easy to implement and use.
This chapter introduces the main concepts and limitations of XML-RPC, and describes how XML-RPC uses XML and HTTP. If you plan to use a toolkit, you need to read only the first part of the chapter about the concepts in XML-RPC. If you're going to do it all yourself, read not only the concepts sections but also the XML and HTTP sections, and the sample client section about the Meerkat service.
XML-RPC was designed primarily by Dave Winer of UserLand Software, Inc. He was one of the designers working on the SOAP specification and became frustrated with the mounting complexity. He wanted something to use immediately, but SOAP was taking a long time to coalesce. So he forked off what was then an early working draft of the SOAP protocol, and this became what is now known as XML-RPC.
The first implementation of the specification was in Userland's Frontier product, a content management system with scripting, object database, and server capabilities. This was introduced in April 1998, and eventually the specification was published to encourage the development of other compliant toolkits. Currently, there are 65 implementations in languages ranging from AppleScript to Zope. There are toolkits for Lisp, Ruby, Eiffel, Scheme, Dylan, and an impressive seven different implementations for PHP. Perl features three different implementations, which will be covered in-depth in Chapter 4.
The web site for XML-RPC, http://www.xmlrpc.com, is a good source for more history of the specification. It also features links to various toolkits and the current specification as well.
XML-RPC uses a simple XML application to express function calls (requests) and returned values (responses) between clients and servers. The heart of an XML-RPC message is the way data is encoded into XML.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
History of XML-RPC
XML-RPC was designed primarily by Dave Winer of UserLand Software, Inc. He was one of the designers working on the SOAP specification and became frustrated with the mounting complexity. He wanted something to use immediately, but SOAP was taking a long time to coalesce. So he forked off what was then an early working draft of the SOAP protocol, and this became what is now known as XML-RPC.
The first implementation of the specification was in Userland's Frontier product, a content management system with scripting, object database, and server capabilities. This was introduced in April 1998, and eventually the specification was published to encourage the development of other compliant toolkits. Currently, there are 65 implementations in languages ranging from AppleScript to Zope. There are toolkits for Lisp, Ruby, Eiffel, Scheme, Dylan, and an impressive seven different implementations for PHP. Perl features three different implementations, which will be covered in-depth in Chapter 4.
The web site for XML-RPC, http://www.xmlrpc.com, is a good source for more history of the specification. It also features links to various toolkits and the current specification as well.
XML-RPC uses a simple XML application to express function calls (requests) and returned values (responses) between clients and servers. The heart of an XML-RPC message is the way data is encoded into XML.

Section 3.1.1.1: Data encoding

Data is at the core of any interface, since the first and foremost goal is to send information between two points. XML-RPC supports six basic datatypes in messages (seven, technically, since i4 and int may be considered distinct), and also supports serialization of arrays and structures (name/value pairs just like Perl's hashes). The data types are explained in Table 3-1.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Example Client: Meerkat
Before moving to Chapter 4 and diving straight into the different Perl toolkits for XML-RPC, let's look at a simple example of a client to give you a feel for the phases in the XML-RPC request/response lifecycle. The example is fairly rudimentary, so that it can be done without using any of the available toolkits. But even so, it's complex enough that it also serves as an incentive for you to read Chapter 4, which shows how much the toolkits can simplify an application.
Meerkat is an open wire service offered by O'Reilly & Associates, Inc. It offers application-level access to news stories in an array of channels and covers a large variety of topics. Meerkat demonstrates the early success of XML-RPC as an API layer.
Users register an account at http://meerkat.oreillynet.com and then customize the way the news content is presented. From the browser interface, it is possible to select not only the channels themselves but also to fine-tune the set of stories that are chosen for display by applying a search pattern (which can in fact be a regular expression) as a filter against the list. For example, a filter of "perl" against the stories from the "Scripting News" channel limits the results to just the stories that mention Perl.
Users may save a choice of channels to browse and a search pattern. To get started, Meerkat offers a set of ready-to-use basic profiles for the more common and popular topics. Figure 3-1 shows a screenshot of Meerkat displayed with the Mozilla browser running under Linux. The profile used for the contents that are displayed gathers items from most of the Perl-related channels, plus a few others such as the popular Slashdot news portal. All stories are searched for the word "perl," so that only the ones that actually mention Perl directly are displayed.
Figure 3-1: A sample Meerkat page
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Limitations of XML-RPC
As flexible and powerful as XML-RPC is, it does suffer from a number of very significant limitations. None of these limitations prevent it from being usable, and there are some very useful systems built with XML-RPC. In some cases the use of XML-RPC is a supplementary feature, while in others it is a basic aspect of the system. However, understanding the limitations of the protocol make it easier to avoid pitfalls or traps when designing applications to use it.
Probably the most limiting aspect of XML-RPC is within the specification itself. No, this doesn't mean that the protocol is the source of the limitation. Rather, the specification as it stands is frozen. Because there is no version specification within the base definitions, the author of the protocol has thus far chosen to not implement any changes or extensions to the specification. This is primarily out of a concern for maintaining a strict sense of compatibility at the wire level between implementations.
The definition of XML-RPC is intentionally simple and clear. The specification at http://www.xmlrpc.com is the single and definitive source for evaluating a claimant to the stamp of XML-RPC compatibility.
Some toolkits (in various languages) have strayed off of the path in small steps, primarily in terms of being more flexible at the transport level. The most common example of this is when a toolkit chooses to allow HTTP 1.1 chunked transfer encoding. When this style of content-transmission is used, it isn't always necessary to provide a Content-Length header. This method is often used with streaming content models such as multimedia types, but it can also be used in cases in which an application wants (or needs) to start the transmission of data before the complete length of the response is known.
If an application or toolkit does something like this, it still bears the responsibility of being completely compatible with even the strictest servers and clients. Otherwise, it can't refer to itself as being an implementation of XML-RPC.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Programming XML-RPC
Web services toolkits hide the XML and HTTP protocol details and let you, the applications programmer, focus on the application you're building. In this chapter we look at three XML-RPC toolkits for Perl. Each offers different features and a different interface, with different advantages and drawbacks for the programmer.
This chapter shows how to use each toolkit, and develops the same pair of applications in all three. You'll see the relative strengths and weaknesses of each toolkit and be able to select and use the right toolkit for your application.
This chapter presents the toolkits from oldest to newest. The first XML-RPC toolkit was the Frontier::RPC2 package by Ken MacLeod. The name refers to the original system the package was intended to support, the Frontier content management system from UserLand Software, the source of the XML-RPC specification itself. In mid-2002, the package was taken over by a new maintainer. The time proved right for renaming the modules, and now the module is known as RPC::XMLSimple.
Following this module was the XMLRPC::Lite component of the SOAP::Lite package for Perl. Support for XML-RPC was added at a later point than the SOAP components themselves, but the functionality builds on the framework that the author, Pavel Kulchenko, had already created to support the SOAP standards. As a result, it integrates very smoothly, and benefits from elements already present in SOAP::Lite, such as a pure-Perl XML parser that can be used when none of the CPAN-based XML modules are available.
The newest addition to the Perl/XML-RPC family is the RPC::XML package. Like the RPC::XMLSimple package, this module requires that the XML::Parser module from CPAN be installed. It doesn't provide a native XML parser the way XMLRPC::Lite does. It relies on the LWP package for client transport, but on the server side it can work with the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Perl Toolkits for XML-RPC
This chapter presents the toolkits from oldest to newest. The first XML-RPC toolkit was the Frontier::RPC2 package by Ken MacLeod. The name refers to the original system the package was intended to support, the Frontier content management system from UserLand Software, the source of the XML-RPC specification itself. In mid-2002, the package was taken over by a new maintainer. The time proved right for renaming the modules, and now the module is known as RPC::XMLSimple.
Following this module was the XMLRPC::Lite component of the SOAP::Lite package for Perl. Support for XML-RPC was added at a later point than the SOAP components themselves, but the functionality builds on the framework that the author, Pavel Kulchenko, had already created to support the SOAP standards. As a result, it integrates very smoothly, and benefits from elements already present in SOAP::Lite, such as a pure-Perl XML parser that can be used when none of the CPAN-based XML modules are available.
The newest addition to the Perl/XML-RPC family is the RPC::XML package. Like the RPC::XMLSimple package, this module requires that the XML::Parser module from CPAN be installed. It doesn't provide a native XML parser the way XMLRPC::Lite does. It relies on the LWP package for client transport, but on the server side it can work with the HTTP::Daemon package (from LWP), the Net::Server package (from CPAN), or with Apache and mod_perl directly as a mod_perl location-handler.
In each toolkit we'll create the same client and server applications. This side-by-side comparison lets you see the strengths and weaknesses of each, to help you choose the best solution for your project.
It should be noted that in many of the code examples given throughout this chapter, the code shown will be restricted to relevant sections that illustrate the technology and concepts being discussed at that point. The full source of the examples in this chapter is provided in Appendix C.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
RPC::XMLSimple
The RPC::XMLSimple module provides support code for the client and server classes, RPC::XMLSimple::Client and RPC::XMLSimple::Daemon. An application will include only the server or client code, as needed. Both of those modules already include the core elements.
Installation of the module is very simple because it is available through CPAN and has only a few simple dependencies. It does require the XML::Parser module to handle the XML data, and the LWP module for both client communications and server functionality.
Let's reimplement the meer2html.pl tool from Chapter 3 using the toolkit instead of building and parsing XML-RPC requests and responses manually. Comparing just the length in lines of the two versions of the utility (with comments and blank lines excluded), the Frontier version is less than half the length of the manual version.
Example 4-1 shows the relevant parts of the meer2html-Frontier.pl code. The sections shown are those that differ significantly from the original version.
Example 4-1. The meer2html-Frontier.pl script
use RPC::XMLSimple::Client;
   
$client = RPC::XMLSimple::Client->new(url => MEERKAT);
   
sub show_data {
    my $data = shift;
   
    print STDOUT qq(<span class="meerkat">\n<dl>\n);
    for (@$data) {
        print STDOUT <<"END_HTML";
<dt class="title"><a href="$_->{link}">$_->Programming Web Services with Perl</a></dt>
<dd class="description">$_->{description}</dd>
END_HTML
    }
    print STDOUT qq(</dl>\n</span>\n);
}
   
sub resolve_name {
    my ($str, $name) = @_;
   
    $name = "meerkat.get${name}BySubstring";
    my $resp = $client->call($name, $str);
    die "resolve_name: $str returned more than 1 match"
        if (@$resp > 1);
   
    $resp->[0]{id};
}
   
sub get_data {
    my ($key, $val, $num) = @_;
   
    $client->call('meerkat.getItems',
                  { $key         => $val,
                    time_period  => '7DAY',
                    num_items    => $num,
                    descriptions => 200 });
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
XMLRPC::Lite
The XMLRPC::Lite package is part of the SOAP::Lite suite, written by one of the authors of this book, Pavel Kulchenko. It shares much of the same underlying architecture and structure as the SOAP package. As a result, it also supports some transport protocols that aren't officially part of XML-RPC, such as TCP/IP and the POP3 protocol. This package is also the only one of the three toolkits that can be used without dependency on an external XML parser. It will use the faster XML::Parser if it is available, however.
Installation of this module takes place as a part of the larger installation of SOAP::Lite itself. That installation process is described in greater detail in Chapter 6 so it won't be covered here.
Example 4-3 shows how Chapter 3's Meerkat application looks with XMLRPC::Lite. Notice how similar it appears to the RPC::XMLSimple version of this application (Example 4-1). The interface design of all the XML-RPC toolkits are very similar, which should simplify switching between them, if need be. As always, full code is available in Appendix C.
Example 4-3. The meer2html-Lite.pl client
use XMLRPC::Lite;
   
$client = XMLRPC::Lite->proxy(MEERKAT)
          ->on_fault(sub { die "Transport error: " .
                               $_[1]->faultstring });
sub show_data {
    my $data = shift;
   
    print STDOUT qq(<span class="meerkat">\n<dl>\n);
    for (@$data) {
        print STDOUT <<"END_HTML";
<dt class="title"><a href="$_->{link}">$_->Programming Web Services with Perl</a></dt>
<dd class="description">$_->{description}</dd>
END_HTML
    }
    print STDOUT qq(</dl>\n</span>\n);
}
   
sub resolve_name {
    my ($str, $name) = @_;
   
    $name = "meerkat.get${name}BySubstring";
    my $resp = $client->call($name, $str)->result;
    die "resolve_name: $str returned more than 1 match"
        if (@$resp > 1);
   
    $resp->[0]{id};
}
   
sub get_data {
    my ($key, $val, $num) = @_;
   
    $client->call('meerkat.getItems',
                  { $key         => $val,
                    time_period  => '7DAY',
                    num_items    => $num,
                    descriptions => 200 })->result;
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
RPC::XML
The last of the XML-RPC toolkits discussed here is the RPC::XML package, developed by one of the authors of this book, Randy J. Ray. This package isn't quite as independent as the XMLRPC::Lite implementation; it requires an external XML parser (currently the XML::Parser package from CPAN).
This package gives you a lot of flexibility in creating server applications. The main server class, RPC::XML::Server, can function as a standalone server using either the HTTP::Daemon class from the LWP package or the Net::Server package from CPAN. This latter package supports several different multiprocess models and provides all the background operation for whichever model the application chooses to use. Applications can even choose a model on-the-fly, rather than being locked into a specific one. The RPC::XML package also provides a server class designed especially to act as a content handler for Apache and mod_perl.
In addition to these choices in server management, the package comes with a set of server-side methods that implement the introspection interface pioneered by the PHP XML-RPC suite that was used in building the Meerkat API. This introspection interface was described in Chapter 3.
The RPC::XML version of the Meerkat example isn't significantly different from the other toolkits' versions. Example 4-5 shows the relevant parts of this version of the script. Appendix C lists the full program.
Example 4-5. The meer2html-RPC::XML.pl script
use RPC::XML::Client;
   
$client = RPC::XML::Client
              # Remember that MEERKAT was declared with "use constant"
              ->new(MEERKAT,
                    error_handler =>
                    sub { die "Transport error: $_[0]" });
   
sub show_data {
    my $data = shift;
   
    print STDOUT qq(<span class="meerkat">\n<dl>\n);
    for (@$data) {
        print STDOUT <<"END_HTML";
<dt class="title"><a href="$_->{link}">$_->Programming Web Services with Perl</a></dt>
<dd class="description">$_->{description}</dd>
END_HTML
    }
    print STDOUT qq(</dl>\n</span>\n);
}
   
sub resolve_name {
    my ($str, $name) = @_;
   
    $name = "meerkat.get${name}BySubstring";
    my $resp = $client->simple_request($name, $str);
    die "resolve_name: $str returned more than 1 match"
        if (@$resp > 1);
   
    $resp->[0]{id};
}
   
sub get_data {
    my ($key, $val, $num) = @_;
   
    $client->simple_request('meerkat.getItems',
                            { $key         => $val,
                              time_period  => '7DAY',
                              num_items    => $num,
                              descriptions => 200 });
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Introduction to SOAP
The Simple Object Access Protocol (SOAP) is the basis for the W3C design for web services. The specifications that make up SOAP cover the expression of data, what to require for communication, and how to link messages with communication layers. When using toolkits (introduced in Chapter 6), it is often not necessary to know more than just the basic elements of SOAP. But the more thorough your understanding, the easier you can develop better, more efficient applications.
This chapter introduces the basic parts of SOAP, and illustrates how they work together to create a platform for distributed application development. The focus of this chapter is the XML that implements SOAP requests and responses and how the SOAP specification is built from other XML technologies, such as XML Schema. Later chapters build SOAP-enabled applications with toolkits, but to make full use of the toolkits, you must know the constraints of the SOAP protocol that they implement.
Where CORBA and COM+ followed RPC, SOAP and XML-RPC are more like its contemporaries. While SOAP was going through a very thorough process of requirements gathering and design analysis, XML-RPC was spun off from an early draft of SOAP.
The time spent in design and planning was far from squandered. The resulting specification from the working group is very flexible and leaves a considerable amount of room for expansion of the protocol. Expansion and extension of the protocol may come from added XML applications bundled into the encoding layer, in either (or both) of the message header and body. Beyond this, the protocol itself is kept by version and maintained as a W3C Technical Recommendation, the term used by the W3C for standards that have been adopted.
At the time of this writing, the Technical Recommendation (TR) of the specification is at Version 1.1, with Version 1.2 currently very close to acceptance as the new current TR. By virtue of the clear definitions of versions and capabilities of the specification, an application (usually at the server end) can tell almost immediately whether it can handle a given request well before reading the entire message.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Background
Where CORBA and COM+ followed RPC, SOAP and XML-RPC are more like its contemporaries. While SOAP was going through a very thorough process of requirements gathering and design analysis, XML-RPC was spun off from an early draft of SOAP.
The time spent in design and planning was far from squandered. The resulting specification from the working group is very flexible and leaves a considerable amount of room for expansion of the protocol. Expansion and extension of the protocol may come from added XML applications bundled into the encoding layer, in either (or both) of the message header and body. Beyond this, the protocol itself is kept by version and maintained as a W3C Technical Recommendation, the term used by the W3C for standards that have been adopted.
At the time of this writing, the Technical Recommendation (TR) of the specification is at Version 1.1, with Version 1.2 currently very close to acceptance as the new current TR. By virtue of the clear definitions of versions and capabilities of the specification, an application (usually at the server end) can tell almost immediately whether it can handle a given request well before reading the entire message.
As this chapter outlines the elements and structure of SOAP communications, there will occasionally be the need to highlight a significant difference between the two versions of the protocol. Look for such differences in paragraphs like this.
SOAP offers a lot more capability and expressiveness than XML-RPC. In addition to enabling remote procedure calls, it also supports a document model in which messages are more free-form and not as strictly defined in terms of a more typical call/response pattern. While it doesn't inherently do features such as security, authentication, transactions, etc., it allows these concepts to be layered over the basic SOAP elements.
SOAP allows for conversations to be designed around more than one server, or
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
XML Definitions
This chapter introduces the structure and mechanics of SOAP and SOAP messages. In practice, the majority of the work behind creating SOAP messages and disassembling them on the receiving end is done using a toolkit such as SOAP::Lite, which will be introduced in Chapter 6. Understanding the form and function of SOAP messages helps you understand the functionality of the toolkit components. This chapter is by no means a complete overview of SOAP. The number (and length) of books devoted to the topic of SOAP is a testimony to the depth of the subject.
Because SOAP is by its very nature a more complex protocol than XML-RPC, it should be no surprise that the depth of the XML it uses is appropriately more complex. One of the key differences between parsing SOAP messages as opposed to XML-RPC is the need for support of XML namespaces. SOAP not only uses namespaces for their original purpose of mixing document-type elements, it also uses them to distinguish between different versions of the specification.
Another important factor in processing a typical SOAP message is that the specification requires the message contain no DTD declaration or processing instructions (which includes XML comments). This may limit the ability to apply other technologies, such as XSLT, to either the request or response messages. The current draft proposals for SOAP 1.2 allow processing instructions to be present but mandate that receivers ignore them. This allows other processing, such as XSLT (XML Stylesheet Language Transforms), without burdening the servers.
Despite these minor differences, a SOAP message is still at its core just another XML document. As such, it is self-describing and (in most cases) very readable by the average viewer. Thoughtful and consistent labeling of the namespaces can also add to the readability.
In the simplest of terms, a SOAP message is made up of the following parts:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
RPC over SOAP
One of the most common applications of SOAP is to provide Remote Procedure Call (RPC) functionality. While SOAP goes beyond the goals of either XML-RPC or the original RPC itself, it is a simple fact that a large number of existing systems built around the RPC model are still in use. New applications are expected to communicate with these legacy systems in addition to any extra abilities they may offer.
The mechanics of implementing RPC via SOAP aren't at all difficult. Looking to XML-RPC as a model for encoding requests, responses, and the requisite data, adapting this protocol to run within the realm of SOAP is a fairly direct task. Both SOAP protocol versions specifically address this issue.
There is more to providing this functionality than merely taking an XML-RPC message and wrapping SOAP envelope elements around it. The SOAP RPC capabilities aren't direct mirrors of XML-RPC. The features of SOAP (named parameters, distinct separation of message header and body) are used to full advantage in the RPC framework.
To contain RPC functionality, a message has to provide the information that sets up the call and defines the result. Besides a URI to the target SOAP node, the call must provide the procedure name (referred to in some documents as the method) and the parameters for the call. Additionally, some information may be present that, while nor required, adds further detail and specifics to the message.
A signature for the procedure or method is optional rather than required. This is only from the viewpoint of the SOAP specification itself. Whether the node acting as the RPC server requires the signature is a different issue. A signature is something that defines the type information for the parameters coming in to the procedure call, as well as the return value if the procedure returns anything. Some servers require this information to provide polymorphic procedures and methods to their clients. The typing information allows the calls to be properly dispatched to the matching version of the code. It also lets a server detect badly formed calls without risking more serious runtime faults.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SOAP Transport
This section is going to briefly cover some of the different methods of transporting SOAP messages. Later chapters will go into greater depth on some of these (such as HTTP and SMTP).
Arguably the most significant difference between SOAP and XML-RPC is the fact that the designers of SOAP refrained from binding it to a specific transport protocol. While both versions of the protocol refer heavily to the application of HTTP as a method of transport, this isn't the only available option. SOAP has been demonstrated using the Simple Mail Transfer Protocol (SMTP), the Jabber wire protocol, Microsoft's .NET framework, and others. Guidelines for protocols such as the Blocks Extensible Exchange Protocol (BEEP), raw TCP/IP, and even FTP have been produced in various levels of maturity and acceptance.
For each protocol, one thing is common: each must define not only the way messages are physically sent, they must also define what the terms and conditions are for sending a message along that route. In most cases, the protocol definitions themselves cover the "how" element. What remains to be defined is what extra material must be added to a message to enable the transport.
Using HTTP as an example, the specification that provides the binding for SOAP over HTTP is responsible for outlining a number of elements:
  • Which HTTP methods are used or supported?
  • What role does the request URI play in defining the service?
  • What should the Content-type header (and any other relevant headers) look like?
  • How is a SOAP message encapsulated into the HTTP request?
Fortunately, the binding for HTTP is a part of the specification in both SOAP 1.1 and 1.2. Because of this, the binding is useful as both a practical application and a roadmap for defining new bindings.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!