RSS 0.91 and 0.92 feeds are created in the same way — the additional elements found in 0.92 are well handled by the existing RSS tools.
Of course, you can always hand-code your RSS feed. Doing so certainly gets you on top of the standard, but it’s neither convenient, quick, nor recommended. Ordinarily, feeds are created by a small program in one of the scripting languages: Perl, PHP, Python, etc. Many CMSs already create RSS feeds automatically, but you may want to create a feed in another context. Hey, you might even write your own CMS!
There are various ways to create a feed, all of which are used in real life:
- XML transformation
Running a transformation on an XML master document to convert the relevant parts into RSS. This technique is used in Apache Axkit-based systems, for example.
- Templates
Substituting values within a RSS feed template. This technique is used within the Movable Type weblogging platform, for example.
- An RSS-specific module or class
Used within hundreds of little ad hoc scripts across the Net, for example.
We’ll look at all three of these methods, but
let’s start with the third, using an RSS-specific
module. In this case, it’s Perl’s
XML::RSS
.
Jonathan
Eisenzopf’s XML::RSS
module for
Perl is one of the key tools in the Perl
RSS world. It is built on top of XML::Parser
— the basis for many Perl XML modules — and it is
object-oriented. Actually, XML::RSS
also supports
both creating RSS 1.0 and parsing existing feeds, but in this section
we will deal only with its 0.91 creation capabilities. Currently, it
does not support the additional elements within RSS 0.92.
Example 4-4 shows a simple Perl script that creates the feed shown in Example 4-5.
Example 4-4. A sample XML::RSS script
#!/usr/local/bin/perl -w ## Chapter 4, Example 1. ## Create an example RSS 0.91 feed use XML::RSS; my $rss = new XML::RSS (version => '0.91'); $rss->channel(title => 'The Title of the Feed', link => 'http://www.oreilly.com/example/', language => 'en', description => 'An example feed created by XML::RSS', lastBuildDate => 'Tue, 04 Jun 2002 16:20:26 GMT', docs => 'http://backend.userland.com/rss092', ); $rss->image(title => 'Oreilly', url => 'http://meerkat.oreillynet.com/icons/meerkat-powered.jpg', link => 'http://www.oreilly.com/example/', width => 88, height => 31, description => 'A nice logo for the feed' ); $rss->textinput(title => "Search", description => "Search the site", name => "query", link => "http://www.oreilly.com/example/search.cgi" ); $rss->add_item( title => "Example Entry 1", link => "http://www.oreilly.com/example/entry1", description => 'blah blah', ); $rss->add_item( title => "Example Entry 2", link => "http://www.oreilly.com/example/entry2", description => 'blah blah' ); $rss->add_item( title => "Example Entry 3", link => "http://www.oreilly.com/example/entry3", description => 'blah blah' ); $rss->save("example.rss");
Example 4-5. The resultant RSS 0.91 feed
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd"> <rss version="0.91"> <channel> <title>The Title of the Feed</title> <link>http://www.oreilly.com/example/</link> <description>An example feed created by XML::RSS</description> <language>en</language> <lastBuildDate>Tue, 04 Jun 2002 16:20:26</lastBuildDate> <docs>http://backend.userland.com/rss092</docs> <image> <title>Oreilly</title> <url>http://meerkat.oreillynet.com/icons/meerkat-powered.jpg</url> <link>http://www.oreilly.com/example/</link> <width>88</width> <height>31</height> <description>A nice logo for the feed</description> </image> <item> <title>Example Entry 1</title> <link>http://www.oreilly.com/example/entry1</link> <description>blah blah</description> </item> <item> <title>Example Entry 2</title> <link>http://www.oreilly.com/example/entry2</link> <description>blah blah</description> </item> <item> <title>Example Entry 3</title> <link>http://www.oreilly.com/example/entry3</link> <description>blah blah</description> </item> <textinput> <title>Search</title> <description>Search the site</description> <name>query</name> <link>http://www.oreilly.com/example/search.cgi</link> </textinput> </channel> </rss>
After the required Perl module declaration, we create a new instance
of XML::RSS
, like so:
my $rss = new XML::RSS (version => '0.91');
The new method function returns a reference to the new
XML::RSS
object. The function can take three
arguments, two of which we are interested in here:
new XML::RSS (version=>$version, encoding=>$encoding);
The version
attribute refers to the version of RSS
you want to make (either '0.91
' or
'1.0
'), and the encoding
attribute sets the encoding of the XML declaration. The default
encoding is UTF-8.
The rest of the script is quite self-explanatory. The methods
channel
, image
,
textinput
, and add_item
all add
new elements and associated values to the feed you are creating, and
the $rss->save
method saves the created feed as
a file.
In Example 4-4, we’re passing known strings to the module. Therefore, it is not of much use as a script; we need to add a more dynamic form of data, or the feed will be very boring indeed.
In the absence of a generalized publishing system to play with, let’s use Google’s SOAP API. This web-services interface was released with much fanfare in April 2002, and at the time of this writing it is still an experimental affair. It may even be defunct by the time you read this book, but you’ll get the idea.
The Google API requires a developer’s key.
This is readily available (again, at the time of this writing) from
http://www.google.com/apis
— I have left it out of the code here, as daily usage is
limited and I’m fond of my own. You will also need
to grab Google’s WSDL file, which the
SOAP::Lite
module requires.
The script in Example 4-6 is designed to be run from a web browser. It takes two parameters — the query and the Google API key — so the URL would look something like this:
http://example.org/googlerss.cgi?q=queryHere&k=YourVeryOwnGoogleKeyHere |
Example 4-6. googlerss.cgi Google API to RSS using Perl
#!/usr/local/bin/perl -w use strict; use SOAP::Lite; use XML::RSS; use CGI qw(:standard); use HTML::Entities ( ); # Set up the query term from the cgi input my $query = param("q"); my $key = param("k"); # Initialise the SOAP interface my $service = SOAP::Lite -> service('http://api.google.com/GoogleSearch.wsdl'); # Run the search my $result = $service -> doGoogleSearch ($key, $query, 0, 10, "false", "", "false", "", "latin1", "latin1"); # Create the new RSS object my $rss = new XML::RSS (version => '0.91'); # Add in the RSS channel data $rss->channel( title => "Google Search for $query", link => "http://www.google.com/search?q=$query", description => "Google search for $query", language => "en", ); #Add in the required image $rss->image(title => 'Google2RSS', url => 'http://www.example.org/icons/google2rss.jpg', link => 'http://www.google.com/search?q=$query', width => 88, height => 31, description => 'Google2RSS' ); # Create each of the items foreach my $element (@{$result->{'resultElements'}}) { $rss->add_item( title => HTML::Entities::encode($element->{'title'}), link => HTML::Entities::encode($element->{'URL'}) ); } # print out the RSS print header('application/xml+rss'), $rss->as_string;
Example 4-7 shows the RSS file created by the script in Example 4-6.
Example 4-7. The resultant RSS file from the Google script, searching for RSS
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd"> <rss version="0.91"> <channel> <title>Google Search for RSS</title> <link>http://www.google.com/search?q=RSS</link> <description>Google search for RSS</description> <image> <title>Google2RSS</title> <url>http://www.example.org/icons/google2rss.jpg</url> <link>http://www.google.com/search?q=$query</link> <width>88</width> <height>31</height> <description>Google2RSS</description> </image> <item> <title>MAPS <b>RSS</b></title> <link>http://work-rss.mail-abuse.org/rss/</link> </item> <item> <title>Yahoo! Groups</title> <link>http://www.purl.org/rss/1.0/</link> </item> <item> <title><b>RSS</b> 0.92</title> <link>http://backend.userland.com/rss092</link> </item> <item> <title><b>RSS</b> 0.91</title> <link>http://backend.userland.com/stories/rss091</link> </item> <item> <title>Royal Statistical Society</title> <link>http://www.rss.org.uk/</link> </item> <item> <title>Latest <b>RSS</b> News (<b>RSS</b> Info)</title> <link>http://blogspace.com/rss/</link> </item> <item> <title>Yahoo! Groups</title> <link>http://groups.yahoo.com/group/rss-dev/files/specification.html</link> </item> <item> <title>Yahoo! Groups : <b>rss</b>-dev</title> <link>http://groups.yahoo.com/group/rss-dev/</link> </item> <item> <title>Yahoo! Groups</title> <link>http://groups.yahoo.com/files/rss-dev/specification.html</link> </item> <item> <title>O'Reilly Network: <b>RSS</b> DevCenter</title> <link>http://www.oreillynet.com/rss/</link> </item> </channel> </rss>
Walking through the script in Example 4-6, we see it
loads the required modules and then sets up the CGI parameters. The
SOAP interface is initialized, and the query is sent via the method
doGoogleSearch
.
At this point, $result
contains the array of
results returned by Google. We leave it there for a moment and
initialize XML::RSS
as before. We add the required
channel
and image
details, in
this case using the $query
string to make the
description more interesting.
Google’s SOAP API returns only ten results by
default, so there is no need to add any limit to the number of
item
elements in the Google results. A simple
foreach
loop is enough to deal with the results.
But beware! Google’s results contain
HTML that has not been
entity-encoded: we have to whiz the relevant data through
HTML::Entity::encode
, or the angle brackets will
come out unencoded. Unencoded brackets are not allowed in any form of
RSS. (For a complete run-down of correct XML form, see Appendix A.)
After that, it’s really just a matter of returning
the RSS in the correct manner. Note that we give the returned file a
MIME type of
application/xml+rss
— the emergent standard.
So there it is: a dynamically created RSS feed from a SOAP interface.
Other inputs could be included, obviously. For example, we could
include a few lines to add a lastBuildDate
.
When we move on to RSS 1.0, we’ll look at building RSS feeds from multiple data sources, but for that we will have to wait for Chapter 6.
Because of its relatively limited nature, RSS 0.9x tends to be used for simple feeds of simple content. Therefore, RSS 0.9x is usually created automatically by the CMS (blogging software is a prime example).
Get Content Syndication with RSS now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.