The W3C Validator to RSS
Of all the tasks of Hercules, the one where he had to keep his web site’s XHTML validated was the hardest. Without wanting to approach the whole Valid XHTML Controversy, we can still safely say that keeping a site validated is a pain. You have to validate your code, most commonly using the W3C validator service at http://validator.w3.org, and you have to keep going back there to make sure nothing has broken.
You have to do that unless, of course, you’re subscribed to a feed of validation results. This script does just that, providing an RSS interface to the W3C validator.
You pass the URL you want to test as a query in the feed URL, like
so:
http://www.example.org/validator.cgi?url=http://www.example.org/index.html.
Walking Through the Code
We’re using the traditional Perl start plus
LWP::Simple and XML::Simple,
which will parse the results coming back from the validator. Note
that, in the classic gotcha, LWP::Simple and
CGI clash, so we have to add those additional
flags to prevent a type mismatch.
use warnings; use strict; use XML::RSS; use CGI qw(:standard); use LWP::Simple 'get'; use XML::Simple;
Now, grab the URL from the query string, and use
LWP::Simple to retrieve the results. The W3C
provides an XML output mode for the validator, and this is what
we’re using here. It is, however, classed as beta
and flakey, and might not always work.
my $cgi = CGI::new( ); my $url = $cgi->param('url'); my $validator_results_in_xml = get("http://validator.w3.org/check?uri=$url;output=xml"); ...