6.3. Namespace-Aware Parsing
Problem
You need to parse an XML document with multiple namespaces.
Solution
Use Digester to parse
XML with multiple namespaces, using
digester.setNamespaceAware(true)
, and supplying
two RuleSet
objects to parse elements in each
namespace. Consider the following document, which contains elements
from two namespaces: http://discursive.com/page
and http://discursive.com/person
:
<?xml version="1.0"?> <pages xmlns="http://discursive.com/page" xmlns:person="http://discursive.com/person"> <page type="standard"> <person:person firstName="Al" lastName="Gore"> <person:role>Co-author</person:role> </person:person> <person:person firstName="George" lastName="Bush"> <person:role>Co-author</person:role> </person:person> </page> </pages>
To parse this XML document with the Digester, you need to create two
separate sets of rules for each namespace, adding each
RuleSet
object to Digester
with
addRuleSet( )
. A RuleSet
adds
Rule
objects to an instance of
Digester
. By extending the
RuleSetBase
class, and setting the
namespaceURI
in the default constructor, the
following class, PersonRuleSet
, defines rules to
parse the http://discursive.com/person
namespace:
import org.apache.commons.digester.Digester; import org.apache.commons.digester.RuleSetBase; public class PersonRuleSet extends RuleSetBase { public PersonRuleSet( ) { this.namespaceURI = "http://discursive.com/person"; } public void addRuleInstances(Digester digester) { digester.addObjectCreate("*/person", Person.class); ...
Get Jakarta Commons Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.