158 Extremely pureXML in DB2 10 for z/OS
8.1 XML representation in COBOL
COBOL offers options for working with XML data. Most important, support is available for the
pureXML data type in DB2 in several variations, although in several cases ordinary binary or
character-based types might also be used. In addition, the file reference variable is applicable
to LOBs and XML.
Because XML is always stored in Unicode, pay special attention to which code pages are
used in the application and how to avoid data conversion.
We use the DB2 precompiler throughout this chapter. Experiences might vary slightly if using
8.1.1 XML host variables in COBOL
In DB2, the data type XML is a basic data type with its own representation and associated
simple functions. You can insert, modify, and retrieve data as XML. In COBOL, the XML data
type always builds on one of the existing LOB formats. XML variable declarations built on the
basic LOB types are shown in Example 8-1.
Example 8-1 XML host variables in COBOL
01 DOC-AS-CLOB IS SQL TYPE IS XML AS CLOB(100K).
01 DOC-AS-BLOB IS SQL TYPE IS XML AS BLOB(100K).
01 DOC-AS-DBCLOB IS SQL TYPE IS XML AS DBCLOB(100K).
These declarations are translated by the precompiler, as shown in Example 8-2.
Example 8-2 XML host variables after transformation by the DB2 pre-compiler
49 DOC-AS-CLOB-LENGTH PIC S9(9) COMP-5.
49 DOC-AS-CLOB-DATA PIC X(102400).
49 DOC-AS-BLOB-LENGTH PIC S9(9) COMP-5.
49 DOC-AS-BLOB-DATA PIC X(102400).
49 DOC-AS-DBCLOB-LENGTH PIC S9(9) COMP-5.
49 DOC-AS-DBCLOB-DATA PIC G(102400) DISPLAY-1.
Whether you want to use CLOBs, DBCLOBs, or BLOBs as the base format depends on how
and whether you want to manipulate the contents of the XML file in the application.
One significant difference between the base format of BLOB compared to that of CLOB (or
DBCLOB) is the encoding. XML is always stored in UTF-8 Unicode; the COBOL application
can work in EBCDIC or UTF-16 Unicode. Therefore, data conversion and data encoding is
always an issue you must consider.
The character-based formats are referred to as externally encoded; the binary-based formats
are referred to as
internally encoded. A variable with subtype FOR BIT DATA is also
considered internally encoded.