By far the hardest part of this or any similar problem is parsing the non-XML input data. Everything else pales by comparison. Unlike parsing XML, you generally cannot rely on a library to do the hard work for you. You have to do it yourself. And also unlike XML, there's little guarantee that the data is well-formed. More likely than not, you will encounter incorrectly formatted data.

In this case, because the records are separated into lines, I'll read each line, one at a time, using the readLine() method of This method works well enough as long as the data is in a file, although it's potentially buggy when the data is served over a network socket.

Each line is dissected into its component fields inside the splitLine() ...

