Multiline Records

Not every data record will fit onto a single line. Here is a file in simplified Protein Data Bank (PDB) format that describes the arrangements of atoms in ammonia:

 COMPND AMMONIA
 ATOM 1 N 0.257 -0.363 0.000
 ATOM 2 H 0.257 0.727 0.000
 ATOM 3 H 0.771 -0.727 0.890
 ATOM 4 H 0.771 -0.727 -0.890
 END

The first line is the name of the molecule. All subsequent lines down to the one containing END specify the ID, type, and XYZ coordinates of one of the atoms in the molecule.

Reading this file is straightforward using the techniques that we have built up in this chapter. But what if the file contained two or more molecules, like this:

 COMPND AMMONIA
 ATOM 1 N 0.257 -0.363 0.000
 ATOM 2 H 0.257 0.727 0.000
 ATOM 3 H 0.771 -0.727 ...

Get Practical Programming, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.