Chapter 5Delimited Data

So far we’ve worked only with unstructured, plain-text input. But in the real world, input typically has a structure that reflects the data stored within it. We’ve seen glimpses of this—separating parts of a line by whitespace, for example—but we haven’t delved deep into the idea of representing the structure of our data within our programs, so that we can transform and manipulate it however we like.

The simplest structured data is stored in so-called delimited files. You may well have encountered files with the extensions CSV (comma-separated values) or TSV (tab-separated values). Both of these are delimited file formats.

The structure that these file formats introduces is centered around the concepts of records

Get Text Processing with Ruby now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.