Reading and parsing unstructured files

It is marvelous to have input files where the information is well formed; that is, the number of columns and the type of its data is precise, all rows follow the same pattern, and so on. However, it is common to find input files where the information has little or no structure, or the structure doesn't follow the matrix (n rows by m columns) you expect. In this section you will learn how to deal with such files.

Get Pentaho 3.2 Data Integration Beginner's Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.