Chapter 3. Data Munging

Hacks 19-27

Perl has always been in love with data. No matter where you find it, Perl happily processes and extracts and reports on files, databases, web pages, spreadsheets, other programs, and anything that produces data. Perl’s so happy to do this that it even overlooks brute-force, rough manipulations. Hey, pragmatism works!

Perl can be gentle, too. A little subtlety, a little style and finesse, and you can write maintainable, easy-to-understand code that’s just as powerful as the wild-eyed forge-ahead-at-all-costs just-do-the-job code. Why? It’s often faster and more correct—as well as more secure, more powerful, and shorter.

Sure, slinging data between sources sounds about as glamorous as slinging hash at the local diner, but it doesn’t have to be that way. Here are several ideas to munge that yummy data with all of the elegance and style and power and clarity that you know you have.

Treat a File As an Array

Pretend a big stream of data on disk is a nice, malleable Perl data structure.

One of the big disappointments in programming is realizing that, although you can think of a text file as a long list of properly terminated lines, to the computer, it’s just a big blob of ones and zeroes. If all you need to do is read the lines of a file and process them in order, you’re fine. If you have a big file that you can’t load into memory and can’t process each line in order...well, good luck.

Fortunately, Mark Jason Dominus’s Tie::File module exists, and is ...

Get Perl Hacks now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.