You have a data file containing comma-separated values that you need to read in, but these data fields may have quoted commas or escaped quotes in them. Most spreadsheets and database programs use comma-separated values as a common interchange format.
Use the procedure in Mastering Regular Expressions.
sub parse_csv { my $text = shift; # record containing comma-separated values my @new = (); push(@new, $+) while $text =~ m{ # the first part groups the phrase inside the quotes. # see explanation of this pattern in MRE "([^\"\\]*(?:\\.[^\"\\]*)*)",? | ([^,]+),? | , }gx; push(@new, undef) if substr($text, -1,1) eq ','; return @new; # list of values that were comma-separated }
Or use the standard Text::ParseWords module.
use Text::ParseWords; sub parse_csv { return quotewords(",",0, $_[0]); }
Comma-separated input is a deceptive and complex format. It sounds
simple, but involves a fairly complex escaping system because the
fields themselves can contain commas. This makes the pattern matching
solution complex and rules out a simple split
/,/
.
Fortunately, Text::ParseWords hides the complexity from you. Pass its
quotewords
function two arguments and the CSV string. The first argument is the
separator (a comma, in this case) and the second is a true or false
value controlling whether the strings are returned with quotes around
them.
If you want to represent quotation marks inside a field delimited by
quotation marks, escape them with backslashes
"like\"this\""
. Quotation
marks and backslashes are the only characters that have meaning
backslashed. Any other use of a backslash will be left in the output
string.
Here’s how you’d use the
parse_csv
subroutines. The
q<>
is just a fancy quote so we didn’t
have to backslash everything.
$line = q<XYZZY,"","O'Reilly, Inc","Wall, Larry","a \"glug\" bit,",5, "Error, Core Dumped">; @fields = parse_csv($line); for ($i = 0; $i < @fields; $i++) { print "$i : $fields[$i]\n"; }
0 : XYZZY
1 :
2 : O'Reilly, Inc
3 : Wall, Larry
4 : a \"glug\" bit,
5 : 5
6 : Error, Core Dumped
Get Perl Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.