Processing Variable-Length Text Fields
Problem
You want to extract variable length fields from your input.
Solution
Use split
with a pattern matching the field
separators.
# given $RECORD with field separated by PATTERN, # extract @FIELDS. @FIELDS = split(/PATTERN/, $RECORD);
Discussion
The split
function takes up to three arguments:
PATTERN
, EXPRESSION
, and
LIMIT
. The LIMIT
parameter is
the maximum number of fields to split into. (If the input contains
more fields, they are returned unsplit in the final list element.) If
LIMIT
is omitted, all fields (except any final
empty ones) are returned. EXPRESSION
gives the
string value to split. If EXPRESSION
is omitted,
$_
is split. PATTERN
is a
pattern matching the field separator. If PATTERN
is omitted, contiguous stretches of whitespace are used as the field
separator and leading empty fields are silently discarded.
If your input field separator isn’t a fixed string, you might
want split
to return the field separators as well
as the data by using parentheses in PATTERN
to
save the field separators. For instance:
split(/([+-])/, "3+5-2");
returns the values:
(3, '+', 5, '-', 2)
To split colon-separated records in the style of the /etc/passwd file, use:
@fields = split(/:/, $RECORD);
The classic application of split
is
whitespace-separated records:
@fields = split(/\s+/, $RECORD);
If $RECORD
started with whitespace, this last use
of split
would have put an empty string into the
first element of @fields
because
split
would consider the record to have ...
Get Perl Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.