Processing Variable-Length Text Fields
Problem
You want to extract variable length fields from your input.
Solution
Use split with a pattern matching the field
separators.
# given $RECORD with field separated by PATTERN, # extract @FIELDS. @FIELDS = split(/PATTERN/, $RECORD);
Discussion
The split function takes up to three arguments:
PATTERN, EXPRESSION, and
LIMIT. The LIMIT parameter is
the maximum number of fields to split into. (If the input contains
more fields, they are returned unsplit in the final list element.) If
LIMIT is omitted, all fields (except any final
empty ones) are returned. EXPRESSION gives the
string value to split. If EXPRESSION is omitted,
$_ is split. PATTERN is a
pattern matching the field separator. If PATTERN
is omitted, contiguous stretches of whitespace are used as the field
separator and leading empty fields are silently discarded.
If your input field separator isn’t a fixed string, you might
want split to return the field separators as well
as the data by using parentheses in PATTERN to
save the field separators. For instance:
split(/([+-])/, "3+5-2");
returns the values:
(3, '+', 5, '-', 2)
To split colon-separated records in the style of the /etc/passwd file, use:
@fields = split(/:/, $RECORD);
The classic application of split is
whitespace-separated records:
@fields = split(/\s+/, $RECORD);
If $RECORD started with whitespace, this last use
of split would have put an empty string into the
first element of @fields because
split would consider the record to have ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access