The split Operator
Another operator that uses regular expressions is split, which breaks up a string according to a
pattern. This is useful for tab-separated data, or colon-separated,
whitespace-separated, or
anything-separated data, really.[†] So long as you can specify the separator with a regular
expression (and generally, it’s a simple regular expression), you can
use split. It looks like this:
@fields = split /separator/, $string;
The split operator[‡] drags the pattern through a string and returns a list of
fields (substrings) that were separated by the separators. Whenever the
pattern matches, that’s the end of one field and the start of the next.
So, anything that matches the pattern will never show up in the returned
fields. Here’s a typical split
pattern, splitting on colons:
@fields = split /:/, "abc:def:g:h"; # gives ("abc", "def", "g", "h")You could even have an empty field, if there were two delimiters together:
@fields = split /:/, "abc:def::g:h"; # gives ("abc", "def", "", "g", "h")Here’s a rule that seems odd at first, but it rarely causes problems: leading empty fields are always returned, but trailing empty fields are discarded. For example:[‖]
@fields = split /:/, ":::a:b:c:::"; # gives ("", "", "", "a", "b", "c")It’s also common to split on
whitespace, using /\s+/ as the
pattern. Under that pattern, all whitespace runs are equivalent to a
single space:
my $some_input = "This is a \t test.\n";
my @args = split /\s+/, $some_input; # ("This", "is", "a", "test.")The default ...