Regular expressions can be used to break a string into fields. The split function does this, and the join function glues the pieces back together.
The split function takes a regular expression and a string, and looks for all occurrences of the regular expression within that string. The parts of the string that don't match the regular expression are returned in sequence as a list of values. For example, here's something to parse colon-separated fields, such as in UNIX /etc/passwd files:
$line = "merlyn::118:10:Randal:/home/merlyn:/usr/bin/perl"; @fields = split(/:/,$line); # split $line, using : as delimiter # now @fields is ("merlyn","","118","10","Randal", # "/home/merlyn","/usr/bin/perl")
Note how the empty second field became an empty string. If you don't want this, match all of the colons in one fell swoop:
@fields = split(/:+/, $line);
This matches one or more adjacent colons together, so there is no empty second field.
One common string to split is the $_ variable, and that turns out to be the default:
$_ = "some string"; @words = split(/ /); # same as @words = split(/ /, $_);
For this split, consecutive spaces in the string to be split will cause null fields (empty strings) in the result. A better pattern would be / +/, or ideally /\s+/, which matches one or more whitespace characters together. In fact, this pattern is the default pattern, so if you're splitting the $_ variable on whitespace, you can use all ...