Reading Lines with Continuation Characters
Problem
You have a file with long lines split over two or more lines, with backslashes to indicate that a continuation line follows. You want to rejoin those split lines. Makefiles, shell scripts, and many other scripting or configuration languages let you break a long line into several shorter ones in this fashion.
Solution
Build up the complete lines one at a time until reaching one without a backslash:
while (defined($line = <FH>) ) {
chomp $line;
if ($line =~ s/\\$//) {
$line .= <FH>;
redo unless eof(FH);
}
# process full record in $line here
}Discussion
Here’s an example input file:
DISTFILES = $(DIST_COMMON) $(SOURCES) $(HEADERS) \
$(TEXINFOS) $(INFOS) $(MANS) $(DATA)
DEP_DISTFILES = $(DIST_COMMON) $(SOURCES) $(HEADERS) \
$(TEXINFOS) $(INFO_DEPS) $(MANS) $(DATA) \
$(EXTRA_DIST)You’d like to process that file with the escaped newlines ignored. That way the first record would in this case be the first two lines, the second record the next three lines, etc.
Here’s how the algorithm works. The while
loop reads lines, which may or may not be complete records—
they might end in backslash (and a newline). The substitution
operator s/// tries to remove a trailing
backslash. If the substitution fails, we’ve found a line
without a backslash at the end. Otherwise, read another record,
concatenate it onto the accumulating $line
variable, and use redo to jump back to just inside
the opening brace of the while loop. This lands us
back on the chomp ...