When using the multiline quoting mechanism called a here document, the text must be flush against the margin, which looks out of place in the code. You would like to indent the here document text in the code, but not have the indentation appear in the final string value.
Use a s///
operator to strip out leading
whitespace.
# all in one ($var = <<HERE_TARGET) =~ s/^\s+//gm; your text goes here HERE_TARGET # or with two steps $var = <<HERE_TARGET; your text goes here HERE_TARGET $var =~ s/^\s+//gm;
The substitution is straightforward. It removes leading whitespace
from the text of the here document. The /m
modifier lets the ^
character match at the start
of each line in the string, and the /g
modifier
makes the pattern matching engine repeat the substitution as often as
it can (i.e., for every line in the here document).
($definition = <<'FINIS') =~ s/^\s+//gm; The five varieties of camelids are the familiar camel, his friends the llama and the alpaca, and the rather less well-known guanaco and vicuña. FINIS
Be warned: all the patterns in this recipe use
\s
, which
will also match newlines. This means they will remove any blank lines
in your here document. If you don’t want this, replace
\s
with [^\S\n]
in the
patterns.
The substitution makes use of the property that the result of an
assignment can be used as the left-hand side of
=~
. This lets us do it all in one line, but it
only works when you’re assigning to a variable. When
you’re using the here document directly, it would be considered
a constant value and you wouldn’t be able to modify it. In
fact, you can’t change a here document’s value
unless you first put it into a variable.
Not to worry, though, because there’s an easy way around this, particularly if you’re going to do this a lot in the program. Just write a subroutine to do it:
sub fix { my $string = shift; $string =~ s/^\s+//gm; return $string; } print fix(<<"END"); My stuff goes here END # With function predeclaration, you can omit the parens: print fix <<"END"; My stuff goes here END
As with all here documents, you have to place this here
document’s target (the token that marks its end,
END
in this case) flush against the left-hand
margin. If you want to have the target indented also, you’ll
have to put the same amount of whitespace in the quoted string as you
use to indent the token.
($quote = <<' FINIS') =~ s/^\s+//gm; ...we will have peace, when you and all your works have perished--and the works of your dark master to whom you would deliver us. You are a liar, Saruman, and a corrupter of men's hearts. --Theoden in /usr/src/perl/taint.c FINIS $quote =~ s/\s+--/\n--/; #move attribution to line of its own
If you’re doing this to strings that contain code you’re
building up for an eval
, or just text to print
out, you might not want to blindly strip off all leading whitespace
because that would destroy your indentation. Although
eval
wouldn’t care, your reader might.
Another embellishment is to use a special leading string for code
that stands out. For example, here we’ll prepend each line with
@@@
, properly indented:
if ($REMEMBER_THE_MAIN) { $perl_main_C = dequote<<' MAIN_INTERPRETER_LOOP'; @@@ int @@@ runops() { @@@ SAVEI32(runlevel); @@@ runlevel++; @@@ while ( op = (*op->op_ppaddr)() ) ; @@@ TAINT_NOT; @@@ return 0; @@@ } MAIN_INTERPRETER_LOOP # add more code here if you want }
Destroying indentation also gets you in trouble with poets.
sub dequote; $poem = dequote<<EVER_ON_AND_ON; Now far ahead the Road has gone, And I must follow, if I can, Pursuing it with eager feet, Until it joins some larger way Where many paths and errands meet. And whither then? I cannot say. --Bilbo in /usr/src/perl/pp_ctl.c EVER_ON_AND_ON print "Here's your poem:\n\n$poem\n";
Here is its sample output:
Here's your poem:
Now far ahead the Road has gone,
And I must follow, if I can,
Pursuing it with eager feet,
Until it joins some larger way
Where many paths and errands meet.
And whither then? I cannot say.
--Bilbo in /usr/src/perl/pp_ctl.c
The following dequote
function handles all these cases. It expects to be called with a here
document as its argument. It checks whether each line begins with a
common substring, and if so, strips that off. Otherwise, it takes the
amount of leading whitespace found on the first line and removes that
much off each subsequent line.
sub dequote { local $_ = shift; my ($white, $leader); # common whitespace and common leading string if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\1\2?.*\n)+$/) { ($white, $leader) = ($2, quotemeta($1)); } else { ($white, $leader) = (/^(\s+)/, ''); } s/^\s*?$leader(?:$white)?//gm; return $_; }
If that pattern makes your eyes glaze over, you could always break it
up and add comments by adding /x
:
if (m{ ^ # start of line \s * # 0 or more whitespace chars (?: # begin first non-remembered grouping ( # begin save buffer $1 [^\w\s] # one byte neither space nor word + # 1 or more of such ) # end save buffer $1 ( \s* ) # put 0 or more white in buffer $2 .* \n # match through the end of first line ) # end of first grouping (?: # begin second non-remembered grouping \s * # 0 or more whitespace chars \1 # whatever string is destined for $1 \2 ? # what'll be in $2, but optionally .* \n # match through the end of the line ) + # now repeat that group idea 1 or more $ # until the end of the line }x ) { ($white, $leader) = ($2, quotemeta($1)); } else { ($white, $leader) = (/^(\s+)/, ''); } s{ ^ # start of each line (due to /m) \s * # any amount of leading whitespace ? # but minimally matched $leader # our quoted, saved per-line leader (?: # begin unremembered grouping $white # the same amount ) ? # optionalize in case EOL after leader }{}xgm;
There, isn’t that much easier to read? Well, maybe not; sometimes it doesn’t help to pepper your code with insipid comments that mirror the code. This may be one of those cases.
Get Perl Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.