Reformatting Paragraphs

Problem

Your string is too big to fit the screen, and you want to break it up into lines of words, without splitting a word between lines. For instance, a style correction script might read a text file a paragraph at a time, replacing bad phrases with good ones. Replacing a phrase like utilizes the inherent functionality of with uses will change the length of lines, so it must somehow reformat the paragraphs when they’re output.

Solution

Use the standard Text::Wrap module to put line breaks at the right place.

use Text::Wrap;
@OUTPUT = wrap($LEADTAB, $NEXTTAB, @PARA);

Discussion

The Text::Wrap module provides the wrap function, shown in Example 1.3, which takes a list of lines and reformats them into a paragraph having no line more than $Text::Wrap::columns characters long. We set $columns to 20, ensuring that no line will be longer than 20 characters. We pass wrap two arguments before the list of lines: the first is the indent for the first line of output, the second the indent for every subsequent line.

Example 1-3. wrapdemo

#!/usr/bin/perl -w
# wrapdemo - show how Text::Wrap works

@input = ("Folding and splicing is the work of an editor,",
          "not a mere collection of silicon",
          "and",
          "mobile electrons!");

use Text::Wrap qw($columns &wrap);

$columns = 20;
print "0123456789" x 2, "\n";
print wrap("    ", "  ", @input), "\n";

The result of this program is:

               
                  01234567890123456789
               
                      Folding and
               
                    splicing is the
               
                    work of an
               
                    editor, not a
               
                    mere collection
               
                    of silicon and
               
                    mobile electrons!

We get back a single string, with newlines ending each line but the last:

# merge multiple lines into one, then wrap one long line
use Text::Wrap;
undef $/;
print wrap('', '', split(/\s*\n\s*/, <>));

If you have the Term::ReadKey module (available from CPAN) on your system, you can use it to determine your window size so you can wrap lines to fit the current screen size. If you don’t have the module, sometimes the screen size can be found in $ENV{COLUMNS} or by parsing the output of the stty command.

The following program tries to reformat both short and long lines within a paragraph, similar to the fmt program, by setting the input record separator $/ to the empty string (causing < > to read paragraphs) and the output record separator $\ to two newlines. Then the paragraph is converted into one long line by changing all newlines (and any surrounding whitespace) to single spaces. Finally, we call the wrap function with both leading and subsequent tab strings set to the empty string so we can have block paragraphs.

use Text::Wrap      qw(&wrap $columns);
use Term::ReadKey   qw(GetTerminalSize);
($columns) = GetTerminalSize();
($/, $\)  = ('', "\n\n");   # read by paragraph, output 2 newlines
while (<>) {                # grab a full paragraph
    s/\s*\n\s*/ /g;         # convert intervening newlines to spaces
    print wrap('', '', $_); # and format
}

See Also

The split and join functions in perlfunc(1) and Chapter 3 of Programming Perl; the manpage for the standard Text::Wrap module, also in Chapter 7 of Programming Perl; the CPAN module Term::ReadKey, and its use in Section 15.6

Get Perl Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.