Printf-Style Formatting

A standard feature that Java adopted from the C language is printf-style string formatting. printf-style formatting utilizes special format strings embedded into text to tell the formatting engine where to place arguments and give detailed specification about conversions, layout, and alignment. The printf formatting methods also make use of variable-length argument lists, which makes working with them much easier. Here is a quick example of printf-formatted output:

    System.out.printf( "My name is %s and I am %d years old\n", name, age );

The printf formatting draws its name from the C language printf() function, so if you’ve done any C programming, this will look familiar. Java has extended the concept, adding some additional type safety and convenience features. Although Java has had some text formatting capabilities in the past (we’ll discuss the java.text package and MessageFormat later), printf formatting was not really feasible until variable-length argument lists and autoboxing of primitive types were added in Java 5.0. (We mention this to explain why these similar APIs both exist in Java.)

Formatter

The primary new tool in our text formatting arsenal is the java.util.Formatter class and its format() method. Several convenience methods can hide the Formatter object from you and you may not need to create a Formatter directly. First, the static String.format() method can be used to format a String with arguments (like the C language sprintf() method):

    String message =
        String.format("My name is %s and I am %d years old.", name, age );

Next, the java.io.PrintStream and java.io.PrintWriter classes, which are used for writing text to streams, have their own format() method. We discuss streams in Chapter 12, but this simply means that you can use this same printf-style formatting for writing strings to any kind of stream, whether it be to System.out standard console output, to a file, or to a network connection.

In addition to the format() method, PrintStream and PrintWriter also have a version of the format method that is actually called printf(). The printf() method is identical to and, in fact, simply delegates to the format() method. It’s there solely as a shout-out to the C programmers and ex-C programmers in the audience.

The Format String

The syntax of the format string is compact and a bit cryptic at first, but not bad once you get used to it. The simplest format string is just a percent sign (%) followed by a conversion character. For example, the following text has two embedded format strings:

    "My name is %s and I am %d years old."

The first conversion character is s, the most general format, which represents a string value; and the second is d, which represents an integer value. There are about a dozen basic conversion characters corresponding to different types and primitives and there are a couple of dozen more that are specifically used for formatting dates and times. We cover the basics here and return to date and time formatting in Chapter 11.

At first glance, some of the conversion characters may not seem to do much. For example, the %s general string conversion in our previous example would actually have handled the job of displaying the numeric age argument just as well as %d. However, these specialized conversion characters accomplish three things. First, they add a level of type safety. By specifying %d, we ensure that only an integer type is formatted at that location. If we make a mistake in the arguments, we get a runtime IllegalFormatConversionException instead of garbage in our string (and your IDE may flag it as well). Second, the format method is Locale-sensitive and capable of displaying numbers, percentages, dates, and times in many different languages just by specifying a Locale as an argument. By telling the Formatter the type of argument with type-specific conversion characters, printf can take into account language-specific localizations. Third, additional flags and fields can be used to govern layout with different meanings for different types of arguments. For example, with floating-point numbers, you can specify a precision in the format string.

The general layout of the embedded format string is as follows:

    %[argument_index$][flags][width][.precision]conversion_type

Following the literal % are a number of optional items before the conversion type character. We’ll discuss these as they come up, but here’s the rundown. The argument index can be used to reorder or reuse individual arguments in the variable-length argument list by referring to them by number. The flags field holds one or more special flag characters governing the format. The width and precision fields control the size of the output for text and the number of digits displayed for floating-point numbers.

String Conversions

The conversion characters s represents the general string conversion type. Ultimately, all of the conversion types produce a String. What we mean is that the general string conversion takes the easy route to turning its argument into a string. Normally, this simply means calling toString() on the object. Since all of the arguments in the variable argument list are autoboxed, they are all Objects. Any primitives are represented by the results of calling toString() on their wrapper classes, which generally return the value as you’d expect. If the argument is null, the result is the String “null.”

More interesting are objects that implement the java.util.Formattable interface. For these, the argument’s formatTo() method is invoked, passing it the flags, width, and precision information and allowing it to return the string to be used. In this way, objects can control their own printf string representation, just as an object can do so using toString().

Width, precision, and justification

For simple text arguments, you can think of the width and precision as a minimum and maximum number of characters to be output. As we’ll see later, for floating-point numeric types, the precision changes meaning slightly and controls the number of digits displayed after the decimal point. We can see the effect on a simple string here:

    System.out.printf("String is '%5s'\n", "A");
    // String is '    A'
    System.out.printf("String is '%.5s'\n", "Happy Birthday!");
    // String is 'Happy'

In the first case, we specified a width of five characters, resulting in spaces being added to pad our argument. In the second example, we used the literal . followed by the precision value of 5 characters to limit the length of the string displayed, so our “Happy Birthday” string is truncated after the first five characters.

When our string was padded, it was right-justified (leading spaces added). You can control this with the flag character literal minus (-). Reversing our example:

    System.out.printf("String is '%-5s'\n", "A");
    // String is 'A    '

And, of course, we can combine all three, specifying a justification flag and a minimum and maximum width. Here is an example that prints words of varying lengths in two columns:

    String [] words =
       new String [] { "abalone", "ape", "antidisestablishmentarianism" };
    System.out.printf( "%-10s %s\n", "Word", "Length" );
    for ( String word : words )
       System.out.printf( "%-10.10s %s\n", word, word.length() );

    // output
    Word       Length
    abalone    7
    ape        3
    antidisest 28

Uppercase

The s conversion’s big brother S indicates that the output of the conversion should be forced to uppercase. Several other primitive and numeric conversion characters follow this pattern, as we’ll see later. For example:

    String word = "abalone";
    System.out.println(" The lucky word is: %S\n", word );
    // The lucky word is: ABALONE

Numbered arguments

You can refer to an arbitrary argument by number from a format string using the %n$ notation. For example, the following code snippet uses the single argument three times:

    System.out.println( "A %1$s is a %1$s is a %1$S...", "rose" );
     // A rose is a rose is a ROSE...

Numbered arguments are useful for two reasons. The first, shown here, is simply for reusing the same argument in different places and with different conversions. The usefulness of this becomes more apparent when we look at Date and Time formatting in Chapter 11, where we may refer to the same item half a dozen times to get individual fields. The second advantage is that numbered arguments give the message the flexibility to reorder the arguments. This is important when you’re using formatting strings to lay out a message for internationalization or customization purposes where convention may dictate a different ordering.

    log.format("Error %d : %s\n", errNo, errMsg );
    // Error 42 : Low Power
    log.format("%2$s (Error %1$d)\n", errNo, errMsg );
    // Low Power (Error 42)

Primitive and Numeric Conversions

Table 10-3 shows character and Boolean conversion characters.

Table 10-3. Character and Boolean conversion characters

Conversion

Type

Description

Example output

c

Character

Formats the result as a Unicode character

a

b, B

Boolean

Formats result as Boolean

true, FALSE

The c conversion character produces a Unicode character:

    System.out.printf("The first letter is: %c\n", 'a' );

The b and B conversion characters output the Boolean value of their arguments. If the argument is null, the output is false. Strangely, if the argument is of a type other than Boolean, the output is true. B is identical to b except that it forces the output to uppercase.

    System.out.printf( "The door is open: %b\n", ( door.status() == OPEN ) );

As for String types, a width value can be specified on c and b conversions to pad the result to a minimum length. Table 10-4 summarizes integer type conversion characters.

Table 10-4. Integer type conversion characters

Conversion

Type

Description

Example output

d

Integer

Formats the result as an integer.

999

x, X

Integer

Formats result as hexadecimal.

FF, 0xCAFE

o

Integer

Formats result as octal integer.

10, 010

h, H

Integer or object

Formats object as hexadecimal number. If object is not an integer, format its hashCode() value or “null” for null value.

7a71e498

The d, x, and o conversion characters handle the integer type values byte, short, int, and long. (The d apparently stands for decimal, which makes little sense in this context.) The h conversion is an oddity probably intended for debugging. Several important flags give additional control over the formatting of these numeric types. See the section Flags for details.

A width value can be specified on these conversions to pad the result. Precision values are not allowed on integer conversions.

Table 10-5 lists floating-point type conversion characters.

Table 10-5. Floating-point type conversion characters

Conversion

Type

Description

Example output

f

Floating point

Formats result as decimal number.

3.14

e, E

Floating point

Formats result in scientific notation.

3.000000e+08

g, G

Floating point

Formats result in either decimal or scientific notation depending on value and precision.

3.14, 10.0e-15

a, A

Floating point

Formats result as hexadecimal floating-point number with significand and exponent.

0x1.fep7

The f conversion character is the primary floating-point conversion character. e and g conversions allow for values to be formatted in scientific notation. a complements the ability in Java to assign floating-point values using hexadecimal significand and exponent notation, allowing bit-for-bit floating-point values to be displayed without ambiguity.

As always, a width value may be used to pad results to a minimum length. The precision value of the conversion, as its name suggests, controls the number of digits displayed after the decimal point for floating-point values. The value is rounded as necessary. If no precision value is specified, it defaults to six digits:

    printf("float is %f\n",   1.23456789); // float is 1.234568
    printf("float is %.3f\n", 1.23456789); // float is 1.235
    printf("float is %.1f\n", 1.23456789); // float is 1.2
    printf("float is %.0f\n", 1.23456789); // float is 1

The g conversion character determines whether to use decimal or scientific notation. First, the value is rounded to the specified precision. If the result is less than 10−4 (less than .0001) or if the result is greater than 10precision (10 to the power of the precision value), it is displayed in scientific notation. Otherwise, decimal notation is displayed.

Flags

Table 10-6 summarizes supported flags to use in format strings.

Table 10-6. Flags for format strings

Flag

Arg types

Description

Example output

-

Any

Left-justifies result (pad space on the right)

'foo '

+

Numeric

Prefixes a + sign on positive results

+1

' '

Numeric

Prefixes a space on positive results (aligning them with negative values)

' 1'

0

Numeric

Pads number with leading zeros to accommodate width requirement

000001

,

Numeric

Formats numbers with commas or other Locale-specific grouping characters

1,234,567

(

Numeric

Encloses negative numbers in parentheses (a convention used to show credits)

(42.50)

#

x,X,o

Uses an alternate form for octal and hexadecimal output

0xCAFE, 010

As mentioned earlier, the - flag can be used to left-justify formatted output. The remaining flags affect the display of numeric types as described.

The # alternate form flag can be used to print octal and hexadecimal values with their standard prefixes—0x for hexadecimal or 0 for octal:

    System.out.printf("%1$X, %1$#X", 0xCAFE, 0xCAFE ); // CAFE, 0xCAFE
    System.out.printf("%1$o, %1$#o", 8, 8 ); // 10, 010

Miscellaneous

Table 10-7 lists the remaining formatting items.

Table 10-7. Miscellaneous formatting items

Conversion

Description

%

Produces a literal % character (Unicode \u0025)

n

Produces the platform-specific line separator (e.g., newline or carriage-return, newline)

Get Learning Java, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.