Parsing and Formatting Text
Parsing and formatting text is a large, open-ended topic. So far in this chapter, we’ve looked at only primitive operations on strings—creation, basic editing, searching, and turning simple values into strings. Now we’d like to move on to more structured forms of text. Java has a rich set of APIs for parsing and printing formatted strings, including numbers, dates, times, and currency values. We’ll cover most of these topics in this chapter, but we’ll wait to discuss date and time formatting until Chapter 11.
We’ll start with parsing—reading primitive numbers and values as
strings and chopping long strings into tokens. Then we’ll go the other way
and look at formatting strings and the java.text package. We’ll revisit the topic of
internationalization to see how Java can localize parsing and formatting
of text, numbers, and dates for particular locales. Finally, we’ll take a
detailed look at regular expressions, the most powerful text-parsing tool
Java offers. Regular expressions let you define your own patterns of
arbitrary complexity, search for them, and parse them from text.
We should mention that you’re going to see a great deal of overlap
between the new formatting and parsing APIs (printf and Scanner) introduced in Java 5.0 and the older
APIs of the java.text package. The new APIs effectively replace much of the old ones and in some ways are easier to use. Nonetheless, it’s good to know about both because so much existing code uses the older APIs. ...