while ((line = infile.readLine( )) != null) {
Matcher m = regexp.matcher(line);
if (m.find()) {
System.out.println(line);
}
}
}
}
International Components for Unicode (ICU)
The
International Components for Unicode (ICU) activity is driven by major software
companies, but it involves voluntary work too and is based on the open source prin-
ciple. The ICU software consists of components (subroutines, modules) that are avail-
able as source code and portable to different operating systems. ICU is often charac-
terized as a “project,” but by its nature, it has to be a continuous activity, to keep up
with the development of the Unicode standard and related specifications.
Originally released (in 1999) as “IBM Classes for Unicode” and still substantially sup-
ported by IBM and other vendors, ICU has become the first choice for building software
that works with Unicode data, when possible. ICU was originally written in Java, and
later support to C and C++ has been added. The Java version is called ICU4J, and the
C and C++ version is ICU4C.
The official ICU site is hosted at http://www.ibm.com/software/globalization/icu/. It
contains a handy “Getting started with ICU” section. The other key site is found at
http://icu.sourceforge.net/ and is by SourceForge, the development and download re-
pository of open source code and applications. The sites are linked together in many
ways, so you can start in either of them. ICU contains software components for several
purposes:
Basic text
Unicode text handling, character properties, and character code conversions
Text analysis
Unicode regular expressions and characters, operations on collections (sets) of
characters, and detection of word and line boundaries
Sorting and searching
Language-sensitive collation and searching
Transformations
Normalization forms, case mappings, transliterations
Locales
General locale data and resource bundle architecture
Complex text layout
For example, Arabic, Hebrew, Indic, and Thai
International Components for Unicode (ICU) | 619

Get Unicode Explained now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.