
while ((line = infile.readLine( )) != null) {
Matcher m = regexp.matcher(line);
if (m.find()) {
System.out.println(line);
}
}
}
}
International Components for Unicode (ICU)
The
International Components for Unicode (ICU) activity is driven by major software
companies, but it involves voluntary work too and is based on the open source prin-
ciple. The ICU software consists of components (subroutines, modules) that are avail-
able as source code and portable to different operating systems. ICU is often charac-
terized as a “project,” but by its nature, it has to be a continuous activity, to keep up
with the development of the Unicode standard and related specifications.
Originally released (in 1999) as “IBM Classes for Unicode” and still substantially sup-
ported by IBM and other vendors, ICU has become the first choice for building software
that works with Unicode data, when possible. ICU was originally written in Java, and
later support to C and C++ has been added. The Java version is called ICU4J, and the
C and C++ version is ICU4C.
The official ICU site is hosted at http://www.ibm.com/software/globalization/icu/. It
contains a handy “Getting started with ICU” section. The other key site is found at
http://icu.sourceforge.net/ and is by SourceForge, the development and download re-
pository of open source code and applications. The sites are linked together in many
ways, so you can