5.3. Conversion of Text from One Encoding to Another

There are few tools for converting text, no doubt because text editors (such as BBEdit and Ultra-Edit) and word-processing packages (such as MS Word and Corel WordPerfect) handle this process internally. There is a free library of subroutines devoted to converting between encodings: libiconv, developed by Bruno Haible. The GNU software provided with this library that performs conversions is called iconv.

Figure 5-10. The main window of XKeyCaps

5.3.1. The recode Utility

In this section we shall describe a program with a long history (its origins, under a different name, go back to the 1970s) that today is based on libiconv: recode, by the Quèbècois François Pinard [293].

To convert a file foo.txt, all that we need is to find the source encoding A and the target encoding B on the list of encodings supported by recode and write:

    recode A..B foo.txt

The file foo.txt will be overwritten. We can also write:

    recode A..B < foo.txt > foo-converted.txt

In fact, we can go through multiple steps:

    recode A..B..C..D..E foo.txt

What is even more interesting is that recode refers to surface, which is roughly the equivalent of Unicode's serialization mechanisms (see page 62)—a technique for transmitting data without changing the encoding. If S is a serialization mechanism, we can write:

    recode A..B/S foo.txt

and, in addition to the ...

Get Fonts & Encodings now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.