
Character Requirements of Languages
Although Unicode contains almost all characters used in currently used languages, it
is still and will always be relevant to consider the character requirements that different
languages impose. Here we will first list some of the reasons for this, and then analyze
the concept of “character requirements,” and finally study some specific languages.
The Impact of Character Repertoire
As mentioned in the section “Definitions of Character Repertoires” in Chapter 1, there
are good reasons to try to estimate the repertoire of characters that will appear in a
document or in an application. In more detail, the reasons include the following:
• A font typically supports a limited character repertoire only. Full Unicode fonts
are rare, and usually not suitable for copy text.
• In particular, artistic or otherwise special fonts, such as those used for headings
and buttons, often have a very limited character repertoire.
• A program that will be used for processing your document in some way might be
prepared to handle a limited repertoire only.
• Special characters in normal text often result from mistyping or other errors. When
checking input data, it is often useful to detect any “unusual” characters and issue
warnings about them.
• In particular, character recognition (in scanning text or in processing handwritten
characters) works best if the assumed repertoire is small. ...