Grapheme Clusters
Unicode 3.2 introduces a new concept called the “grapheme cluster.” Actually, the concept isn't all that new; Unicode 3.2 merely formalizes a concept that was already out there, nailing down a more specific definition and some related character properties and giving it a new name.
A grapheme cluster is a sequence of one or more Unicode code points that should be treated as a single unit by various processes:
Text-editing software should generally allow placement of the cursor only at grapheme cluster boundaries. Clicking the mouse on a piece of text should place the insertion point at the nearest grapheme cluster boundary, and the arrow keys should move forward and back one grapheme cluster at a time.
Text-rendering software ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access