4.2. Coding in a Post-ASCII World

The “age of ASCII” is gone, though some have not realized it yet. Many assumptions commonly made by programmers in the past are no longer true. We need a new way of thinking.

There are two concepts that I consider to be fundamental, almost axiomatic. First, a string has no intrinsic interpretation. It must be interpreted according to some external standard. Second, a byte does not correspond to a character; a character may be one or more bytes. There are other lessons to be learned, but these two come first.

These facts may sometimes affect our programming in subtle ways. Let’s examine in detail how to handle character strings in a modern fashion.

4.2.1. The jcode Library and $KCODE

To use different character ...

Get The Ruby Way: Solutions and Techniques in Ruby Programming, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.