PHP as we see it in version 5 does not really know much at all about character sets. As we have seen previously, using UTF-8 means that the things that people see as characters may be one, two or three bytes long in simple cases. They are longer when special characters are accounted for. But when PHP looks at a string using something like the
strlen function, the only thing it is looking for is bytes. The length returned by
strlen for a single UTF-8 character could be 1, 2, or 3.
On the plus side, PHP will not damage or alter strings. So if we have a string that contains UTF-8 characters, it can be moved around, stored, retrieved, and sent to the browser, all without any adverse events. Provided, that is, we ...