O'Reilly logo

The class of Java by Pravin Jain

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

APPENDIX A

What is UTF?

UTF is an abbreviation for UCS Transformation Format. UCS is an abbreviation for Universal Character Set. The Universal Character Set is synchronized with the unicode standard. There are three commonly known types of UTF encodings, namely UTF-8, UTF-16 and UTF-32.

The UTF-8 encodes unicode characters into a sequence of 8-bit values known as code units. In UTF-8 the encoding unit is 8-bits long. Similarly UTF-16 and UTF-32 each use 16 and 32 bits for encoding the unicode characters.

There are over a million characters included in the current version of unicode Standard (v5.2.0 is the standard at the time of writing this book). The valid range of code points for the unicode characters is from 0 to 10FFFF (in Hex). Out of ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required