CHAPTER 15 ■ CASE STUDY: PORTING CHARDET TO PYTHON 3
256
Hebrew is handled as a special case. If the text appears to be Hebrew based on two-character
distribution analysis, HebrewProber (defined in hebrewprober.py) tries to distinguish between Visual
Hebrew (where the source text is stored “backwards” line-by-line and then displayed verbatim, so it can
be read from right to left) and Logical Hebrew (where the source text is stored in reading order and then
rendered right-to-left by the client). Certain characters are encoded differently based on whether they
appear in the middle of or at the end of a word, so we can make a reasonable guess about direction of the
source text, and return the appropriate encoding (windows-1255 for Logical Hebrew ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month, and much more.
O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.