August 2019
Intermediate to advanced
560 pages
13h 41m
English
The main function of a character filter is to convert the original input text into a stream of characters and then preprocess it before passing it as an input to the tokenizer. Three built-in character filters are supported: html_strip, mapping, and pattern_replace. We'll practice each one using the same input text string as in the previous section.
Read now
Unlock full access