The main function of a token filter is to add, modify, or delete the characters of the output tokens from the tokenizer. There are approximately 50 built-in token filters. We'll cover some popular token filters in the following table. You can find out more and learn about the remaining token filters at https://www.elastic.co/guide/en/elasticsearch/reference/7.x/analysis-tokenfilters.html. Each example token filter in the following table uses a standard tokenizer and a specified token filter. Note that no character filter is applied:
Token filter | ||
asciifolding | Input text | "Ÿőű'ľľ ľőνė Ȅľȁśťĩćŝėȁŕćĥ 7.0" |
Description |
This transforms the terms when letters, numbers, and unicode symbols are not in the first 127 ASCII characters ... |