The Standard Analyzer is suitable for many languages and situations. It can also be customized for the underlying language or situation. The Standard Analyzer is composed of the following components:
Tokenizer:
- Standard Tokenizer: A tokenizer that splits tokens on whitespace characters
Token Filters:
- Standard Token Filter: Standard Token Filter is used as a placeholder token filter within the Standard Analyzer. It does not change any of the input tokens but may be used in future to perform some tasks.
- Lowercase Token Filter: Makes all tokens in the input lowercase.
- Stop Token Filter: Removes the specified stopwords. The default settings has a stopword list set to _none_ which doesn't remove any stopwords by default.