OpenSearch/docs/reference/analysis/tokenizers
James Rodewig 095c34359f [DOCS] Note limitations of `max_gram` parm in `edge_ngram` tokenizer for index analyzers (#49007)
The `edge_ngram` tokenizer limits tokens to the `max_gram` character
length. Autocomplete searches for terms longer than this limit return
no results.

To prevent this, you can use the `truncate` token filter to truncate
tokens to the `max_gram` character length. However, this could return irrelevant results.

This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach.

Closes #48956.
2019-11-13 14:28:12 -05:00
..
chargroup-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
classic-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
edgengram-tokenizer.asciidoc [DOCS] Note limitations of `max_gram` parm in `edge_ngram` tokenizer for index analyzers (#49007) 2019-11-13 14:28:12 -05:00
keyword-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
letter-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
lowercase-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
ngram-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
pathhierarchy-tokenizer-examples.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
pathhierarchy-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
pattern-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
simplepattern-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
simplepatternsplit-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
standard-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
thai-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
uaxurlemail-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
whitespace-tokenizer.asciidoc [DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00