OpenSearch/docs/reference/analysis/tokenizers
Mayya Sharipova 148376c2c5
Add limits for ngram and shingle settings (#27211)
* Add limits for ngram and shingle settings (#27211)

Create index-level settings:
max_ngram_diff - maximum allowed difference between max_gram and min_gram in
NGramTokenFilter/NGramTokenizer. Default is 1.
max_shingle_diff - maximum allowed difference between max_shingle_size and
 min_shingle_size in ShingleTokenFilter.  Default is 3.

Throw an IllegalArgumentException when
trying to create NGramTokenFilter, NGramTokenizer, ShingleTokenFilter
where difference between max_size and min_size exceeds the settings value.

Closes #25887
2017-11-07 08:14:55 -05:00
..
classic-tokenizer.asciidoc Remove wait_for_status=yellow from the docs 2016-07-15 16:02:07 -04:00
edgengram-tokenizer.asciidoc Add a shard filter search phase to pre-filter shards based on query rewriting (#25658) 2017-07-12 22:19:20 +02:00
keyword-tokenizer.asciidoc Docs: Improved tokenizer docs (#18356) 2016-05-19 19:42:23 +02:00
letter-tokenizer.asciidoc Docs: Improved tokenizer docs (#18356) 2016-05-19 19:42:23 +02:00
lowercase-tokenizer.asciidoc Update lowercase-tokenizer.asciidoc (#21896) 2016-12-02 10:49:51 -05:00
ngram-tokenizer.asciidoc Add limits for ngram and shingle settings (#27211) 2017-11-07 08:14:55 -05:00
pathhierarchy-tokenizer.asciidoc Remove wait_for_status=yellow from the docs 2016-07-15 16:02:07 -04:00
pattern-tokenizer.asciidoc [Docs] Fix typo in pattern-tokenizer.asciidoc (#25626) 2017-07-13 18:43:48 +02:00
simplepattern-tokenizer.asciidoc Update experimental labels in the docs (#25727) 2017-07-18 14:06:22 +02:00
simplepatternsplit-tokenizer.asciidoc Update experimental labels in the docs (#25727) 2017-07-18 14:06:22 +02:00
standard-tokenizer.asciidoc Remove wait_for_status=yellow from the docs 2016-07-15 16:02:07 -04:00
thai-tokenizer.asciidoc Docs: Improved tokenizer docs (#18356) 2016-05-19 19:42:23 +02:00
uaxurlemail-tokenizer.asciidoc Remove wait_for_status=yellow from the docs 2016-07-15 16:02:07 -04:00
whitespace-tokenizer.asciidoc Add configurable `maxTokenLength` parameter to whitespace tokenizer (#26749) 2017-09-25 17:21:19 +02:00