OpenSearch/docs/reference/analysis
Alan Woodward 5107949402
Allow TokenFilterFactories to rewrite themselves against their preceding chain (#33702)
We currently special-case SynonymFilterFactory and SynonymGraphFilterFactory, which need to 
know their predecessors in the analysis chain in order to correctly analyze their synonym lists. This
special-casing doesn't work with Referring filter factories, such as the Multiplexer or Conditional
filters. We also have a number of filters (eg the Multiplexer) that will break synonyms when they
appear before them in a chain, because they produce multiple tokens at the same position.

This commit adds two methods to the TokenFilterFactory interface.

* `getChainAwareTokenFilterFactory()` allows a filter factory to rewrite itself against its preceding
  filter chain, or to resolve references to other filters. It replaces `ReferringFilterFactory` and
  `CustomAnalyzerProvider.checkAndApplySynonymFilter`, and by default returns `this`.
* `getSynonymFilter()` defines whether or not a filter should be applied when building a synonym
  list `Analyzer`. By default it returns `true`.

Fixes #33609
2018-09-19 15:52:14 +01:00
..
analyzers Upgrade to a Lucene 8 snapshot (#33310) 2018-09-06 14:42:06 +02:00
charfilters fixed elements in array of produced terms (#32519) 2018-08-02 11:12:15 -04:00
tokenfilters Allow TokenFilterFactories to rewrite themselves against their preceding chain (#33702) 2018-09-19 15:52:14 +01:00
tokenizers [Feature] Adding a char_group tokenizer (#24186) 2018-05-22 16:26:31 +02:00
analyzers.asciidoc First pass at improving analyzer docs (#18269) 2016-05-11 14:17:56 +02:00
anatomy.asciidoc Correction of the names of numirals (#21531) 2016-11-25 14:30:49 +01:00
charfilters.asciidoc Hindu-Arabico-Latino Numerals (#22476) 2017-01-10 15:24:56 +01:00
normalizers.asciidoc [DOCS] Add supported token filters 2018-02-13 14:10:25 -08:00
testing.asciidoc Allow `_doc` as a type. (#27816) 2017-12-14 17:47:53 +01:00
tokenfilters.asciidoc Add predicate_token_filter (#33431) 2018-09-11 09:16:39 +01:00
tokenizers.asciidoc [Feature] Adding a char_group tokenizer (#24186) 2018-05-22 16:26:31 +02:00