From 3a35427b6d81c0fb7fbe65aa9ac223430ec1d5ed Mon Sep 17 00:00:00 2001 From: Alan Woodward Date: Tue, 7 May 2019 09:09:28 +0100 Subject: [PATCH] Improvements to docs around multiplexer and synonyms (#41645) This commit fixes a multiplexer doc error concerning synonyms, and adds suggestions on how to combine the two filters. --- .../tokenfilters/multiplexer-tokenfilter.asciidoc | 11 +++++------ .../tokenfilters/synonym-graph-tokenfilter.asciidoc | 4 ++++ .../tokenfilters/synonym-tokenfilter.asciidoc | 4 ++++ 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/docs/reference/analysis/tokenfilters/multiplexer-tokenfilter.asciidoc b/docs/reference/analysis/tokenfilters/multiplexer-tokenfilter.asciidoc index a92e2476ad7..50462cc2871 100644 --- a/docs/reference/analysis/tokenfilters/multiplexer-tokenfilter.asciidoc +++ b/docs/reference/analysis/tokenfilters/multiplexer-tokenfilter.asciidoc @@ -116,9 +116,8 @@ And it'd respond: duplicate of this token it has been removed from the token stream NOTE: The synonym and synonym_graph filters use their preceding analysis chain to -parse and analyse their synonym lists, and ignore any token filters in the chain -that produce multiple tokens at the same position. This means that any filters -within the multiplexer will be ignored for the purpose of synonyms. If you want to -use filters contained within the multiplexer for parsing synonyms (for example, to -apply stemming to the synonym lists), then you should append the synonym filter -to the relevant multiplexer filter list. +parse and analyse their synonym lists, and will throw an exception if that chain +contains token filters that produce multiple tokens at the same position. +If you want to apply synonyms to a token stream containing a multiplexer, then you +should append the synonym filter to each relevant multiplexer filter list, rather than +placing it after the multiplexer in the main token chain definition. diff --git a/docs/reference/analysis/tokenfilters/synonym-graph-tokenfilter.asciidoc b/docs/reference/analysis/tokenfilters/synonym-graph-tokenfilter.asciidoc index 2285c6f6e89..b434129626d 100644 --- a/docs/reference/analysis/tokenfilters/synonym-graph-tokenfilter.asciidoc +++ b/docs/reference/analysis/tokenfilters/synonym-graph-tokenfilter.asciidoc @@ -188,6 +188,10 @@ parsing synonyms, e.g. `asciifolding` will only produce the folded version of th token. Others, e.g. `multiplexer`, `word_delimiter_graph` or `ngram` will throw an error. +If you need to build analyzers that include both multi-token filters and synonym +filters, consider using the <> filter, +with the multi-token filters in one branch and the synonym filter in the other. + WARNING: The synonym rules should not contain words that are removed by a filter that appears after in the chain (a `stop` filter for instance). Removing a term from a synonym rule breaks the matching at query time. diff --git a/docs/reference/analysis/tokenfilters/synonym-tokenfilter.asciidoc b/docs/reference/analysis/tokenfilters/synonym-tokenfilter.asciidoc index 11072361946..139f7c3ab0a 100644 --- a/docs/reference/analysis/tokenfilters/synonym-tokenfilter.asciidoc +++ b/docs/reference/analysis/tokenfilters/synonym-tokenfilter.asciidoc @@ -177,3 +177,7 @@ multiple versions of a token may choose which version of the token to emit when parsing synonyms, e.g. `asciifolding` will only produce the folded version of the token. Others, e.g. `multiplexer`, `word_delimiter_graph` or `ngram` will throw an error. + +If you need to build analyzers that include both multi-token filters and synonym +filters, consider using the <> filter, +with the multi-token filters in one branch and the synonym filter in the other.