[Docs] Clarify caveats for phonetic filters replace option (#42807)

The `replace` option in the phonetic token filter can have suprising side
effects, e.g. such as described in #26921. This PR adds a note to be mindful
about such scenarios and offers alternatives to using the `replace` option.

Closes #26921
This commit is contained in:
Christoph Büscher 2019-06-05 22:02:17 +02:00
parent 1300183001
commit 99542e66a6
2 changed files with 10 additions and 1 deletions

View File

@ -65,6 +65,14 @@ GET phonetic_sample/_analyze
<1> Returns: `J`, `joe`, `BLKS`, `bloggs`
It is important to note that `"replace": false` can lead to unexpected behavior since
the original and the phonetically analyzed version are both kept at the same token position.
Some queries handle these stacked tokens in special ways. For example, the fuzzy `match`
query does not apply {ref}/common-options.html#fuzziness[fuzziness] to stacked synonym tokens.
This can lead to issues that are difficult to diagnose and reason about. For this reason, it
is often beneficial to use separate fields for analysis with and without phonetic filtering.
That way searches can be run against both fields with differing boosts and trade-offs (e.g.
only run a fuzzy `match` query on the original text field, but not on the phonetic version).
[float]
===== Double metaphone settings

View File

@ -56,7 +56,8 @@ rewritten.
Fuzzy transpositions (`ab` -> `ba`) are allowed by default but can be disabled
by setting `fuzzy_transpositions` to `false`.
Note that fuzzy matching is not applied to terms with synonyms, as under the hood
NOTE: Fuzzy matching is not applied to terms with synonyms or in cases where the
analysis process produces multiple tokens at the same position. Under the hood
these terms are expanded to a special synonym query that blends term frequencies,
which does not support fuzzy expansion.