[Docs] Clarify behaviour of Pattern Capture Token Filter during search (#26278)
There was some confusion about the fact that tokens emitted from a Pattern Capture Token Filter are treated as synonyms when used to analyze a search query. This commit adds an explanation to the note in the docs to emphasize this behaviour. Closes #25746
This commit is contained in:
parent
181e881a0f
commit
254c1b28e9
|
@ -131,10 +131,12 @@ Multiple patterns are required to allow overlapping captures, but also
|
|||
means that patterns are less dense and easier to understand.
|
||||
|
||||
*Note:* All tokens are emitted in the same position, and with the same
|
||||
character offsets, so when combined with highlighting, the whole
|
||||
original token will be highlighted, not just the matching subset. For
|
||||
instance, querying the above email address for `"smith"` would
|
||||
highlight:
|
||||
character offsets. This means, for example, that a `match` query for
|
||||
`john-smith_123@foo-bar.com` that uses this analyzer will return documents
|
||||
containing any of these tokens, even when using the `and` operator.
|
||||
Also, when combined with highlighting, the whole original token will
|
||||
be highlighted, not just the matching subset. For instance, querying
|
||||
the above email address for `"smith"` would highlight:
|
||||
|
||||
[source,html]
|
||||
--------------------------------------------------
|
||||
|
|
Loading…
Reference in New Issue