[DOCS] Rewrite `fuzzy` query docs (#42078)

This commit is contained in:
James Rodewig 2019-08-14 13:06:23 -04:00
parent 79a1390935
commit 6904778a14
1 changed files with 73 additions and 49 deletions

View File

@ -4,75 +4,99 @@
<titleabbrev>Fuzzy</titleabbrev> <titleabbrev>Fuzzy</titleabbrev>
++++ ++++
The fuzzy query uses similarity based on Levenshtein edit distance. Returns documents that contain terms similar to the search term, as measured by
a http://en.wikipedia.org/wiki/Levenshtein_distance[Levenshtein edit distance].
==== String fields An edit distance is the number of one-character changes needed to turn one term
into another. These changes can include:
The `fuzzy` query generates matching terms that are within the * Changing a character (**b**ox → **f**ox)
maximum edit distance specified in `fuzziness` and then checks the term * Removing a character (**b**lack → lack)
dictionary to find out which of those generated terms actually exist in the * Inserting a character (sic → sic**k**)
index. The final query uses up to `max_expansions` matching terms. * Transposing two adjacent characters (**ac**t → **ca**t)
Here is a simple example: To find similar terms, the `fuzzy` query creates a set of all possible
variations, or expansions, of the search term within a specified edit distance.
The query then returns exact matches for each expansion.
[[fuzzy-query-ex-request]]
==== Example requests
[[fuzzy-query-ex-simple]]
===== Simple example
[source,js] [source,js]
-------------------------------------------------- ----
GET /_search GET /_search
{ {
"query": { "query": {
"fuzzy" : { "user" : "ki" } "fuzzy": {
"user": {
"value": "ki"
} }
} }
-------------------------------------------------- }
}
----
// CONSOLE // CONSOLE
Or with more advanced settings: [[fuzzy-query-ex-advanced]]
===== Example using advanced parameters
[source,js] [source,js]
-------------------------------------------------- ----
GET /_search GET /_search
{ {
"query": { "query": {
"fuzzy": { "fuzzy": {
"user": { "user": {
"value": "ki", "value": "ki",
"boost": 1.0, "fuzziness": "AUTO",
"fuzziness": 2, "max_expansions": 50,
"prefix_length": 0, "prefix_length": 0,
"max_expansions": 100 "transpositions": true,
"rewrite": "constant_score"
} }
} }
} }
} }
-------------------------------------------------- ----
// CONSOLE // CONSOLE
[float] [[fuzzy-query-top-level-params]]
===== Parameters ==== Top-level parameters for `fuzzy`
`<field>`::
(Required, object) Field you wish to search.
[[fuzzy-query-field-params]]
==== Parameters for `<field>`
`value`::
(Required, string) Term you wish to find in the provided `<field>`.
[horizontal]
`fuzziness`:: `fuzziness`::
(Optional, string) Maximum edit distance allowed for matching. See <<fuzziness>>
for valid values and more information.
The maximum edit distance. Defaults to `AUTO`. See <<fuzziness>>.
`prefix_length`::
The number of initial characters which will not be ``fuzzified''. This
helps to reduce the number of terms which must be examined. Defaults
to `0`.
`max_expansions`:: `max_expansions`::
+
--
(Optional, integer) Maximum number of variations created. Defaults to `50`.
The maximum number of terms that the `fuzzy` query will expand to. WARNING: Avoid using a high value in the `max_expansions` parameter, especially
Defaults to `50`. if the `prefix_length` parameter value is `0`. High values in the
`max_expansions` parameter can cause poor performance due to the high number of
variations examined.
--
`prefix_length`::
(Optional, integer) Number of beginning characters left unchanged when creating
expansions. Defaults to `0`.
`transpositions`:: `transpositions`::
(Optional, boolean) Indicates whether edits include transpositions of two
adjacent characters (ab → ba). Defaults to `true`.
Whether fuzzy transpositions (`ab` -> `ba`) are supported. `rewrite`::
Default is `true`. (Optional, string) Method used to rewrite the query. For valid values and more
information, see the <<query-dsl-multi-term-rewrite, `rewrite` parameter>>.
WARNING: This query can be very heavy if `prefix_length` is set to `0` and if
`max_expansions` is set to a high number. It could result in every term in the
index being examined!