mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-06 13:08:29 +00:00
A lot of different API's currently use different names for the same logical parameter. Since lucene moved away from the notion of a `similarity` and now uses an `fuzziness` we should generalize this and encapsulate the generation, parsing and creation of these settings across all queries. This commit adds a new `Fuzziness` class that handles the renaming and generalization in a backwards compatible manner. This commit also added a ParseField class to better support deprecated Query DSL parameters The ParseField class allows specifying parameger that have been deprecated. Those parameters can be more easily tracked and removed in future version. This also allows to run queries in `strict` mode per index to throw exceptions if a query is executed with deprected keys. Closes #4082
162 lines
5.5 KiB
Plaintext
162 lines
5.5 KiB
Plaintext
[[query-dsl-query-string-query]]
|
|
=== Query String Query
|
|
|
|
A query that uses a query parser in order to parse its content. Here is
|
|
an example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"query_string" : {
|
|
"default_field" : "content",
|
|
"query" : "this AND that OR thus"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
The `query_string` top level parameters include:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Parameter |Description
|
|
|`query` |The actual query to be parsed. See <<query-string-syntax>>.
|
|
|
|
|`default_field` |The default field for query terms if no prefix field
|
|
is specified. Defaults to the `index.query.default_field` index
|
|
settings, which in turn defaults to `_all`.
|
|
|
|
|`default_operator` |The default operator used if no explicit operator
|
|
is specified. For example, with a default operator of `OR`, the query
|
|
`capital of Hungary` is translated to `capital OR of OR Hungary`, and
|
|
with default operator of `AND`, the same query is translated to
|
|
`capital AND of AND Hungary`. The default value is `OR`.
|
|
|
|
|`analyzer` |The analyzer name used to analyze the query string.
|
|
|
|
|`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
|
|
character. Defaults to `true`.
|
|
|
|
|`lowercase_expanded_terms` |Whether terms of wildcard, prefix, fuzzy,
|
|
and range queries are to be automatically lower-cased or not (since they
|
|
are not analyzed). Default it `true`.
|
|
|
|
|`enable_position_increments` |Set to `true` to enable position
|
|
increments in result queries. Defaults to `true`.
|
|
|
|
|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
|
|
expand to. Defaults to `50`
|
|
|
|
|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
|
|
to `AUTO`. See <<fuzziness>> for allowed settings.
|
|
|
|
|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
|
|
is `0`.
|
|
|
|
|`phrase_slop` |Sets the default slop for phrases. If zero, then exact
|
|
phrase matches are required. Default value is `0`.
|
|
|
|
|`boost` |Sets the boost value of the query. Defaults to `1.0`.
|
|
|
|
|`analyze_wildcard` |By default, wildcards terms in a query string are
|
|
not analyzed. By setting this value to `true`, a best effort will be
|
|
made to analyze those as well.
|
|
|
|
|`auto_generate_phrase_queries` |Default to `false`.
|
|
|
|
|`minimum_should_match` |A value controlling how many "should" clauses
|
|
in the resulting boolean query should match. It can be an absolute value
|
|
(`2`), a percentage (`30%`) or a
|
|
<<query-dsl-minimum-should-match,combination of
|
|
both>>.
|
|
|
|
|`lenient` |If set to `true` will cause format based failures (like
|
|
providing text to a numeric field) to be ignored.
|
|
|=======================================================================
|
|
|
|
When a multi term query is being generated, one can control how it gets
|
|
rewritten using the
|
|
<<query-dsl-multi-term-rewrite,rewrite>>
|
|
parameter.
|
|
|
|
[float]
|
|
==== Default Field
|
|
|
|
When not explicitly specifying the field to search on in the query
|
|
string syntax, the `index.query.default_field` will be used to derive
|
|
which field to search on. It defaults to `_all` field.
|
|
|
|
So, if `_all` field is disabled, it might make sense to change it to set
|
|
a different default field.
|
|
|
|
[float]
|
|
==== Multi Field
|
|
|
|
The `query_string` query can also run against multiple fields. The idea
|
|
of running the `query_string` query against multiple fields is by
|
|
internally creating several queries for the same query string, each with
|
|
`default_field` that match the fields provided. Since several queries
|
|
are generated, combining them can be automatically done either using a
|
|
`dis_max` query or a simple `bool` query. For example (the `name` is
|
|
boosted by 5 using `^5` notation):
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"query_string" : {
|
|
"fields" : ["content", "name^5"],
|
|
"query" : "this AND that OR thus",
|
|
"use_dis_max" : true
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Simple wildcard can also be used to search "within" specific inner
|
|
elements of the document. For example, if we have a `city` object with
|
|
several fields (or inner object with fields) in it, we can automatically
|
|
search on all "city" fields:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"query_string" : {
|
|
"fields" : ["city.*"],
|
|
"query" : "this AND that OR thus",
|
|
"use_dis_max" : true
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Another option is to provide the wildcard fields search in the query
|
|
string itself (properly escaping the `*` sign), for example:
|
|
`city.\*:something`.
|
|
|
|
When running the `query_string` query against multiple fields, the
|
|
following additional parameters are allowed:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Parameter |Description
|
|
|`use_dis_max` |Should the queries be combined using `dis_max` (set it
|
|
to `true`), or a `bool` query (set it to `false`). Defaults to `true`.
|
|
|
|
|`tie_breaker` |When using `dis_max`, the disjunction max tie breaker.
|
|
Defaults to `0`.
|
|
|=======================================================================
|
|
|
|
The fields parameter can also include pattern based field names,
|
|
allowing to automatically expand to the relevant fields (dynamically
|
|
introduced fields included). For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"query_string" : {
|
|
"fields" : ["content", "name.*^5"],
|
|
"query" : "this AND that OR thus",
|
|
"use_dis_max" : true
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
include::query-string-syntax.asciidoc[]
|