mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-06 13:08:29 +00:00
6666fb1614
This commit introduces a new execution mode for the query_string query, which is intended down the road to be a replacement for the current _all field. It now does auto-field-expansion and auto-leniency when the following criteria are ALL met: The _all field is disabled No default_field has been set in the index settings No default_field has been set in the request No fields are specified in the request Additionally, a user can force the "all-like" execution by setting the all_fields parameter to true. When executing in all field mode, the query_string query will look at all the fields in the mapping that are not metafields and can be searched, and automatically expand the list of fields that are going to be queried. Relates to #19784
234 lines
7.4 KiB
Plaintext
234 lines
7.4 KiB
Plaintext
[[query-dsl-query-string-query]]
|
|
=== Query String Query
|
|
|
|
A query that uses a query parser in order to parse its content. Here is
|
|
an example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query": {
|
|
"query_string" : {
|
|
"default_field" : "content",
|
|
"query" : "this AND that OR thus"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
The `query_string` top level parameters include:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Parameter |Description
|
|
|`query` |The actual query to be parsed. See <<query-string-syntax>>.
|
|
|
|
|`default_field` |The default field for query terms if no prefix field
|
|
is specified. Defaults to the `index.query.default_field` index
|
|
settings, which in turn defaults to `_all`.
|
|
|
|
|`default_operator` |The default operator used if no explicit operator
|
|
is specified. For example, with a default operator of `OR`, the query
|
|
`capital of Hungary` is translated to `capital OR of OR Hungary`, and
|
|
with default operator of `AND`, the same query is translated to
|
|
`capital AND of AND Hungary`. The default value is `OR`.
|
|
|
|
|`analyzer` |The analyzer name used to analyze the query string.
|
|
|
|
|`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
|
|
character. Defaults to `true`.
|
|
|
|
|`enable_position_increments` |Set to `true` to enable position
|
|
increments in result queries. Defaults to `true`.
|
|
|
|
|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
|
|
expand to. Defaults to `50`
|
|
|
|
|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
|
|
to `AUTO`. See <<fuzziness>> for allowed settings.
|
|
|
|
|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
|
|
is `0`.
|
|
|
|
|`phrase_slop` |Sets the default slop for phrases. If zero, then exact
|
|
phrase matches are required. Default value is `0`.
|
|
|
|
|`boost` |Sets the boost value of the query. Defaults to `1.0`.
|
|
|
|
|`auto_generate_phrase_queries` |Defaults to `false`.
|
|
|
|
|`analyze_wildcard` |By default, wildcards terms in a query string are
|
|
not analyzed. By setting this value to `true`, a best effort will be
|
|
made to analyze those as well.
|
|
|
|
|`max_determinized_states` |Limit on how many automaton states regexp
|
|
queries are allowed to create. This protects against too-difficult
|
|
(e.g. exponentially hard) regexps. Defaults to 10000.
|
|
|
|
|`minimum_should_match` |A value controlling how many "should" clauses
|
|
in the resulting boolean query should match. It can be an absolute value
|
|
(`2`), a percentage (`30%`) or a
|
|
<<query-dsl-minimum-should-match,combination of
|
|
both>>.
|
|
|
|
|`lenient` |If set to `true` will cause format based failures (like
|
|
providing text to a numeric field) to be ignored.
|
|
|
|
|`time_zone` | Time Zone to be applied to any range query related to dates. See also
|
|
http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html[JODA timezone].
|
|
|
|
|`quote_field_suffix` | A suffix to append to fields for quoted parts of
|
|
the query string. This allows to use a field that has a different analysis chain
|
|
for exact matching. Look <<mixing-exact-search-with-stemming,here>> for a
|
|
comprehensive example.
|
|
|
|
|`split_on_whitespace` |Whether query text should be split on whitespace prior to analysis.
|
|
Instead the queryparser would parse around only real 'operators'.
|
|
Default to `false`.
|
|
|
|
|`all_fields` | Perform the query on all fields detected in the mapping that can
|
|
be queried. Will be used by default when the `_all` field is disabled and no
|
|
`default_field` is specified (either in the index settings or in the request
|
|
body) and no `fields` are specified.
|
|
|
|
|=======================================================================
|
|
|
|
When a multi term query is being generated, one can control how it gets
|
|
rewritten using the
|
|
<<query-dsl-multi-term-rewrite,rewrite>>
|
|
parameter.
|
|
|
|
[float]
|
|
==== Default Field
|
|
|
|
When not explicitly specifying the field to search on in the query
|
|
string syntax, the `index.query.default_field` will be used to derive
|
|
which field to search on. It defaults to `_all` field.
|
|
|
|
If the `_all` field is disabled, the `query_string` query will automatically
|
|
attempt to determine the existing fields in the index's mapping that are
|
|
queryable, and perform the search on those fields.
|
|
|
|
[float]
|
|
==== Multi Field
|
|
|
|
The `query_string` query can also run against multiple fields. Fields can be
|
|
provided via the `"fields"` parameter (example below).
|
|
|
|
The idea of running the `query_string` query against multiple fields is to
|
|
expand each query term to an OR clause like this:
|
|
|
|
field1:query_term OR field2:query_term | ...
|
|
|
|
For example, the following query
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query": {
|
|
"query_string" : {
|
|
"fields" : ["content", "name"],
|
|
"query" : "this AND that"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
matches the same words as
|
|
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query": {
|
|
"query_string": {
|
|
"query": "(content:this OR name:this) AND (content:that OR name:that)"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Since several queries are generated from the individual search terms,
|
|
combining them can be automatically done using either a `dis_max` query or a
|
|
simple `bool` query. For example (the `name` is boosted by 5 using `^5`
|
|
notation):
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query": {
|
|
"query_string" : {
|
|
"fields" : ["content", "name^5"],
|
|
"query" : "this AND that OR thus",
|
|
"use_dis_max" : true
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Simple wildcard can also be used to search "within" specific inner
|
|
elements of the document. For example, if we have a `city` object with
|
|
several fields (or inner object with fields) in it, we can automatically
|
|
search on all "city" fields:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query": {
|
|
"query_string" : {
|
|
"fields" : ["city.*"],
|
|
"query" : "this AND that OR thus",
|
|
"use_dis_max" : true
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Another option is to provide the wildcard fields search in the query
|
|
string itself (properly escaping the `*` sign), for example:
|
|
`city.\*:something`.
|
|
|
|
When running the `query_string` query against multiple fields, the
|
|
following additional parameters are allowed:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Parameter |Description
|
|
|`use_dis_max` |Should the queries be combined using `dis_max` (set it
|
|
to `true`), or a `bool` query (set it to `false`). Defaults to `true`.
|
|
|
|
|`tie_breaker` |When using `dis_max`, the disjunction max tie breaker.
|
|
Defaults to `0`.
|
|
|=======================================================================
|
|
|
|
The fields parameter can also include pattern based field names,
|
|
allowing to automatically expand to the relevant fields (dynamically
|
|
introduced fields included). For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query": {
|
|
"query_string" : {
|
|
"fields" : ["content", "name.*^5"],
|
|
"query" : "this AND that OR thus",
|
|
"use_dis_max" : true
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
include::query-string-syntax.asciidoc[]
|