OpenSearch/docs/reference/query-dsl/query-string-query.asciidoc

228 lines
7.1 KiB
Plaintext

[[query-dsl-query-string-query]]
=== Query String Query
A query that uses a query parser in order to parse its content. Here is
an example:
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"query_string" : {
"default_field" : "content",
"query" : "this AND that OR thus"
}
}
}
--------------------------------------------------
// CONSOLE
The `query_string` top level parameters include:
[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`query` |The actual query to be parsed. See <<query-string-syntax>>.
|`default_field` |The default field for query terms if no prefix field
is specified. Defaults to the `index.query.default_field` index
settings, which in turn defaults to `_all`.
|`default_operator` |The default operator used if no explicit operator
is specified. For example, with a default operator of `OR`, the query
`capital of Hungary` is translated to `capital OR of OR Hungary`, and
with default operator of `AND`, the same query is translated to
`capital AND of AND Hungary`. The default value is `OR`.
|`analyzer` |The analyzer name used to analyze the query string.
|`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
character. Defaults to `true`.
|`enable_position_increments` |Set to `true` to enable position
increments in result queries. Defaults to `true`.
|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
expand to. Defaults to `50`
|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
to `AUTO`. See <<fuzziness>> for allowed settings.
|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
is `0`.
|`phrase_slop` |Sets the default slop for phrases. If zero, then exact
phrase matches are required. Default value is `0`.
|`boost` |Sets the boost value of the query. Defaults to `1.0`.
|`auto_generate_phrase_queries` |Defaults to `false`.
|`analyze_wildcard` |By default, wildcards terms in a query string are
not analyzed. By setting this value to `true`, a best effort will be
made to analyze those as well.
|`max_determinized_states` |Limit on how many automaton states regexp
queries are allowed to create. This protects against too-difficult
(e.g. exponentially hard) regexps. Defaults to 10000.
|`minimum_should_match` |A value controlling how many "should" clauses
in the resulting boolean query should match. It can be an absolute value
(`2`), a percentage (`30%`) or a
<<query-dsl-minimum-should-match,combination of
both>>.
|`lenient` |If set to `true` will cause format based failures (like
providing text to a numeric field) to be ignored.
|`time_zone` | Time Zone to be applied to any range query related to dates. See also
http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html[JODA timezone].
|`quote_field_suffix` | A suffix to append to fields for quoted parts of
the query string. This allows to use a field that has a different analysis chain
for exact matching. Look <<mixing-exact-search-with-stemming,here>> for a
comprehensive example.
|`split_on_whitespace` |Whether query text should be split on whitespace prior to analysis.
Instead the queryparser would parse around only real 'operators'.
Default to `false`.
|=======================================================================
When a multi term query is being generated, one can control how it gets
rewritten using the
<<query-dsl-multi-term-rewrite,rewrite>>
parameter.
[float]
==== Default Field
When not explicitly specifying the field to search on in the query
string syntax, the `index.query.default_field` will be used to derive
which field to search on. It defaults to `_all` field.
So, if `_all` field is disabled, it might make sense to change it to set
a different default field.
[float]
==== Multi Field
The `query_string` query can also run against multiple fields. Fields can be
provided via the `"fields"` parameter (example below).
The idea of running the `query_string` query against multiple fields is to
expand each query term to an OR clause like this:
field1:query_term OR field2:query_term | ...
For example, the following query
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"query_string" : {
"fields" : ["content", "name"],
"query" : "this AND that"
}
}
}
--------------------------------------------------
// CONSOLE
matches the same words as
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"query_string": {
"query": "(content:this OR name:this) AND (content:that OR name:that)"
}
}
}
--------------------------------------------------
// CONSOLE
Since several queries are generated from the individual search terms,
combining them can be automatically done using either a `dis_max` query or a
simple `bool` query. For example (the `name` is boosted by 5 using `^5`
notation):
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"query_string" : {
"fields" : ["content", "name^5"],
"query" : "this AND that OR thus",
"use_dis_max" : true
}
}
}
--------------------------------------------------
// CONSOLE
Simple wildcard can also be used to search "within" specific inner
elements of the document. For example, if we have a `city` object with
several fields (or inner object with fields) in it, we can automatically
search on all "city" fields:
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"query_string" : {
"fields" : ["city.*"],
"query" : "this AND that OR thus",
"use_dis_max" : true
}
}
}
--------------------------------------------------
// CONSOLE
Another option is to provide the wildcard fields search in the query
string itself (properly escaping the `*` sign), for example:
`city.\*:something`.
When running the `query_string` query against multiple fields, the
following additional parameters are allowed:
[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`use_dis_max` |Should the queries be combined using `dis_max` (set it
to `true`), or a `bool` query (set it to `false`). Defaults to `true`.
|`tie_breaker` |When using `dis_max`, the disjunction max tie breaker.
Defaults to `0`.
|=======================================================================
The fields parameter can also include pattern based field names,
allowing to automatically expand to the relevant fields (dynamically
introduced fields included). For example:
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"query_string" : {
"fields" : ["content", "name.*^5"],
"query" : "this AND that OR thus",
"use_dis_max" : true
}
}
}
--------------------------------------------------
// CONSOLE
include::query-string-syntax.asciidoc[]