2013-08-28 19:24:34 -04:00
|
|
|
[[query-dsl-query-string-query]]
|
2019-07-18 10:18:11 -04:00
|
|
|
=== Query string query
|
|
|
|
++++
|
|
|
|
<titleabbrev>Query string</titleabbrev>
|
|
|
|
++++
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
A query that uses a query parser in order to parse its content. Here is
|
|
|
|
an example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
GET /_search
|
2013-08-28 19:24:34 -04:00
|
|
|
{
|
2016-05-24 05:58:43 -04:00
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"default_field" : "content",
|
|
|
|
"query" : "this AND that OR thus"
|
|
|
|
}
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
// CONSOLE
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2017-07-24 12:35:19 -04:00
|
|
|
The `query_string` query parses the input and splits text around operators.
|
2017-07-13 09:32:17 -04:00
|
|
|
Each textual part is analyzed independently of each other. For instance the following query:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"default_field" : "content",
|
2018-11-30 10:10:13 -05:00
|
|
|
"query" : "(new york city) OR (big apple)" <1>
|
2017-07-13 09:32:17 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
2018-11-30 10:10:13 -05:00
|
|
|
<1> will be split into `new york city` and `big apple` and each part is then
|
2018-03-01 18:08:25 -05:00
|
|
|
analyzed independently by the analyzer configured for the field.
|
|
|
|
|
|
|
|
WARNING: Whitespaces are not considered operators, this means that `new york city`
|
|
|
|
will be passed "as is" to the analyzer configured for the field. If the field is a `keyword`
|
|
|
|
field the analyzer will create a single term `new york city` and the query builder will
|
|
|
|
use this term in the query. If you want to query each term separately you need to add explicit
|
|
|
|
operators around the terms (e.g. `new AND york AND city`).
|
|
|
|
|
2017-07-13 09:32:17 -04:00
|
|
|
When multiple fields are provided it is also possible to modify how the different
|
|
|
|
field queries are combined inside each textual part using the `type` parameter.
|
|
|
|
The possible modes are described <<multi-match-types, here>> and the default is `best_fields`.
|
|
|
|
|
2013-08-28 19:24:34 -04:00
|
|
|
The `query_string` top level parameters include:
|
|
|
|
|
|
|
|
[cols="<,<",options="header",]
|
|
|
|
|=======================================================================
|
|
|
|
|Parameter |Description
|
2013-10-07 08:42:13 -04:00
|
|
|
|`query` |The actual query to be parsed. See <<query-string-syntax>>.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2017-09-08 15:37:55 -04:00
|
|
|
|`default_field` |The default field for query terms if no prefix field is
|
|
|
|
specified. Defaults to the `index.query.default_field` index settings, which in
|
|
|
|
turn defaults to `*`. `*` extracts all fields in the mapping that are eligible
|
|
|
|
to term queries and filters the metadata fields. All extracted fields are then
|
2018-11-08 11:04:40 -05:00
|
|
|
combined to build a query when no prefix field is provided.
|
|
|
|
|
|
|
|
WARNING: There is a limit on the number of fields that can be queried
|
|
|
|
at once. It is defined by the `indices.query.bool.max_clause_count` <<search-settings>>
|
|
|
|
which defaults to 1024.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
|`default_operator` |The default operator used if no explicit operator
|
|
|
|
is specified. For example, with a default operator of `OR`, the query
|
|
|
|
`capital of Hungary` is translated to `capital OR of OR Hungary`, and
|
|
|
|
with default operator of `AND`, the same query is translated to
|
|
|
|
`capital AND of AND Hungary`. The default value is `OR`.
|
|
|
|
|
|
|
|
|`analyzer` |The analyzer name used to analyze the query string.
|
|
|
|
|
2017-09-02 05:53:02 -04:00
|
|
|
|`quote_analyzer` |The name of the analyzer that is used to analyze
|
|
|
|
quoted phrases in the query string. For those parts, it overrides other
|
|
|
|
analyzers that are set using the `analyzer` parameter or the
|
|
|
|
<<search-quote-analyzer,`search_quote_analyzer`>> setting.
|
|
|
|
|
2013-08-28 19:24:34 -04:00
|
|
|
|`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
|
|
|
|
character. Defaults to `true`.
|
|
|
|
|
|
|
|
|`enable_position_increments` |Set to `true` to enable position
|
|
|
|
increments in result queries. Defaults to `true`.
|
|
|
|
|
|
|
|
|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
|
|
|
|
expand to. Defaults to `50`
|
|
|
|
|
2014-01-02 10:45:24 -05:00
|
|
|
|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
|
2017-09-02 05:53:02 -04:00
|
|
|
to `AUTO`. See <<fuzziness>> for allowed settings.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
|
|
|
|
is `0`.
|
|
|
|
|
2017-10-05 03:01:09 -04:00
|
|
|
|`fuzzy_transpositions` |Set to `false` to disable fuzzy transpositions (`ab` -> `ba`).
|
|
|
|
Default is `true`.
|
|
|
|
|
2013-08-28 19:24:34 -04:00
|
|
|
|`phrase_slop` |Sets the default slop for phrases. If zero, then exact
|
|
|
|
phrase matches are required. Default value is `0`.
|
|
|
|
|
|
|
|
|`boost` |Sets the boost value of the query. Defaults to `1.0`.
|
|
|
|
|
|
|
|
|`analyze_wildcard` |By default, wildcards terms in a query string are
|
|
|
|
not analyzed. By setting this value to `true`, a best effort will be
|
|
|
|
made to analyze those as well.
|
|
|
|
|
2014-11-10 13:43:48 -05:00
|
|
|
|`max_determinized_states` |Limit on how many automaton states regexp
|
2018-11-30 10:10:13 -05:00
|
|
|
queries are allowed to create. This protects against too-difficult
|
|
|
|
(e.g. exponentially hard) regexps. Defaults to 10000.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
|`minimum_should_match` |A value controlling how many "should" clauses
|
|
|
|
in the resulting boolean query should match. It can be an absolute value
|
|
|
|
(`2`), a percentage (`30%`) or a
|
|
|
|
<<query-dsl-minimum-should-match,combination of
|
|
|
|
both>>.
|
|
|
|
|
|
|
|
|`lenient` |If set to `true` will cause format based failures (like
|
2014-01-02 10:45:24 -05:00
|
|
|
providing text to a numeric field) to be ignored.
|
2014-02-14 14:34:16 -05:00
|
|
|
|
2019-02-14 04:18:12 -05:00
|
|
|
|`time_zone` | Time Zone to be applied to any range query related to dates.
|
2016-10-28 03:11:57 -04:00
|
|
|
|
|
|
|
|`quote_field_suffix` | A suffix to append to fields for quoted parts of
|
|
|
|
the query string. This allows to use a field that has a different analysis chain
|
|
|
|
for exact matching. Look <<mixing-exact-search-with-stemming,here>> for a
|
|
|
|
comprehensive example.
|
2016-11-02 05:00:40 -04:00
|
|
|
|
2017-08-09 06:15:09 -04:00
|
|
|
|`auto_generate_synonyms_phrase_query` |Whether phrase queries should be automatically generated for multi terms synonyms.
|
|
|
|
Defaults to `true`.
|
|
|
|
|
2013-08-28 19:24:34 -04:00
|
|
|
|=======================================================================
|
|
|
|
|
|
|
|
When a multi term query is being generated, one can control how it gets
|
|
|
|
rewritten using the
|
|
|
|
<<query-dsl-multi-term-rewrite,rewrite>>
|
|
|
|
parameter.
|
|
|
|
|
|
|
|
[float]
|
2015-06-03 19:59:22 -04:00
|
|
|
==== Default Field
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
When not explicitly specifying the field to search on in the query
|
|
|
|
string syntax, the `index.query.default_field` will be used to derive
|
2017-07-21 10:52:57 -04:00
|
|
|
which field to search on. If the `index.query.default_field` is not specified,
|
|
|
|
the `query_string` will automatically attempt to determine the existing fields in the index's
|
2018-11-19 07:21:59 -05:00
|
|
|
mapping that are queryable, and perform the search on those fields.
|
|
|
|
This will not include nested documents, use a nested query to search those documents.
|
|
|
|
|
|
|
|
NOTE: For mappings with a large number of fields, searching across all queryable
|
|
|
|
fields in the mapping could be expensive.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
[float]
|
2015-06-03 19:59:22 -04:00
|
|
|
==== Multi Field
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2014-08-15 08:16:54 -04:00
|
|
|
The `query_string` query can also run against multiple fields. Fields can be
|
2018-11-30 10:10:13 -05:00
|
|
|
provided via the `fields` parameter (example below).
|
2014-08-15 08:16:54 -04:00
|
|
|
|
|
|
|
The idea of running the `query_string` query against multiple fields is to
|
|
|
|
expand each query term to an OR clause like this:
|
|
|
|
|
|
|
|
field1:query_term OR field2:query_term | ...
|
|
|
|
|
|
|
|
For example, the following query
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
GET /_search
|
2014-08-15 08:16:54 -04:00
|
|
|
{
|
2016-05-24 05:58:43 -04:00
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"fields" : ["content", "name"],
|
|
|
|
"query" : "this AND that"
|
|
|
|
}
|
2014-08-15 08:16:54 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
// CONSOLE
|
2014-08-15 08:16:54 -04:00
|
|
|
|
|
|
|
matches the same words as
|
|
|
|
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
GET /_search
|
2014-08-15 08:16:54 -04:00
|
|
|
{
|
2016-05-24 05:58:43 -04:00
|
|
|
"query": {
|
|
|
|
"query_string": {
|
|
|
|
"query": "(content:this OR name:this) AND (content:that OR name:that)"
|
|
|
|
}
|
2014-08-15 08:16:54 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
// CONSOLE
|
2014-08-15 08:16:54 -04:00
|
|
|
|
|
|
|
Since several queries are generated from the individual search terms,
|
2018-11-30 10:10:13 -05:00
|
|
|
combining them is automatically done using a `dis_max` query with a `tie_breaker`.
|
2017-07-13 09:32:17 -04:00
|
|
|
For example (the `name` is boosted by 5 using `^5` notation):
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
GET /_search
|
2013-08-28 19:24:34 -04:00
|
|
|
{
|
2016-05-24 05:58:43 -04:00
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"fields" : ["content", "name^5"],
|
|
|
|
"query" : "this AND that OR thus",
|
2017-07-13 09:32:17 -04:00
|
|
|
"tie_breaker" : 0
|
2016-05-24 05:58:43 -04:00
|
|
|
}
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
// CONSOLE
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
Simple wildcard can also be used to search "within" specific inner
|
|
|
|
elements of the document. For example, if we have a `city` object with
|
|
|
|
several fields (or inner object with fields) in it, we can automatically
|
|
|
|
search on all "city" fields:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
GET /_search
|
2013-08-28 19:24:34 -04:00
|
|
|
{
|
2016-05-24 05:58:43 -04:00
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"fields" : ["city.*"],
|
2017-07-13 09:32:17 -04:00
|
|
|
"query" : "this AND that OR thus"
|
2016-05-24 05:58:43 -04:00
|
|
|
}
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
// CONSOLE
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
Another option is to provide the wildcard fields search in the query
|
|
|
|
string itself (properly escaping the `*` sign), for example:
|
2016-12-19 08:21:21 -05:00
|
|
|
`city.\*:something`:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
2017-07-13 09:32:17 -04:00
|
|
|
"query" : "city.\\*:(this AND that OR thus)"
|
2016-12-19 08:21:21 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
|
|
|
NOTE: Since `\` (backslash) is a special character in json strings, it needs to
|
|
|
|
be escaped, hence the two backslashes in the above `query_string`.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
When running the `query_string` query against multiple fields, the
|
|
|
|
following additional parameters are allowed:
|
|
|
|
|
|
|
|
[cols="<,<",options="header",]
|
|
|
|
|=======================================================================
|
|
|
|
|Parameter |Description
|
|
|
|
|
2017-07-13 09:32:17 -04:00
|
|
|
|`type` |How the fields should be combined to build the text query.
|
|
|
|
See <<multi-match-types, types>> for a complete example.
|
|
|
|
Defaults to `best_fields`
|
|
|
|
|
|
|
|
|`tie_breaker` |The disjunction max tie breaker for multi fields.
|
|
|
|
Defaults to `0`
|
2013-08-28 19:24:34 -04:00
|
|
|
|=======================================================================
|
|
|
|
|
|
|
|
The fields parameter can also include pattern based field names,
|
|
|
|
allowing to automatically expand to the relevant fields (dynamically
|
|
|
|
introduced fields included). For example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
GET /_search
|
2013-08-28 19:24:34 -04:00
|
|
|
{
|
2016-05-24 05:58:43 -04:00
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"fields" : ["content", "name.*^5"],
|
2017-07-13 09:32:17 -04:00
|
|
|
"query" : "this AND that OR thus"
|
2016-05-24 05:58:43 -04:00
|
|
|
}
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-24 05:58:43 -04:00
|
|
|
// CONSOLE
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2017-08-09 06:15:09 -04:00
|
|
|
[float]
|
|
|
|
==== Synonyms
|
|
|
|
|
|
|
|
The `query_string` query supports multi-terms synonym expansion with the <<analysis-synonym-graph-tokenfilter,
|
|
|
|
synonym_graph>> token filter. When this filter is used, the parser creates a phrase query for each multi-terms synonyms.
|
2018-11-30 10:10:13 -05:00
|
|
|
For example, the following synonym: `ny, new york` would produce:
|
2017-08-09 06:15:09 -04:00
|
|
|
|
|
|
|
`(ny OR ("new york"))`
|
|
|
|
|
|
|
|
It is also possible to match multi terms synonyms with conjunctions instead:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string" : {
|
|
|
|
"default_field": "title",
|
|
|
|
"query" : "ny city",
|
|
|
|
"auto_generate_synonyms_phrase_query" : false
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
|
|
|
The example above creates a boolean query:
|
|
|
|
|
2018-03-30 09:10:14 -04:00
|
|
|
`(ny OR (new AND york)) city`
|
2017-08-09 06:15:09 -04:00
|
|
|
|
|
|
|
that matches documents with the term `ny` or the conjunction `new AND york`.
|
|
|
|
By default the parameter `auto_generate_synonyms_phrase_query` is set to `true`.
|
|
|
|
|
2018-11-30 10:10:13 -05:00
|
|
|
[float]
|
|
|
|
==== Minimum should match
|
|
|
|
|
|
|
|
The `query_string` splits the query around each operator to create a boolean
|
|
|
|
query for the entire input. You can use `minimum_should_match` to control how
|
|
|
|
many "should" clauses in the resulting query should match.
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string": {
|
|
|
|
"fields": [
|
|
|
|
"title"
|
|
|
|
],
|
|
|
|
"query": "this that thus",
|
|
|
|
"minimum_should_match": 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
|
|
|
The example above creates a boolean query:
|
|
|
|
|
|
|
|
`(title:this title:that title:thus)~2`
|
|
|
|
|
|
|
|
that matches documents with at least two of the terms `this`, `that` or `thus`
|
|
|
|
in the single field `title`.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
===== Multi Field
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string": {
|
|
|
|
"fields": [
|
|
|
|
"title",
|
|
|
|
"content"
|
|
|
|
],
|
|
|
|
"query": "this that thus",
|
|
|
|
"minimum_should_match": 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
|
|
|
The example above creates a boolean query:
|
|
|
|
|
|
|
|
`((content:this content:that content:thus) | (title:this title:that title:thus))`
|
|
|
|
|
|
|
|
that matches documents with the disjunction max over the fields `title` and
|
|
|
|
`content`. Here the `minimum_should_match` parameter can't be applied.
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string": {
|
|
|
|
"fields": [
|
|
|
|
"title",
|
|
|
|
"content"
|
|
|
|
],
|
|
|
|
"query": "this OR that OR thus",
|
|
|
|
"minimum_should_match": 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
|
|
|
Adding explicit operators forces each term to be considered as a separate clause.
|
|
|
|
|
|
|
|
The example above creates a boolean query:
|
|
|
|
|
|
|
|
`((content:this | title:this) (content:that | title:that) (content:thus | title:thus))~2`
|
|
|
|
|
|
|
|
that matches documents with at least two of the three "should" clauses, each of
|
|
|
|
them made of the disjunction max over the fields for each term.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
===== Cross Field
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string": {
|
|
|
|
"fields": [
|
|
|
|
"title",
|
|
|
|
"content"
|
|
|
|
],
|
|
|
|
"query": "this OR that OR thus",
|
|
|
|
"type": "cross_fields",
|
|
|
|
"minimum_should_match": 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
|
|
|
|
The `cross_fields` value in the `type` field indicates that fields that have the
|
|
|
|
same analyzer should be grouped together when the input is analyzed.
|
|
|
|
|
|
|
|
The example above creates a boolean query:
|
|
|
|
|
|
|
|
`(blended(terms:[field2:this, field1:this]) blended(terms:[field2:that, field1:that]) blended(terms:[field2:thus, field1:thus]))~2`
|
|
|
|
|
|
|
|
that matches documents with at least two of the three per-term blended queries.
|
|
|
|
|
2013-10-07 08:42:13 -04:00
|
|
|
include::query-string-syntax.asciidoc[]
|