diff --git a/docs/reference/query-dsl/match-query.asciidoc b/docs/reference/query-dsl/match-query.asciidoc
index 27dde4c2a91..a894ef0dae2 100644
--- a/docs/reference/query-dsl/match-query.asciidoc
+++ b/docs/reference/query-dsl/match-query.asciidoc
@@ -185,15 +185,3 @@ The example above creates a boolean query:
that matches documents with the term `ny` or the conjunction `new AND york`.
By default the parameter `auto_generate_synonyms_phrase_query` is set to `true`.
-
-.Comparison to query_string / field
-**************************************************
-
-The match family of queries does not go through a "query parsing"
-process. It does not support field name prefixes, wildcard characters,
-or other "advanced" features. For this reason, chances of it failing are
-very small / non existent, and it provides an excellent behavior when it
-comes to just analyze and run that text as a query behavior (which is
-usually what a text search box does).
-
-**************************************************
diff --git a/docs/reference/query-dsl/query-string-query.asciidoc b/docs/reference/query-dsl/query-string-query.asciidoc
index 967dd906eec..cced4f30eeb 100644
--- a/docs/reference/query-dsl/query-string-query.asciidoc
+++ b/docs/reference/query-dsl/query-string-query.asciidoc
@@ -4,8 +4,39 @@
Query string
++++
-A query that uses a query parser in order to parse its content. Here is
-an example:
+Returns documents based on a provided query string, using a parser with a strict
+syntax.
+
+This query uses a <> to parse and split the provided
+query string based on operators, such as `AND` or `NOT`. The query
+then <> each split text independently before returning
+matching documents.
+
+You can use the `query_string` query to create a complex search that includes
+wildcard characters, searches across multiple fields, and more. While versatile,
+the query is strict and returns an error if the query string includes any
+invalid syntax.
+
+[WARNING]
+====
+Because it returns an error for any invalid syntax, we don't recommend using
+the `query_string` query for search boxes.
+
+If you don't need to support a query syntax, consider using the
+<> query. If you need the features of a query
+syntax, use the <>
+query, which is less strict.
+====
+
+
+[[query-string-query-ex-request]]
+==== Example request
+
+When running the following search, the `query_string` query splits `(new york
+city) OR (big apple)` into two parts: `new york city` and `big apple`. The
+`content` field's analyzer then independently converts each part into tokens
+before returning matching documents. Because the query syntax does not use
+whitespace as an operator, `new york city` is passed as-is to the analyzer.
[source,js]
--------------------------------------------------
@@ -13,154 +44,211 @@ GET /_search
{
"query": {
"query_string" : {
- "default_field" : "content",
- "query" : "this AND that OR thus"
+ "query" : "(new york city) OR (big apple)",
+ "default_field" : "content"
}
}
}
--------------------------------------------------
// CONSOLE
-The `query_string` query parses the input and splits text around operators.
-Each textual part is analyzed independently of each other. For instance the following query:
+[[query-string-top-level-params]]
+==== Top-level parameters for `query_string`
+`query`::
+(Required, string) Query string you wish to parse and use for search. See
+<>.
-[source,js]
---------------------------------------------------
-GET /_search
-{
- "query": {
- "query_string" : {
- "default_field" : "content",
- "query" : "(new york city) OR (big apple)" <1>
- }
- }
-}
---------------------------------------------------
-// CONSOLE
+`default_field`::
++
+--
+(Optional, string) Default field you wish to search if no field is provided in
+the query string.
-<1> will be split into `new york city` and `big apple` and each part is then
-analyzed independently by the analyzer configured for the field.
+Defaults to the `index.query.default_field` index setting, which has a default
+value of `*`. The `*` value extracts all fields that are eligible to term
+queries and filters the metadata fields. All extracted fields are then combined
+to build a query if no `prefix` is specified.
-WARNING: Whitespaces are not considered operators, this means that `new york city`
-will be passed "as is" to the analyzer configured for the field. If the field is a `keyword`
-field the analyzer will create a single term `new york city` and the query builder will
-use this term in the query. If you want to query each term separately you need to add explicit
-operators around the terms (e.g. `new AND york AND city`).
+WARNING: There is a limit on the number of fields that can be queried at once.
+It is defined by the `indices.query.bool.max_clause_count`
+<>, which defaults to 1024.
+--
-When multiple fields are provided it is also possible to modify how the different
-field queries are combined inside each textual part using the `type` parameter.
-The possible modes are described <> and the default is `best_fields`.
+`allow_leading_wildcard`::
+(Optional, boolean) If `true`, the wildcard characters `*` and `?` are allowed
+as the first character of the query string. Defaults to `true`.
-The `query_string` top level parameters include:
+`analyze_wildcard`::
+(Optional, boolean) If `true`, the query attempts to analyze wildcard terms in
+the query string. Defaults to `false`.
-[cols="<,<",options="header",]
-|=======================================================================
-|Parameter |Description
-|`query` |The actual query to be parsed. See <>.
+`analyzer`::
+(Optional, string) <> used to convert text in the
+query string into tokens. Defaults to the
+<> mapped for the
+`default_field`. If no analyzer is mapped, the index's default analyzer is used.
-|`default_field` |The default field for query terms if no prefix field is
-specified. Defaults to the `index.query.default_field` index settings, which in
-turn defaults to `*`. `*` extracts all fields in the mapping that are eligible
-to term queries and filters the metadata fields. All extracted fields are then
-combined to build a query when no prefix field is provided.
+`auto_generate_synonyms_phrase_query`::
+(Optional, boolean) If `true`, <>
+queries are automatically created for multi-term synonyms. Defaults to `true`.
+See <> for an example.
-WARNING: There is a limit on the number of fields that can be queried
-at once. It is defined by the `indices.query.bool.max_clause_count` <>
-which defaults to 1024.
+`boost`::
++
+--
+(Optional, float) Floating point number used to decrease or increase the
+<> of the query. Defaults to `1.0`.
-|`default_operator` |The default operator used if no explicit operator
-is specified. For example, with a default operator of `OR`, the query
-`capital of Hungary` is translated to `capital OR of OR Hungary`, and
-with default operator of `AND`, the same query is translated to
-`capital AND of AND Hungary`. The default value is `OR`.
+Boost values are relative to the default value of `1.0`. A boost value between
+`0` and `1.0` decreases the relevance score. A value greater than `1.0`
+increases the relevance score.
+--
-|`analyzer` |The analyzer name used to analyze the query string.
+`default_operator`::
++
+--
+(Optional, string) Default boolean logic used to interpret text in the query
+string if no operators are specified. Valid values are:
-|`quote_analyzer` |The name of the analyzer that is used to analyze
-quoted phrases in the query string. For those parts, it overrides other
-analyzers that are set using the `analyzer` parameter or the
-<> setting.
+ `OR` (Default)::
+For example, a query string of `capital of Hungary` is interpreted as `capital
+OR of OR Hungary`.
-|`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
-character. Defaults to `true`.
+ `AND`::
+For example, a query string of `capital of Hungary` is interpreted as `capital
+AND of AND Hungary`.
+--
-|`enable_position_increments` |Set to `true` to enable position
-increments in result queries. Defaults to `true`.
+`enable_position_increments`::
+(Optional, boolean) If `true`, enable position increments in queries constructed
+from a `query_string` search. Defaults to `true`.
-|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
-expand to. Defaults to `50`
+`fields`::
++
+--
+(Optional, array of strings) Array of fields you wish to search.
-|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
-to `AUTO`. See <> for allowed settings.
+You can use this parameter query to search across multiple fields. See
+<>.
+--
-|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
-is `0`.
+`fuzziness`::
+(Optional, string) Maximum edit distance allowed for matching. See <>
+for valid values and more information.
-|`fuzzy_transpositions` |Set to `false` to disable fuzzy transpositions (`ab` -> `ba`).
-Default is `true`.
+`fuzzy_max_expansions`::
+(Optional, integer) Maximum number of terms to which the query expands for fuzzy
+matching. Defaults to `50`.
-|`phrase_slop` |Sets the default slop for phrases. If zero, then exact
-phrase matches are required. Default value is `0`.
+`fuzzy_prefix_length`::
+(Optional, integer) Number of beginning characters left unchanged for fuzzy
+matching. Defaults to `0`.
-|`boost` |Sets the boost value of the query. Defaults to `1.0`.
+`fuzzy_transpositions`::
+(Optional, boolean) If `true`, edits for fuzzy matching include
+transpositions of two adjacent characters (ab → ba). Defaults to `true`.
-|`analyze_wildcard` |By default, wildcards terms in a query string are
-not analyzed. By setting this value to `true`, a best effort will be
-made to analyze those as well.
+`lenient`::
+(Optional, boolean) If `true`, format-based errors, such as providing a text
+value for a <> field, are ignored. Defaults to `false`.
-|`max_determinized_states` |Limit on how many automaton states regexp
-queries are allowed to create. This protects against too-difficult
-(e.g. exponentially hard) regexps. Defaults to 10000.
+`max_determinized_states`::
++
+--
+(Optional, integer) Maximum number of
+https://en.wikipedia.org/wiki/Deterministic_finite_automaton[automaton states]
+required for the query. Default is `10000`.
-|`minimum_should_match` |A value controlling how many "should" clauses
-in the resulting boolean query should match. It can be an absolute value
-(`2`), a percentage (`30%`) or a
-<>.
+{es} uses https://lucene.apache.org/core/[Apache Lucene] internally to parse
+regular expressions. Lucene converts each regular expression to a finite
+automaton containing a number of determinized states.
-|`lenient` |If set to `true` will cause format based failures (like
-providing text to a numeric field) to be ignored.
+You can use this parameter to prevent that conversion from unintentionally
+consuming too many resources. You may need to increase this limit to run complex
+regular expressions.
+--
-|`time_zone` | Time Zone to be applied to any range query related to dates.
+`minimum_should_match`::
+(Optional, string) Minimum number of clauses that must match for a document to
+be returned. See the <> for valid values and more information. See
+<> for an example.
-|`quote_field_suffix` | A suffix to append to fields for quoted parts of
-the query string. This allows to use a field that has a different analysis chain
-for exact matching. Look <> for a
-comprehensive example.
+`quote_analyzer`::
++
+--
+(Optional, string) <> used to convert quoted text in the
+query string into tokens. Defaults to the
+<> mapped for the
+`default_field`.
-|`auto_generate_synonyms_phrase_query` |Whether phrase queries should be automatically generated for multi terms synonyms.
-Defaults to `true`.
+For quoted text, this parameter overrides the analyzer specified in the
+`analyzer` parameter.
+--
-|=======================================================================
+`phrase_slop`::
+(Optional, integer) Maximum number of positions allowed between matching tokens
+for phrases. Defaults to `0`. If `0`, exact phrase matches are required.
+Transposed terms have a slop of `2`.
-When a multi term query is being generated, one can control how it gets
-rewritten using the
-<>
-parameter.
+`quote_field_suffix`::
++
+--
+(Optional, string) Suffix appended to quoted text in the query string.
-[float]
-==== Default Field
+You can use this suffix to use a different analysis method for exact matches.
+See <>.
+--
-When not explicitly specifying the field to search on in the query
-string syntax, the `index.query.default_field` will be used to derive
-which field to search on. If the `index.query.default_field` is not specified,
-the `query_string` will automatically attempt to determine the existing fields in the index's
-mapping that are queryable, and perform the search on those fields.
-This will not include nested documents, use a nested query to search those documents.
+`rewrite`::
+(Optional, string) Method used to rewrite the query. For valid values and more
+information, see the <>.
-NOTE: For mappings with a large number of fields, searching across all queryable
-fields in the mapping could be expensive.
+`time_zone`::
++
+--
+(Optional, string)
+https://en.wikipedia.org/wiki/List_of_UTC_time_offsets[Coordinated Universal
+Time (UTC) offset] or
+https://en.wikipedia.org/wiki/List_of_tz_database_time_zones[IANA time zone]
+used to convert `date` values in the query string to UTC.
-[float]
-==== Multi Field
+Valid values are ISO 8601 UTC offsets, such as `+01:00` or -`08:00`, and IANA
+time zone IDs, such as `America/Los_Angeles`.
-The `query_string` query can also run against multiple fields. Fields can be
-provided via the `fields` parameter (example below).
+[NOTE]
+====
+The `time_zone` parameter does **not** affect the <> value
+of `now`. `now` is always the current system time in UTC. However, the
+`time_zone` parameter does convert dates calculated using `now` and
+<>. For example, the `time_zone` parameter will
+convert a value of `now/d`.
+====
+--
+
+[[query-string-query-notes]]
+==== Notes
+
+include::query-string-syntax.asciidoc[]
+
+[[query-string-nested]]
+====== Avoid using the `query_string` query for nested documents
+
+`query_string` searches do not return <> documents. To search
+nested documents, use the <>.
+
+[[query-string-multi-field]]
+====== Search multiple fields
+
+You can use the `fields` parameter to perform a `query_string` search across
+multiple fields.
The idea of running the `query_string` query against multiple fields is to
expand each query term to an OR clause like this:
- field1:query_term OR field2:query_term | ...
+```
+field1:query_term OR field2:query_term | ...
+```
For example, the following query
@@ -252,21 +340,6 @@ GET /_search
NOTE: Since `\` (backslash) is a special character in json strings, it needs to
be escaped, hence the two backslashes in the above `query_string`.
-When running the `query_string` query against multiple fields, the
-following additional parameters are allowed:
-
-[cols="<,<",options="header",]
-|=======================================================================
-|Parameter |Description
-
-|`type` |How the fields should be combined to build the text query.
-See <> for a complete example.
-Defaults to `best_fields`
-
-|`tie_breaker` |The disjunction max tie breaker for multi fields.
-Defaults to `0`
-|=======================================================================
-
The fields parameter can also include pattern based field names,
allowing to automatically expand to the relevant fields (dynamically
introduced fields included). For example:
@@ -285,8 +358,50 @@ GET /_search
--------------------------------------------------
// CONSOLE
-[float]
-==== Synonyms
+[[query-string-multi-field-parms]]
+====== Additional parameters for multiple field searches
+
+When running the `query_string` query against multiple fields, the
+following additional parameters are supported.
+
+`type`::
++
+--
+(Optional, string) Determines how the query matches and scores documents. Valid
+values are:
+
+`best_fields` (Default)::
+Finds documents which match any field and uses the highest
+<> from any matching field. See
+<>.
+
+`bool_prefix`::
+Creates a `match_bool_prefix` query on each field and combines the `_score` from
+each field. See <>.
+
+`cross_fields`::
+Treats fields with the same `analyzer` as though they were one big field. Looks
+for each word in **any** field. See <>.
+
+`most_fields`::
+Finds documents which match any field and combines the `_score` from each field.
+See <>.
+
+`phrase`::
+Runs a `match_phrase` query on each field and uses the `_score` from the best
+field. See <>.
+
+`phrase_prefix`::
+Runs a `match_phrase_prefix` query on each field and uses the `_score` from the
+best field. See <>.
+
+NOTE:
+Additional top-level `multi_match` parameters may be available based on the
+<> value.
+--
+
+[[query-string-synonyms]]
+===== Synonyms and the `query_string` query
The `query_string` query supports multi-terms synonym expansion with the <> token filter. When this filter is used, the parser creates a phrase query for each multi-terms synonyms.
@@ -318,8 +433,8 @@ The example above creates a boolean query:
that matches documents with the term `ny` or the conjunction `new AND york`.
By default the parameter `auto_generate_synonyms_phrase_query` is set to `true`.
-[float]
-==== Minimum should match
+[[query-string-min-should-match]]
+===== How `minimum_should_match` works
The `query_string` splits the query around each operator to create a boolean
query for the entire input. You can use `minimum_should_match` to control how
@@ -349,8 +464,8 @@ The example above creates a boolean query:
that matches documents with at least two of the terms `this`, `that` or `thus`
in the single field `title`.
-[float]
-===== Multi Field
+[[query-string-min-should-match-multi]]
+===== How `minimum_should_match` works for multiple fields
[source,js]
--------------------------------------------------
@@ -404,8 +519,11 @@ The example above creates a boolean query:
that matches documents with at least two of the three "should" clauses, each of
them made of the disjunction max over the fields for each term.
-[float]
-===== Cross Field
+[[query-string-min-should-match-cross]]
+===== How `minimum_should_match` works for cross-field searches
+
+A `cross_fields` value in the `type` field indicates fields with the same
+analyzer are grouped together when the input is analyzed.
[source,js]
--------------------------------------------------
@@ -426,13 +544,8 @@ GET /_search
--------------------------------------------------
// CONSOLE
-The `cross_fields` value in the `type` field indicates that fields that have the
-same analyzer should be grouped together when the input is analyzed.
-
The example above creates a boolean query:
`(blended(terms:[field2:this, field1:this]) blended(terms:[field2:that, field1:that]) blended(terms:[field2:thus, field1:thus]))~2`
that matches documents with at least two of the three per-term blended queries.
-
-include::query-string-syntax.asciidoc[]
diff --git a/docs/reference/query-dsl/query-string-syntax.asciidoc b/docs/reference/query-dsl/query-string-syntax.asciidoc
index 765b54b5883..03a2e8b8212 100644
--- a/docs/reference/query-dsl/query-string-syntax.asciidoc
+++ b/docs/reference/query-dsl/query-string-syntax.asciidoc
@@ -1,6 +1,6 @@
[[query-string-syntax]]
-==== Query string syntax
+===== Query string syntax
The query string ``mini-language'' is used by the
<> and by the
@@ -14,10 +14,9 @@ phrase, in the same order.
Operators allow you to customize the search -- the available options are
explained below.
-===== Field names
+====== Field names
-As mentioned in <>, the `default_field` is searched for the
-search terms, but it is possible to specify other fields in the query syntax:
+You can specify fields to search in the query syntax:
* where the `status` field contains `active`
@@ -40,7 +39,7 @@ search terms, but it is possible to specify other fields in the query syntax:
_exists_:title
-===== Wildcards
+====== Wildcards
Wildcard searches can be run on individual terms, using `?` to replace
a single character, and `*` to replace zero or more characters:
@@ -88,7 +87,7 @@ analyzed and a boolean query will be built out of the different tokens, by
ensuring exact matches on the first N-1 tokens, and prefix match on the last
token.
-===== Regular expressions
+====== Regular expressions
Regular expression patterns can be embedded in the query string by
wrapping them in forward-slashes (`"/"`):
@@ -108,7 +107,7 @@ Elasticsearch to visit every term in the index:
Use with caution!
=======
-===== Fuzziness
+====== Fuzziness
We can search for terms that are
similar to, but not exactly like our search terms, using the ``fuzzy''
@@ -128,7 +127,7 @@ sufficient to catch 80% of all human misspellings. It can be specified as:
quikc~1
-===== Proximity searches
+====== Proximity searches
While a phrase query (eg `"john smith"`) expects all of the terms in exactly
the same order, a proximity query allows the specified words to be further
@@ -143,7 +142,7 @@ query string, the more relevant that document is considered to be. When
compared to the above example query, the phrase `"quick fox"` would be
considered more relevant than `"quick brown fox"`.
-===== Ranges
+====== Ranges
Ranges can be specified for date, numeric or string fields. Inclusive ranges
are specified with square brackets `[min TO max]` and exclusive ranges with
@@ -197,7 +196,7 @@ The parsing of ranges in query strings can be complex and error prone. It is
much more reliable to use an explicit <>.
-===== Boosting
+====== Boosting
Use the _boost_ operator `^` to make one term more relevant than another.
For instance, if we want to find all documents about foxes, but we are
@@ -212,7 +211,7 @@ Boosts can also be applied to phrases or to groups:
"john smith"^2 (foo bar)^4
-===== Boolean operators
+====== Boolean operators
By default, all terms are optional, as long as one term matches. A search
for `foo bar baz` will find any document that contains one or more of
@@ -255,7 +254,7 @@ would look like this:
}
-===== Grouping
+====== Grouping
Multiple terms or clauses can be grouped together with parentheses, to form
sub-queries:
@@ -267,7 +266,7 @@ of a sub-query:
status:(active OR pending) title:(full text search)^2
-===== Reserved characters
+====== Reserved characters
If you need to use any of the characters which function as operators in your
query itself (and not as operators), then you should escape them with
@@ -283,7 +282,9 @@ NOTE: `<` and `>` can't be escaped at all. The only way to prevent them from
attempting to create a range query is to remove them from the query string
entirely.
-===== Empty Query
+====== Whitespaces and empty queries
+
+Whitespace is not considered an operator.
If the query string is empty or only contains whitespaces the query will
yield an empty result set.