[DOCS] Reformats simple query string query (#45343)

This commit is contained in:
James Rodewig 2019-08-09 08:32:46 -04:00
parent 634a070430
commit eec87ffab8

View File

@ -4,10 +4,21 @@
<titleabbrev>Simple query string</titleabbrev>
++++
A query that uses the SimpleQueryParser to parse its context. Unlike the
regular `query_string` query, the `simple_query_string` query will never
throw an exception, and discards invalid parts of the query. Here is
an example:
Returns documents based on a provided query string, using a parser with a
limited but fault-tolerant syntax.
This query uses a <<simple-query-string-syntax,simple syntax>> to parse and
split the provided query string into terms based on special operators. The query
then <<analysis,analyzes>> each term independently before returning matching
documents.
While its syntax is more limited than the
<<query-dsl-query-string-query,`query_string` query>>, the `simple_query_string`
query does not return errors for invalid syntax. Instead, it ignores any invalid
parts of the query string.
[[simple-query-string-query-ex-request]]
==== Example request
[source,js]
--------------------------------------------------
@ -24,72 +35,108 @@ GET /_search
--------------------------------------------------
// CONSOLE
The `simple_query_string` top level parameters include:
[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`query` |The actual query to be parsed. See below for syntax.
[[simple-query-string-top-level-params]]
==== Top-level parameters for `simple_query_string`
|`fields` |The fields to perform the parsed query against. Defaults to the
`index.query.default_field` index settings, which in turn defaults to `*`. `*`
extracts all fields in the mapping that are eligible to term queries and filters
the metadata fields.
`query`::
(Required, string) Query string you wish to parse and use for search. See <<simple-query-string-syntax>>.
WARNING: There is a limit on the number of fields that can be queried
at once. It is defined by the `indices.query.bool.max_clause_count` <<search-settings>>
which defaults to 1024.
`fields`::
+
--
(Optional, array of strings) Array of fields you wish to search.
|`default_operator` |The default operator used if no explicit operator
is specified. For example, with a default operator of `OR`, the query
`capital of Hungary` is translated to `capital OR of OR Hungary`, and
with default operator of `AND`, the same query is translated to
`capital AND of AND Hungary`. The default value is `OR`.
This field accepts wildcard expressions. You also can boost relevance scores for
matches to particular fields using a caret (`^`) notation. See
<<simple-query-string-boost>> for examples.
|`analyzer` |Force the analyzer to use to analyze each term of the query when
creating composite queries.
Defaults to the `index.query.default_field` index setting, which has a default
value of `*`. The `*` value extracts all fields that are eligible to term
queries and filters the metadata fields. All extracted fields are then combined
to build a query if no `prefix` is specified.
|`flags` |A set of <<supported-flags,flags>> specifying which features of the
`simple_query_string` to enable. Defaults to `ALL`.
WARNING: There is a limit on the number of fields that can be queried at once.
It is defined by the `indices.query.bool.max_clause_count`
<<search-settings,search setting>>, which defaults to `1024`.
--
|`analyze_wildcard` | Whether terms of prefix queries should be automatically
analyzed or not. If `true` a best effort will be made to analyze the prefix. However,
some analyzers will be not able to provide a meaningful results
based just on the prefix of a term. Defaults to `false`.
`default_operator`::
+
--
(Optional, string) Default boolean logic used to interpret text in the query
string if no operators are specified. Valid values are:
|`lenient` | If set to `true` will cause format based failures
(like providing text to a numeric field) to be ignored.
`OR` (Default)::
For example, a query string of `capital of Hungary` is interpreted as `capital
OR of OR Hungary`.
|`minimum_should_match` | The minimum number of clauses that must match for a
document to be returned. See the
<<query-dsl-minimum-should-match,`minimum_should_match`>> documentation for the
full list of options.
`AND`::
For example, a query string of `capital of Hungary` is interpreted as `capital
AND of AND Hungary`.
--
|`quote_field_suffix` | A suffix to append to fields for quoted parts of
the query string. This allows to use a field that has a different analysis chain
for exact matching. Look <<mixing-exact-search-with-stemming,here>> for a
comprehensive example.
`all_fields`::
deprecated:[6.0.0, set `fields` to `*` instead](Optional, boolean) If `true`,
search all searchable fields in the index's field mapping.
|`auto_generate_synonyms_phrase_query` |Whether phrase queries should be automatically generated for multi terms synonyms.
Defaults to `true`.
`analyze_wildcard`::
(Optional, boolean) If `true`, the query attempts to analyze wildcard terms in
the query string. Defaults to `false`.
|`all_fields` | deprecated[6.0.0, set `fields` to `*` instead]
Perform the query on all fields detected in the mapping that can
be queried.
`analyzer`::
(Optional, string) <<analysis,Analyzer>> used to convert text in the
query string into tokens. Defaults to the
<<specify-index-time-analyzer,index-time analyzer>> mapped for the
`default_field`. If no analyzer is mapped, the index's default analyzer is used.
|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
is `0`.
`auto_generate_synonyms_phrase_query`::
(Optional, boolean) If `true`, <<query-dsl-match-query-phrase,match phrase>>
queries are automatically created for multi-term synonyms. Defaults to `true`.
See <<simple-query-string-synonyms>> for an example.
|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
expand to. Defaults to `50`
`flags`::
(Optional, string) List of enabled operators for the
<<simple-query-string-syntax,simple query string syntax>>. Defaults to `ALL`
(all operators). See <<supported-flags>> for valid values.
|`fuzzy_transpositions` |Set to `false` to disable fuzzy transpositions (`ab` -> `ba`).
Default is `true`.
|=======================================================================
`fuzzy_max_expansions`::
(Optional, integer) Maximum number of terms to which the query expands for fuzzy
matching. Defaults to `50`.
[float]
===== Simple Query String Syntax
The `simple_query_string` supports the following special characters:
`fuzzy_prefix_length`::
(Optional, integer) Number of beginning characters left unchanged for fuzzy
matching. Defaults to `0`.
`fuzzy_transpositions`::
(Optional, boolean) If `true`, edits for fuzzy matching include
transpositions of two adjacent characters (ab → ba). Defaults to `true`.
`lenient`::
(Optional, boolean) If `true`, format-based errors, such as providing a text
value for a <<number,numeric>> field, are ignored. Defaults to `false`.
`minimum_should_match`::
(Optional, string) Minimum number of clauses that must match for a document to
be returned. See the <<query-dsl-minimum-should-match, `minimum_should_match`
parameter>> for valid values and more information.
`quote_field_suffix`::
+
--
(Optional, string) Suffix appended to quoted text in the query string.
You can use this suffix to use a different analysis method for exact matches.
See <<mixing-exact-search-with-stemming>>.
--
[[simple-query-string-query-notes]]
==== Notes
[[simple-query-string-syntax]]
===== Simple query string syntax
The `simple_query_string` query supports the following operators:
* `+` signifies AND operation
* `|` signifies OR operation
@ -100,11 +147,11 @@ The `simple_query_string` supports the following special characters:
* `~N` after a word signifies edit distance (fuzziness)
* `~N` after a phrase signifies slop amount
In order to search for any of these special characters, they will need to
be escaped with `\`.
To use one of these characters literally, escape it with a preceding backslash
(`\`).
Be aware that this syntax may have a different behavior depending on the
`default_operator` value. For example, consider the following query:
The behavior of these operators may differ depending on the `default_operator`
value. For example:
[source,js]
--------------------------------------------------
@ -120,47 +167,20 @@ GET /_search
--------------------------------------------------
// CONSOLE
You may expect that documents containing only "foo" or "bar" will be returned,
as long as they do not contain "baz", however, due to the `default_operator`
being OR, this really means "match documents that contain "foo" or documents
that contain "bar", or documents that don't contain "baz". If this is unintended
then the query can be switched to `"foo bar +-baz"` which will not return
documents that contain "baz".
This search is intended to only return documents containing `foo` or `bar` that
also do **not** contain `baz`. However because of a `default_operator` of `OR`,
this search actually returns documents that contain `foo` or `bar` and any
documents that don't contain `baz`. To return documents as intended, change the
query string to `foo bar +-baz`.
[float]
==== Default Field
When not explicitly specifying the field to search on in the query
string syntax, the `index.query.default_field` will be used to derive
which fields to search on. It defaults to `*` and the query will automatically
attempt to determine the existing fields in the index's mapping that are queryable,
and perform the search on those fields.
[float]
==== Multi Field
The fields parameter can also include pattern based field names,
allowing to automatically expand to the relevant fields (dynamically
introduced fields included). For example:
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"simple_query_string" : {
"fields" : ["content", "name.*^5"],
"query" : "foo bar baz"
}
}
}
--------------------------------------------------
// CONSOLE
[float]
[[supported-flags]]
==== Flags
`simple_query_string` support multiple flags to specify which parsing features
should be enabled. It is specified as a `|`-delimited string with the
`flags` parameter:
===== Limit operators
You can use the `flags` parameter to limit the supported operators for the
simple query string syntax.
To explicitly enable only specific operators, use a `|` separator. For example,
a `flags` value of `OR|AND|PREFIX` disables all operators except `OR`, `AND`,
and `PREFIX`.
[source,js]
--------------------------------------------------
@ -176,28 +196,92 @@ GET /_search
--------------------------------------------------
// CONSOLE
[[supported-flags-values]]
====== Valid values
The available flags are:
[cols="<,<",options="header",]
|=======================================================================
|Flag |Description
|`ALL` |Enables all parsing features. This is the default.
|`NONE` |Switches off all parsing features.
|`AND` |Enables the `+` AND operator.
|`OR` |Enables the `\|` OR operator.
|`NOT` |Enables the `-` NOT operator.
|`PREFIX` |Enables the `*` Prefix operator.
|`PHRASE` |Enables the `"` quotes operator used to search for phrases.
|`PRECEDENCE` |Enables the `(` and `)` operators to control operator precedence.
|`ESCAPE` |Enables `\` as the escape character.
|`WHITESPACE` |Enables whitespaces as split characters.
|`FUZZY` |Enables the `~N` operator after a word where N is an integer denoting the allowed edit distance for matching (see <<fuzziness>>).
|`SLOP` |Enables the `~N` operator after a phrase where N is an integer denoting the slop amount.
|`NEAR` |Synonymous to `SLOP`.
|=======================================================================
`ALL` (Default)::
Enables all optional operators.
[float]
==== Synonyms
`AND`::
Enables the `+` AND operator.
`ESCAPE`::
Enables `\` as an escape character.
`FUZZY`::
Enables the `~N` operator after a word, where `N` is an integer denoting the
allowed edit distance for matching. See <<fuzziness>>.
`NEAR`::
Enables the `~N` operator, after a phrase where `N` is the maximum number of
positions allowed between matching tokens. Synonymous to `SLOP`.
`NONE`::
Disables all operators.
`NOT`::
Enables the `-` NOT operator.
`OR`::
Enables the `\|` OR operator.
`PHRASE`::
Enables the `"` quotes operator used to search for phrases.
`PRECEDENCE`::
Enables the `(` and `)` operators to control operator precedence.
`PREFIX`::
Enables the `*` prefix operator.
`SLOP`::
Enables the `~N` operator, after a phrase where `N` is maximum number of
positions allowed between matching tokens. Synonymous to `NEAR`.
`WHITESPACE`::
Enables whitespace as split characters.
[[simple-query-string-boost]]
===== Wildcards and per-field boosts in the `fields` parameter
Fields can be specified with wildcards, eg:
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"simple_query_string" : {
"query": "Will Smith",
"fields": [ "title", "*_name" ] <1>
}
}
}
--------------------------------------------------
// CONSOLE
<1> Query the `title`, `first_name` and `last_name` fields.
Individual fields can be boosted with the caret (`^`) notation:
[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"simple_query_string" : {
"query" : "this is a test",
"fields" : [ "subject^3", "message" ] <1>
}
}
}
--------------------------------------------------
// CONSOLE
<1> The `subject` field is three times as important as the `message` field.
[[simple-query-string-synonyms]]
===== Synonyms
The `simple_query_string` query supports multi-terms synonym expansion with the <<analysis-synonym-graph-tokenfilter,
synonym_graph>> token filter. When this filter is used, the parser creates a phrase query for each multi-terms synonyms.