--- layout: default title: Match parent: Full-text queries grand_parent: Query DSL nav_order: 10 --- # Match query Use the `match` query for full-text search on a specific document field. If you run a `match` query on a [`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field, the `match` query [analyzes]({{site.url}}{{site.baseurl}}/analyzers/index/) the provided search string and returns documents that match any of the string's terms. If you run a `match` query on an exact-value field, it returns documents that match the exact value. The preferred way to search exact-value fields is to use a filter because, unlike a query, a filter is cached. The following example shows a basic `match` query for the word `wind` in the `title`: ```json GET _search { "query": { "match": { "title": "wind" } } } ``` {% include copy-curl.html %} To pass additional parameters, you can use the expanded syntax: ```json GET _search { "query": { "match": { "title": { "query": "wind", "analyzer": "stop" } } } } ``` {% include copy-curl.html %} ## Examples In the following examples, you'll use the index that contains the following documents: ```json PUT testindex/_doc/1 { "title": "Let the wind rise" } ``` {% include copy-curl.html %} ```json PUT testindex/_doc/2 { "title": "Gone with the wind" } ``` {% include copy-curl.html %} ```json PUT testindex/_doc/3 { "title": "Rise is gone" } ``` {% include copy-curl.html %} ## Operator If a `match` query is run on a `text` field, the text is analyzed with the analyzer specified in the `analyzer` parameter. Then the resulting tokens are combined into a Boolean query using the operator specified in the `operator` parameter. The default operator is `OR`, so the query `wind rise` is changed into `wind OR rise`. In this example, this query returns documents 1--3 because each document has a term that matches the query. To specify the `and` operator, use the following query: ```json GET testindex/_search { "query": { "match": { "title": { "query": "wind rise", "operator": "and" } } } } ``` {% include copy-curl.html %} The query is constructed as `wind AND rise` and returns document 1 as the matching document:
Response {: .text-delta} ```json { "took": 17, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.2667098, "hits": [ { "_index": "testindex", "_id": "1", "_score": 1.2667098, "_source": { "title": "Let the wind rise" } } ] } } ```
### Minimum should match You can control the minimum number of terms that a document must match to be returned in the results by specifying the [`minimum_should_match`]({{site.url}}{{site.baseurl}}/query-dsl/minimum-should-match/) parameter: ```json GET testindex/_search { "query": { "match": { "title": { "query": "wind rise", "operator": "or", "minimum_should_match": 2 } } } } ``` {% include copy-curl.html %} Now documents are required to match both terms, so only document 1 is returned (this is equivalent to the `and` operator):
Response {: .text-delta} ```json { "took": 23, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.2667098, "hits": [ { "_index": "testindex", "_id": "1", "_score": 1.2667098, "_source": { "title": "Let the wind rise" } } ] } } ```
## Analyzer Because in this example you didn't explicitly specify the analyzer, the default `standard` analyzer is used. The default analyzer does not perform stemming, so if you run a query `the wind rises`, you receive no results because the token `rises` does not match the token `rise`. To change the search analyzer, specify it in the `analyzer` field. For example, the following query uses the `english` analyzer: ```json GET testindex/_search { "query": { "match": { "title": { "query": "the wind rises", "operator": "and", "analyzer": "english" } } } } ``` {% include copy-curl.html %} The `english` analyzer removes the stopword `the` and performs stemming, producing the tokens `wind` and `rise`. The latter token matches document 1, which is returned in the results:
Response {: .text-delta} ```json { "took": 19, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.2667098, "hits": [ { "_index": "testindex", "_id": "1", "_score": 1.2667098, "_source": { "title": "Let the wind rise" } } ] } } ```
## Empty query In some cases, an analyzer might remove all tokens from a query. For example, the `english` analyzer removes stop words, so in a query `and OR or`, all tokens are removed. To check the analyzer behavior, you can use the [Analyze API]({{site.url}}{{site.baseurl}}/api-reference/analyze-apis/#apply-a-built-in-analyzer): ```json GET testindex/_analyze { "analyzer" : "english", "text" : "and OR or" } ``` {% include copy-curl.html %} As expected, the query produces no tokens: ```json { "tokens": [] } ``` You can specify the behavior for an empty query in the `zero_terms_query` parameter. Setting `zero_terms_query` to `all` returns all documents in the index and setting it to `none` returns no documents: ```json GET testindex/_search { "query": { "match": { "title": { "query": "and OR or", "analyzer" : "english", "zero_terms_query": "all" } } } } ``` {% include copy-curl.html %} ## Fuzziness To account for typos, you can specify `fuzziness` for your query as either of the following: - An integer that specifies the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) for this edit. - `AUTO`: - Strings of 0–2 characters must match exactly. - Strings of 3–5 characters allow 1 edit. - Strings longer than 5 characters allow 2 edits. Setting `fuzziness` to the default `AUTO` value works best in most cases: ```json GET testindex/_search { "query": { "match": { "title": { "query": "wnid", "fuzziness": "AUTO" } } } } ``` {% include copy-curl.html %} The token `wnid` matches `wind` and the query returns documents 1 and 2:
Response {: .text-delta} ```json { "took": 31, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 2, "relation": "eq" }, "max_score": 0.47501624, "hits": [ { "_index": "testindex", "_id": "1", "_score": 0.47501624, "_source": { "title": "Let the wind rise" } }, { "_index": "testindex", "_id": "2", "_score": 0.47501624, "_source": { "title": "Gone with the wind" } } ] } } ```
### Prefix length Misspellings rarely occur in the beginning of words. Thus, you can specify the minimum length the matched prefix must be to return a document in the results. For example, you can change the preceding query to include a `prefix_length`: ```json GET testindex/_search { "query": { "match": { "title": { "query": "wnid", "fuzziness": "AUTO", "prefix_length": 2 } } } } ``` {% include copy-curl.html %} The preceding query returns no results. If you change the `prefix_length` to 1, documents 1 and 2 are returned because the first letter of the token `wnid` is not misspelled. ### Transpositions In the preceding example, the word `wnid` contained a transposition (`in` was changed to `ni`). By default, transpositions are allowed in fuzzy matching, but you can disallow them by setting `fuzzy_transpositions` to `false`: ```json GET testindex/_search { "query": { "match": { "title": { "query": "wnid", "fuzziness": "AUTO", "fuzzy_transpositions": false } } } } ``` {% include copy-curl.html %} Now the query returns no results. ## Synonyms If you use a `synonym_graph` filter and `auto_generate_synonyms_phrase_query` is set to `true` (default), OpenSearch parses the query into terms and then combines the terms to generate a [phrase query](https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/PhraseQuery.html) for multi-term synonyms. For example, if you specify `ba,batting average` as synonyms and search for `ba`, OpenSearch searches for `ba OR "batting average"`. To match multi-term synonyms with conjunctions, set `auto_generate_synonyms_phrase_query` to `false`: ```json GET /testindex/_search { "query": { "match": { "text": { "query": "good ba", "auto_generate_synonyms_phrase_query": false } } } } ``` {% include copy-curl.html %} The query produced is `ba OR (batting AND average)`. ## Parameters The query accepts the name of the field (``) as a top-level parameter: ```json GET _search { "query": { "match": { "": { "query": "text to search for", ... } } } } ``` {% include copy-curl.html %} The `` accepts the following parameters. All parameters except `query` are optional. Parameter | Data type | Description :--- | :--- | :--- `query` | String | The query string to use for search. Required. `auto_generate_synonyms_phrase_query` | Boolean | Specifies whether to create a [match phrase query]({{site.url}}{{site.baseurl}}/query-dsl/full-text/match-phrase/) automatically for multi-term synonyms. For example, if you specify `ba,batting average` as synonyms and search for `ba`, OpenSearch searches for `ba OR "batting average"` (if this option is `true`) or `ba OR (batting AND average)` (if this option is `false`). Default is `true`. `analyzer` | String | The [analyzer]({{site.url}}{{site.baseurl}}/analyzers/index/) used to tokenize the query string text. Default is the index-time analyzer specified for the `default_field`. If no analyzer is specified for the `default_field`, the `analyzer` is the default analyzer for the index. `boost` | Floating-point | Boosts the clause by the given multiplier. Useful for weighing clauses in compound queries. Values in the [0, 1) range decrease relevance, and values greater than 1 increase relevance. Default is `1`. `enable_position_increments` | Boolean | When `true`, resulting queries are aware of position increments. This setting is useful when the removal of stop words leaves an unwanted "gap" between terms. Default is `true`. `fuzziness` | String | The number of character edits (insert, delete, substitute) that it takes to change one word to another when determining whether a term matched a value. For example, the distance between `wined` and `wind` is 1. Valid values are non-negative integers or `AUTO`. The default, `AUTO`, chooses a value based on the length of each term and is a good choice for most use cases. `fuzzy_rewrite` | String | Determines how OpenSearch rewrites the query. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. If the `fuzziness` parameter is not `0`, the query uses a `fuzzy_rewrite` method of `top_terms_blended_freqs_${max_expansions}` by default. Default is `constant_score`. `fuzzy_transpositions` | Boolean | Setting `fuzzy_transpositions` to `true` (default) adds swaps of adjacent characters to the insert, delete, and substitute operations of the `fuzziness` option. For example, the distance between `wind` and `wnid` is 1 if `fuzzy_transpositions` is true (swap "n" and "i") and 2 if it is false (delete "n", insert "n"). If `fuzzy_transpositions` is false, `rewind` and `wnid` have the same distance (2) from `wind`, despite the more human-centric opinion that `wnid` is an obvious typo. The default is a good choice for most use cases. `lenient` | Boolean | Setting `lenient` to `true` ignores data type mismatches between the query and the document field. For example, a query string of `"8.2"` could match a field of type `float`. Default is `false`. `max_expansions` | Positive integer | The maximum number of terms to which the query can expand. Fuzzy queries “expand to” a number of matching terms that are within the distance specified in `fuzziness`. Then OpenSearch tries to match those terms. Default is `50`. `minimum_should_match` | Positive or negative integer, positive or negative percentage, combination | If the query string contains multiple search terms and you use the `or` operator, the number of terms that need to match for the document to be considered a match. For example, if `minimum_should_match` is 2, `wind often rising` does not match `The Wind Rises.` If `minimum_should_match` is `1`, it matches. For details, see [Minimum should match]({{site.url}}{{site.baseurl}}/query-dsl/minimum-should-match/). `operator` | String | If the query string contains multiple search terms, whether all terms need to match (`AND`) or only one term needs to match (`OR`) for a document to be considered a match. Valid values are:
- `OR`: The string `to be` is interpreted as `to OR be`
- `AND`: The string `to be` is interpreted as `to AND be`
Default is `OR`. `prefix_length` | Non-negative integer | The number of leading characters that are not considered in fuzziness. Default is `0`. `zero_terms_query` | String | In some cases, the analyzer removes all terms from a query string. For example, the `stop` analyzer removes all terms from the string `an but this`. In those cases, `zero_terms_query` specifies whether to match no documents (`none`) or all documents (`all`). Valid values are `none` and `all`. Default is `none`.