Refactor term query documentation (#4787)

* Refactor term query documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Refactor term-level query documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* fix links

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* More examples

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Add section about expensive queries

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* More editorial feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
kolchfa-aws 2023-08-17 14:38:31 -04:00 committed by GitHub
parent b1767a1e0b
commit cac580ef00
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
19 changed files with 1342 additions and 522 deletions

View File

@ -221,7 +221,7 @@ GET testindex/_search
## Date math
The date field type supports using date math to specify durations in queries. For example, the `gt`, `gte`, `lt`, and `lte` parameters in [range queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range) and the `from` and `to` parameters in [date range aggregations]({{site.url}}{{site.baseurl}}/query-dsl/aggregations/bucket/date-range/) accept date math expressions.
The date field type supports using date math to specify durations in queries. For example, the `gt`, `gte`, `lt`, and `lte` parameters in [range queries]({{site.url}}{{site.baseurl}}/query-dsl/term/range/) and the `from` and `to` parameters in [date range aggregations]({{site.url}}{{site.baseurl}}/query-dsl/aggregations/bucket/date-range/) accept date math expressions.
A date math expression contains a fixed date, optionally followed by one or more mathematical expressions. The fixed date may be either `now` (current date and time in milliseconds since the epoch) or a string ending with `||` that specifies a date (for example, `2022-05-18||`). The date must be in the `strict_date_optional_time||epoch_millis` format.
@ -256,7 +256,7 @@ The following example expressions illustrate using date math:
### Using date math in a range query
The following example illustrates using date math in a [range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range).
The following example illustrates using date math in a [range query]({{site.url}}{{site.baseurl}}/query-dsl/term/range/).
Set up an index with `release_date` mapped as `date`:

View File

@ -45,16 +45,16 @@ Flat objects do not support:
The flat object field type supports the following queries:
- [Term]({{site.url}}{{site.baseurl}}/query-dsl/term#term)
- [Terms]({{site.url}}{{site.baseurl}}/query-dsl/term#terms)
- [Terms set]({{site.url}}{{site.baseurl}}/query-dsl/term#terms-set)
- [Prefix]({{site.url}}{{site.baseurl}}/query-dsl/term#prefix)
- [Range]({{site.url}}{{site.baseurl}}/query-dsl/term#range)
- [Term]({{site.url}}{{site.baseurl}}/query-dsl/term/term/)
- [Terms]({{site.url}}{{site.baseurl}}/query-dsl/term/terms/)
- [Terms set]({{site.url}}{{site.baseurl}}/query-dsl/term/terms-set/)
- [Prefix]({{site.url}}{{site.baseurl}}/query-dsl/term/prefix/)
- [Range]({{site.url}}{{site.baseurl}}/query-dsl/term/range/)
- [Match]({{site.url}}{{site.baseurl}}/query-dsl/full-text/#match)
- [Multi-match]({{site.url}}{{site.baseurl}}/query-dsl/full-text/#multi-match)
- [Query string]({{site.url}}{{site.baseurl}}/query-dsl/full-text/query-string/)
- [Simple query string]({{site.url}}{{site.baseurl}}/query-dsl/full-text/#simple-query-string)
- [Exists]({{site.url}}{{site.baseurl}}/query-dsl/term#exists)
- [Exists]({{site.url}}{{site.baseurl}}/query-dsl/term/exists/)
## Limitations

View File

@ -61,64 +61,7 @@ PUT testindex/_doc/1
```
{% include copy-curl.html %}
You can use a [Term query](#term-query) or a [Range query](#range-query) to search for values within range fields.
### Term query
A term query takes a value and matches all range fields for which the value is within the range.
The following query will return document 1 because 3.5 is within the range [1.0, 4.0]:
```json
GET testindex/_search
{
"query" : {
"term" : {
"gpa" : {
"value" : 3.5
}
}
}
}
```
{% include copy-curl.html %}
### Range query
A range query on a range field returns documents within that range. Along with the field to be matched, you can further specify a date format or relational operators with the following optional parameters:
Parameter | Description
:--- | :---
format | A [format]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#formats) for dates in this query. Default is the field's mapped format.
relation | Provides a relation between the query's date range and the document's date range. There are three types of relations that you can specify:<br> 1. `intersects` matches documents for which there are dates that belong to both the query's date range and document's date range. This is the default. <br> 2. `contains` matches documents for which the query's date range is a subset of the document's date range. <br> 3. `within` matches documents for which the document's date range is a subset of the query's date range.
To use a date format other than the field's mapped format in a query, specify it in the `format` field.
For a full description of range query usage, including all range query parameters, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range).
{: .tip }
Query for all graduation dates in 2019, providing the date range in a "MM/dd/yyyy" format:
```json
GET testindex1/_search
{
"query": {
"range": {
"graduation_date": {
"gte": "01/01/2019",
"lte": "12/31/2019",
"format": "MM/dd/yyyy",
"relation" : "within"
}
}
}
}
```
{% include copy-curl.html %}
The above query will return document 1 for the `within` and `intersects` relations but will not return it for the `contains` relation.
### IP address ranges
## IP address ranges
You can specify IP address ranges in two formats: as a range and in [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation).
@ -155,6 +98,55 @@ PUT testindex/_doc/2
```
{% include copy-curl.html %}
## Querying range fields
You can use a [Term query](#term-query) or a [Range query](#range-query) to search for values within range fields.
### Term query
A term query takes a value and matches all range fields for which the value is within the range.
The following query will return document 1 because 3.5 is within the range [1.0, 4.0]:
```json
GET testindex/_search
{
"query" : {
"term" : {
"gpa" : {
"value" : 3.5
}
}
}
}
```
{% include copy-curl.html %}
### Range query
A range query on a range field returns documents within that range.
Query for all graduation dates in 2019, providing the date range in a "MM/dd/yyyy" format:
```json
GET testindex1/_search
{
"query": {
"range": {
"graduation_date": {
"gte": "01/01/2019",
"lte": "12/31/2019",
"format": "MM/dd/yyyy",
"relation" : "within"
}
}
}
}
```
{% include copy-curl.html %}
The preceding query will return document 1 for the `within` and `intersects` relations but will not return it for the `contains` relation. For more information about relation types, see [range query parameters]({{site.url}}{{site.baseurl}}/query-dsl/term/range#parameters).
## Parameters
The following table lists the parameters accepted by range field types. All parameters are optional.

View File

@ -446,7 +446,7 @@ Option | Valid values | Description
You can also use synonyms with the `terms` query type to search for multiple terms. Use the `auto_generate_synonyms_phrase_query` Boolean field. By default it is set to `true`. It automatically generates phrase queries for multiple term synonyms. For example, if you have the synonym `"ba, batting average"` and search for "ba," OpenSearch searches for `ba OR "batting average"` when the option is `true` or `ba OR (batting AND average)` when the option is `false`.
To learn more about the multiple terms query type, see [Terms]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#terms). For more reference information about phrase queries, see the [Lucene documentation](https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/PhraseQuery.html).
To learn more about the multiple terms query type, see [Terms]({{site.url}}{{site.baseurl}}/query-dsl/term/terms/). For more reference information about phrase queries, see the [Lucene documentation](https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/PhraseQuery.html).
### Other advanced options

View File

@ -4,6 +4,7 @@ title: Query DSL
nav_order: 2
has_children: true
nav_exclude: true
has_toc: false
redirect_from:
- /opensearch/query-dsl/
- /opensearch/query-dsl/index/
@ -38,7 +39,7 @@ Broadly, you can classify queries into two categories---*leaf queries* and *comp
- **Full-text queries**: Use full-text queries to search text documents. For an analyzed text field search, full-text queries split the query string into terms using the same analyzer that was used when the field was indexed. For an exact value search, full-text queries look for the specified value without applying text analysis. To learn more, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index/).
- **Term-level queries**: Use term-level queries to search documents for an exact term, such as an ID or value range. Term-level queries do not analyze search terms or sort results by relevance score. To learn more, see [Term-level queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/).
- **Term-level queries**: Use term-level queries to search documents for an exact term, such as an ID or value range. Term-level queries do not analyze search terms or sort results by relevance score. To learn more, see [Term-level queries]({{site.url}}{{site.baseurl}}/query-dsl/term/index/).
- **Geographic and xy queries**: Use geographic queries to search documents that include geographic data. Use xy queries to search documents that include points and shapes in a two-dimensional coordinate system. To learn more, see [Geographic and xy queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/index).
@ -82,4 +83,30 @@ The following examples illustrate values containing special characters that will
To avoid this circumstance when using either query DSL or the REST API, you can use a custom analyzer or map the field as `keyword`, which performs an exact-match search. See [Keyword field type]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/) for the latter option.
For a list of characters that should be avoided when using `text` field types, see [Word Boundaries](https://unicode.org/reports/tr29/#Word_Boundaries).
For a list of characters that should be avoided when using `text` field types, see [Word Boundaries](https://unicode.org/reports/tr29/#Word_Boundaries).
## Expensive queries
Expensive queries can consume a lot of memory and lead to a decline in cluster performance. The following queries may be resource consuming:
- [`fuzzy`]({{site.url}}{{site.baseurl}}/query-dsl/term/fuzzy/) queries
- [`prefix`]({{site.url}}{{site.baseurl}}/query-dsl/term/prefix/) queries
- [`range`]({{site.url}}{{site.baseurl}}/query-dsl/term/range/) queries on [`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/)) and [`keyword`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/keyword/) fields
- [`regexp`]({{site.url}}{{site.baseurl}}/query-dsl/term/regexp/) queries
- [`wildcard`]({{site.url}}{{site.baseurl}}/query-dsl/term/wildcard/) queries
- [`query_string`]({{site.url}}{{site.baseurl}}/query-dsl/full-text/query-string/) queries that are internally transformed into prefix queries
To disallow expensive queries, you can disable the `search.allow_expensive_queries` cluster setting as follows:
```json
PUT _cluster/settings
{
"persistent": {
"search.allow_expensive_queries": false
}
}
```
{% include copy-curl.html %}
To track expensive queries, enable [slow logs]({{site.url}}{{site.baseurl}}/monitoring-your-cluster/logs/#slow-logs).
{: .tip}

View File

@ -1,450 +0,0 @@
---
layout: default
title: Term-level queries
nav_order: 20
redirect_from:
- /opensearch/query-dsl/term/
- /query-dsl/query-dsl/term/
---
# Term-level queries
Term-level queries search an index for documents that contain an exact search term. Documents returned by a term-level query are not sorted by their relevance scores.
When working with text data, use term-level queries for fields mapped as `keyword` only.
Term-level queries are not suited for searching analyzed text fields. To return analyzed fields, use a [full-text query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text).
## Term-level query types
The following table lists all term-level query types.
| Query type | Description
:--- | :--- | :---
[`term`](#term) | Searches for documents with an exact term in a specific field.
[`terms`](#terms) | Searches for documents with one or more terms in a specific field.
[`terms_set`](#terms-set) | Searches for documents that match a minimum number of terms in a specific field.
[`ids`](#ids) | Searches for documents by document ID.
[`range`](#range) | Searches for documents with field values in a specific range.
[`prefix`](#prefix) | Searches for documents with terms that begin with a specific prefix.
[`exists`](#exists) | Searches for documents with any indexed value in a specific field.
[`fuzzy`](#fuzzy) | Searches for documents with terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term.
[`wildcard`](#wildcard) | Searches for documents with terms that match a wildcard pattern.
[`regexp`](#regexp) | Searches for documents with terms that match a regular expression.
## Term
Use the `term` query to search for an exact term in a field.
```json
GET shakespeare/_search
{
"query": {
"term": {
"line_id": {
"value": "61809"
}
}
}
}
```
{% include copy-curl.html %}
## Terms
Use the `terms` query to search for multiple terms in the same field.
```json
GET shakespeare/_search
{
"query": {
"terms": {
"line_id": [
"61809",
"61810"
]
}
}
}
```
{% include copy-curl.html %}
You get back documents that match any of the terms.
## Terms set
With a terms set query, you can search for documents that match a minimum number of exact terms in a specified field. The `terms_set` query is similar to the `terms` query, but you can specify the minimum number of matching terms that are required to return a document. You can specify this number either in a field in the index or with a script.
As an example, consider an index that contains students with classes they have taken. When setting up the mapping for this index, you need to provide a [numeric]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/numeric) field that specifies the minimum number of matching terms that are required to return a document:
```json
PUT students
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"classes": {
"type": "keyword"
},
"min_required": {
"type": "integer"
}
}
}
}
```
{% include copy-curl.html %}
Next, index two documents that correspond to students:
```json
PUT students/_doc/1
{
"name": "Mary Major",
"classes": [ "CS101", "CS102", "MATH101" ],
"min_required": 2
}
```
{% include copy-curl.html %}
```json
PUT students/_doc/2
{
"name": "John Doe",
"classes": [ "CS101", "MATH101", "ENG101" ],
"min_required": 2
}
```
{% include copy-curl.html %}
Now search for students who have taken at least two of the following classes: `CS101`, `CS102`, `MATH101`:
```json
GET students/_search
{
"query": {
"terms_set": {
"classes": {
"terms": [ "CS101", "CS102", "MATH101" ],
"minimum_should_match_field": "min_required"
}
}
}
}
```
{% include copy-curl.html %}
The response contains both students:
```json
{
"took" : 44,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.4544616,
"hits" : [
{
"_index" : "students",
"_id" : "1",
"_score" : 1.4544616,
"_source" : {
"name" : "Mary Major",
"classes" : [
"CS101",
"CS102",
"MATH101"
],
"min_required" : 2
}
},
{
"_index" : "students",
"_id" : "2",
"_score" : 0.5013843,
"_source" : {
"name" : "John Doe",
"classes" : [
"CS101",
"MATH101",
"ENG101"
],
"min_required" : 2
}
}
]
}
}
```
To specify the minimum number of terms a document should match with a script, provide the script in the `minimum_should_match_script` field:
```json
GET students/_search
{
"query": {
"terms_set": {
"classes": {
"terms": [ "CS101", "CS102", "MATH101" ],
"minimum_should_match_script": {
"source": "Math.min(params.num_terms, doc['min_required'].value)"
}
}
}
}
}
```
{% include copy-curl.html %}
## IDs
Use the `ids` query to search for one or more document ID values.
```json
GET shakespeare/_search
{
"query": {
"ids": {
"values": [
34229,
91296
]
}
}
}
```
{% include copy-curl.html %}
## Range
You can search for a range of values in a field with the `range` query.
To search for documents where the `line_id` value is >= 10 and <= 20:
```json
GET shakespeare/_search
{
"query": {
"range": {
"line_id": {
"gte": 10,
"lte": 20
}
}
}
}
```
{% include copy-curl.html %}
Parameter | Behavior
:--- | :---
`gte` | Greater than or equal to.
`gt` | Greater than.
`lte` | Less than or equal to.
`lt` | Less than.
In addition to the range query parameters, you can provide date formats or relation operators such as "contains" or "within." To see the supported field types for range queries, see [Range query optional parameters]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/range/#range-query). To see all date formats, see [Formats]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#formats).
{: .tip }
Assume that you have a `products` index and you want to find all the products that were added in the year 2019:
```json
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "2019/01/01",
"lte": "2019/12/31"
}
}
}
}
```
{% include copy-curl.html %}
Specify relative dates by using [date math]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#date-math).
To subtract 1 year and 1 day from the specified date, use the following query:
```json
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "2019/01/01||-1y-1d"
}
}
}
}
```
{% include copy-curl.html %}
The first date that we specify is the anchor date or the starting point for the date math. Add two trailing pipe symbols. You could then add one day (`+1d`) or subtract two weeks (`-2w`). This math expression is relative to the anchor date that you specify.
You could also round off dates by adding a forward slash to the date or time unit.
To find products added in the last year and rounded off by month:
```json
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "now-1y/M"
}
}
}
}
```
{% include copy-curl.html %}
The keyword `now` refers to the current date and time.
## Prefix
Use the `prefix` query to search for terms that begin with a specific prefix.
```json
GET shakespeare/_search
{
"query": {
"prefix": {
"speaker": "KING"
}
}
}
```
{% include copy-curl.html %}
## Exists
Use the `exists` query to search for documents that contain a specific field.
```json
GET shakespeare/_search
{
"query": {
"exists": {
"field": "speaker"
}
}
}
```
{% include copy-curl.html %}
## Fuzzy
A fuzzy query searches for documents with terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term. These changes include:
- Replacements: **c**at to **b**at
- Insertions: cat to cat**s**
- Deletions: **c**at to at
- Transpositions: **ca**t to **ac**t
A fuzzy query creates a list of all possible expansions of the search term that fall within the Levenshtein distance. You can specify the maximum number of such expansions in the `max_expansions` field. Then is searches for documents that match any of the expansions.
The following example query searches for the speaker `HALET` (misspelled `HAMLET`). The maximum edit distance is not specified, so the default `AUTO` edit distance is used:
```json
GET shakespeare/_search
{
"query": {
"fuzzy": {
"speaker": {
"value": "HALET"
}
}
}
}
```
{% include copy-curl.html %}
The response contains all documents where `HAMLET` is the speaker.
The following example query searches for the word `cat` with advanced parameters:
```json
GET shakespeare/_search
{
"query": {
"fuzzy": {
"speaker": {
"value": "HALET",
"fuzziness": "2",
"max_expansions": 40,
"prefix_length": 0,
"transpositions": true,
"rewrite": "constant_score"
}
}
}
}
```
{% include copy-curl.html %}
## Wildcard
Use wildcard queries to search for terms that match a wildcard pattern.
Feature | Behavior
:--- | :---
`*` | Specifies all valid values.
`?` | Specifies a single valid value.
To search for terms that start with `H` and end with `Y`:
```json
GET shakespeare/_search
{
"query": {
"wildcard": {
"speaker": {
"value": "H*Y"
}
}
}
}
```
{% include copy-curl.html %}
If we change `*` to `?`, we get no matches, because `?` refers to a single character.
Wildcard queries tend to be slow because they need to iterate over a lot of terms. Avoid placing wildcard characters at the beginning of a query because it could be a very expensive operation in terms of both resources and time.
## Regexp
Use the `regexp` query to search for terms that match a regular expression.
This regular expression matches any single uppercase or lowercase letter:
```json
GET shakespeare/_search
{
"query": {
"regexp": {
"play_name": "[a-zA-Z]amlet"
}
}
}
```
{% include copy-curl.html %}
A few important notes:
- Regular expressions are applied to the terms in the field (i.e. tokens), not the entire field.
- Regular expressions use the Lucene syntax, which differs from more standardized implementations. Test thoroughly to ensure that you receive the results you expect. To learn more, see [the Lucene documentation](https://lucene.apache.org/core/8_9_0/core/index.html).
- `regexp` queries can be expensive operations and require the `search.allow_expensive_queries` setting to be set to `true`. Before making frequent `regexp` queries, test their impact on cluster performance and examine alternative queries for achieving similar results.

149
_query-dsl/term/exists.md Normal file
View File

@ -0,0 +1,149 @@
---
layout: default
title: Exists
parent: Term-level queries
grand_parent: Query DSL
nav_order: 10
---
# Exists query
Use the `exists` query to search for documents that contain a specific field.
An indexed value will not exist for a document field in any of the following cases:
- The field has `"index" : false` specified in the mapping.
- The field in the source JSON is `null` or `[]`.
- The length of the field value exceeds the `ignore_above` setting in the mapping.
- The field value is malformed and `ignore_malformed` is defined in the mapping.
An indexed value will exist for a document field in any of the following cases:
- The value is an array that contains one or more null elements and one or more non-null elements (for example, `["one", null]`).
- The value is an empty string (`""` or `"-"`).
- The value is a custom `null_value`, as defined in the field mapping.
## Example
For example, consider an index that contains the following two documents:
```json
PUT testindex/_doc/1
{
"title": "The wind rises"
}
```
{% include copy-curl.html %}
```json
PUT testindex/_doc/2
{
"title": "Gone with the wind",
"description": "A 1939 American epic historical film"
}
```
{% include copy-curl.html %}
The following query searches for documents that contain the `description` field:
```json
GET testindex/_search
{
"query": {
"exists": {
"field": "description"
}
}
}
```
{% include copy-curl.html %}
The response contains the matching document:
```json
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "testindex",
"_id": "2",
"_score": 1,
"_source": {
"title": "Gone with the wind",
"description": "A 1939 American epic historical film"
}
}
]
}
}
```
## Finding documents with missing indexed values
To find documents with missing indexed values, you can use the `must_not` [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) with the inner `exists` query. For example, the following request searches for documents in which the `description` field is missing:
```json
GET testindex/_search
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "description"
}
}
}
}
}
```
{% include copy-curl.html %}
The response contains the matching document:
```json
{
"took": 19,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0,
"hits": [
{
"_index": "testindex",
"_id": "1",
"_score": 0,
"_source": {
"title": "The wind rises"
}
}
]
}
}
```
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter.

93
_query-dsl/term/fuzzy.md Normal file
View File

@ -0,0 +1,93 @@
---
layout: default
title: Fuzzy
parent: Term-level queries
grand_parent: Query DSL
nav_order: 20
---
# Fuzzy query
A fuzzy query searches for documents containing terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term. These changes include:
- Replacements: **c**at to **b**at
- Insertions: cat to cat**s**
- Deletions: **c**at to at
- Transpositions: **ca**t to **ac**t
A fuzzy query creates a list of all possible expansions of the search term that fall within the Levenshtein distance. You can specify the maximum number of such expansions in the `max_expansions` field. Then it searches for documents that match any of the expansions.
The following example query searches for the speaker `HALET` (misspelled `HAMLET`). The maximum edit distance is not specified, so the default `AUTO` edit distance is used:
```json
GET shakespeare/_search
{
"query": {
"fuzzy": {
"speaker": {
"value": "HALET"
}
}
}
}
```
{% include copy-curl.html %}
The response contains all documents in which `HAMLET` is the speaker.
The following example query searches for the word `HALET` with advanced parameters:
```json
GET shakespeare/_search
{
"query": {
"fuzzy": {
"speaker": {
"value": "HALET",
"fuzziness": "2",
"max_expansions": 40,
"prefix_length": 0,
"transpositions": true,
"rewrite": "constant_score"
}
}
}
}
```
{% include copy-curl.html %}
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"fuzzy": {
"<field>": {
"value": "sample",
...
}
}
}
}
```
{% include copy-curl.html %}
The `<field>` accepts the following parameters. All parameters except `value` are optional.
Parameter | Data type | Description
:--- | :--- | :---
`value` | String | The term to search for in the field specified in `<field>`.
`fuzziness` | `AUTO`, `0`, or a positive integer | The number of character edits (insert, delete, substitute) needed to change one word to another when determining whether a term matched a value. For example, the distance between `wined` and `wind` is 1. The default, `AUTO`, chooses a value based on the length of each term and is a good choice for most use cases.
`max_expansions` | Positive integer | The maximum number of terms to which the query can expand. Fuzzy queries “expand to” a number of matching terms that are within the distance specified in `fuzziness`. Then OpenSearch tries to match those terms. Default is `50`.
`prefix_length` | Non-negative integer | The number of leading characters that are not considered in fuzziness. Default is `0`.
`rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`.
`transpositions` | Boolean | Specifies whether to allow transpositions of two adjacent characters (`ab` to `ba`) as edits. Default is `true`.
Specifying a large value in `max_expansions` can lead to poor performance, especially if `prefix_length` is set to `0`, because of the large number of variations of the word that OpenSearch tries to match.
{: .warning}
If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, fuzzy queries are not run.
{: .important}

34
_query-dsl/term/ids.md Normal file
View File

@ -0,0 +1,34 @@
---
layout: default
title: IDs
parent: Term-level queries
grand_parent: Query DSL
nav_order: 30
---
# IDs query
Use the `ids` query to search for documents with one or more specific document ID values in the `_id` field. For example, the following query requests documents with the IDs `34229` and `91296`:
```json
GET shakespeare/_search
{
"query": {
"ids": {
"values": [
34229,
91296
]
}
}
}
```
{% include copy-curl.html %}
## Parameters
The query accepts the following parameter.
Parameter | Data type | Description
:--- | :--- | :---
`value` | Array of strings | The document IDs to search for. Required.

31
_query-dsl/term/index.md Normal file
View File

@ -0,0 +1,31 @@
---
layout: default
title: Term-level queries
has_children: true
nav_order: 20
---
# Term-level queries
Term-level queries search an index for documents that contain an exact search term. Documents returned by a term-level query are not sorted by their relevance scores.
When working with text data, use term-level queries for fields mapped as `keyword` only.
Term-level queries are not suited for searching analyzed text fields. To return analyzed fields, use a [full-text query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
## Term-level query types
The following table lists all term-level query types.
Query type | Description
:--- | :---
[`term`]({{site.url}}{{site.baseurl}}/query-dsl/term/term/) | Searches for documents containing an exact term in a specific field.
[`terms`]({{site.url}}{{site.baseurl}}/query-dsl/term/terms/) | Searches for documents containing one or more terms in a specific field.
[`terms_set`]({{site.url}}{{site.baseurl}}/query-dsl/term/terms-set/) | Searches for documents that match a minimum number of terms in a specific field.
[`ids`]({{site.url}}{{site.baseurl}}/query-dsl/term/ids/) | Searches for documents by document ID.
[`range`]({{site.url}}{{site.baseurl}}/query-dsl/term/range/) | Searches for documents with field values in a specific range.
[`prefix`]({{site.url}}{{site.baseurl}}/query-dsl/term/prefix/) | Searches for documents containing terms that begin with a specific prefix.
[`exists`]({{site.url}}{{site.baseurl}}/query-dsl/term/exists/) | Searches for documents with any indexed value in a specific field.
[`fuzzy`]({{site.url}}{{site.baseurl}}/query-dsl/term/fuzzy/) | Searches for documents containing terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term.
[`wildcard`]({{site.url}}{{site.baseurl}}/query-dsl/term/wildcard/) | Searches for documents containing terms that match a wildcard pattern.
[`regexp`]({{site.url}}{{site.baseurl}}/query-dsl/term/regexp/) | Searches for documents containing terms that match a regular expression.

70
_query-dsl/term/prefix.md Normal file
View File

@ -0,0 +1,70 @@
---
layout: default
title: Prefix
parent: Term-level queries
grand_parent: Query DSL
nav_order: 40
---
# Prefix query
Use the `prefix` query to search for terms that begin with a specific prefix. For example, the following query searches for documents in which the `speaker` field contains a term that starts with `KING H`:
```json
GET shakespeare/_search
{
"query": {
"prefix": {
"speaker": "KING H"
}
}
}
```
{% include copy-curl.html %}
To provide parameters, you can use a query equivalent to the preceding one with the following extended syntax:
```json
GET shakespeare/_search
{
"query": {
"prefix": {
"speaker": {
"value": "KING H"
}
}
}
}
```
{% include copy-curl.html %}
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"prefix": {
"<field>": {
"value": "sample",
...
}
}
}
}
```
{% include copy-curl.html %}
The `<field>` accepts the following parameters. All parameters except `value` are optional.
Parameter | Data type | Description
:--- | :--- | :---
`value` | String | The term to search for in the field specified in `<field>`.
`case_insensitive` | Boolean | If `true`, allows case-insensitive matching of the value with the indexed field values. Default is `false` (case sensitivity is determined by the field's mapping).
`rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`.
If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, prefix queries are not run. If `index_prefixes` is enabled, the `search.allow_expensive_queries` setting is ignored and an optimized query is built and run.
{: .important}

220
_query-dsl/term/range.md Normal file
View File

@ -0,0 +1,220 @@
---
layout: default
title: Range
parent: Term-level queries
grand_parent: Query DSL
nav_order: 50
---
# Range query
You can search for a range of values in a field with the `range` query.
To search for documents in which the `line_id` value is >= 10 and <= 20, use the following request:
```json
GET shakespeare/_search
{
"query": {
"range": {
"line_id": {
"gte": 10,
"lte": 20
}
}
}
}
```
{% include copy-curl.html %}
## Operators
The field parameter in the range query accepts the following optional operator parameters:
- `gte`: Greater than or equal to
- `gt`: Greater than
- `lte`: Less than or equal to
- `lt`: Less than
## Date fields
You can use range queries on fields containing dates. For example, assume that you have a `products` index and you want to find all the products that were added in the year 2019:
```json
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "2019/01/01",
"lte": "2019/12/31"
}
}
}
}
```
{% include copy-curl.html %}
### Format
To use a date format other than the field's mapped format in a query, specify it in the `format` field.
For example, if the `products` index maps the `created` field as `strict_date_optional_time`, you can specify a different format for a query date as follows:
```json
GET /products/_search
{
"query": {
"range": {
"created": {
"gte": "01/01/2022",
"lte": "31/12/2022",
"format":"dd/MM/yyyy"
}
}
}
}
```
{% include copy-curl.html %}
### Missing date components
OpenSearch populates missing date components with the following values:
- `MONTH_OF_YEAR`: `01`
- `DAY_OF_MONTH`: `01`
- `HOUR_OF_DAY`: `23`
- `MINUTE_OF_HOUR`: `59`
- `SECOND_OF_MINUTE`: `59`
- `NANO_OF_SECOND`: `999_999_999`
If the year is missing, it is not populated.
For example, consider the following request that specifies only the year in the start date:
```json
GET /products/_search
{
"query": {
"range": {
"created": {
"gte": "2022",
"lte": "2022-12-31"
}
}
}
}
```
{% include copy-curl.html %}
The start date is populated with the default values, so the `gte` parameter used is `2022-01-01T23:59:59.999999999Z`.
### Relative dates
You can specify relative dates by using [date math]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#date-math).
To subtract 1 year and 1 day from the specified date, use the following query:
```json
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "2019/01/01||-1y-1d"
}
}
}
}
```
{% include copy-curl.html %}
In the preceding example, `2019/01/01` is the anchor date (the starting point) for the date math. After the two pipe characters (`||`), you are specifying a mathematical expression relative to the anchor date. In this example, you are subtracting 1 year (`-1y`) and 1 day (`-1d`).
You can also round off dates by adding a forward slash to the date or time unit.
To find products added within the last year, rounded off by month, use the following query:
```json
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "now-1y/M"
}
}
}
}
```
{% include copy-curl.html %}
The keyword `now` refers to the current date and time.
{: .tip}
### Rounding relative dates
The following table specifies how relative dates are rounded.
Parameter | Rounding rule | Example: The value `2022-05-18||/M` is rounded to
:--- | :--- | :---
`gt` | Rounds up to the first millisecond that is not in the rounding interval. | `2022-06-01T00:00:00.000`
`gte` | Rounds down to the first millisecond. | `2022-05-01T00:00:00.000`
`lt` | Rounds down to the last millisecond before the rounded date. | `2022-04-30T23:59:59.999`
`lte` | Rounds up to the last millisecond in the rounding interval. | `2022-05-31T23:59:59.999`
### Time zone
To convert `date` values to [Coordinated Universal Time (UTC)](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) using a [UTC offset](https://en.wikipedia.org/wiki/UTC_offset), use the `time_zone` parameter:
```json
GET /products/_search
{
"query": {
"range": {
"created": {
"time_zone": "-04:00",
"gte": "2022-04-17T06:00:00"
}
}
}
}
```
{% include copy-curl.html %}
The preceding query specifies the `-04:00` offset, so the `gte` parameter is converted to `2022-04-17T10:00:00 UTC`.
The `time_zone` parameter does not affect the `now` value.
{: .note}
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"range": {
"<field>": {
"gt": 10,
...
}
}
}
}
```
{% include copy-curl.html %}
In addition to [operators](#operators), you can specify the following optional parameters for the `<field>`.
Parameter | Data type | Description
:--- | :--- | :---
`format` | String | A [format]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#formats) for dates in this query. Default is the field's mapped format.
`relation` | String | Indicates how the range query matches values for [`range`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/range/) fields. Valid values are:<br> - `INTERSECTS` (default): Matches documents whose `range` field value intersects the range provided in the query. <br> - `CONTAINS`: Matches documents whose `range` field value contains the entire range provided in the query. <br> - `WITHIN`: Matches documents whose `range` field value is entirely within the range provided in the query.
`boost` | Floating-point | Boosts the query by the given multiplier. Useful for searches that contain more than one query. Values in the [0, 1) range decrease relevance, and values greater than 1 increase relevance. Default is `1`.
`time_zone` | String | The time zone used to convert [`date`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/) values to UTC in the query. Valid values are ISO 8601 [UTC offsets](https://en.wikipedia.org/wiki/List_of_UTC_offsets) and [IANA time zone IDs](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). For more information, see [Time zone](#time-zone).
If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, range queries on [`text`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) and [`keyword`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/) fields are not run.
{: .important}

66
_query-dsl/term/regexp.md Normal file
View File

@ -0,0 +1,66 @@
---
layout: default
title: Regexp
parent: Term-level queries
grand_parent: Query DSL
nav_order: 60
---
# Regexp query
Use the `regexp` query to search for terms that match a regular expression.
The following query searches for any term that starts with any uppercase or lowercase letter followed by `amlet`:
```json
GET shakespeare/_search
{
"query": {
"regexp": {
"play_name": "[a-zA-Z]amlet"
}
}
}
```
{% include copy-curl.html %}
Note the following important considerations:
- Regular expressions are applied to the terms (that is, tokens) in the field---not to the entire field.
- By default, the maximum length of a regular expression is 1,000 characters. To change the maximum length, update the `index.max_regex_length` setting.
- Regular expressions use the Lucene syntax, which differs from more standardized implementations. Test thoroughly to ensure that you receive the results you expect. To learn more, see [the Lucene documentation](https://lucene.apache.org/core/8_9_0/core/index.html).
- To improve regexp query performance, avoid wildcard patterns without a prefix or suffix, such as `.*` or `.*?+`.
- `regexp` queries can be expensive operations and require the [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) setting to be set to `true`. Before making frequent `regexp` queries, test their impact on cluster performance and examine alternative queries that may achieve similar results.
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"regexp": {
"<field>": {
"value": "[Ss]ample",
...
}
}
}
}
```
{% include copy-curl.html %}
The `<field>` accepts the following parameters. All parameters except `value` are optional.
Parameter | Data type | Description
:--- | :--- | :---
`value` | String | The regular expression used for matching terms in the field specified in `<field>`.
`case_insensitive` | Boolean | If `true`, allows case-insensitive matching of the regular expression value with the indexed field values. Default is `false` (case sensitivity is determined by the field's mapping).
`flags` | String | Enables optional operators for Lucenes regular expression engine.
`max_determinized_states` | Integer | Lucene converts a regular expression to an automaton with a number of determinized states. This parameter specifies the maximum number of automaton states the query requires. Use this parameter to prevent high resource consumption. To run complex regular expressions, you may need to increase the value of this parameter. Default is 10,000.
`rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`.
If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, `regexp` queries are not run.
{: .important}

99
_query-dsl/term/term.md Normal file
View File

@ -0,0 +1,99 @@
---
layout: default
title: Term
parent: Term-level queries
grand_parent: Query DSL
nav_order: 70
---
# Term query
Use the `term` query to search for an exact term in a field. For example, the following query searches for a line with an exact line number:
```json
GET shakespeare/_search
{
"query": {
"term": {
"line_id": {
"value": "61809"
}
}
}
}
```
{% include copy-curl.html %}
When a document is indexed, the `text` fields are [analyzed]({{site.url}}{{site.baseurl}}/analyzers/index/). Analysis includes tokenizing and lowercasing the text and removing punctuation. Unlike `match` queries, which analyze the query text, `term` queries only match the exact term and thus may not return relevant results. Avoid using `term` queries on `text` fields. For more information, see [Term-level and full-text queries compared]({{site.url}}{{site.baseurl}}/query-dsl/term-vs-full-text/).
You can specify that the query should be case insensitive in the `case_insensitive` parameter:
```json
GET shakespeare/_search
{
"query": {
"term": {
"speaker": {
"value": "HAMLET",
"case_insensitive": true
}
}
}
}
```
{% include copy-curl.html %}
The response contains the matching documents despite any differences in case:
```json
"hits": {
"total": {
"value": 1582,
"relation": "eq"
},
"max_score": 2,
"hits": [
{
"_index": "shakespeare",
"_id": "32700",
"_score": 2,
"_source": {
"type": "line",
"line_id": 32701,
"play_name": "Hamlet",
"speech_number": 9,
"line_number": "1.2.66",
"speaker": "HAMLET",
"text_entry": "[Aside] A little more than kin, and less than kind."
}
},
...
}
```
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"term": {
"<field>": {
"value": "sample",
...
}
}
}
}
```
{% include copy-curl.html %}
The `<field>` accepts the following parameters. All parameters except `value` are optional.
Parameter | Data type | Description
:--- | :--- | :---
`value` | String | The term to search for in the field specified in `<field>`. A document is returned in the results only if its field value exactly matches the term, with the correct spacing and capitalization.
`boost` | Floating-point | Boosts the query by the given multiplier. Useful for searches that contain more than one query. Values in the [0, 1) range decrease relevance, and values greater than 1 increase relevance. Default is `1`.
`case_insensitive` | Boolean | If `true`, allows case-insensitive matching of the value with the indexed field values. Default is `false` (case sensitivity is determined by the field's mapping).

View File

@ -0,0 +1,170 @@
---
layout: default
title: Terms set
parent: Term-level queries
grand_parent: Query DSL
nav_order: 90
---
# Terms set query
With a terms set query, you can search for documents that match a minimum number of exact terms in a specified field. A `terms_set` query is similar to a `terms` query, except that you can specify the minimum number of matching terms that are required in order to return a document. You can specify this number either in a field in the index or with a script.
As an example, consider an index that contains names of students and classes those students have taken. When setting up the mapping for this index, you need to provide a [numeric]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/numeric/) field that specifies the minimum number of matching terms that are required in order to return a document:
```json
PUT students
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"classes": {
"type": "keyword"
},
"min_required": {
"type": "integer"
}
}
}
}
```
{% include copy-curl.html %}
Next, index two documents that correspond to students:
```json
PUT students/_doc/1
{
"name": "Mary Major",
"classes": [ "CS101", "CS102", "MATH101" ],
"min_required": 2
}
```
{% include copy-curl.html %}
```json
PUT students/_doc/2
{
"name": "John Doe",
"classes": [ "CS101", "MATH101", "ENG101" ],
"min_required": 2
}
```
{% include copy-curl.html %}
Now search for students who have taken at least two of the following classes: `CS101`, `CS102`, `MATH101`:
```json
GET students/_search
{
"query": {
"terms_set": {
"classes": {
"terms": [ "CS101", "CS102", "MATH101" ],
"minimum_should_match_field": "min_required"
}
}
}
}
```
{% include copy-curl.html %}
The response contains both documents:
```json
{
"took" : 44,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.4544616,
"hits" : [
{
"_index" : "students",
"_id" : "1",
"_score" : 1.4544616,
"_source" : {
"name" : "Mary Major",
"classes" : [
"CS101",
"CS102",
"MATH101"
],
"min_required" : 2
}
},
{
"_index" : "students",
"_id" : "2",
"_score" : 0.5013843,
"_source" : {
"name" : "John Doe",
"classes" : [
"CS101",
"MATH101",
"ENG101"
],
"min_required" : 2
}
}
]
}
}
```
To specify the minimum number of terms a document should match with a script, provide the script in the `minimum_should_match_script` field:
```json
GET students/_search
{
"query": {
"terms_set": {
"classes": {
"terms": [ "CS101", "CS102", "MATH101" ],
"minimum_should_match_script": {
"source": "Math.min(params.num_terms, doc['min_required'].value)"
}
}
}
}
}
```
{% include copy-curl.html %}
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"terms_set": {
"<field>": {
"terms": [ "term1", "term2" ],
...
}
}
}
}
```
{% include copy-curl.html %}
The `<field>` accepts the following parameters. All parameters except `terms` are optional.
Parameter | Data type | Description
:--- | :--- | :---
`terms` | Array of strings | The array of terms to search for in the field specified in `<field>`. A document is returned in the results only if the required number of terms matches the document's field values exactly, with the correct spacing and capitalization.
`minimum_should_match_field` | The name of the [numeric]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/numeric/) field that specifies the number of matching terms required in order to return a document in the results.
`minimum_should_match_script` | A script that returns the number of matching terms required in order to return a document in the results.

252
_query-dsl/term/terms.md Normal file
View File

@ -0,0 +1,252 @@
---
layout: default
title: Terms
parent: Term-level queries
grand_parent: Query DSL
nav_order: 80
---
# Terms query
Use the `terms` query to search for multiple terms in the same field. For example, the following query searches for lines with the IDs `61809` and `61810`:
```json
GET shakespeare/_search
{
"query": {
"terms": {
"line_id": [
"61809",
"61810"
]
}
}
}
```
{% include copy-curl.html %}
A document is returned if it matches any of the terms in the array.
By default, the maximum number of terms allowed in a `terms` query is 65,536. To change the maximum number of terms, update the `index.max_terms_count` setting.
The ability to [highlight results]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/highlight/) for terms queries may not be guaranteed, depending on the highlighter type and the number of terms in the query.
{: .note}
## Parameters
The query accepts the following parameters. All parameters are optional.
Parameter | Data type | Description
:--- | :--- | :---
`<field>` | String | The field in which to search. A document is returned in the results only if its field value exactly matches at least one term, with the correct spacing and capitalization.
`boost` | Floating-point | Boosts the query by the given multiplier. Useful for searches that contain more than one query. Values in the [0, 1) range decrease relevance, and values greater than 1 increase relevance. Default is `1`.
## Terms lookup
Terms lookup retrieves the field values of a single document and uses them as search terms. You can use terms lookup to search for a large number of terms.
To use terms lookup, you must enable the `_source` mapping field because terms lookup fetches values from a document. The `_source` field is enabled by default.
Terms lookup tries to fetch the document field values from a shard on a local data node. Thus, using an index with a single primary shard that has full replicas on all applicable data nodes reduces network traffic.
### Example
As an example, create an index that contains student data, mapping `student_id` as a `keyword`:
```json
PUT students
{
"mappings": {
"properties": {
"student_id": { "type": "keyword" }
}
}
}
```
{% include copy-curl.html %}
Next, index three documents that correspond to students:
```json
PUT students/_doc/1
{
"name": "Jane Doe",
"student_id" : "111"
}
```
{% include copy-curl.html %}
```json
PUT students/_doc/2
{
"name": "Mary Major",
"student_id" : "222"
}
```
{% include copy-curl.html %}
```json
PUT students/_doc/3
{
"name": "John Doe",
"student_id" : "333"
}
```
{% include copy-curl.html %}
Create a separate index that contains class information, including the class name and an array of student IDs corresponding to the students enrolled in the class:
```json
PUT classes/_doc/101
{
"name": "CS101",
"enrolled" : ["111" , "222"]
}
```
{% include copy-curl.html %}
To search for students enrolled in the `CS101` class, specify the document ID of the document that corresponds to the class, the index of that document, and the path of the field in which the terms are located:
```json
GET students/_search
{
"query": {
"terms": {
"student_id": {
"index": "classes",
"id": "101",
"path": "enrolled"
}
}
}
}
```
{% include copy-curl.html %}
The response contains the documents in the `students` index for every student whose ID matches one of the values in the `enrolled` array:
```json
{
"took": 13,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "students",
"_id": "1",
"_score": 1,
"_source": {
"name": "Jane Doe",
"student_id": "111"
}
},
{
"_index": "students",
"_id": "2",
"_score": 1,
"_source": {
"name": "Mary Major",
"student_id": "222"
}
}
]
}
}
```
### Example: Nested fields
The second example demonstrates querying nested fields. Consider an index with the following document:
```json
PUT classes/_doc/102
{
"name": "CS102",
"enrolled_students" : {
"id_list" : ["111" , "333"]
}
}
```
{% include copy-curl.html %}
To search for students enrolled in `CS102`, use the dot path notation to specify the full path to the field in the `path` parameter:
```json
ET students/_search
{
"query": {
"terms": {
"student_id": {
"index": "classes",
"id": "102",
"path": "enrolled_students.id_list"
}
}
}
}
```
{% include copy-curl.html %}
The response contains the matching documents:
```json
{
"took": 18,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "students",
"_id": "1",
"_score": 1,
"_source": {
"name": "Jane Doe",
"student_id": "111"
}
},
{
"_index": "students",
"_id": "3",
"_score": 1,
"_source": {
"name": "John Doe",
"student_id": "333"
}
}
]
}
}
```
### Parameters
The following table lists the terms lookup parameters.
Parameter | Data type | Description
:--- | :--- | :---
`index` | String | The name of the index in which to fetch field values. Required.
`id` | String | The document ID of the document from which to fetch field values. Required.
`path` | String | The name of the field from which to fetch field values. Specify nested fields using dot path notation. Required.
`routing` | Custom routing value of the document from which to fetch field values. Optional. Required if a custom routing value was provided when the document was indexed.

View File

@ -0,0 +1,67 @@
---
layout: default
title: Wildcard
parent: Term-level queries
grand_parent: Query DSL
nav_order: 100
---
# Wildcard query
Use wildcard queries to search for terms that match a wildcard pattern. Wildcard queries support the following operators.
Operator | Description
:--- | :---
`*` | Matches zero or more characters.
`?` | Matches any single character.
To search for terms that start with `H` and end with `Y`, use the following request:
```json
GET shakespeare/_search
{
"query": {
"wildcard": {
"speaker": {
"value": "H*Y"
}
}
}
}
```
{% include copy-curl.html %}
If you change `*` to `?`, you get no matches because `?` refers to a single character.
Wildcard queries tend to be slow because they need to iterate over a lot of terms. Avoid placing wildcard characters at the beginning of a query because it could be a very expensive operation in terms of both resources and time.
## Parameters
The query accepts the name of the field (`<field>`) as a top-level parameter:
```json
GET _search
{
"query": {
"wildcard": {
"<field>": {
"value": "patt*rn",
...
}
}
}
}
```
{% include copy-curl.html %}
The `<field>` accepts the following parameters. All parameters except `value` are optional.
Parameter | Data type | Description
:--- | :--- | :---
`value` | String | The wildcard pattern used for matching terms in the field specified in `<field>`.
`boost` | Floating-point | Boosts the query by the given multiplier. Useful for searches that contain more than one query. Values in the [0, 1) range decrease relevance, and values greater than 1 increase relevance. Default is `1`.
`case_insensitive` | Boolean | If `true`, allows case-insensitive matching of the value with the indexed field values. Default is `false` (case sensitivity is determined by the field's mapping).
`rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`.
If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, wildcard queries are not run.
{: .important}

View File

@ -193,8 +193,8 @@ Specify a condition to filter the results.
`>=` | Greater than or equal to.
`<=` | Less than or equal to.
`IN` | Specify multiple `OR` operators.
`BETWEEN` | Similar to a range query. For more information about range queries, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term#range).
`LIKE` | Use for full-text search. For more information about full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
`BETWEEN` | Similar to a range query. For more information about range queries, see [Range query]({{site.url}}{{site.baseurl}}/query-dsl/term/range/).
`LIKE` | Use for full-text search. For more information about full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index/).
`IS NULL` | Check if the field value is `NULL`.
`IS NOT NULL` | Check if the field value is `NOT NULL`.

View File

@ -168,7 +168,7 @@ You can perform term-level lookup queries (TLQs) with document-level security (D
By default, the Security plugin detects if a DLS query contains a TLQ or not and chooses the appropriate mode automatically at runtime.
To learn more about OpenSearch queries, see [Term-level queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/).
To learn more about OpenSearch queries, see [Term-level queries]({{site.url}}{{site.baseurl}}/query-dsl/term/index/).
### How to set the DLS evaluation mode in `opensearch.yml`