Refactor the Query DSL section (#2904)
* for query dsl index page rewrites for proper index page Signed-off-by: alicejw <alicejw@amazon.com> * fix formatting in table Signed-off-by: alicejw <alicejw@amazon.com> * update query table intro Signed-off-by: alicejw <alicejw@amazon.com> * rmv proprietary from overview Signed-off-by: alicejw <alicejw@amazon.com> * awkward sentence fix Signed-off-by: alicejw <alicejw@amazon.com> * to add list of all query categories Signed-off-by: alicejw <alicejw@amazon.com> * for query category descriptions Signed-off-by: alicejw <alicejw@amazon.com> * remove commented note Signed-off-by: alicejw <alicejw@amazon.com> * update term-level query page Signed-off-by: alicejw <alicejw@amazon.com> * for clarity about term and full-text query use cases Signed-off-by: alicejw <alicejw@amazon.com> * for parallel bullet list of queries Signed-off-by: alicejw <alicejw@amazon.com> * remove redundant word Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * for tech review feedback Signed-off-by: alicejw <alicejw@amazon.com> * for entire list of query types we support, even though we don't have document topic pages for them yet. Signed-off-by: alicejw <alicejw@amazon.com> * to include full list of query types we support Signed-off-by: alicejw <alicejw@amazon.com> * change Boolean to type for consistency in the section Signed-off-by: alicejw <alicejw@amazon.com> * update query type category list title Signed-off-by: alicejw <alicejw@amazon.com> * for compound query type definitions Signed-off-by: alicejw <alicejw@amazon.com> * for additional descriptions Signed-off-by: alicejw <alicejw@amazon.com> * for query context descriptions Signed-off-by: alicejw <alicejw@amazon.com> * for additional edits to query descriptions list Signed-off-by: alicejw <alicejw@amazon.com> * create span query category page and update bullet list on index to cross-reference to it. Signed-off-by: alicejw <alicejw@amazon.com> * add pages for geo and shape query category, and add cross-references Signed-off-by: alicejw <alicejw@amazon.com> * remove regex it is part of term-level queries Signed-off-by: alicejw <alicejw@amazon.com> * for bullet list granular edits Signed-off-by: alicejw <alicejw@amazon.com> * put bullet list in alphabetical order Signed-off-by: alicejw <alicejw@amazon.com> * for doc review updates Signed-off-by: alicejw <alicejw@amazon.com> * reword for reviewer feedback Signed-off-by: alicejw <alicejw@amazon.com> * small rewording Signed-off-by: alicejw <alicejw@amazon.com> * typo space Signed-off-by: alicejw <alicejw@amazon.com> * put topics in alphabetical order in left nav Signed-off-by: alicejw <alicejw@amazon.com> * additional reviewer's comment Signed-off-by: alicejw <alicejw@amazon.com> * for second doc reviewer's feedback updates Signed-off-by: alicejw <alicejw@amazon.com> * for doc reviewer comment that was hidden Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/geo-and-shape.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/span-query.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/span-query.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * updates from third doc review for tech accuracy requested by editorial Signed-off-by: alicejw <alicejw@amazon.com> * create compound query sub-page to move descriptions to make bullet list parallel Signed-off-by: alicejw <alicejw@amazon.com> * fix compound query page title Signed-off-by: alicejw <alicejw@amazon.com> * add fuzzy query definition Signed-off-by: alicejw <alicejw@amazon.com> * for editorial feedback updates Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Refactor Query DSL section Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Adds doc review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Fix typo Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Implemented editorial comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Changed periods to colons when introducing code blocks Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> --------- Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Co-authored-by: alicejw <alicejw@amazon.com> Co-authored-by: Alice Williams <88908598+alicejw-aws@users.noreply.github.com>
This commit is contained in:
parent
f3833a0fe8
commit
5abc22147c
|
@ -9,7 +9,7 @@ has_toc: false
|
|||
|
||||
# Index rollups
|
||||
|
||||
Time series data increases storage costs, strains cluster health, and slows down aggregations over time. Index rollup lets you periodically reduce data granularity by rolling up old data into summarized indices.
|
||||
Time series data increases storage costs, strains cluster health, and slows down aggregations over time. Index rollup lets you periodically reduce data granularity by rolling up old data into summarized indexes.
|
||||
|
||||
You pick the fields that interest you and use index rollup to create a new index with only those fields aggregated into coarser time buckets. You can store months or years of historical data at a fraction of the cost with the same query performance.
|
||||
|
||||
|
@ -18,7 +18,7 @@ For example, say you collect CPU consumption data every five seconds and store i
|
|||
You can use index rollup in three ways:
|
||||
|
||||
1. Use the index rollup API for an on-demand index rollup job that operates on an index that's not being actively ingested such as a rolled-over index. For example, you can perform an index rollup operation to reduce data collected at a five minute interval to a weekly average for trend analysis.
|
||||
2. Use the OpenSearch Dashboards UI to create an index rollup job that runs on a defined schedule. You can also set it up to roll up your indices as it’s being actively ingested. For example, you can continuously roll up Logstash indices from a five second interval to a one hour interval.
|
||||
2. Use the OpenSearch Dashboards UI to create an index rollup job that runs on a defined schedule. You can also set it up to roll up your indexes as it’s being actively ingested. For example, you can continuously roll up Logstash indexes from a five second interval to a one hour interval.
|
||||
3. Specify the index rollup job as an ISM action for complete index management. This allows you to roll up an index after a certain event such as a rollover, index age reaching a certain point, index becoming read-only, and so on. You can also have rollover and index rollup jobs running in sequence, where the rollover first moves the current index to a warm node and then the index rollup job creates a new index with the minimized data on the hot node.
|
||||
|
||||
## Create an Index Rollup Job
|
||||
|
@ -26,7 +26,7 @@ You can use index rollup in three ways:
|
|||
To get started, choose **Index Management** in OpenSearch Dashboards.
|
||||
Select **Rollup Jobs** and choose **Create rollup job**.
|
||||
|
||||
### Step 1: Set up indices
|
||||
### Step 1: Set up indexes
|
||||
|
||||
1. In the **Job name and description** section, specify a unique name and an optional description for the index rollup job.
|
||||
2. In the **Indices** section, select the source and target index. The source index is the one that you want to roll up. The source index remains as is, the index rollup job creates a new index referred to as a target index. The target index is where the index rollup results are saved. For target index, you can either type in a name for a new index or you select an existing index.
|
||||
|
@ -48,7 +48,7 @@ The order in which you select attributes is critical. A city followed by a demog
|
|||
|
||||
### Step 3: Specify schedule
|
||||
|
||||
Specify a schedule to roll up your indices as it’s being ingested. The index rollup job is enabled by default.
|
||||
Specify a schedule to roll up your indexes as it’s being ingested. The index rollup job is enabled by default.
|
||||
|
||||
1. Specify if the data is continuous or not.
|
||||
3. For roll up execution frequency, select **Define by fixed interval** and specify the **Rollup interval** and the time unit or **Define by cron expression** and add in a cron expression to select the interval. To learn how to define a cron expression, see [Alerting]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/).
|
||||
|
@ -303,7 +303,7 @@ PUT _plugins/_rollup/jobs/example
|
|||
```
|
||||
|
||||
You can query the `example_rollup` index for the terms aggregations on the fields set up in the rollup job.
|
||||
You get back the same response that you would on the original `opensearch_dashboards_sample_data_ecommerce` source index.
|
||||
You get back the same response that you would on the original `opensearch_dashboards_sample_data_ecommerce` source index:
|
||||
|
||||
```json
|
||||
POST example_rollup/_search
|
||||
|
@ -520,7 +520,7 @@ The `doc_count` field in bucket aggregations contains the number of documents co
|
|||
|
||||
## Query string queries
|
||||
|
||||
To take advantage of shorter and easier to write strings in Query DSL, you can use [query strings]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/query-string/) to simplify search queries in rollup indexes. To use query strings, add the following fields to your rollup search request.
|
||||
To take advantage of shorter and more easily written strings in Query DSL, you can use [query strings]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/query-string/) to simplify search queries in rollup indexes. To use query strings, add the following fields to your rollup search request:
|
||||
|
||||
```json
|
||||
"query": {
|
||||
|
@ -530,7 +530,7 @@ To take advantage of shorter and easier to write strings in Query DSL, you can u
|
|||
}
|
||||
```
|
||||
|
||||
The following example uses a query string with a `*` wildcard operator to search inside a rollup index called `my_server_logs_rollup`.
|
||||
The following example uses a query string with a `*` wildcard operator to search inside a rollup index called `my_server_logs_rollup`:
|
||||
|
||||
```json
|
||||
GET my_server_logs_rollup/_search
|
||||
|
@ -567,7 +567,7 @@ GET my_server_logs_rollup/_search
|
|||
}
|
||||
```
|
||||
|
||||
For more information on which parameters are supported in query strings, see [Advanced filter options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/query-string/#parameters).
|
||||
For more information about query string query parameters, see [Query string query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/query-string/#parameters).
|
||||
|
||||
## Dynamic target index
|
||||
|
||||
|
@ -575,23 +575,23 @@ For more information on which parameters are supported in query strings, see [Ad
|
|||
.nobr { white-space: nowrap }
|
||||
</style>
|
||||
|
||||
In ISM rollup, the `target_index` field may contain a template that is compiled at the time of each rollup indexing. For example, if you specify the `target_index` field as <span style="white-space: nowrap">`{% raw %}rollup_ndx-{{ctx.source_index}}{% endraw %}`,</span> the source index `log-000001` will roll up into a target index `rollup_ndx-log-000001`. This allows you to roll up data into multiple time-based indices, with one rollup job created for each source index.
|
||||
In ISM rollup, the `target_index` field may contain a template that is compiled at the time of each rollup indexing. For example, if you specify the `target_index` field as <span style="white-space: nowrap">`{% raw %}rollup_ndx-{{ctx.source_index}}{% endraw %}`,</span> the source index `log-000001` will roll up into a target index `rollup_ndx-log-000001`. This allows you to roll up data into multiple time-based indexes, with one rollup job created for each source index.
|
||||
|
||||
The `source_index` parameter in {% raw %}`{{ctx.source_index}}`{% endraw %} cannot contain wildcards.
|
||||
{: .note}
|
||||
|
||||
## Searching multiple rollup indices
|
||||
## Searching multiple rollup indexes
|
||||
|
||||
When data is rolled up into multiple target indices, you can run one search across all of the rollup indices. To search multiple target indices that have the same rollup, specify the index names as a comma-separated list or a wildcard pattern. For example, with `target_index` as <span style="white-space: nowrap">`{% raw %}rollup_ndx-{{ctx.source_index}}{% endraw %}`</span> and source indices that start with `log`, specify the `rollup_ndx-log*` pattern. Or, to search for rolled up log-000001 and log-000002 indices, specify the `rollup_ndx-log-000001,rollup_ndx-log-000002` list.
|
||||
When data is rolled up into multiple target indexes, you can run one search across all of the rollup indexes. To search multiple target indexes that have the same rollup, specify the index names as a comma-separated list or a wildcard pattern. For example, with `target_index` as <span style="white-space: nowrap">`{% raw %}rollup_ndx-{{ctx.source_index}}{% endraw %}`</span> and source indexes that start with `log`, specify the `rollup_ndx-log*` pattern. Or, to search for rolled up log-000001 and log-000002 indexes, specify the `rollup_ndx-log-000001,rollup_ndx-log-000002` list.
|
||||
|
||||
You cannot search a mix of rollup and non-rollup indices with the same query.
|
||||
You cannot search a mix of rollup and non-rollup indexes with the same query.
|
||||
{: .note}
|
||||
|
||||
## Example
|
||||
|
||||
The following example demonstrates the `doc_count` field, dynamic index names, and searching multiple rollup indices with the same rollup.
|
||||
The following example demonstrates the `doc_count` field, dynamic index names, and searching multiple rollup indexes with the same rollup.
|
||||
|
||||
**Step 1:** Add an index template for ISM to manage the rolling over of the indices aliased by `log`.
|
||||
**Step 1:** Add an index template for ISM to manage the rolling over of the indexes aliased by `log`:
|
||||
|
||||
```json
|
||||
PUT _index_template/ism_rollover
|
||||
|
|
|
@ -144,7 +144,7 @@ Per query monitors run your specified query and then check whether the query's r
|
|||
|
||||
- Visual definition works well for monitors that you can define as "some value is above or below some threshold for some amount of time."
|
||||
|
||||
- Query definition gives you flexibility in terms of what you query for (using [the OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text)) and how you evaluate the results of that query (Painless scripting).
|
||||
- Query definition gives you flexibility in terms of what you query for (using [OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index)) and how you evaluate the results of that query (Painless scripting).
|
||||
|
||||
This example averages the `cpu_usage` field:
|
||||
|
||||
|
@ -197,7 +197,7 @@ Per query monitors run your specified query and then check whether the query's r
|
|||
|
||||
If you use the Security plugin, you can only choose indexes that you have permission to access. For details, see [Alerting security]({{site.url}}{{site.baseurl}}/security/).
|
||||
|
||||
To use a query, choose **Extraction query editor**, add your query (using [the OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/)), and test it using the **Run** button.
|
||||
To use a query, choose **Extraction query editor**, add your query (using [OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index)), and test it using the **Run** button.
|
||||
|
||||
The monitor makes this query to OpenSearch as often as the schedule dictates; check the **Query Performance** section and make sure you're comfortable with the performance implications.
|
||||
|
||||
|
|
|
@ -1,8 +1,9 @@
|
|||
---
|
||||
layout: default
|
||||
title: Boolean queries
|
||||
parent: Query DSL
|
||||
nav_order: 45
|
||||
parent: Compound queries
|
||||
grand_parent: Query DSL
|
||||
nav_order: 10
|
||||
---
|
||||
|
||||
# Boolean queries
|
|
@ -0,0 +1,19 @@
|
|||
---
|
||||
layout: default
|
||||
title: Compound queries
|
||||
parent: Query DSL
|
||||
has_children: true
|
||||
nav_order: 40
|
||||
---
|
||||
|
||||
# Compound queries
|
||||
|
||||
Compound queries serve as wrappers for multiple leaf or compound clauses either to combine their results or to modify their behavior.
|
||||
|
||||
OpenSearch supports the following compound query types:
|
||||
|
||||
- **Boolean**: Combines multiple query clauses with Boolean logic. To learn more, see [Boolean queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/compound/bool/).
|
||||
- **Constant score**: Wraps a query or a filter and assigns a constant score to all matching documents. This score is equal to the `boost` value.
|
||||
- **Disjunction max**: Returns documents that match one or more query clauses. If a document matches multiple query clauses, it is assigned a higher relevance score. The relevance score is calculated using the highest score from any matching clause and, optionally, the scores from the other matching clauses multiplied by the tiebreaker value.
|
||||
- **Function score**: Recalculates the relevance score of documents that are returned by a query using a function that you define.
|
||||
- **Boosting**: Changes the relevance score of documents without removing them from the search results. Returns documents that match a `positive` query, but downgrades the relevance of documents in the results that match a `negative` query.
|
|
@ -2,10 +2,11 @@
|
|||
layout: default
|
||||
title: Full-text queries
|
||||
parent: Query DSL
|
||||
nav_order: 40
|
||||
has_children: true
|
||||
nav_order: 30
|
||||
---
|
||||
|
||||
# Full-text query types and options
|
||||
# Full-text queries
|
||||
|
||||
This page lists all full-text query types and common options. There are many optional fields that you can use to create subtle search behaviors, so we recommend that you test out some basic query types against representative indexes and verify the output before you perform more advanced or complex searches with multiple options.
|
||||
|
|
@ -1,8 +1,9 @@
|
|||
---
|
||||
layout: default
|
||||
title: Query string queries
|
||||
parent: Query DSL
|
||||
nav_order: 70
|
||||
parent: Full-text queries
|
||||
grand_parent: Query DSL
|
||||
nav_order: 25
|
||||
---
|
||||
|
||||
# Query string queries
|
||||
|
@ -41,7 +42,7 @@ Parameter | Data type | Description
|
|||
`phrase_slop` | Integer | The maximum number of words that are allowed between the matched words. If `phrase_slop` is 2, a maximum of two words is allowed between matched words in a phrase. Transposed words have a slop of 2. Default is 0 (an exact phrase match where matched words must be next to each other).
|
||||
`minimum_should_match` | Positive or negative integer, positive or negative percentage, combination | If the query string contains multiple search terms and you used the `or` operator, the number of terms that need to match for the document to be considered a match. For example, if `minimum_should_match` is 2, "wind often rising" does not match "The Wind Rises." If `minimum_should_match` is 1, it matches.
|
||||
`rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`.
|
||||
`auto_generate_synonyms_phrase_query` | Boolean | Specifies whether to create [match queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text#match) automatically for multi-term synonyms. Default is `true`.
|
||||
`auto_generate_synonyms_phrase_query` | Boolean | Specifies whether to create [match queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match) automatically for multi-term synonyms. Default is `true`.
|
||||
`boost` | Floating-point | Boosts the clause by the given multiplier. Values less than 1.0 decrease relevance, and values greater than 1.0 increase relevance. Default is 1.0.
|
||||
`default_operator`| String | The default Boolean operator used if no operators are specified. Valid values are:<br>- `OR`: The string `to be` is interpreted as `to OR be`<br>- `AND`: The string `to be` is interpreted as `to AND be`<br> Default is `OR`.
|
||||
`enable_position_increments` | Boolean | When true, resulting queries are aware of position increments. This setting is useful when the removal of stop words leaves an unwanted "gap" between terms. Default is `true`.
|
|
@ -1,8 +1,9 @@
|
|||
---
|
||||
layout: default
|
||||
title: Geo-bounding box queries
|
||||
parent: Query DSL
|
||||
nav_order: 55
|
||||
parent: Geographic and xy queries
|
||||
grand_parent: Query DSL
|
||||
nav_order: 10
|
||||
---
|
||||
|
||||
# Geo-bounding box queries
|
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
layout: default
|
||||
title: Geographic and xy queries
|
||||
parent: Query DSL
|
||||
has_children: true
|
||||
nav_order: 50
|
||||
---
|
||||
|
||||
# Geographic and xy queries
|
||||
|
||||
Geographic and xy queries let you search fields that contain points and shapes on a map or coordinate plane. Geographic queries work on geospatial data, while xy queries work on two-dimensional coordinate data. Out of all geographic queries, the geoshape query is very similar to the xy query, but the former searches [geographic fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geographic), while the latter searches [Cartesian fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy).
|
||||
|
||||
## xy queries
|
||||
|
||||
[xy queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/xy) search for documents that contain geometries in a Cartesian coordinate system. These geometries can be specified in [`xy_point`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-point) fields, which support points, and [`xy_shape`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-shape) fields, which support points, lines, circles, and polygons.
|
||||
|
||||
xy queries return documents that contain:
|
||||
- xy shapes and xy points that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`.
|
||||
- xy points that intersect the provided shape.
|
||||
|
||||
## Geographic queries
|
||||
|
||||
Geographic queries search for documents that contain geospatial geometries. These geometries can be specified in [`geo_point`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point) fields, which support points on a map, and [`geo_shape`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-shape) fields, which support points, lines, circles, and polygons.
|
||||
|
||||
OpenSearch provides the following geographic query types:
|
||||
|
||||
- [**Geo-bounding box queries**]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/geo-bounding-box/): Return documents with geopoint field values that are within a bounding box.
|
||||
- **Geodistance queries** return documents with geopoints that are within a specified distance from the provided geopoint.
|
||||
- **Geopolygon queries** return documents with geopoints that are within a polygon.
|
||||
- **Geoshape queries** return documents that contain:
|
||||
- geoshapes and geopoints that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`.
|
||||
- geopoints that intersect the provided shape.
|
|
@ -1,8 +1,9 @@
|
|||
---
|
||||
layout: default
|
||||
title: xy queries
|
||||
parent: Query DSL
|
||||
nav_order: 65
|
||||
parent: Geographic and xy queries
|
||||
grand_parent: Query DSL
|
||||
nav_order: 50
|
||||
---
|
||||
|
||||
# xy queries
|
|
@ -12,121 +12,43 @@ redirect_from:
|
|||
|
||||
# Query DSL
|
||||
|
||||
While you can use HTTP request parameters to perform simple searches, you can also use the OpenSearch query domain-specific language (DSL), which provides a wider range of search options. The query DSL uses the HTTP request body, so you can more easily customize your queries to get the exact results that you want.
|
||||
OpenSearch provides a search language called *query domain-specific language (DSL)* that you can use to search your data. Query DSL is a flexible language with a JSON interface.
|
||||
|
||||
For example, the following request performs a simple search to search for a `speaker` field that has a value of `queen`.
|
||||
With query DSL, you need to specify a query in the `query` parameter of the search. One of the simplest searches in OpenSearch uses the `match_all` query, which matches all documents in an index:
|
||||
|
||||
**Sample request**
|
||||
```json
|
||||
GET _search?q=speaker:queen
|
||||
```
|
||||
|
||||
**Sample response**
|
||||
```
|
||||
{
|
||||
"took": 87,
|
||||
"timed_out": false,
|
||||
"_shards": {
|
||||
"total": 68,
|
||||
"successful": 68,
|
||||
"skipped": 0,
|
||||
"failed": 0
|
||||
},
|
||||
"hits": {
|
||||
"total": {
|
||||
"value": 4080,
|
||||
"relation": "eq"
|
||||
},
|
||||
"max_score": 4.4368687,
|
||||
"hits": [
|
||||
{
|
||||
"_index": "new_shakespeare",
|
||||
"_id": "28559",
|
||||
"_score": 4.4368687,
|
||||
"_source": {
|
||||
"type": "line",
|
||||
"line_id": 28560,
|
||||
"play_name": "Cymbeline",
|
||||
"speech_number": 20,
|
||||
"line_number": "1.1.81",
|
||||
"speaker": "QUEEN",
|
||||
"text_entry": "No, be assured you shall not find me, daughter,"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
With query DSL, however, you can include an HTTP request body to look for results more tailored to your needs. The following example shows how to search for `speaker` and `text_entry` fields that have a value of `QUEEN`.
|
||||
|
||||
<!-- need to include the HTTP method in example here GET _search is missing from code block
|
||||
-->
|
||||
**Sample request**
|
||||
```json
|
||||
GET _search
|
||||
GET testindex/_search
|
||||
{
|
||||
"query": {
|
||||
"multi_match": {
|
||||
"query": "QUEEN",
|
||||
"fields": ["speaker", "text_entry"]
|
||||
}
|
||||
"match_all": {
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Sample Response**
|
||||
```json
|
||||
{
|
||||
"took": 39,
|
||||
"timed_out": false,
|
||||
"_shards": {
|
||||
"total": 68,
|
||||
"successful": 68,
|
||||
"skipped": 0,
|
||||
"failed": 0
|
||||
},
|
||||
"hits": {
|
||||
"total": {
|
||||
"value": 5837,
|
||||
"relation": "eq"
|
||||
},
|
||||
"max_score": 7.8623476,
|
||||
"hits": [
|
||||
{
|
||||
"_index": "new_shakespeare",
|
||||
"_id": "100763",
|
||||
"_score": 7.8623476,
|
||||
"_source": {
|
||||
"type": "line",
|
||||
"line_id": 100764,
|
||||
"play_name": "Troilus and Cressida",
|
||||
"speech_number": 43,
|
||||
"line_number": "3.1.68",
|
||||
"speaker": "PANDARUS",
|
||||
"text_entry": "Sweet queen, sweet queen! thats a sweet queen, i faith."
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index": "shakespeare",
|
||||
"_id": "28559",
|
||||
"_score": 5.8923807,
|
||||
"_source": {
|
||||
"type": "line",
|
||||
"line_id": 28560,
|
||||
"play_name": "Cymbeline",
|
||||
"speech_number": 20,
|
||||
"line_number": "1.1.81",
|
||||
"speaker": "QUEEN",
|
||||
"text_entry": "No, be assured you shall not find me, daughter,"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
The OpenSearch query DSL comes in three varieties: term-level queries, full-text queries, and boolean queries. You can even perform more complicated searches by using different elements from each variety to find whatever data you need.
|
||||
A query can consist of many query clauses. You can combine query clauses to produce complex queries.
|
||||
|
||||
Broadly, you can classify queries into two categories---*leaf queries* and *compound queries*:
|
||||
|
||||
- **Leaf queries**: Leaf queries search for a specified value in a certain field or fields. You can use leaf queries on their own. They include the following query types:
|
||||
|
||||
- **Full-text queries**: Use full-text queries to search text documents. For an analyzed text field search, full-text queries split the query string into terms with the same analyzer that was used when the field was indexed. For an exact value search, full-text queries look for the specified value without applying text analysis. To learn more, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
|
||||
|
||||
- **Term-level queries**: Use term-level queries to search documents for an exact specified term, such as an ID or value range. Term-level queries do not analyze search terms or sort results by relevance score. To learn more, see [Term-level queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/).
|
||||
|
||||
- **Geographic and xy queries**: Use geographic queries to search documents that include geographic data. Use xy queries to search documents that include points and shapes in a two-dimensional coordinate system. To learn more, see [Geographic and xy queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/index).
|
||||
|
||||
- **Joining queries**: Use joining queries to search nested fields or return parent and child documents that match a specific query. Types of joining queries include `nested`, `has_child`, `has_parent`, and `parent_id` queries.
|
||||
|
||||
- **Span queries**: Use span queries to perform precise positional searches. Span queries are low-level, specific queries that provide control over the order and proximity of specified query terms. They are primarily used to search legal documents. To learn more, see [Span queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/span-query/).
|
||||
|
||||
- **Specialized queries**: Specialized queries include all other query types (`distance_feature`, `more_like_this`, `percolate`, `rank_feature`, `script`, `script_score`, `wrapper`, and `pinned_query`).
|
||||
|
||||
- **Compound queries**: Compound queries serve as wrappers for multiple leaf or compound clauses either to combine their results or to modify their behavior. They include the Boolean, disjunction max, constant score, function score, and boosting query types. To learn more, see [Compound queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/compound/index).
|
||||
|
||||
## A note on Unicode special characters in text fields
|
||||
|
||||
Due to word boundaries associated with Unicode special characters, the Unicode standard analyzer cannot index a [text field type](https://opensearch.org/docs/2.2/opensearch/supported-field-types/text/) value as a whole value when it includes one of these special characters. As a result, a text field value that includes a special character is parsed by the standard analyzer as multiple values separated by the special character, effectively tokenizing the different elements on either side of it. This can lead to unintentional filtering of documents and potentially compromise control over their access.
|
||||
Due to word boundaries associated with Unicode special characters, the Unicode standard analyzer cannot index a [text field type]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) value as a whole value when it includes one of these special characters. As a result, a text field value that includes a special character is parsed by the standard analyzer as multiple values separated by the special character, effectively tokenizing the different elements on either side of it. This can lead to unintentional filtering of documents and potentially compromise control over their access.
|
||||
|
||||
The examples below illustrate values containing special characters that will be parsed improperly by the standard analyzer. In this example, the existence of the hyphen/minus sign in the value prevents the analyzer from distinguishing between the two different users for `user.id` and interprets them as one and the same:
|
||||
|
||||
|
@ -154,7 +76,6 @@ The examples below illustrate values containing special characters that will be
|
|||
}
|
||||
```
|
||||
|
||||
To avoid this circumstance when using either query DSL or the REST API, you can use a custom analyzer or map the field as `keyword`, which performs an exact-match search. See [Keyword field type](https://opensearch.org/docs/2.2/opensearch/supported-field-types/keyword/) for the latter option.
|
||||
|
||||
For a list of characters that should be avoided when field type is `text`, see [Word Boundaries](https://unicode.org/reports/tr29/#Word_Boundaries).
|
||||
To avoid this circumstance when using either query DSL or the REST API, you can use a custom analyzer or map the field as `keyword`, which performs an exact-match search. See [Keyword field type]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/) for the latter option.
|
||||
|
||||
For a list of characters that should be avoided for `text` field types, see [Word Boundaries](https://unicode.org/reports/tr29/#Word_Boundaries).
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
layout: default
|
||||
title: Span queries
|
||||
parent: Query DSL
|
||||
nav_order: 60
|
||||
---
|
||||
|
||||
# Span queries
|
||||
|
||||
You can use span queries to perform precise positional searches. Span queries are low-level, specific queries that provide control over the order and proximity of specified query terms. They are primarily used to search legal documents and patents.
|
||||
|
||||
Span queries include the following query types:
|
||||
|
||||
- **Span containing**: Wraps a list of span queries and only returns spans that match a second span query.
|
||||
- **Span field masking**: Combines `span_near` or `span_or` across different fields.
|
||||
- **Span first**: Matches spans close to the beginning of the field.
|
||||
- **Span multi-term**: Provides a wrapper around the following query types: `term`, `range`, `prefix`, `wildcard`, `regexp` or `fuzzy`.
|
||||
- **Span near**: Matches spans that are near each other. Wraps multiple span queries that must match within the specified `slop` distance of each other, and optionally in the same order. Slop represents the maximum number of intervening unmatched positions and indicates whether matches are required to be returned in order.
|
||||
- **Span not**: Provides a wrapper for another span query and excludes any documents that match the internal query.
|
||||
- **Span or**: Provides a wrapper for multiple span queries and includes any documents that match any of the specified queries.
|
||||
- **Span term**: Functions in the same way as a `term` query, but is designed to be used with other span queries.
|
||||
- **Span within**: Used with other span queries to return a single span query if its span is within the spans that are returned by a list of other span queries.
|
|
@ -0,0 +1,233 @@
|
|||
---
|
||||
layout: default
|
||||
title: Term-level and full-text queries compared
|
||||
parent: Query DSL
|
||||
nav_order: 10
|
||||
---
|
||||
|
||||
# Term-level and full-text queries compared
|
||||
|
||||
You can use both term-level and full-text queries to search text, but while term-level queries are usually used to search structured data, full-text queries are used for full-text search. The main difference between term-level and full-text queries is that term-level queries search documents for an exact specified term, while full-text queries analyze the query string. The following table summarizes the differences between term-level and full-text queries.
|
||||
|
||||
| | Term-level queries | Full-text queries
|
||||
:--- | :--- | :---
|
||||
*Description* | Term-level queries answer which documents match a query. | Full-text queries answer how well the documents match a query.
|
||||
*Analyzer* | The search term isn't analyzed. This means that the term query searches for your search term as it is. | The search term is analyzed by the same analyzer that was used for the specific document field at the time it was indexed. This means that your search term goes through the same analysis process as the document's field.
|
||||
*Relevance* | Term-level queries simply return documents that match without sorting them based on the relevance score. They still calculate the relevance score, but this score is the same for all the documents that are returned. | Full-text queries calculate a relevance score for each match and sort the results by decreasing order of relevance.
|
||||
*Use Case* | Use term-level queries when you want to match exact values such as numbers, dates, or tags and don't need the matches to be sorted by relevance. | Use full-text queries to match text fields and sort by relevance after taking into account factors like casing and stemming variants.
|
||||
|
||||
OpenSearch uses the BM25 ranking algorithm to calculate relevance scores. To learn more, see [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25).
|
||||
{: .note }
|
||||
|
||||
## Should I use a full-text or a term-level query?
|
||||
|
||||
To clarify the difference between full-text and term-level queries, consider the following two examples that search for a specific text phrase. The complete works of Shakespeare are indexed in an OpenSearch cluster.
|
||||
|
||||
### Example: Phrase search
|
||||
|
||||
In this example, you'll search the complete works of Shakespeare for the phrase "To be, or not to be" in the `text_entry` field.
|
||||
|
||||
First, use a **term-level query** for this search:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"term": {
|
||||
"text_entry": "To be, or not to be"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The response contains no matches, indicated by zero `hits`:
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 3,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 0,
|
||||
"relation" : "eq"
|
||||
},
|
||||
"max_score" : null,
|
||||
"hits" : [ ]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This is because the term “To be, or not to be” is searched literally in the inverted index, where only the analyzed values of the text fields are stored. Term-level queries aren’t suited for searching analyzed text fields because they often yield unexpected results. When working with text data, use term-level queries only for fields mapped as `keyword`.
|
||||
|
||||
Now search for the same phrase using a **full-text query**:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"match": {
|
||||
"text_entry": "To be, or not to be"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The search query “To be, or not to be” is analyzed and tokenized into an array of tokens just like the `text_entry` field of the documents. The full-text query takes an intersection of tokens between the search query and the `text_entry` fields for all the documents, and then sorts the results by relevance score:
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 19,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 10000,
|
||||
"relation" : "gte"
|
||||
},
|
||||
"max_score" : 17.419369,
|
||||
"hits" : [
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "34229",
|
||||
"_score" : 17.419369,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 34230,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 19,
|
||||
"line_number" : "3.1.64",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "To be, or not to be: that is the question:"
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "109930",
|
||||
"_score" : 14.883024,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 109931,
|
||||
"play_name" : "A Winters Tale",
|
||||
"speech_number" : 23,
|
||||
"line_number" : "4.4.153",
|
||||
"speaker" : "PERDITA",
|
||||
"text_entry" : "Not like a corse; or if, not to be buried,"
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "103117",
|
||||
"_score" : 14.782743,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 103118,
|
||||
"play_name" : "Twelfth Night",
|
||||
"speech_number" : 53,
|
||||
"line_number" : "1.3.95",
|
||||
"speaker" : "SIR ANDREW",
|
||||
"text_entry" : "will not be seen; or if she be, its four to one"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
For a list of all full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
|
||||
|
||||
### Example: Exact term search
|
||||
|
||||
If you want to search for an exact term like “HAMLET” in the `speaker` field and don't need the results to be sorted by relevance score, a term-level query is more efficient:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"term": {
|
||||
"speaker": "HAMLET"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The response contains document matches:
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 5,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 1582,
|
||||
"relation" : "eq"
|
||||
},
|
||||
"max_score" : 4.2540946,
|
||||
"hits" : [
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "32700",
|
||||
"_score" : 4.2540946,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 32701,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 9,
|
||||
"line_number" : "1.2.66",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "[Aside] A little more than kin, and less than kind."
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "32702",
|
||||
"_score" : 4.2540946,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 32703,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 11,
|
||||
"line_number" : "1.2.68",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "Not so, my lord; I am too much i' the sun."
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "32709",
|
||||
"_score" : 4.2540946,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 32710,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 13,
|
||||
"line_number" : "1.2.75",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "Ay, madam, it is common."
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
The term-level queries provide exact matches. So if you search for “Hamlet”, you don’t receive any matches, because “HAMLET” is a keyword field and is stored in OpenSearch literally and not in an analyzed form.
|
||||
The search query “HAMLET” is also searched literally. So to get a match for this field, we need to enter the exact same characters.
|
|
@ -2,231 +2,33 @@
|
|||
layout: default
|
||||
title: Term-level queries
|
||||
parent: Query DSL
|
||||
nav_order: 30
|
||||
nav_order: 20
|
||||
---
|
||||
|
||||
# Term-level queries
|
||||
|
||||
OpenSearch supports two types of queries when you search for data: term-level queries and full-text queries.
|
||||
Term-level queries search an index for documents that contain an exact search term. Documents returned by a term-level query are not sorted by their relevance scores.
|
||||
|
||||
The following table describes the differences between them:
|
||||
When working with text data, use term-level queries for fields mapped as `keyword` only.
|
||||
|
||||
| | Term-level queries | Full-text queries
|
||||
Term-level queries are not suited for searching analyzed text fields. To return analyzed fields, use a [full-text query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text).
|
||||
|
||||
## Term-level query types
|
||||
|
||||
The following table lists all term-level query types.
|
||||
|
||||
| Query type | Description
|
||||
:--- | :--- | :---
|
||||
*Description* | Term-level queries answer which documents match a query. | Full-text queries answer how well the documents match a query.
|
||||
*Analyzer* | The search term isn't analyzed. This means that the term query searches for your search term as it is. | The search term is analyzed by the same analyzer that was used for the specific field of the document at the time it was indexed. This means that your search term goes through the same analysis process that the document's field did.
|
||||
*Relevance* | Term-level queries simply return documents that match without sorting them based on the relevance score. They still calculate the relevance score, but this score is the same for all the documents that are returned. | Full-text queries calculate a relevance score for each match and sort the results by decreasing order of relevance.
|
||||
*Use Case* | Use term-level queries when you want to match exact values such as numbers, dates, tags, and so on, and don't need the matches to be sorted by relevance. | Use full-text queries to match text fields and sort by relevance after taking into account factors like casing and stemming variants.
|
||||
|
||||
OpenSearch uses a probabilistic ranking framework called Okapi BM25 to calculate relevance scores. To learn more about Okapi BM25, see [Wikipedia](https://en.wikipedia.org/wiki/Okapi_BM25).
|
||||
{: .note }
|
||||
|
||||
Assume that you have the complete works of Shakespeare indexed in an OpenSearch cluster. We use a term-level query to search for the phrase "To be, or not to be" in the `text_entry` field:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"term": {
|
||||
"text_entry": "To be, or not to be"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Sample response
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 3,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 0,
|
||||
"relation" : "eq"
|
||||
},
|
||||
"max_score" : null,
|
||||
"hits" : [ ]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
We don’t get back any matches (`hits`). This is because the term “To be, or not to be” is searched literally in the inverted index, where only the analyzed values of the text fields are stored. Term-level queries aren't suited for searching on analyzed text fields because they often yield unexpected results. When working with text data, use term-level queries only for fields mapped as keyword only.
|
||||
|
||||
Using a full-text query:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"match": {
|
||||
"text_entry": "To be, or not to be"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The search query “To be, or not to be” is analyzed and tokenized into an array of tokens just like the `text_entry` field of the documents. The full-text query performs an intersection of tokens between our search query and the `text_entry` fields for all the documents, and then sorts the results by relevance scores:
|
||||
|
||||
#### Sample response
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 19,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 10000,
|
||||
"relation" : "gte"
|
||||
},
|
||||
"max_score" : 17.419369,
|
||||
"hits" : [
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "34229",
|
||||
"_score" : 17.419369,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 34230,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 19,
|
||||
"line_number" : "3.1.64",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "To be, or not to be: that is the question:"
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "109930",
|
||||
"_score" : 14.883024,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 109931,
|
||||
"play_name" : "A Winters Tale",
|
||||
"speech_number" : 23,
|
||||
"line_number" : "4.4.153",
|
||||
"speaker" : "PERDITA",
|
||||
"text_entry" : "Not like a corse; or if, not to be buried,"
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "103117",
|
||||
"_score" : 14.782743,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 103118,
|
||||
"play_name" : "Twelfth Night",
|
||||
"speech_number" : 53,
|
||||
"line_number" : "1.3.95",
|
||||
"speaker" : "SIR ANDREW",
|
||||
"text_entry" : "will not be seen; or if she be, its four to one"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
For a list of all full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
|
||||
|
||||
If you want to query for an exact term like “HAMLET” in the speaker field and don't need the results to be sorted by relevance scores, a term-level query is more efficient:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"term": {
|
||||
"speaker": "HAMLET"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Sample response
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 5,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 1582,
|
||||
"relation" : "eq"
|
||||
},
|
||||
"max_score" : 4.2540946,
|
||||
"hits" : [
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "32700",
|
||||
"_score" : 4.2540946,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 32701,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 9,
|
||||
"line_number" : "1.2.66",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "[Aside] A little more than kin, and less than kind."
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "32702",
|
||||
"_score" : 4.2540946,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 32703,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 11,
|
||||
"line_number" : "1.2.68",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "Not so, my lord; I am too much i' the sun."
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "shakespeare",
|
||||
"_id" : "32709",
|
||||
"_score" : 4.2540946,
|
||||
"_source" : {
|
||||
"type" : "line",
|
||||
"line_id" : 32710,
|
||||
"play_name" : "Hamlet",
|
||||
"speech_number" : 13,
|
||||
"line_number" : "1.2.75",
|
||||
"speaker" : "HAMLET",
|
||||
"text_entry" : "Ay, madam, it is common."
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
The term-level queries are exact matches. So, if you search for “Hamlet”, you don’t get back any matches, because “HAMLET” is a keyword field and is stored in OpenSearch literally and not in an analyzed form.
|
||||
The search query “HAMLET” is also searched literally. So, to get a match on this field, we need to enter the exact same characters.
|
||||
|
||||
---
|
||||
[`term`](#term) | Searches for documents with an exact term in a specific field.
|
||||
[`terms`](#terms) | Searches for documents with one or more terms in a specific field.
|
||||
[`terms_set`](#terms-set) | Searches for documents that match a minimum number of terms in a specific field.
|
||||
[`ids`](#ids) | Searches for documents by document ID.
|
||||
[`range`](#range) | Searches for documents with field values in a specific range.
|
||||
[`prefix`](#prefix) | Searches for documents with terms that begin with a specific prefix.
|
||||
[`exists`](#exists) | Searches for documents with any indexed value in a specific field.
|
||||
[`fuzzy`](#fuzzy) | Searches for documents with terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term.
|
||||
[`wildcard`](#wildcard) | Searches for documents with terms that match a wildcard pattern.
|
||||
[`regexp`](#regexp) | Searches for documents with terms that match a regular expression.
|
||||
|
||||
## Term
|
||||
|
||||
|
@ -244,6 +46,7 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
## Terms
|
||||
|
||||
|
@ -262,9 +65,146 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
You get back documents that match any of the terms.
|
||||
|
||||
## Terms set
|
||||
|
||||
With a terms set query, you can search for documents that match a minimum number of exact terms in a specified field. The `terms_set` query is similar to the `terms` query, but you can specify the minimum number of matching terms that are required to return a document. You can specify this number either in a field in the index or with a script.
|
||||
|
||||
As an example, consider an index that contains students with classes they have taken. When setting up the mapping for this index, you need to provide a [numeric]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/numeric) field that specifies the minimum number of matching terms that are required to return a document:
|
||||
|
||||
```json
|
||||
PUT students
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "keyword"
|
||||
},
|
||||
"classes": {
|
||||
"type": "keyword"
|
||||
},
|
||||
"min_required": {
|
||||
"type": "integer"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
Next, index two documents that correspond to students:
|
||||
|
||||
```json
|
||||
PUT students/_doc/1
|
||||
{
|
||||
"name": "Mary Major",
|
||||
"classes": [ "CS101", "CS102", "MATH101" ],
|
||||
"min_required": 2
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
```json
|
||||
PUT students/_doc/2
|
||||
{
|
||||
"name": "John Doe",
|
||||
"classes": [ "CS101", "MATH101", "ENG101" ],
|
||||
"min_required": 2
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
Now search for students who have taken at least two of the following classes: `CS101`, `CS102`, `MATH101`:
|
||||
|
||||
```json
|
||||
GET students/_search
|
||||
{
|
||||
"query": {
|
||||
"terms_set": {
|
||||
"classes": {
|
||||
"terms": [ "CS101", "CS102", "MATH101" ],
|
||||
"minimum_should_match_field": "min_required"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
The response contains both students:
|
||||
|
||||
```json
|
||||
{
|
||||
"took" : 44,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : {
|
||||
"value" : 2,
|
||||
"relation" : "eq"
|
||||
},
|
||||
"max_score" : 1.4544616,
|
||||
"hits" : [
|
||||
{
|
||||
"_index" : "students",
|
||||
"_id" : "1",
|
||||
"_score" : 1.4544616,
|
||||
"_source" : {
|
||||
"name" : "Mary Major",
|
||||
"classes" : [
|
||||
"CS101",
|
||||
"CS102",
|
||||
"MATH101"
|
||||
],
|
||||
"min_required" : 2
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index" : "students",
|
||||
"_id" : "2",
|
||||
"_score" : 0.5013843,
|
||||
"_source" : {
|
||||
"name" : "John Doe",
|
||||
"classes" : [
|
||||
"CS101",
|
||||
"MATH101",
|
||||
"ENG101"
|
||||
],
|
||||
"min_required" : 2
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
To specify the minimum number of terms a document should match with a script, provide the script in the `minimum_should_match_script` field:
|
||||
|
||||
```json
|
||||
GET students/_search
|
||||
{
|
||||
"query": {
|
||||
"terms_set": {
|
||||
"classes": {
|
||||
"terms": [ "CS101", "CS102", "MATH101" ],
|
||||
"minimum_should_match_script": {
|
||||
"source": "Math.min(params.num_terms, doc['min_required'].value)"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
## IDs
|
||||
|
||||
Use the `ids` query to search for one or more document ID values.
|
||||
|
@ -282,8 +222,9 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
## Range query
|
||||
## Range
|
||||
|
||||
You can search for a range of values in a field with the `range` query.
|
||||
|
||||
|
@ -302,6 +243,7 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
Parameter | Behavior
|
||||
:--- | :---
|
||||
|
@ -328,6 +270,7 @@ GET products/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
Specify relative dates by using [date math]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#date-math).
|
||||
|
||||
|
@ -345,6 +288,7 @@ GET products/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
The first date that we specify is the anchor date or the starting point for the date math. Add two trailing pipe symbols. You could then add one day (`+1d`) or subtract two weeks (`-2w`). This math expression is relative to the anchor date that you specify.
|
||||
|
||||
|
@ -364,6 +308,7 @@ GET products/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
The keyword `now` refers to the current date and time.
|
||||
|
||||
|
@ -381,6 +326,7 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
## Exists
|
||||
|
||||
|
@ -396,8 +342,59 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
## Wildcards
|
||||
## Fuzzy
|
||||
|
||||
A fuzzy query searches for documents with terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term. These changes include:
|
||||
|
||||
- Replacements: **c**at to **b**at
|
||||
- Insertions: cat to cat**s**
|
||||
- Deletions: **c**at to at
|
||||
- Transpositions: **ca**t to **ac**t
|
||||
|
||||
A fuzzy query creates a list of all possible expansions of the search term that fall within the Levenshtein distance. You can specify the maximum number of such expansions in the `max_expansions` field. Then is searches for documents that match any of the expansions.
|
||||
|
||||
The following example query searches for the speaker `HALET` (misspelled `HAMLET`). The maximum edit distance is not specified, so the default `AUTO` edit distance is used:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"fuzzy": {
|
||||
"speaker": {
|
||||
"value": "HALET"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
The response contains all documents where `HAMLET` is the speaker.
|
||||
|
||||
The following example query searches for the word `cat` with advanced parameters:
|
||||
|
||||
```json
|
||||
GET shakespeare/_search
|
||||
{
|
||||
"query": {
|
||||
"fuzzy": {
|
||||
"speaker": {
|
||||
"value": "HALET",
|
||||
"fuzziness": "2",
|
||||
"max_expansions": 40,
|
||||
"prefix_length": 0,
|
||||
"transpositions": true,
|
||||
"rewrite": "constant_score"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
## Wildcard
|
||||
|
||||
Use wildcard queries to search for terms that match a wildcard pattern.
|
||||
|
||||
|
@ -420,12 +417,13 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
If we change `*` to `?`, we get no matches, because `?` refers to a single character.
|
||||
|
||||
Wildcard queries tend to be slow because they need to iterate over a lot of terms. Avoid placing wildcard characters at the beginning of a query because it could be a very expensive operation in terms of both resources and time.
|
||||
|
||||
## Regex
|
||||
## Regexp
|
||||
|
||||
Use the `regexp` query to search for terms that match a regular expression.
|
||||
|
||||
|
@ -441,6 +439,7 @@ GET shakespeare/_search
|
|||
}
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
A few important notes:
|
||||
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Text analyzers
|
||||
parent: Query DSL
|
||||
nav_order: 41
|
||||
nav_order: 75
|
||||
---
|
||||
|
||||
|
||||
|
@ -16,14 +16,14 @@ OpenSearch provides several text analyzers to convert your structured text into
|
|||
|
||||
OpenSearch supports the following text analyzers:
|
||||
|
||||
1. **Standard analyzer** – Parses strings into terms at word boundaries per the Unicode text segmentation algorithm. It removes most, but not all, punctuation. It converts strings to lowercase. You can remove stop words if you turn on that option, but it does not remove stop words by default.
|
||||
1. **Simple analyzer** – Converts strings to lowercase and removes non-letter characters when it splits a string into tokens on any non-letter character.
|
||||
1. **Whitespace analyzer** – Parses strings into terms between each whitespace.
|
||||
1. **Stop analyzer** – Converts strings to lowercase and removes non-letter characters by splitting strings into tokens at each non-letter character. It also removes stop words (e.g., "but" or "this") from strings.
|
||||
1. **Keyword analyzer** – Receives a string as input and outputs the entire string as one term.
|
||||
1. **Pattern analyzer** – Splits strings into terms using regular expressions and supports converting strings to lowercase. It also supports removing stop words.
|
||||
1. **Language analyzer** – Provides analyzers specific to multiple languages.
|
||||
1. **Fingerprint analyzer** – Creates a fingerprint to use as a duplicate detector.
|
||||
- **Standard analyzer** – Parses strings into terms at word boundaries according to the Unicode text segmentation algorithm. It removes most, but not all, punctuation and converts strings to lowercase. You can remove stop words if you enable that option, but it does not remove stop words by default.
|
||||
- **Simple analyzer** – Converts strings to lowercase and removes non-letter characters when it splits a string into tokens on any non-letter character.
|
||||
- **Whitespace analyzer** – Parses strings into terms between each whitespace.
|
||||
- **Stop analyzer** – Converts strings to lowercase and removes non-letter characters by splitting strings into tokens at each non-letter character. It also removes stop words (for example, "but" or "this") from strings.
|
||||
- **Keyword analyzer** – Receives a string as input and outputs the entire string as one term.
|
||||
- **Pattern analyzer** – Splits strings into terms using regular expressions and supports converting strings to lowercase. It also supports removing stop words.
|
||||
- **Language analyzer** – Provides analyzers specific to multiple languages.
|
||||
- **Fingerprint analyzer** – Creates a fingerprint to use as a duplicate detector.
|
||||
|
||||
The full specialized text analyzers reference is in progress and will be published soon.
|
||||
{: .note }
|
||||
|
|
|
@ -6,7 +6,7 @@ nav_order: 16
|
|||
|
||||
# Reindex data
|
||||
|
||||
After creating an index, you might need to make an extensive change such as adding a new field to every document or combining multiple indices to form a new one. Rather than deleting your index, making the change offline, and then indexing your data all over again, you can use the `reindex` operation.
|
||||
After creating an index, you might need to make an extensive change such as adding a new field to every document or combining multiple indexes to form a new one. Rather than deleting your index, making the change offline, and then indexing your data again, you can use the `reindex` operation.
|
||||
|
||||
With the `reindex` operation, you can copy all or a subset of documents that you select through a query to another index. Reindex is a `POST` operation. In its most basic form, you specify a source index and a destination index.
|
||||
|
||||
|
@ -113,13 +113,13 @@ POST _reindex
|
|||
}
|
||||
```
|
||||
|
||||
For a list of all query operations, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
|
||||
For a list of all query operations, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
|
||||
|
||||
## Combine one or more indices
|
||||
## Combine one or more indexes
|
||||
|
||||
You can combine documents from one or more indices by adding the source indices as a list.
|
||||
You can combine documents from one or more indexes by adding the source indexes as a list.
|
||||
|
||||
This command copies all documents from two source indices to one destination index:
|
||||
This command copies all documents from two source indexes to one destination index:
|
||||
|
||||
```json
|
||||
POST _reindex
|
||||
|
@ -135,7 +135,7 @@ POST _reindex
|
|||
}
|
||||
}
|
||||
```
|
||||
Make sure the number of shards for your source and destination indices are the same.
|
||||
Make sure the number of shards for your source and destination indexes is the same.
|
||||
|
||||
## Reindex only unique documents
|
||||
|
||||
|
@ -246,7 +246,7 @@ You can specify the following options for your source index:
|
|||
|
||||
Option | Valid values | Description | Required
|
||||
:--- | :--- | :---
|
||||
`index` | String | The name of the source index. You can provide multiple source indices as a list. | Yes
|
||||
`index` | String | The name of the source index. You can provide multiple source indexes as a list. | Yes
|
||||
`max_docs` | Integer | The maximum number of documents to reindex. | No
|
||||
`query` | Object | The search query to use for the reindex operation. | No
|
||||
`size` | Integer | The number of documents to reindex. | No
|
||||
|
|
|
@ -42,7 +42,7 @@ GET shakespeare/_search
|
|||
}
|
||||
```
|
||||
|
||||
To make the word order and relative positions flexible, specify a `slop` value. To learn about the `slop` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text#other-advanced-options).
|
||||
To make the word order and relative positions flexible, specify a `slop` value. To learn about the `slop` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#other-advanced-options).
|
||||
|
||||
Prefix matching doesn’t require any special mappings. It works with your data as is.
|
||||
However, it’s a fairly resource-intensive operation. A prefix of `a` could match hundreds of thousands of terms and not be useful to your user.
|
||||
|
@ -63,7 +63,7 @@ GET shakespeare/_search
|
|||
}
|
||||
```
|
||||
|
||||
To learn about the `max_expansions` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text#other-advanced-options).
|
||||
To learn about the `max_expansions` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#other-advanced-options).
|
||||
|
||||
The ease of implementing query-time autocomplete comes at the cost of performance.
|
||||
When implementing this feature on a large scale, we recommend an index-time solution. With an index-time solution, you might experience slower indexing, but it’s a price you pay only once and not for every query. The edge n-gram, search-as-you-type, and completion suggester methods are index-time solutions.
|
||||
|
|
|
@ -154,7 +154,7 @@ Format name and description | Pattern and examples
|
|||
|
||||
## Custom formats
|
||||
|
||||
You can create custom formats for date fields. For example, the following request specifies a date in the common "MM/dd/yyyy" format.
|
||||
You can create custom formats for date fields. For example, the following request specifies a date in the common "MM/dd/yyyy" format:
|
||||
|
||||
```json
|
||||
PUT testindex
|
||||
|
@ -217,7 +217,7 @@ GET testindex/_search
|
|||
|
||||
## Date math
|
||||
|
||||
The date field type supports using date math to specify duration in queries. For example, the `gt`, `gte`, `lt`, and `lte` parameters in [range queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range-query) and the `from` and `to` parameters in [date range aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg/#range-date_range-ip_range) accept date math expressions.
|
||||
The date field type supports using date math to specify durations in queries. For example, the `gt`, `gte`, `lt`, and `lte` parameters in [range queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range) and the `from` and `to` parameters in [date range aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg/#range-date_range-ip_range) accept date math expressions.
|
||||
|
||||
A date math expression contains a fixed date, optionally followed by one or more mathematical expressions. The fixed date may be either `now` (current date and time in milliseconds since the epoch) or a string ending with `||` that specifies a date (for example, `2022-05-18||`). The date must be in the `strict_date_optional_time||epoch_millis` format.
|
||||
|
||||
|
@ -252,7 +252,7 @@ The following example expressions illustrate using date math:
|
|||
|
||||
### Using date math in a range query
|
||||
|
||||
The following example illustrates using date math in a [range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range-query).
|
||||
The following example illustrates using date math in a [range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range).
|
||||
|
||||
Set up an index with `release_date` mapped as `date`:
|
||||
|
||||
|
|
|
@ -64,7 +64,7 @@ You can use a [Term query](#term-query) or a [Range query](#range-query) to sear
|
|||
|
||||
A term query takes a value and matches all range fields for which the value is within the range.
|
||||
|
||||
The following query will return document 1 because 3.5 is within the range [1.0, 4.0].
|
||||
The following query will return document 1 because 3.5 is within the range [1.0, 4.0]:
|
||||
|
||||
```json
|
||||
GET testindex/_search
|
||||
|
@ -91,7 +91,7 @@ relation | Provides a relation between the query's date range and the document's
|
|||
|
||||
To use a date format other than the field's mapped format in a query, specify it in the `format` field.
|
||||
|
||||
To see the full description of range query usage, including all range query parameters, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range-query).
|
||||
For a full description of range query usage, including all range query parameters, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range).
|
||||
{: .tip }
|
||||
|
||||
Query for all graduation dates in 2019, providing the date range in a "MM/dd/yyyy" format:
|
||||
|
|
|
@ -11,10 +11,10 @@ has_math: true
|
|||
Introduced 2.4
|
||||
{: .label .label-purple }
|
||||
|
||||
You can create custom filters using Query domain-specific language (DSL) search options to refine your k-NN searches. You define the filter criteria within the `knn_vector` field's `filter` subsection in your query. You can use any of the OpenSearch Query DSL query types as a filter. This includes the common query types: `term`, `range`, `regexp`, and `wildcard`, as well as custom query types. To include or exclude results, use Boolean query clauses. You can also specify a query point with the `knn_vector` type and search for nearest neighbors that match your filter criteria.
|
||||
You can create custom filters using Query domain-specific language (DSL) search options to refine your k-NN searches. You define the filter criteria within the `knn_vector` field's `filter` subsection in your query. You can use any of the OpenSearch query DSL query types as a filter. This includes the common query types: `term`, `range`, `regexp`, and `wildcard`, as well as custom query types. To include or exclude results, use Boolean query clauses. You can also specify a query point with the `knn_vector` type and search for nearest neighbors that match your filter criteria.
|
||||
To run k-NN queries with a filter, the Lucene search engine and Hierarchical Navigable Small World (HNSW) method are required.
|
||||
|
||||
To learn more about how to use Query DSL Boolean query clauses, see [Boolean queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/bool). For more details about the `knn_vector` data type definition, see [k-NN Index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/).
|
||||
To learn more about how to use query DSL Boolean query clauses, see [Boolean queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/compound/bool). For more details about the `knn_vector` data type definition, see [k-NN Index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/).
|
||||
{: .note }
|
||||
|
||||
## How does a k-NN filter work?
|
||||
|
@ -205,7 +205,7 @@ A very restrictive filter returns the lowest number of documents in your dataset
|
|||
|
||||
### Use case 2: Somewhat restrictive 38% filter
|
||||
|
||||
A somewhat restrictive filter returns 38% of the documents in the data set that you search. For example, the following filter criteria specifies hotels with parking and feedback ratings less than or equal to 8 and returns 5 documents.
|
||||
A somewhat restrictive filter returns 38% of the documents in the data set that you search. For example, the following filter criteria specifies hotels with parking and feedback ratings less than or equal to 8 and returns 5 documents:
|
||||
|
||||
```json
|
||||
"filter": {
|
||||
|
@ -230,7 +230,7 @@ A somewhat restrictive filter returns 38% of the documents in the data set that
|
|||
|
||||
### Use case 3: Not very restrictive 80% filter
|
||||
|
||||
A filter that is not very restrictive will return 80% of the documents that you search. For example, the following filter criteria specifies hotels with feedback ratings greater than or equal to 5 and returns 10 documents.
|
||||
A filter that is not very restrictive will return 80% of the documents that you search. For example, the following filter criteria specifies hotels with feedback ratings greater than or equal to 5 and returns 10 documents:
|
||||
|
||||
```json
|
||||
"filter": {
|
||||
|
@ -254,7 +254,7 @@ You can search with a filter by following these three steps:
|
|||
1. Create an index and specify the requirements for the Lucene engine and HNSW requirements in the mapping.
|
||||
1. Add your data to the index.
|
||||
1. Search the index and specify these three items in your query:
|
||||
* One or more filters defined by Query DSL
|
||||
* One or more filters defined by query DSL
|
||||
* A vector reference point defined by the `vector` field
|
||||
* The number of matches you want returned with the `k` field
|
||||
|
||||
|
@ -405,9 +405,9 @@ Upon success, you should receive a "200-OK" status with entries for each documen
|
|||
|
||||
## Step 3: Search your data with a filter
|
||||
|
||||
Now you can create a k-NN search that specifies filters by using Query DSL Boolean clauses. You need to include your reference point to search for nearest neighbors. Provide an x-y coordinate for the point within the `vector` field, such as `"vector": [ 5.0, 4.0]`.
|
||||
Now you can create a k-NN search that specifies filters by using query DSL Boolean clauses. You need to include your reference point to search for nearest neighbors. Provide an x-y coordinate for the point within the `vector` field, such as `"vector": [ 5.0, 4.0]`.
|
||||
|
||||
To learn more about how to specify ranges with Query DSL, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range-query).
|
||||
To learn more about how to specify ranges with query DSL, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range).
|
||||
{: .note }
|
||||
|
||||
#### Sample request
|
||||
|
@ -525,7 +525,7 @@ Depending on how restrictive you want your filter to be, you can add multiple qu
|
|||
|
||||
#### Sample request
|
||||
|
||||
The following request returns hotels that provide parking. This request illustrates multiple alternative mechanisms to obtain the parking filter criteria. It uses a regular expression for the value `true`, a term query for the key-value pair `"parking":"true"`, a wildcard for the characters that spell "true", and the `must_not` clause to eliminate hotels with "parking" set to `false`.
|
||||
The following request returns hotels that provide parking. This request illustrates multiple alternative mechanisms to obtain the parking filter criteria. It uses a regular expression for the value `true`, a term query for the key-value pair `"parking":"true"`, a wildcard for the characters that spell "true", and the `must_not` clause to eliminate hotels with "parking" set to `false`:
|
||||
|
||||
```json
|
||||
POST /hotels-index/_search
|
||||
|
|
|
@ -9,7 +9,7 @@ nav_order: 11
|
|||
|
||||
Use SQL commands for full-text search. The SQL plugin supports a subset of full-text queries available in OpenSearch.
|
||||
|
||||
To learn about full-text queries in OpenSearch, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
|
||||
To learn about full-text queries in OpenSearch, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
|
||||
|
||||
## Match
|
||||
|
||||
|
@ -36,7 +36,7 @@ You can specify the following options in any order:
|
|||
- `zero_terms_query`
|
||||
- `boost`
|
||||
|
||||
Please, refer to `match` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match) for parameter description and supported values.
|
||||
Refer to the `match` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match) for parameter descriptions and supported values.
|
||||
|
||||
### Example 1: Search the `message` field for the text "this is a test":
|
||||
|
||||
|
@ -224,7 +224,7 @@ You can specify the following options for `QUERY_STRING` in any order:
|
|||
- `tie_breaker`
|
||||
- `time_zone`
|
||||
|
||||
Please, refer to `query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#query-string) for parameter description and supported values.
|
||||
Refer to the `query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#query-string) for parameter descriptions and supported values.
|
||||
|
||||
### Example of using `query_string` in SQL and PPL queries:
|
||||
|
||||
|
@ -281,7 +281,7 @@ The `MATCHPHRASE`/`MATCH_PHRASE` functions let you specify the following options
|
|||
- `zero_terms_query`
|
||||
- `boost`
|
||||
|
||||
Please, refer to `match_phrase` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-phrase) for parameter description and supported values.
|
||||
Refer to the `match_phrase` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match-phrase) for parameter descriptions and supported values.
|
||||
|
||||
### Example of using `match_phrase` in SQL and PPL queries:
|
||||
|
||||
|
@ -349,7 +349,7 @@ You can specify the following options for `SIMPLE_QUERY_STRING` in any order:
|
|||
- `minimum_should_match`
|
||||
- `quote_field_suffix`
|
||||
|
||||
Please, refer to `simple_query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#simple-query-string) to check parameter meanings and available values.
|
||||
Refer to the `simple_query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#simple-query-string) for parameter descriptions and supported values.
|
||||
|
||||
### *Example* of using `simple_query_string` in SQL and PPL queries:
|
||||
|
||||
|
@ -400,7 +400,7 @@ The `MATCH_PHRASE_PREFIX` function lets you specify the following options in any
|
|||
- `zero_terms_query`
|
||||
- `boost`
|
||||
|
||||
Please, refer to `match_phrase_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-phrase-prefix) for parameter description and supported values.
|
||||
Refer to the `match_phrase_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match-phrase-prefix) for parameter descriptions and supported values.
|
||||
|
||||
### *Example* of using `match_phrase_prefix` in SQL and PPL queries:
|
||||
|
||||
|
@ -456,7 +456,7 @@ The `MATCH_BOOL_PREFIX` function lets you specify the following options in any o
|
|||
- `analyzer`
|
||||
- `operator`
|
||||
|
||||
Please, refer to `match_bool_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-boolean-prefix) for parameter description and supported values.
|
||||
Refer to the `match_bool_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match-boolean-prefix) for parameter descriptions and supported values.
|
||||
|
||||
### Example of using `match_bool_prefix` in SQL and PPL queries:
|
||||
|
||||
|
|
|
@ -191,8 +191,8 @@ Specify a condition to filter the results.
|
|||
`>=` | Greater than or equal to.
|
||||
`<=` | Less than or equal to.
|
||||
`IN` | Specify multiple `OR` operators.
|
||||
`BETWEEN` | Similar to a range query. For more information about range queries, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term#range-query).
|
||||
`LIKE` | Use for full text search. For more information about full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
|
||||
`BETWEEN` | Similar to a range query. For more information about range queries, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term#range).
|
||||
`LIKE` | Use for full-text search. For more information about full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
|
||||
`IS NULL` | Check if the field value is `NULL`.
|
||||
`IS NOT NULL` | Check if the field value is `NOT NULL`.
|
||||
|
||||
|
|
|
@ -220,7 +220,7 @@ Get all SM policies:
|
|||
```json
|
||||
GET _plugins/_sm/policies
|
||||
```
|
||||
You can use a [query string]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#query-string) and specify pagination, the field to be sorted by, and sort order:
|
||||
You can use a [query string]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#query-string) and specify pagination, the field to be sorted by, and sort order:
|
||||
|
||||
```json
|
||||
GET _plugins/_sm/policies?from=0&size=20&sortField=sm_policy.name&sortOrder=desc&queryString=*
|
||||
|
|
Loading…
Reference in New Issue