Added delete by query and index document to document APIs

This commit is contained in:
keithhc2 2021-08-06 15:52:53 -07:00
parent a3808e858a
commit eb51f06694
4 changed files with 220 additions and 2 deletions

View File

@ -3,7 +3,7 @@ layout: default
title: Bulk
parent: Document APIs
grand_parent: REST API reference
nav_order: 25
nav_order: 20
---
# Bulk

View File

@ -0,0 +1,125 @@
---
layout: default
title: Delete by query
parent: Document APIs
grand_parent: REST API reference
nav_order: 25
---
# Delete by query
Introduced 1.0
{: .label .label-purple}
You can include a query as part of your delete request so OpenSearch deletes all documents that match that query.
## Example
```json
POST sample-index1/_delete_by_query
{
"query": {
"match": {
"movie-length": "124"
}
}
}
```
## Path and HTTP methods
```
POST <target>/_delete_by_query
```
## URL parameters
All URL parameters are optional.
Parameter | Type | Description
:--- | :--- | :--- | :---
&lt;index&gt; | String | Name of the data streams, indices, or aliases to delete from. Supports wildcards. If left blank, OpenSearch searches through all indices.
allow_no_indices | Boolean | False indicates to OpenSearch the request should return an error if any wildcard expression or index alias targets only missing or closed indices. Default is true.
analyzer | String | The analyzer to use in the query string.
analyze_wildcard | Boolean | Specifies whether to analyze wildcard and prefix queries. Default is false.
conflicts | String | Indicates to OpenSearch what should happen if the delete by query operation runs into a version conflict. Valid options are `abort` and `proceed`. Default is `abort`.
default_operator | String | Indicates whether the default operator for a string query should be AND or OR. Default is OR.
df | String | The default field in case a field prefix is not provided in the query string.
expand_wildcards | String | Specifies the type of index that wildcard expressions can match. Supports comma-separated values. Valid values are `all` (match any index), `open` (match open, non-hidden indices), `closed` (match closed, non-hidden indices), `hidden` (match hidden indices), and `none` (deny wildcard expressions). Default is `open`.
from | Integer | The starting index to search from. Default is 0.
ignore_unavailable | Boolean | Specifies whether to include missing or closed indices in the response. Default is false.
lenient | Boolean | Specifies whether OpenSearch should ignore format-based query failures (for example, querying a text field for an integer). Default is false.
max_docs | Integer | Maximum amount of documents the operation should process. Default is all documents.
preference | String | Specifies the shard or node OpenSearch should perform the operation on.
q | String | Query in the Lucene query string syntax.
request_cache | Boolean | Specifies whether OpenSearch should use the request cache for the request. Default is whether it's enabled in the index's settings.
refresh | Boolean | Specifies whether OpenSearch should refresh all of the shards involved in the delete request once the operation finishes. Default is false.
requests_per_second | Integer | Specifies the request's throttling in sub-requests per second. Default is -1, which means no throttling.
routing | String | Value used to route the operation to a specific shard.
scroll | Time | Amount of time to keep the search results of documents that matched the query.
scroll_size | Integer | Size of the scroll request of the operation. Default is 1000.
search_type | String | Whether OpenSearch should use global term and document frequencies calculating revelance scores. Valid choices are `query_then_fetch` and `dfs_query_then_fetch`. `query_then_fetch` scores documents using local term and document frequencies for the shard. Its usually faster but less accurate. `dfs_query_then_fetch` scores documents using global term and document frequencies across all shards. Its usually slower but more accurate. Default is `query_then_fetch`.
search_timeout | Time | Amount of time until timeout for the search request. Default is no timeout.
slices | Integer | Number of sub-tasks OpenSearch should divide this task into. Default is 1, which means OpenSearch should not divide this task.
sort | String | A comma-separated list of &lt;field&gt; : &lt;direction&gt; pairs to sort by.
_source | String | Specifies whether to include the `_source` field in the response.
_source_excludes | String | A comma-separated list of source fields to exclude from the response.
_source_includes | String | A comma-separated list of source fields to include in the response.
stats | String | Value to associate with the request for additional logging.
terminate_after | Integer | The maximum number of documents OpenSearch should process before terminating the request.
timeout | Time | How long the operation should wait from a response from active shards. Default is `1m`.
version | Boolean | Whether to include the document version as a match.
wait_for_active_shards | Integer | The number of shards that must be active before OpenSearch executes the operation. Valid values are `all` or any integer up to the total number of shards in the index. Default is 1, which is the primary shard.
## Request body
To search your index for specific documents, you must include a [query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index) in the request body that OpenSearch uses to match documents. If you don't use a query, OpenSearch treats your delete request as a simple [delete document operation]({{site.url}}{{site.baseurl}}/opensearch/rest-api/document-apis/delete-document).
```json
{
"query": {
"match": {
"movie-length": "124"
}
}
}
```
## Response
```json
{
"took": 143,
"timed_out": false,
"total": 1,
"deleted": 1,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1.0,
"throttled_until_millis": 0,
"failures": []
}
```
## Response body fields
Field | Description
:--- | :---
took | The amount of time in milliseconds OpenSearch needed to complete the operation.
timed_out | Whether any delete requests during the operation timed out.
total | Total number of documents processed.
deleted | Total number of documents deleted.
batches | Number of scroll responses the request processed.
version_conflicts | Number of conflicts the request ran into.
noops | How many delete requests OpenSearch ignored during the operation. This field always returns 0.
retries | The number of bulk and search retry requests.
throttled_millis | Number of throttled milliseconds during the request.
requests_per_second | Number of requests executed per second during the operation.
throttled_until_millis | The amount of time until OpenSearch executes the next throttled request. Always equal to 0 in a delete by query request.
failures | Any failures that occur during the request.

View File

@ -0,0 +1,93 @@
---
layout: default
title: Index document
parent: Document APIs
grand_parent: REST API reference
nav_order: 1
---
# Index document
Introduced 1.0
{: .label .label-purple}
Before you can search for data, you must first add a document by indexing it.
## Example
```json
GET sample-index/_doc/1
```
## Path and HTTP methods
```
PUT <index>/_doc/<_id>
POST <index>/_doc
PUT <index>/_create/<_id>
POST <index>/_create/<_id>
```
## URL parameters
In your request, you must specify the index you want to add your document to. If the index doesn't already exist, OpenSearch automatically creates the index and adds in your document. All other URL parameters are optional.
Parameter | Type | Description | Required
:--- | :--- | :--- | :---
&lt;index&gt; | String | Name of the index. | Yes
&lt;_id&gt; | String | A unique identifier to attach to the document. To automatically generate an ID, use `POST <target/_doc` in your request. | No
if_seq_no | Integer | Only perform the operation if the document has the specified sequence number. | No
if_primary_term | Integer | Only perform the operation if the document has the specified primary term. | No
op_type | Enum | Specifies the type of operation to complete with the document. Valid values are `create` (create the index if it doesn't exist) and `index`. If a document ID is included in the request, then the default is `index`. Otherwise, the default is `create`. | No
pipeline | String | ID used to route the indexing operation to a certain pipeline. | No
routing | String | Value used to assign operations to specific shards. | No
timeout | Time | How long to wait for a response from the cluster. Default is `1m`. | No
version | Integer | The document's version number. | No
version_type | Enum | Assigns a specific type to the document. Valid options are `external` (retrieve the document if the specified version number is greater than the document's current version) and `external_gte` (retrieve the document if the specified version number is greater than or equal to the document's current version). For example, to index version 3 of a document, use `/_doc/1?version=3&version_type=external`. | No
wait_for_active_shards | The number of active shards that must be available before OpenSearch processes the request. Default is 1 (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the operation to succeed. | No
require_alias | Boolean | Specifies whether the target index must be an index alias. Default is false. | No
## Request body
Your request body must contain the information you want to index.
```json
{
"Description": "This is just a sample document"
}
```
## Response
```json
{
"_index": "sample-index",
"_type": "_doc",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
```
## Response body fields
Field | Description
:--- | :---
_index | The name of the index.
_type | The document's type. OpenSearch supports only one type, which is `_doc`.
_id | The document's ID.
_version | The document's version.
_result | The result of the index operation.
_shards | Detailed information about the cluster's shards.
total | The total number of shards.
successful | The number of shards OpenSearch succssfully added the document to.
failed | The number of shards OpenSearch failed to added the document to.
_seq_no | The sequence number assigned when the document was indexed.
_primary_term | The primary term assigned when the document was indexed.

View File

@ -3,7 +3,7 @@ layout: default
title: Multi-get document
parent: Document APIs
grand_parent: REST API reference
nav_order: 20
nav_order: 25
---
# Multi-get documents