opensearch-docs-cn/_api-reference/document-apis/bulk.md

190 lines
6.5 KiB
Markdown
Raw Normal View History

2021-05-28 13:48:19 -04:00
---
layout: default
title: Bulk
parent: Document APIs
nav_order: 20
redirect_from:
- /opensearch/rest-api/document-apis/bulk/
2021-05-28 13:48:19 -04:00
---
# Bulk
**Introduced 1.0**
2021-07-26 19:14:22 -04:00
{: .label .label-purple }
2021-05-28 13:48:19 -04:00
The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests.
2021-05-28 13:48:19 -04:00
Beginning in OpenSearch 2.9, when indexing documents using the bulk operation, the document `_id` must be 512 bytes or less in size.
{: .note}
2021-05-28 13:48:19 -04:00
## Example
```json
POST _bulk
{ "delete": { "_index": "movies", "_id": "tt2229499" } }
{ "index": { "_index": "movies", "_id": "tt1979320" } }
{ "title": "Rush", "year": 2013 }
{ "create": { "_index": "movies", "_id": "tt1392214" } }
{ "title": "Prisoners", "year": 2013 }
{ "update": { "_index": "movies", "_id": "tt0816711" } }
{ "doc" : { "title": "World War Z" } }
```
{% include copy-curl.html %}
2021-05-28 13:48:19 -04:00
## Path and HTTP methods
```
POST _bulk
POST <index>/_bulk
2021-05-28 13:48:19 -04:00
```
Make API reference top level (#1637) * Make API reference top level Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix typo on Drag and Drop page (#1633) * Fix typo on Drag and Drop page * Update _dashboards/drag-drop-wizard.md Co-authored-by: Nate Bower <nbower@amazon.com> * Update drag-drop-wizard.md Co-authored-by: Nate Bower <nbower@amazon.com> * Putting all the Docker install material on a single page (#1452) * Putting all the Docker install material on a single page Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Making room for revamp Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Intro added Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing to flesh out the intro section and overview Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Overview finalized Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Introducing docker compose Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added link to compose Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing docker image commentary Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Sometimes I wonder if anyone reads these Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding notes on installing compose with pip Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding prereqs Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Magnets - how do they work? Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Almonds and peaches are part of the same plant subgenus, Amygdalus Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * There are 293 ways to make change for a dollar Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * A shark is the only known fish that can blink with both eyes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * A crocodile cannot stick its tongue out Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * wording Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Reorganizing a couple paragraphs to make it flow better Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Forgot a word Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Add tip about pruning stopped containers Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Cleaning up Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Add blurb about container ls Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding the Docker Compose stuff Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Working on compose Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing work on the compose section - it's a lot of info Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added important settings Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Updates to settings that need configured Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Still working through compose things Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixed wording Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Working through compose commands and guidance Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Reordering/rewording Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * More phrasing Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * More wording in steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * More wording in steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Organizing Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding stuff and things Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing to work through the configuration steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Still working on the configuration steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Changes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * More work Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Removed perf analyzer - refer to GH issue 1555 Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixing things Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding guidance on passing settings in compose Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Working through dockerfile materials now Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * wording Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Finalized the sample dev compose file Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing work with configuration Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Finished - ready for reviews Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixed a link I forgot to change before Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Changes from first proofread Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Changed heading Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Addressed reviewer comments and made some changes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Forgot to incorporate one change. Fixed. Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Final editorial changes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * fix#1584-custom_attr_allowlist (#1636) Signed-off-by: cwillum <cwmmoore@amazon.com> Signed-off-by: cwillum <cwmmoore@amazon.com> * Update TERMS.md with definition for Setting (#1632) * fix#1631-Terms-setting Signed-off-by: cwillum <cwmmoore@amazon.com> * fix#1631-Terms-setting Signed-off-by: cwillum <cwmmoore@amazon.com> Signed-off-by: cwillum <cwmmoore@amazon.com> * Add disclaimer about remote fs usage and an example of setting env var (#1644) * Add disclaimer about remote fs usage and an example of setting env var Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Enhanced wording a little bit Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * [DOC] New documentation: Self-host maps server (#1625) * Add new page self-host maps server Signed-off-by: vagimeli <vagimeli@amazon.com> * Added new content Signed-off-by: vagimeli <vagimeli@amazon.com> * Copy edit Signed-off-by: vagimeli <vagimeli@amazon.com> * Tech review edits Signed-off-by: vagimeli <vagimeli@amazon.com> * Doc review edits Signed-off-by: vagimeli <vagimeli@amazon.com> * Editorial review changes Signed-off-by: vagimeli <vagimeli@amazon.com> * Final edits Signed-off-by: vagimeli <vagimeli@amazon.com> Signed-off-by: vagimeli <vagimeli@amazon.com> * Add feedback. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix links Signed-off-by: Naarcha-AWS <naarcha@amazon.com> Signed-off-by: Naarcha-AWS <naarcha@amazon.com> Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> Signed-off-by: cwillum <cwmmoore@amazon.com> Signed-off-by: vagimeli <vagimeli@amazon.com> Co-authored-by: Nate Bower <nbower@amazon.com> Co-authored-by: Jeff Huss <jeffhuss@amazon.com> Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Co-authored-by: Melissa Vagi <105296784+vagimeli@users.noreply.github.com>
2022-10-27 12:50:39 -04:00
Specifying the index in the path means you don't need to include it in the [request body]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/#request-body).
2021-05-28 13:48:19 -04:00
OpenSearch also accepts PUT requests to the `_bulk` path, but we highly recommend using POST. The accepted usage of PUT---adding or replacing a single resource at a given path---doesn't make sense for bulk requests.
{: .note }
## URL parameters
All bulk URL parameters are optional.
Parameter | Type | Description
:--- | :--- | :---
pipeline | String | The pipeline ID for preprocessing documents.
refresh | Enum | Whether to refresh the affected shards after performing the indexing operations. Default is `false`. `true` makes the changes show up in search results immediately, but hurts cluster performance. `wait_for` waits for a refresh. Requests take longer to return, but cluster performance doesn't suffer.
require_alias | Boolean | Set to `true` to require that all actions target an index alias rather than an index. Default is `false`.
routing | String | Routes the request to the specified shard.
timeout | Time | How long to wait for the request to return. Default `1m`.
type | String | (Deprecated) The default document type for documents that don't specify a type. Default is `_doc`. We highly recommend ignoring this parameter and using a type of `_doc` for all indexes.
2021-05-28 13:48:19 -04:00
wait_for_active_shards | String | Specifies the number of active shards that must be available before OpenSearch processes the bulk request. Default is 1 (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the request to succeed.
{% comment %}_source | List | asdf
_source_excludes | list | asdf
_source_includes | list | asdf{% endcomment %}
## Request body
The bulk request body follows this pattern:
```
Action and metadata\n
Optional document\n
Action and metadata\n
Optional document\n
```
The optional JSON document doesn't need to be minified---spaces are fine---but it does need to be on a single line. OpenSearch uses newline characters to parse bulk requests and requires that the request body end with a newline character.
All actions support the same metadata: `_index`, `_id`, and `_require_alias`. If you don't provide an ID, OpenSearch generates one automatically, which can make it challenging to update the document at a later time.
- Create
Creates a document if it doesn't already exist and returns an error otherwise. The next line must include a JSON document:
2021-05-28 13:48:19 -04:00
```json
{ "create": { "_index": "movies", "_id": "tt1392214" } }
{ "title": "Prisoners", "year": 2013 }
```
- Delete
This action deletes a document if it exists. If the document doesn't exist, OpenSearch doesn't return an error but instead returns `not_found` under `result`. Delete actions don't require documents on the next line:
2021-05-28 13:48:19 -04:00
```json
{ "delete": { "_index": "movies", "_id": "tt2229499" } }
```
- Index
Index actions create a document if it doesn't yet exist and replace the document if it already exists. The next line must include a JSON document:
2021-05-28 13:48:19 -04:00
```json
{ "index": { "_index": "movies", "_id": "tt1979320" } }
{ "title": "Rush", "year": 2013}
```
- Update
By default, this action updates existing documents and returns an error if the document doesn't exist. The next line must include a full or partial JSON document, depending on how much of the document you want to update:
2021-05-28 13:48:19 -04:00
```json
{ "update": { "_index": "movies", "_id": "tt0816711" } }
{ "doc" : { "title": "World War Z" } }
```
To upsert a document, specify `doc_as_upsert` as `true`. If a document exists, it is updated; if it does not exist, a new document is indexed with the parameters specified in the `doc` field:
2021-05-28 13:48:19 -04:00
- Upsert
```json
{ "update": { "_index": "movies", "_id": "tt0816711" } }
{ "doc" : { "title": "World War Z" }, "doc_as_upsert": true }
```
You can specify a script for more complex document updates:
- Script
```json
{ "update": { "_index": "movies", "_id": "tt0816711" } }
{ "script" : { "source": "ctx._source.title = \"World War Z\"" } }
```
2021-05-28 13:48:19 -04:00
## Response
In the response, pay particular attention to the top-level `errors` boolean. If true, you can iterate over the individual actions for more detailed information.
```json
{
"took": 11,
"errors": true,
"items": [
{
"index": {
"_index": "movies",
"_id": "tt1979320",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 201
}
},
{
"create": {
"_index": "movies",
"_id": "tt1392214",
"status": 409,
"error": {
"type": "version_conflict_engine_exception",
"reason": "[tt1392214]: version conflict, document already exists (current version [1])",
"index": "movies",
"shard": "0",
"index_uuid": "yhizhusbSWmP0G7OJnmcLg"
}
}
},
{
"update": {
"_index": "movies",
"_id": "tt0816711",
"status": 404,
"error": {
"type": "document_missing_exception",
"reason": "[_doc][tt0816711]: document missing",
"index": "movies",
"shard": "0",
"index_uuid": "yhizhusbSWmP0G7OJnmcLg"
}
}
}
]
}
```