From 7aee5fc916d9e1117b50b66775707ac9f63d653d Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Wed, 19 Jul 2023 15:43:59 -0700 Subject: [PATCH] Add document limits to index and bulk pages (#4537) * Add document limits to index and buld pages Signed-off-by: Naarcha-AWS * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: Naarcha-AWS Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Nathan Bower --- _api-reference/document-apis/bulk.md | 3 +++ _im-plugin/index.md | 4 ++++ 2 files changed, 7 insertions(+) diff --git a/_api-reference/document-apis/bulk.md b/_api-reference/document-apis/bulk.md index a4b63706..efb52db7 100644 --- a/_api-reference/document-apis/bulk.md +++ b/_api-reference/document-apis/bulk.md @@ -14,6 +14,9 @@ Introduced 1.0 The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests. +Beginning in OpenSearch 2.9, when indexing documents using the bulk operation, the document `_id` must be 512 MB or less in size. +{: .note} + ## Example ```json diff --git a/_im-plugin/index.md b/_im-plugin/index.md index fb6cc8c9..5804a236 100644 --- a/_im-plugin/index.md +++ b/_im-plugin/index.md @@ -16,6 +16,8 @@ You index data using the OpenSearch REST API. Two APIs exist: the index API and For situations in which new data arrives incrementally (for example, customer orders from a small business), you might use the index API to add documents individually as they arrive. For situations in which the flow of data is less frequent (for example, weekly updates to a marketing website), you might prefer to generate a file and send it to the `_bulk` API. For large numbers of documents, lumping requests together and using the `_bulk` API offers superior performance. If your documents are enormous, however, you might need to index them individually. +When indexing documents, the document `_id` must be 512 MB or less in size. + ## Introduction to indexing @@ -91,6 +93,8 @@ OpenSearch indexes have the following naming restrictions: `:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, or `<` + + ## Read data After you index a document, you can retrieve it by sending a GET request to the same endpoint that you used for indexing: