2018-10-29 10:32:17 -04:00
|
|
|
--
|
|
|
|
:api: bulk
|
|
|
|
:request: BulkRequest
|
|
|
|
:response: BulkResponse
|
|
|
|
--
|
|
|
|
|
|
|
|
[id="{upid}-{api}"]
|
2017-07-05 03:26:26 -04:00
|
|
|
=== Bulk API
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
NOTE: The Java High Level REST Client provides the
|
|
|
|
<<{upid}-{api}-processor>> to assist with bulk requests.
|
2017-07-06 11:05:10 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
[id="{upid}-{api}-request"]
|
2017-07-05 03:26:26 -04:00
|
|
|
==== Bulk Request
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
A +{request}+ can be used to execute multiple index, update and/or delete
|
2017-07-05 03:26:26 -04:00
|
|
|
operations using a single request.
|
|
|
|
|
|
|
|
It requires at least one operation to be added to the Bulk request:
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
<1> Creates the +{request}+
|
|
|
|
<2> Adds a first `IndexRequest` to the Bulk request. See <<{upid}-index>> for
|
|
|
|
more information on how to build `IndexRequest`.
|
2017-07-05 03:26:26 -04:00
|
|
|
<3> Adds a second `IndexRequest`
|
|
|
|
<4> Adds a third `IndexRequest`
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
WARNING: The Bulk API supports only documents encoded in JSON or SMILE.
|
|
|
|
Providing documents in any other format will result in an error.
|
2017-07-05 03:26:26 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
And different operation types can be added to the same +{request}+:
|
2017-07-05 03:26:26 -04:00
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-with-mixed-operations]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
2018-11-16 02:58:13 -05:00
|
|
|
<1> Adds a `DeleteRequest` to the +{request}+. See <<{upid}-delete>>
|
2017-07-05 03:26:26 -04:00
|
|
|
for more information on how to build `DeleteRequest`.
|
2018-11-16 02:58:13 -05:00
|
|
|
<2> Adds an `UpdateRequest` to the +{request}+. See <<{upid}-update>>
|
2017-07-05 03:26:26 -04:00
|
|
|
for more information on how to build `UpdateRequest`.
|
|
|
|
<3> Adds an `IndexRequest` using the SMILE format
|
|
|
|
|
|
|
|
==== Optional arguments
|
|
|
|
The following arguments can optionally be provided:
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-timeout]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Timeout to wait for the bulk request to be performed as a `TimeValue`
|
|
|
|
<2> Timeout to wait for the bulk request to be performed as a `String`
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-refresh]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Refresh policy as a `WriteRequest.RefreshPolicy` instance
|
|
|
|
<2> Refresh policy as a `String`
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-active-shards]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Sets the number of shard copies that must be active before proceeding with
|
|
|
|
the index/update/delete operations.
|
2018-10-29 10:32:17 -04:00
|
|
|
<2> Number of shard copies provided as a `ActiveShardCount`: can be
|
|
|
|
`ActiveShardCount.ALL`, `ActiveShardCount.ONE` or
|
|
|
|
`ActiveShardCount.DEFAULT` (default)
|
2017-07-05 03:26:26 -04:00
|
|
|
|
2018-10-30 04:08:12 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-11-16 02:58:13 -05:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-pipeline]
|
2018-10-30 04:08:12 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Global pipelineId used on all sub requests, unless overridden on a sub request
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-11-16 02:58:13 -05:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-routing]
|
2018-10-30 04:08:12 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Global routingId used on all sub requests, unless overridden on a sub request
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-11-16 02:58:13 -05:00
|
|
|
include-tagged::{doc-tests-file}[{api}-request-index-type]
|
2018-10-30 04:08:12 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> A bulk request with global index and type used on all sub requests, unless overridden on a sub request.
|
2018-11-16 02:58:13 -05:00
|
|
|
Both parameters are @Nullable and can only be set during +{request}+ creation.
|
2018-10-30 04:08:12 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
include::../execution.asciidoc[]
|
2018-02-01 11:56:13 -05:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
[id="{upid}-{api}-response"]
|
2017-07-05 03:26:26 -04:00
|
|
|
==== Bulk Response
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
The returned +{response}+ contains information about the executed operations and
|
2017-07-05 03:26:26 -04:00
|
|
|
allows to iterate over each result as follows:
|
|
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-response]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Iterate over the results of all operations
|
2018-10-29 10:32:17 -04:00
|
|
|
<2> Retrieve the response of the operation (successful or not), can be
|
|
|
|
`IndexResponse`, `UpdateResponse` or `DeleteResponse` which can all be seen as
|
|
|
|
`DocWriteResponse` instances
|
2017-07-05 03:26:26 -04:00
|
|
|
<3> Handle the response of an index operation
|
|
|
|
<4> Handle the response of a update operation
|
|
|
|
<5> Handle the response of a delete operation
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
The Bulk response provides a method to quickly check if one or more operation
|
|
|
|
has failed:
|
2017-07-05 03:26:26 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-has-failures]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> This method returns `true` if at least one operation failed
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
In such situation it is necessary to iterate over all operation results in order
|
|
|
|
to check if the operation failed, and if so, retrieve the corresponding failure:
|
2017-07-05 03:26:26 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-errors]
|
2017-07-05 03:26:26 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Indicate if a given operation failed
|
|
|
|
<2> Retrieve the failure of the failed operation
|
2017-07-06 11:05:10 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
[id="{upid}-{api}-processor"]
|
2017-07-06 11:05:10 -04:00
|
|
|
==== Bulk Processor
|
|
|
|
|
|
|
|
The `BulkProcessor` simplifies the usage of the Bulk API by providing
|
|
|
|
a utility class that allows index/update/delete operations to be
|
|
|
|
transparently executed as they are added to the processor.
|
|
|
|
|
2017-10-25 04:30:23 -04:00
|
|
|
In order to execute the requests, the `BulkProcessor` requires the following
|
|
|
|
components:
|
2017-07-06 11:05:10 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
`RestHighLevelClient`:: This client is used to execute the +{request}+
|
2017-07-06 11:05:10 -04:00
|
|
|
and to retrieve the `BulkResponse`
|
|
|
|
`BulkProcessor.Listener`:: This listener is called before and after
|
2018-10-29 10:32:17 -04:00
|
|
|
every +{request}+ execution or when a +{request}+ failed
|
2017-07-06 11:05:10 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
Then the `BulkProcessor.builder` method can be used to build a new
|
|
|
|
`BulkProcessor`:
|
2017-07-06 11:05:10 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-processor-init]
|
2017-07-06 11:05:10 -04:00
|
|
|
--------------------------------------------------
|
2017-10-25 04:30:23 -04:00
|
|
|
<1> Create the `BulkProcessor.Listener`
|
2018-10-29 10:32:17 -04:00
|
|
|
<2> This method is called before each execution of a +{request}+
|
|
|
|
<3> This method is called after each execution of a +{request}+
|
|
|
|
<4> This method is called when a +{request}+ failed
|
2017-10-25 04:30:23 -04:00
|
|
|
<5> Create the `BulkProcessor` by calling the `build()` method from
|
2017-07-06 11:05:10 -04:00
|
|
|
the `BulkProcessor.Builder`. The `RestHighLevelClient.bulkAsync()`
|
2018-10-29 10:32:17 -04:00
|
|
|
method will be used to execute the +{request}+ under the hood.
|
2017-07-06 11:05:10 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
The `BulkProcessor.Builder` provides methods to configure how the
|
|
|
|
`BulkProcessor` should handle requests execution:
|
2017-07-06 11:05:10 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-processor-options]
|
2017-07-06 11:05:10 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> Set when to flush a new bulk request based on the number of
|
|
|
|
actions currently added (defaults to 1000, use -1 to disable it)
|
|
|
|
<2> Set when to flush a new bulk request based on the size of
|
|
|
|
actions currently added (defaults to 5Mb, use -1 to disable it)
|
|
|
|
<3> Set the number of concurrent requests allowed to be executed
|
|
|
|
(default to 1, use 0 to only allow the execution of a single request)
|
2018-11-16 02:58:13 -05:00
|
|
|
<4> Set a flush interval flushing any +{request}+ pending if the
|
2017-07-06 11:05:10 -04:00
|
|
|
interval passes (defaults to not set)
|
|
|
|
<5> Set a constant back off policy that initially waits for 1 second
|
|
|
|
and retries up to 3 times. See `BackoffPolicy.noBackoff()`,
|
|
|
|
`BackoffPolicy.constantBackoff()` and `BackoffPolicy.exponentialBackoff()`
|
|
|
|
for more options.
|
|
|
|
|
|
|
|
Once the `BulkProcessor` is created requests can be added to it:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-processor-add]
|
2017-07-06 11:05:10 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
The requests will be executed by the `BulkProcessor`, which takes care of
|
|
|
|
calling the `BulkProcessor.Listener` for every bulk request.
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
The listener provides methods to access to the +{request}+ and the +{response}+:
|
2017-07-06 11:05:10 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-processor-listener]
|
2017-07-06 11:05:10 -04:00
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
<1> Called before each execution of a +{request}+, this method allows to know
|
|
|
|
the number of operations that are going to be executed within the +{request}+
|
|
|
|
<2> Called after each execution of a +{request}+, this method allows to know if
|
|
|
|
the +{response}+ contains errors
|
|
|
|
<3> Called if the +{request}+ failed, this method allows to know
|
2017-07-06 11:05:10 -04:00
|
|
|
the failure
|
|
|
|
|
|
|
|
Once all requests have been added to the `BulkProcessor`, its instance needs to
|
2017-10-25 04:30:23 -04:00
|
|
|
be closed using one of the two available closing methods.
|
2017-07-06 11:05:10 -04:00
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
The `awaitClose()` method can be used to wait until all requests have been
|
|
|
|
processed or the specified waiting time elapses:
|
2017-07-06 11:05:10 -04:00
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-processor-await]
|
2017-07-06 11:05:10 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
<1> The method returns `true` if all bulk requests completed and `false` if the
|
|
|
|
waiting time elapsed before all the bulk requests completed
|
|
|
|
|
|
|
|
The `close()` method can be used to immediately close the `BulkProcessor`:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
|
|
--------------------------------------------------
|
2018-10-29 10:32:17 -04:00
|
|
|
include-tagged::{doc-tests-file}[{api}-processor-close]
|
2017-07-06 11:05:10 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
|
2018-10-29 10:32:17 -04:00
|
|
|
Both methods flush the requests added to the processor before closing the
|
|
|
|
processor and also forbid any new request to be added to it.
|