[DOCS] add java REST client docs (#19618)

[DOCS] add java REST client docs

Add some docs on how to get started with the Java REST client, some common configuration that may be needed and the sniffer component.
This commit is contained in:
Luca Cavanna 2016-07-29 11:22:47 +02:00 committed by GitHub
parent 5a99ce5b91
commit 502217f035
5 changed files with 530 additions and 0 deletions

View File

@ -0,0 +1,113 @@
== Common configuration
The `RestClientBuilder` supports providing both a `RequestConfigCallback` and
an `HttpClientConfigCallback` which allow for any customization that the Apache
Async Http Client exposes. Those callbacks make it possible to modify some
specific behaviour of the client without overriding every other default
configuration that the `RestClient` is initialized with. This section
describes some common scenarios that require additional configuration for the
low-level Java REST Client.
=== Timeouts
Configuring requests timeouts can be done by providing an instance of
`RequestConfigCallback` while building the `RestClient` through its builder.
The interface has one method that receives an instance of
https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/config/RequestConfig.Builder.html[`org.apache.http.client.config.RequestConfig.Builder`]
as an argument and has the same return type. The request config builder can
be modified and then returned. In the following example we increase the
connect timeout (defaults to 1 second) and the socket timeout (defaults to 10
seconds). Also we adjust the max retry timeout accordingly (defaults to 10
seconds too).
[source,java]
--------------------------------------------------
RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200))
.setRequestConfigCallback(new RestClientBuilder.RequestConfigCallback() {
@Override
public RequestConfig.Builder customizeRequestConfig(RequestConfig.Builder requestConfigBuilder) {
return requestConfigBuilder.setConnectTimeout(5000)
.setSocketTimeout(30000);
}
})
.setMaxRetryTimeoutMillis(30000)
.build();
--------------------------------------------------
=== Number of threads
The Apache Http Async Client starts by default one dispatcher thread, and a
number of worker threads used by the connection manager, as many as the number
of locally detected processors (depending on what
`Runtime.getRuntime().availableProcessors()` returns). The number of threads
can be modified as follows:
[source,java]
--------------------------------------------------
RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200))
.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
return httpClientBuilder.setDefaultIOReactorConfig(
IOReactorConfig.custom().setIoThreadCount(1).build());
}
})
.build();
--------------------------------------------------
=== Basic authentication
Configuring basic authentication can be done by providing an
`HttpClientConfigCallback` while building the `RestClient` through its builder.
The interface has one method that receives an instance of
https://hc.apache.org/httpcomponents-asyncclient-dev/httpasyncclient/apidocs/org/apache/http/impl/nio/client/HttpAsyncClientBuilder.html[`org.apache.http.impl.nio.client.HttpAsyncClientBuilder`]
as an argument and has the same return type. The http client builder can be
modified and then returned. In the following example we set a default
credentials provider that requires basic authentication.
[source,java]
--------------------------------------------------
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY,
new UsernamePasswordCredentials("user", "password"));
RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200))
.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
}
})
.build();
--------------------------------------------------
=== Encrypted communication
Encrypted communication can also be configured through the
`HttpClientConfigCallback`. The
https://hc.apache.org/httpcomponents-asyncclient-dev/httpasyncclient/apidocs/org/apache/http/impl/nio/client/HttpAsyncClientBuilder.html[`org.apache.http.impl.nio.client.HttpAsyncClientBuilder`]
received as an argument exposes multiple methods to configure encrypted
communication: `setSSLContext`, `setSSLSessionStrategy` and
`setConnectionManager`, in order of precedence from the least important.
The following is an example:
[source,java]
--------------------------------------------------
KeyStore keyStore = KeyStore.getInstance("jks");
try (InputStream is = Files.newInputStream(keyStorePath)) {
keyStore.load(is, keyStorePass.toCharArray());
}
RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200))
.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
return httpClientBuilder.setSSLContext(sslcontext);
}
})
.build();
--------------------------------------------------
=== Others
For any other required configuration needed, the Apache HttpAsyncClient docs
should be consulted: https://hc.apache.org/httpcomponents-asyncclient-4.1.x/ .

View File

@ -0,0 +1,12 @@
[[java-rest]]
= Java REST Client
:version: 5.0.0-alpha4
include::overview.asciidoc[]
include::usage.asciidoc[]
include::configuration.asciidoc[]
include::sniffer.asciidoc[]

View File

@ -0,0 +1,42 @@
== Overview
Official low-level client for Elasticsearch. Allows to communicate with an
Elasticsearch cluster through http. Compatible with all elasticsearch versions.
=== Features
The low-level client's features include:
* minimal dependencies
* load balancing across all available nodes
* failover in case of node failures and upon specific response codes
* failed connection penalization (whether a failed node is retried depends on
how many consecutive times it failed; the more failed attempts the longer the
client will wait before trying that same node again)
* persistent connections
* trace logging of requests and responses
* optional automatic <<sniffer,discovery of cluster nodes>>
=== License
Copyright 2013-2016 Elasticsearch
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,136 @@
[[sniffer]]
== Sniffer
Minimal library that allows to automatically discover nodes from a running
Elasticsearch cluster and set them to an existing `RestClient` instance.
It retrieves by default the nodes that belong to the cluster using the
Nodes Info api and uses jackson to parse the obtained json response.
Compatible with Elasticsearch 2.x and onwards.
=== Maven Repository
Here is how you can configure the dependency using maven as a dependency manager.
Add the following to your `pom.xml` file:
["source","xml",subs="attributes"]
--------------------------------------------------
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>sniffer</artifactId>
<version>{version}</version>
</dependency>
--------------------------------------------------
The low-level REST client is subject to the same release cycle as
elasticsearch. Replace `${es.version}` with the desired sniffer version, first
released with `5.0.0-alpha4`. There is no relation between the sniffer version
and the elasticsearch version that the client can communicate with. Sniffer
supports fetching the nodes list from elasticsearch 2.x and onwards.
=== Usage
Once a `RestClient` instance has been created, a `Sniffer` can be associated
to it. The `Sniffer` will make use of the provided `RestClient` to periodically
(every 5 minutes by default) fetch the list of current nodes from the cluster
and update them by calling `RestClient#setHosts`.
[source,java]
--------------------------------------------------
Sniffer sniffer = Sniffer.builder(restClient).build();
--------------------------------------------------
It is important to close the `Sniffer` so that its background thread gets
properly shutdown and all of its resources are released. The `Sniffer`
object should have the same lifecycle as the `RestClient` and get closed
right before the client:
[source,java]
--------------------------------------------------
sniffer.close();
restClient.close();
--------------------------------------------------
The Elasticsearch Nodes Info api doesn't return the protocol to use when
connecting to the nodes but only their `host:port` key-pair, hence `http`
is used by default. In case `https` should be used instead, the
`ElasticsearchHostsSniffer` object has to be manually created and provided
as follows:
[source,java]
--------------------------------------------------
HostsSniffer hostsSniffer = new ElasticsearchHostsSniffer(restClient,
ElasticsearchHostsSniffer.DEFAULT_SNIFF_REQUEST_TIMEOUT,
ElasticsearchHostsSniffer.Scheme.HTTPS);
Sniffer sniffer = Sniffer.builder(restClient)
.setHostsSniffer(hostsSniffer).build();
--------------------------------------------------
In the same way it is also possible to customize the `sniffRequestTimeout`,
which defaults to one second. That is the `timeout` parameter provided as a
querystring parameter when calling the Nodes Info api, so that when the
timeout expires on the server side, a valid response is still returned
although it may contain only a subset of the nodes that are part of the
cluster, the ones that have responsed until then.
Also, a custom `HostsSniffer` implementation can be provided for advanced
use-cases that may require fetching the hosts from external sources.
The `Sniffer` updates the nodes by default every 5 minutes. This interval can
be customized by providing it (in milliseconds) as follows:
[source,java]
--------------------------------------------------
Sniffer sniffer = Sniffer.builder(restClient)
.setSniffIntervalMillis(60000).build();
--------------------------------------------------
It is also possible to enable sniffing on failure, meaning that after each
failure the nodes list gets updated straightaway rather than at the following
ordinary sniffing round. In this case a `SniffOnFailureListener` needs to
be created at first and provided at `RestClient` creation. Also once the
`Sniffer` is later created, it needs to be associated with that same
`SniffOnFailureListener` instance, which will be notified at each failure
and use the `Sniffer` to perform the additional sniffing round as described.
[source,java]
--------------------------------------------------
SniffOnFailureListener sniffOnFailureListener = new SniffOnFailureListener();
RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200))
.setFailureListener(sniffOnFailureListener).build();
Sniffer sniffer = Sniffer.builder(restClient).build();
sniffOnFailureListener.setSniffer(sniffer);
--------------------------------------------------
When using sniffing on failure, not only do the nodes get updated after each
failure, but an additional sniffing round is also scheduled sooner than usual,
by default one minute after the failure, assuming that things will go back to
normal and we want to detect that as soon as possible. Said interval can be
customized at `Sniffer` creation time as follows:
[source,java]
--------------------------------------------------
Sniffer sniffer = Sniffer.builder(restClient)
.setSniffAfterFailureDelayMillis(30000).build();
--------------------------------------------------
Note that this last configuration parameter has no effect in case sniffing
on failure is not enabled like explained above.
=== License
Copyright 2013-2016 Elasticsearch
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,227 @@
== Getting started
=== Maven Repository
The low-level Java REST client is hosted on
http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.elasticsearch.client%22[Maven
Central]. The minimum Java version required is `1.7`.
Here is how you can configure the dependency using maven as a dependency manager.
Add the following to your `pom.xml` file:
["source","xml",subs="attributes"]
--------------------------------------------------
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>rest</artifactId>
<version>{version}</version>
</dependency>
--------------------------------------------------
The low-level REST client is subject to the same release cycle as
elasticsearch. Replace `${es.version}` with the desired client version, first
released with `5.0.0-alpha4`. There is no relation between the client version
and the elasticsearch version that the client can communicate with. The
low-level REST client is compatible with all elasticsearch versions.
=== Dependencies
The low-level Java REST client internally uses the
http://hc.apache.org/httpcomponents-asyncclient-dev/[Apache Http Async Client]
to send http requests. It depends on the following artifacts, namely the async
http client and its own transitive dependencies:
- org.apache.httpcomponents:httpasyncclient
- org.apache.httpcomponents:httpcore-nio
- org.apache.httpcomponents:httpclient
- org.apache.httpcomponents:httpcore
- commons-codec:commons-codec
- commons-logging:commons-logging
=== Initialization
A `RestClient` instance can be built through the corresponding
`RestClientBuilder` class, created via `RestClient#builder(HttpHost...)`
static method. The only required argument is one or more hosts that the
client will communicate with, provided as instances of
https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/HttpHost.html[HttpHost]
as follows:
[source,java]
--------------------------------------------------
RestClient restClient = RestClient.builder(
new HttpHost("http", "localhost", 9200),
new HttpHost("http", "localhost", 9201)).build();
--------------------------------------------------
The `RestClient` class is thread-safe and ideally has the same lifecycle as
the application that uses it. It is important that it gets closed when no
longer needed so that all the resources used by it get properly released,
as well as the underlying http client instance and its threads:
[source,java]
--------------------------------------------------
restClient.close();
--------------------------------------------------
`RestClientBuilder` also allows to optionally set the following configuration
parameters while building the `RestClient` instance:
`setDefaultHeaders`:: default headers that need to be sent with each request,
to prevent having to specify them with each single request
`setMaxRetryTimeoutMillis`:: the timeout that should be honoured in case
multiple attempts are made for the same request. The default value is 10
seconds, same as the default socket timeout. In case the socket timeout is
customized, the maximum retry timeout should be adjusted accordingly
`setFailureListener`:: a listener that gets notified every time a node
fails, in case actions need to be taken. Used internally when sniffing on
failure is enabled
`setRequestConfigCallback`:: callback that allows to modify the default
request configuration (e.g. request timeouts, authentication, or anything that
the https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/config/RequestConfig.Builder.html[`org.apache.http.client.config.RequestConfig.Builder`]
allows to set)
`setHttpClientConfigCallback`:: callback that allows to modify the http client
configuration (e.g. encrypted communication over ssl, or anything that the
http://hc.apache.org/httpcomponents-asyncclient-dev/httpasyncclient/apidocs/org/apache/http/impl/nio/client/HttpAsyncClientBuilder.html[`org.apache.http.impl.nio.client.HttpAsyncClientBuilder`]
allows to set)
=== Performing requests
Once the `RestClient` has been created, requests can be sent by calling one of
the available `performRequest` method variants. The ones that return the
`Response` are executed synchronously, meaning that the client will block and
wait for a response to be returned. The `performRequest` variants that return
`void` accept a `ResponseListener` as an argument and are executed
asynchronously. The provided listener will be notified upon completion or
failure. The following are the arguments accepted by the different
`performRequest` methods:
`method`:: the http method or verb
`endpoint`:: the request path, which identifies the Elasticsearch api to
call (e.g. `/_cluster/health`)
`params`:: the optional parameters to be sent as querystring parameters
`entity`:: the optional request body enclosed in an
`org.apache.http.HttpEntity` object
`responseConsumer`:: the optional
http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/nio/protocol/HttpAsyncResponseConsumer.html[`org.apache.http.nio.protocol.HttpAsyncResponseConsumer`]
callback. Controls how the response body gets streamed from a non-blocking
HTTP connection on the client side. When not provided, the default
implementation is used which buffers the whole response body in heap memory
`responseListener`:: the listener to be notified upon request success or failure
whenever the async `performRequest` method variants are used
`headers`:: optional request headers
=== Reading responses
The `Response` object, either returned by the sync `performRequest` methods or
received as an argument in `ResponseListener#onSucces(Response)`, wraps the
response object returned by the http client and exposes the following information:
`getRequestLine`:: information about the performed request
`getHost`:: the host that returned the response
`getStatusLine`:: the response status line
`getHeaders`:: the response headers, which can also be retrieved by name
though `getHeader(String)`
`getEntity`:: the response body enclosed in an
https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/HttpEntity.html[`org.apache.http.HttpEntity`]
object
When performing a request, an exception is thrown (or received as an argument
in `ResponseListener#onSucces(Exception)` in the following scenarios:
`IOException`:: communication problem (e.g. SocketTimeoutException etc.)
`ResponseException`:: a response was returned, but its status code indicated
an error (either `4xx` or `5xx`). A `ResponseException` originates from a valid
http response, hence it exposes its corresponding `Response` object which gives
access to the returned response.
=== Example requests
Here are a couple of examples:
[source,java]
--------------------------------------------------
Response response = restClient.performRequest("GET", "/",
Collections.singletonMap("pretty", "true"));
System.out.println(EntityUtils.toString(response.getEntity()));
//index a document
HttpEntity entity = new NStringEntity(
"{\n" +
" \"user\" : \"kimchy\",\n" +
" \"post_date\" : \"2009-11-15T14:12:12\",\n" +
" \"message\" : \"trying out Elasticsearch\"\n" +
"}", ContentType.APPLICATION_JSON);
Response indexResponse = restClient.performRequest(
"PUT",
"/twitter/tweet/1",
Collections.<String, String>emptyMap(),
entity);
--------------------------------------------------
Note that the low-level client doesn't expose any helper for json marshalling
and un-marshalling. Users are free to use the library that they prefer for that
purpose.
The underlying Apache Async Http Client ships with different
https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/HttpEntity.html[`org.apache.http.HttpEntity`]
implementations that allow to provide the request body in different formats
(stream, byte array, string etc.). As for reading the response body, the
`HttpEntity#getContent` method comes handy which returns an `InputStream`
reading from the previously buffered response body. As an alternative, it is
possible to provide a custom
http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/nio/protocol/HttpAsyncResponseConsumer.html[`org.apache.http.nio.protocol.HttpAsyncResponseConsumer`]
that controls how bytes are read and buffered.
The following is a basic example of how async requests can be sent:
[source,java]
--------------------------------------------------
int numRequests = 10;
final CountDownLatch latch = new CountDownLatch(numRequests);
for (int i = 0; i < numRequests; i++) {
restClient.performRequest(
"PUT",
"/twitter/tweet/" + i,
Collections.<String, String>emptyMap(),
//assume that the documents are stored in an entities array
entities[i],
new ResponseListener() {
@Override
public void onSuccess(Response response) {
System.out.println(response);
latch.countDown();
}
@Override
public void onFailure(Exception exception) {
latch.countDown();
}
}
);
}
//wait for all requests to be completed
latch.await();
--------------------------------------------------
=== Logging
The Java REST client uses the same logging library that the Apache Async Http
Client uses: https://commons.apache.org/proper/commons-logging/[Apache Commons Logging],
which comes with support for a number of popular logging implementations. The
java packages to enable logging for are `org.elasticsearch.client` for the
client itself and `org.elasticsearch.client.sniffer` for the sniffer.
The request tracer logging can also be enabled to log every request and
corresponding response in curl format. That comes handy when debugging, for
instance in case a request needs to be manually executed to check whether it
still yields the same response as it did. Enable trace logging for the `tracer`
package to have such log lines printed out. Do note that this type of logging is
expensive and should not be enabled at all times in production environments,
but rather temporarily used only when needed.