This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.
Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
The High Level REST Client's documentation suggested that users should
use the Low Level REST Client for index management activities. This
change removes that suggestion because the high level REST client
supports those APIs now.
This also changes the examples in the migration docs to that still use
the Low Level REST Client to use the non-deprecated varieats of
`performRequest`.
These tests are sharing the same server and client for every test. Yet,
we are seeing some tests fail with mysterious connection resets. It is
not clear what is happening but one theory is that the tests are
interfering with each other. This commit moves to use a separate server
and client per test.
Previously `BulkProcessor` retry logic was based on the exception type of the failed response (`EsRejectedExecutionException`). This commit changes it to be based on the returned status code. This allows us to reproduce the same retry behaviour when the `BulkProcessor` is used from the high-level REST client, which was previously not the case as we cannot rebuild the same exception type when parsing back the response. This change has no effect on the transport client.
Closes#28885
This commit adds the Snapshot Client with a first API call within it,
the get repositories call in snapshot/restore module. This also creates
a snapshot namespace for the docs, as well as get repositories docs.
Relates #27205
We've been setting this value to 500ms in the default low-level REST
client configuration, misunderstanding the effect that it would have.
This proved very problematic, as it ends up causing `TimeoutException`
returned from the leased pool in some cases even for successful requests.
Closes#24069
Deprecate the many arguments versions of `performRequest` and
`performRequestAsync` in favor of the `Request` object flavored variants
introduced in #29623. We'll be dropping the many arguments variants in
7.0 because they make it difficult to add new features in a backwards
compatible way and they create a *ton* of intellisense noise.
This PR adds support for the Get Settings API to the java high-level rest client.
Furthermore, logic related to the retrieval of default settings has been moved from the rest layer into the transport layer and now default settings may be retrieved consistency via both the rest API and the transport API.
Adds two new methods to `RestClient` that take a `Request` object. These
methods will allows us to add more per-request customizable options
without creating more and more and more overloads of the `performRequest`
and `performRequestAsync` methods. These new methods look like:
```
Response performRequest(Request request)
```
and
```
void performRequestAsync(Request request, ResponseListener responseListener)
```
This change doesn't add any actual features but enables adding things like
per request timeouts and per request node selectors. This change *does*
rework the `HighLevelRestClient` and its tests to use these new `Request`
objects and it does update the docs.
This *mostly* silences `javadoc`'s warning about defaulting to
generating html4 files by enabling generating html5 file for the
projects for which that works. It didn't work in a half dozen projects,
about half of which I've fixed in this PR, entirely by replacing
`<tt>thing</tt>` with `{@code thing}`.
There are a few remaining projects that contain javadoc with invalid
html5. I'll fix those projects in a followup.
The low-level REST client targets JDK 7. To avoid compiling against JDK
functionality not available in JDK 7, we use animal sniffer. However,
when we switched to using the JDK 9 and now the JDK 10 compiler which
has built-in support for targeting previous JDKs, we no longer need to
use animal sniffer. This is because the JDK is now packaged with the
signatures needed to ensure that when we target JDK 7 at compile-time it
is detected that we are only using JDK 7 functionality. This commit
removes the use of animal sniffer from the low-level REST client build.
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
As part of adding support for new API to the high-level REST client,
we added support for the `flat_settings` parameter to some of our
request classes. We added documentation that such flag is only ever
read by the high-level REST client, but the truth is that it doesn't
do anything given that settings are always parsed back into a `Settings`
object, no matter whether they are returned in a flat format or not.
It was a mistake to add support for this flag in the context of the
high-level REST client, hence this commit removes it.
The following is the current behaviour, tested now through a specific
test.
The low-level REST client doesn't add a leading wildcard when not
provided, unless a `pathPrefix` is configured in which case a trailing
slash will be automatically added when concatenating the prefix and the
provided uri.
Also when configuring a pathPrefix, if it doesn't start with a '/' it
will be modified by adding the missing leading '/'.
CRUD: Parsing changes for UpdateRequest (#29293)
Use `ObjectParser` to parse `UpdateRequest` so we reject unknown fields
and drop support for the `_fields` parameter because it was deprecated
in 5.x.
This change validates that the `_search` request does not have trailing
tokens after the main object and fails the request with a parsing exception otherwise.
Closes#28995
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
Currently the ranking evaluation API doesn't support many of the
standard parameters of the search API. Some of these make sense, like
adding support for the common indices options parameters, which this
change adds.
When the `BulkProcessor` is used with the high-level REST client, a scheduler is internally created that allows to schedule tasks. Such scheduler is not exposed to users and needs to be closed once the `BulkProcessor` is closed. There are two ways to close the `BulkProcessor` though, one is the ordinary `close` method and the other one is `awaitClose`. The former closes the scheduler while the latter doesn't, leaving threads lingering.
* Remove all dependencies from XContentBuilder
This commit removes all of the non-JDK dependencies from XContentBuilder, with
the exception of `CollectionUtils.ensureNoSelfReferences`. It adds a third
extension point around dealing with time-based fields and formatters to work
around the Joda dependency.
This decoupling allows us to be able to move XContentBuilder to a separate lib
so it can be available for things like the high level rest client.
Relates to #28504
This was the plan from day one but due to a silly bug nodes were immediately retried after they were marked as dead for the first time. From the second time on, the expected backoff was applied.
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
Adds docs for `HighLevelRestClient#multiSearch`. Unlike the `multiGet`
docs these are much more sparse because multi-search doesn't support
setting many options on the `MultiSearchRequest` and instead just wraps
a list of `SearchRequest`s.
Closes#28389
Currently we store the indices specified in the request URL together with all
the other ranking evaluation specification in RankEvalSpec. This is not ideal
since e.g. the indices are not rendered to xContent and so cannot be parsed
back. Instead we should keep them in RankEvalRequest.
Adds SSLHandshakeException to the list of Exceptions that are
specifically rethrown from the async thread so its type is preserved.
This should make it easier to debug synchronous calls with ssl issues.
In the past the Low Level REST Client was super careful not to wrap
any exceptions that it throws from synchronous calls so that callers can
catch the exceptions and work with them. The trouble with that is that
the exceptions are originally thrown on the async thread pool and then
transfered back into calling thread. That means that the stack trace of
the exception doesn't have the calling method which is *super* *ultra*
confusing.
This change always wraps exceptions transferred from the async thread
pool so that the stack trace of the thrown exception contains the
caller's stack. It tries to preserve the type of the throw exception but
this is quite a fiddly thing to get right. We have to catch every type
of exception that we want to preserve, wrap with the same type and
rethrow. I've preserved the types of all exceptions that we had tests
mentioning but no other exceptions. The other exceptions are either
wrapped in `IOException` or `RuntimeException`.
Closes#28399
* Decouple XContentBuilder from BytesReference
This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.
While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).
Relates to #28504
The REST status 503 means "I can not handle the request that you sent
me." However today we respond to a main request with a 503 when there
are certain cluster blocks despite still responding with an actual main
response. This is broken, we should respond with a 200 status. This
commit removes this silliness.
The recent addition of the _cluster/settings API was merged together with another change that added encoding of the different URL parts. That broke the cluster PUT settings API straight-away. This commit fixes this problem. The '/' that's part of the /_cluster/settings endpoint should not be encoded.
The REST high-level client supports now encoding of path parts, so that for instance documents with valid ids, but containing characters that need to be encoded as part of urls (`#` etc.), are properly supported. We also make sure that each path part can contain `/` by encoding them properly too.
Closes#28625
* Move to non-deprecated XContentHelper.createParser(...)
This moves away from one of the now-deprecated XContentHelper.createParser
methods in favor of specifying the deprecation logger at parser creation time.
Relates to #28449
Note that this doesn't move all the `createParser` calls because some of them
use the already-deprecated method that doesn't specify the XContentType.
* Remove the deprecated (and now non-needed) createParser method
* [DOCS] expand examples on providing mappings for create index and put mapping
The create index API and put mappings API docs the for high-level Java REST client didn't have a lot of info on how to provide mappings. This commit adds some examples.
This commit splits the async execution documentation into 2 parts, one
for the async method itself and one for the action listener. This allows
to add more doc and to use CountDownLatches in doc tests to wait for
asynchronous operations to be completed before moving to the next test.
It also renames few files.
Related to #28457
Similarly to other documentation tests in the high level client, the
asynchronous operation like update, index or delete can make the test
fail if it sneak in between the middle of another operation.
This commit moves the async doc tests to be the last ones executed and
adds assert busy loops to ensure that the asynchronous operations are
correctly terminated.
closes#28446
Adds allow_partial_search_results flag to search requests with default setting = true.
When false, will error if search either timeouts, has partial errors or has missing shards rather
than returning partial search results. A cluster-level setting provides a default for search requests with no flag.
Closes#27435
This change adds support for the new ranking evaluation API to the High Level Rest Client.
This mostly means adding support for parsing the various response objects back from the
REST representation. It includes one change to the response syntax where previously we didn't
print the type of the metric details section but we now need it to pick the right parser to
parse this section back.
Closes#28198
Script fields can get a bit more complicated than just stored fields. A script can return null, an object and also an array. Extended parsing to support such valid values. Also renamed util method from `parseStoredFieldsValue` to `parseFieldsValue` given that it can parse stored fields but also script fields, anything that's returned as `fields`.
Closes#28380
It has been pointed out that GET with body may cause problems to some proxies. We are then switching to POST the API that retrieve info and support a request body.
Closes#28326
The search request body can never be null as `SearchRequest` doesn't allow the inner `SearchSourceBuilder` to be null. Instead, when search source is not set, the request body is going to be an empty json object (`{}``)
Today, the way to call them API under the indices namespace is by doing e.g. `client.indices().createIndex()`. Our spec define the API under the indices namespace as e.g. `indices.create`, hence there is no need to repeat the index suffix for each method as that is already defined by the namespace. Using the `index` suffix in each method was an oversight which must be corrected.
This is related to #27933. It introduces a jar named elasticsearch-core
in the lib directory. This commit moves the JarHell class from server to
elasticsearch-core. Additionally, PathUtils and some of Loggers are
moved as JarHell depends on them.
Several responses include the shards_acknowledged flag (indicating whether the
requisite number of shard copies started before the completion of the operation)
and there are two different getters used : isShardsAcknowledged() and isShardsAcked().
This PR deprecates the isShardsAcked() in favour of isShardsAcknowledged() in
CreateIndexResponse, RolloverResponse and CreateIndexClusterStateUpdateResponse.
Closes#27784
With this commit we upgrade the Gradle Shadow plugin that is used in our
benchmarks to version 2.0.2. This version does not use APIs that are
deprecated in Gradle 4.x.
* Adds task dependenciesInfo to BuildPlugin to generate a CSV file with dependencies information (name,version,url,license)
* Adds `ConcatFilesTask.groovy` to concatenates multiple files into one
* Adds task `:distribution:generateDependenciesReport` to concatenate `dependencies.csv` files into a single file (`es-dependencies.csv` by default)
# Examples:
$ gradle dependenciesInfo :distribution:generateDependenciesReport
## Use `csv` system property to customize the output file path
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dcsv=/tmp/elasticsearch-dependencies.csv
## When branch is not master, use `build.branch` system property to generate correct licenses URLs
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dbuild.branch=6.x -Dcsv=/tmp/elasticsearch-dependencies.csv
The last operation executed in IndicesClientDocumentationIT.testCreate()
is an asynchronous index creation. Because nothing waits for its
completion, on slow machines the index can sometimes be created after
the testCreate() test is finished, and it can fail the following test.
Closes#27754
This commit removes the usage of system properties for the HttpAsyncClient as this overrides some
defaults that we intentionally change. In order to set the default SSLContext to the system context
we set the SSLContext on the builder explicitly.
Closes#27827
This was already changed in 6.x as part of the backport of the recently added open and create index API. wait_for_active_shards can be a number but also "all", with this commit we verify that providing "all" works too.
Today we require users to prepare their indices for split operations.
Yet, we can do this automatically when an index is created which would
make the split feature a much more appealing option since it doesn't have
any 3rd party prerequisites anymore.
This change automatically sets the number of routinng shards such that
an index is guaranteed to be able to split once into twice as many shards.
The number of routing shards is scaled towards the default shard limit per index
such that indices with a smaller amount of shards can be split more often than
larger ones. For instance an index with 1 or 2 shards can be split 10x
(until it approaches 1024 shards) while an index created with 128 shards can only
be split 3x by a factor of 2. Please note this is just a default value and users
can still prepare their indices with `index.number_of_routing_shards` for custom
splitting.
NOTE: this change has an impact on the document distribution since we are changing
the hash space. Documents are still uniformly distributed across all shards but since
we are artificually changing the number of buckets in the consistent hashign space
document might be hashed into different shards compared to previous versions.
This is a 7.0 only change.
This change removes the module named aggs-composite and adds the `composite` aggs
as a core aggregation. This allows other plugins to use this new aggregation
and simplifies the integration in the HL rest client.
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
Stardardize underscore requirements in parameters across different type of
requests:
_index, _type, _source, _id keep their underscores
params like version and retry_on_conflict will be without underscores
Throw an error if older versions of parameters are used
BulkRequest, MultiGetRequest, TermVectorcRequest, MoreLikeThisQuery
were changed
Closes#26886
* This change adds a module called `aggs-composite` that defines a new aggregation named `composite`.
The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources.
The sources for each bucket can be defined as:
* A `terms` source, values are extracted from a field or a script.
* A `date_histogram` source, values are extracted from a date field and rounded to the provided interval.
This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets.
A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources.
For instance the following aggregation:
````
"test_agg": {
"terms": {
"field": "field1"
},
"aggs": {
"nested_test_agg":
"terms": {
"field": "field2"
}
}
}
````
... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve **all** the combinations of `field1`, `field2` in the matching documents:
````
"composite_agg": {
"composite": {
"sources": [
{
"field1": {
"terms": {
"field": "field1"
}
}
},
{
"field2": {
"terms": {
"field": "field2"
}
}
},
}
}
````
The response of the aggregation looks like this:
````
"aggregations": {
"composite_agg": {
"buckets": [
{
"key": {
"field1": "alabama",
"field2": "almanach"
},
"doc_count": 100
},
{
"key": {
"field1": "alabama",
"field2": "calendar"
},
"doc_count": 1
},
{
"key": {
"field1": "arizona",
"field2": "calendar"
},
"doc_count": 1
}
]
}
}
````
By default this aggregation returns 10 buckets sorted in ascending order of the composite key.
Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after.
For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`:
````
"composite_agg": {
"composite": {
"after": {"field1": "alabama", "field2": "calendar"},
"size": 100,
"sources": [
{
"field1": {
"terms": {
"field": "field1"
}
}
},
{
"field2": {
"terms": {
"field": "field2"
}
}
}
}
}
````
This aggregation is optimized for indices that set an index sorting that match the composite source definition.
For instance the aggregation above could run faster on indices that defines an index sorting like this:
````
"settings": {
"index.sort.field": ["field1", "field2"]
}
````
In this case the `composite` aggregation can early terminate on each segment.
This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition.
This is mandatory because index sorting picks only one value per document to perform the sort.
This is the first step to supporting WKT (and other future) format(s). The ShapeBuilders are quite messy and can be simplified by decoupling the parse logic from the build logic. This commit refactors the parsing logic into its own package separate from the Shape builders. It also decouples the GeoShapeType into a standalone enumerator that is responsible for validating the parsed data and providing the appropriate builder. This future-proofs the code making it easier to maintain and add new shape types.
While it's not possible to upgrade the Jackson dependencies
to their latest versions yet (see #27032 (comment) for more)
it's still possible to upgrade to the latest 2.8.x version.
The headers passed to reindex were skipped except for the last one. This
commit fixes the copying of the headers, as well as adds a base test
case for rest client builders to access the headers within the built
rest client.
relates #22976
Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close.
Closes#26028
Upgrade to Jackson 2.9.2 and also use a boolean `closed` flag to
indicate that a FastStringReader instance is closed, so that length
is still correctly reported after the reader is closed.
The shard preference _primary, _replica and its variants were useful
for the asynchronous replication. However, with the current impl, they
are no longer useful and should be removed.
Closes#26335
Request class is currently package protected, making it difficult for
the users to extend the RestHighLevelClient and to use its protected
methods to execute requests. This commit makes the Request class public
and changes few methods of RestHighLevelClient to be protected.
This avoids messages with malformed URLs, like
"org.elasticsearch.client.ResponseException: PUT
http://127.0.0.1:9502customer: HTTP/1.1 400 Bad Request".
Relates #26564
It's easy to create a wrong Content-Type header when converting a
XContentType to a Apache HTTP ContentType instance.
This commit the direct usages of ContentType.create() methods in favor of a Request.createContentType(XContentType) method that does the right thing.
Closes#26438
At current, we do not feel there is enough of a reason to shade the low
level rest client. It caused problems with commons logging and IDE's
during the brief time it was used. We did not know exactly how many
users will need this, and decided that leaving shading out until we
gather more information is best. Users can still shade the jar
themselves. For information and feeback, see issue #26366.
Closes#26328
This reverts commit 3a20922046.
This reverts commit 2c271f0f22.
This reverts commit 9d10dbea39.
This reverts commit e816ef89a2.
This allows plugins to plug rescore implementations into
Elasticsearch. While this is a fairly expert thing to do I've
done my best to point folks to the QueryRescorer as one that at
least documents the tradeoffs that it makes. I've attempted to
limit the API surface area by removing `SearchContext` from the
exposed interface, instead exposing just the IndexSearcher and
`QueryShardContext`. I also tried to make some of the class names
more consistent and do some general cleanup while I was there.
I entertained the notion of moving the `QueryRescorer` to module.
After all, it'd be a wonderful test to prove that you can plug
rescore implementation into Elasticsearch if the only built in
rescore implementation is in the module. But I decided against it
because the new module would require a client jar and it'd require
moving some more things around. I think if we really want to do
it, we should do it as a followup.
I did, on the other hand, create an "example" rescore plugin which
should both be a nice example for anyone wanting to plug in their
own rescore implementation and servers as a good integration test
to make sure that you can indeed plug one in.
Closes#26208
The parser for the `ip_range` aggregation response is currently missing from the
NamedXContentRegistry in the high level rest client. Also changes the testing
around the expected number of parsers so we at least check that we register all
the parsers that we also test in InternalAggregationTestCase.
By making RestHighLevelClient Closeable, its close method will close the internal low-level REST client instance by default, which simplifies the way most users interact with the high-level client.
Its constructor accepts now a RestClientBuilder, which clarifies that the low-level REST client is internally created and managed.
It is still possible to provide an already built `RestClient` instance, but that can only be done by subclassing `RestHighLevelClient` and calling the protected constructor that accepts a `RestClient`. In such case a consumer has also to be provided, which controls what has to be done when the high-level client gets done.
Closes#26086
The client sniffer depends on the low-level REST client, while the Java high-level REST client and the transport client depend on Elasticsearch itself. Javadoc are not that useful unless they have links to the Elasticsearch classes in the latter case, and to the low-level REST client in the sniffer javadoc. This commit adds those links.
* A cycle was detected in eclipse, and was fixed in the same fashion as
core and core-tests.
* The rest client deps jar was not properly exported in the generated
eclipse classpath file for rest client.
Relates #25208
The low level rest client does not need the shadow plugin applied, it
only needs the plugin jar in the classpath, in order to create a
ShadowJar task.
Relates #25208
The configuration removed from the runtime configuration did not
properly remove the deps jar from gradle versions > 3.3. The rest client
now removes both the 3.3 and 3.3+ configurations so this works on both
versions of gradle.
Closes#25884
Relates #25208
This commit removes all external dependencies from the rest client jar
and shades them in an 'org.elasticsearch.client' package within the jar
using shadowJar gradle plugin. All projects that depended on the
existing jar have been converted to using the 'org.elasticsearch.client'
package prefixes to interact with the rest client.
Closes#25208
This commit calls the `useSystemProperties` method on the HttpAsyncClientBuilder so that the jvm
system properties are used. The primary reason for doing this is to ensure the builder uses the
system default SSLContext rather than the default instance created by the http client library.
Closes#23231
We currently use fielddata on the `_id` field which is trappy, especially as we
do it implicitly. This changes the `random_score` function to use doc ids when
no seed is provided and to suggest a field when a seed is provided.
For now the change only emits a deprecation warning when no field is supplied
but this should be replaced by a strict check on 7.0.
Closes#25240