* Adds a new auto-interval date histogram
This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time.
This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary.
Closes#9572
* Adds documentation
* Added sub aggregator test
* Fixes failing docs test
* Brings branch up to date with master changes
* trying to get tests to pass again
* Fixes multiBucketConsumer accounting
* Collects more buckets than needed on shards
This gives us more options at reduce time in terms of how we do the
final merge of the buckeets to produce the final result
* Revert "Collects more buckets than needed on shards"
This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5.
* Adds ability to merge within a rounding
* Fixes nonn-timezone doc test failure
* Fix time zone tests
* iterates on tests
* Adds test case and documentation changes
Added some notes in the documentation about the intervals that can bbe
returned.
Also added a test case that utilises the merging of conseecutive buckets
* Fixes performance bug
The bug meant that getAppropriate rounding look a huge amount of time
if the range of the data was large but also sparsely populated. In
these situations the rounding would be very low so iterating through
the rounding values from the min key to the max keey look a long time
(~120 seconds in one test).
The solution is to add a rough estimate first which chooses the
rounding based just on the long values of the min and max keeys alone
but selects the rounding one lower than the one it thinks is
appropriate so the accurate method can choose the final rounding taking
into account the fact that intervals are not always fixed length.
Thee commit also adds more tests
* Changes to only do complex reduction on final reduce
* merge latest with master
* correct tests and add a new test case for 10k buckets
* refactor to perform bucket number check in innerBuild
* correctly derive bucket setting, update tests to increase bucket threshold
* fix checkstyle
* address code review comments
* add documentation for default buckets
* fix typo
This commit adds the _xpack/usage api to the high level rest client.
Currently in the transport api, the usage data is exposed in a limited
fashion, at most giving one level of helper methods for the inner keys
of data, but then exposing thos subobjects as maps of objects. Rather
than making parsers for every set of usage data from each feature, this
PR exposes the entire set of usage data as a map of maps.
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `client/rest-high-level` project to use the new
versions.
The `:x-pack:protocol` project is an implementation detail shared by the
xpack projects and the high level rest client and really doesn't deserve
its own maven coordinants and published javadoc. This change bundles
`:x-pack:protocol` into the high level rest client.
Relates to #29827
Originally I put the X-Pack info object into the top level rest client
object. I did that because we thought we'd like to squash `xpack` from
the name of the X-Pack APIs now that it is part of the default
distribution. We still kind of want to do that, but at least for now we
feel like it is better to keep the high level rest client aligned with
the other language clients like C# and Python. This shifts the X-Pack
info API to align with its json spec file.
Relates to #31870
This is the first x-pack API we're adding to the high level REST client
so there is a lot to talk about here!
= Open source
The *client* for these APIs is open source. We're taking the previously
Elastic licensed files used for the `Request` and `Response` objects and
relicensing them under the Apache 2 license.
The implementation of these features is staying under the Elastic
license. This lines up with how the rest of the Elasticsearch language
clients work.
= Location of the new files
We're moving all of the `Request` and `Response` objects that we're
relicensing to the `x-pack/protocol` directory. We're adding a copy of
the Apache 2 license to the root fo the `x-pack/protocol` directory to
line up with the language in the root `LICENSE.txt` file. All files in
this directory will have the Apache 2 license header as well. We don't
want there to be any confusion. Even though the files are under the
`x-pack` directory, they are Apache 2 licensed.
We chose this particular directory layout because it keeps the X-Pack
stuff together and easier to think about.
= Location of the API in the REST client
We've been following the layout of the rest-api-spec files for other
APIs and we plan to do this for the X-Pack APIs with one exception:
we're dropping the `xpack` from the name of most of the APIs. So
`xpack.graph.explore` will become `graph().explore()` and
`xpack.license.get` will become `license().get()`.
`xpack.info` and `xpack.usage` are special here though because they
don't belong to any proper category. For now I'm just calling
`xpack.info` `xPackInfo()` and intend to call usage `xPackUsage` though
I'm not convinced that this is the final name for them. But it does get
us started.
= Jars, jars everywhere!
This change makes the `xpack:protocol` project a `compile` scoped
dependency of the `x-pack:plugin:core` and `client:rest-high-level`
projects. I intend to keep it a compile scoped dependency of
`x-pack:plugin:core` but I intend to bundle the contents of the protocol
jar into the `client:rest-high-level` jar in a follow up. This change
has grown large enough at this point.
In that followup I'll address javadoc issues as well.
= Breaking-Java
This breaks that transport client by a few classes around. We've
traditionally been ok with doing this to the transport client.
This is a followup to #31537. It makes a number of changes requested by
a review that came after the PR was merged. These are mostly cleanups
and doc improvements.
This PR does the server side work for adding the Get Index API to the REST
high-level-client, namely moving resolving default settings to the
transport action. A follow up would be the client side changes.
The test could sometimes generate an empty array of random stored fields
to fetch which would cause the test to fail. It is fine not to allow
empty here because we already randomize whether or not to generate the
stored fields param anyway.
Added support to the high-level rest client for the create snapshot API call. This required
several changes to toXContent which may need to be cleaned up in a later PR. Also
added several parsers for fromXContent to be able to retrieve appropriate responses
along with tests.
While the other two ranking evaluation metrics (precicion and reciprocal rank)
already provide a more detailed output for how their score is calculated, the
discounted cumulative gain metric (dcg) and its normalized variant are lacking
this until now. Its not really clear which level of detail might be useful for
debugging and understanding the final metric calculation, but this change adds a
`metric_details` section to REST output that contains some information about the
evaluation details.
With #29331 we added support for the cluster health API to the
high-level REST client. The transport client does not support the level
parameter, and it always returns all the info needed for shards level
rendering. We have maintained that behaviour when adding support for
cluster health to the high-level REST client, to ease migration, but the
correct thing to do is to default the high-level REST client to
`cluster` level, which is the same default as when going through the
Elasticsearch REST layer.
The default wait_for_active_shards is NONE for cluster health, which
differs from all the other API in master, hence we need to make sure to
set the parameter whenever it differs from NONE (0). The test around
this also had a bug, which is why this was not originally uncovered.
Relates to #29331
This commit removes all the API methods that accept a `Header` varargs
argument, in favour of the newly introduced API methods that accept a
`RequestOptions` argument.
Relates to #31069
The response currently implements ToXContentFragment although the only time it's used
it is supposed to print out a complete object rather than a fragment. Note that this
is the client version of the response, used only in the high-level client.
Given the weirdness of the response returned by the get alias API, we went for a client specific response, which allows us to hold the error message, exception and status returned as part of the response together with aliases. See #30536 .
Relates to #27205
* Initial commit of rest high level exposure of cancel task
* fix javadocs
* address some code review comments
* update branch to use tasks namespace instead of cluster
* High-level client: list tasks failure to not lose nodeId
This commit reworks testing for `ListTasksResponse` so that random
fields insertion can be tested and xcontent equivalence can be checked
too. Proper exclusions need to be configured, and failures need to be
tested separately. This helped finding a little problem, whenever there
is a node failure returned, the nodeId was lost as it was never printed
out as part of the exception toXContent.
* added comment
* merge from master
* re-work CancelTasksResponseTests to separate XContent failure cases from non-failure cases
* remove duplication of logic in parser creation
* code review changes
* refactor TasksClient to support RequestOptions
* add tests for parent task id
* address final PR review comments, mostly formatting and such
With #30490 we have introduced a new way to provide request options
whenever sending a request using the high-level REST client. Before you
could provide headers as the last argument varargs of each API method,
now you can provide `RequestOptions` that in the future will allow to
provide more options which can be specified per request.
This commit deprecates all of the client methods that accept a `Header`
varargs argument in favour of new methods that accept `RequestOptions`
instead. For some API we don't even go through deprecation given that
they were not released since they were added, hence in that case we can
just move them to the new method.
This modifies the high level rest client to allow calling code to
customize per request options for the bulk API. You do the actual
customization by passing a `RequestOptions` object to the API call
which is set on the `Request` that is generated by the high level
client. It also makes the `RequestOptions` a thing in the low level
rest client. For now that just means you use it to customize the
headers and the `httpAsyncResponseConsumerFactory` and we'll add
node selectors and per request timeouts in a follow up.
I only implemented this on the bulk API because it is the first one
in the list alphabetically and I wanted to keep the change small
enough to review. I'll convert the remaining APIs in a followup.
This commit adds Verify Repository, the associated docs and tests for
the high level REST API client. A few small changes to the Verify
Repository Response went into the commit as well.
Relates #27205
Our API spec define the tasks API as e.g. tasks.list, meaning that they belong to their own namespace. This commit moves them from the cluster namespace to their own namespace.
Relates to #29546
Adding headers rather than setting them all at once seems more
user-friendly and we already do it in a similar way for parameters
(see Request#addParameter).
This commit adds Delete Repository, the associated docs and tests for
the high level REST API client. It also cleans up a seemingly innocuous
line in the RestDeleteRepositoryAction and some naming in SnapshotIT.
Relates #27205
This change adds a `listTasks` method to the high level java
ClusterClient which allows listing running tasks through the
task management API.
Related to #27205
This commit adds Create Repository, the associated docs and tests
for the high level REST API client. A few small changes to the
PutRepository Request and Response went into the commit as well.
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.
Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
The High Level REST Client's documentation suggested that users should
use the Low Level REST Client for index management activities. This
change removes that suggestion because the high level REST client
supports those APIs now.
This also changes the examples in the migration docs to that still use
the Low Level REST Client to use the non-deprecated varieats of
`performRequest`.
Previously `BulkProcessor` retry logic was based on the exception type of the failed response (`EsRejectedExecutionException`). This commit changes it to be based on the returned status code. This allows us to reproduce the same retry behaviour when the `BulkProcessor` is used from the high-level REST client, which was previously not the case as we cannot rebuild the same exception type when parsing back the response. This change has no effect on the transport client.
Closes#28885
This commit adds the Snapshot Client with a first API call within it,
the get repositories call in snapshot/restore module. This also creates
a snapshot namespace for the docs, as well as get repositories docs.
Relates #27205
This PR adds support for the Get Settings API to the java high-level rest client.
Furthermore, logic related to the retrieval of default settings has been moved from the rest layer into the transport layer and now default settings may be retrieved consistency via both the rest API and the transport API.
Adds two new methods to `RestClient` that take a `Request` object. These
methods will allows us to add more per-request customizable options
without creating more and more and more overloads of the `performRequest`
and `performRequestAsync` methods. These new methods look like:
```
Response performRequest(Request request)
```
and
```
void performRequestAsync(Request request, ResponseListener responseListener)
```
This change doesn't add any actual features but enables adding things like
per request timeouts and per request node selectors. This change *does*
rework the `HighLevelRestClient` and its tests to use these new `Request`
objects and it does update the docs.
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
As part of adding support for new API to the high-level REST client,
we added support for the `flat_settings` parameter to some of our
request classes. We added documentation that such flag is only ever
read by the high-level REST client, but the truth is that it doesn't
do anything given that settings are always parsed back into a `Settings`
object, no matter whether they are returned in a flat format or not.
It was a mistake to add support for this flag in the context of the
high-level REST client, hence this commit removes it.
This change validates that the `_search` request does not have trailing
tokens after the main object and fails the request with a parsing exception otherwise.
Closes#28995
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
Currently the ranking evaluation API doesn't support many of the
standard parameters of the search API. Some of these make sense, like
adding support for the common indices options parameters, which this
change adds.
When the `BulkProcessor` is used with the high-level REST client, a scheduler is internally created that allows to schedule tasks. Such scheduler is not exposed to users and needs to be closed once the `BulkProcessor` is closed. There are two ways to close the `BulkProcessor` though, one is the ordinary `close` method and the other one is `awaitClose`. The former closes the scheduler while the latter doesn't, leaving threads lingering.
* Remove all dependencies from XContentBuilder
This commit removes all of the non-JDK dependencies from XContentBuilder, with
the exception of `CollectionUtils.ensureNoSelfReferences`. It adds a third
extension point around dealing with time-based fields and formatters to work
around the Joda dependency.
This decoupling allows us to be able to move XContentBuilder to a separate lib
so it can be available for things like the high level rest client.
Relates to #28504
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
Adds docs for `HighLevelRestClient#multiSearch`. Unlike the `multiGet`
docs these are much more sparse because multi-search doesn't support
setting many options on the `MultiSearchRequest` and instead just wraps
a list of `SearchRequest`s.
Closes#28389
Currently we store the indices specified in the request URL together with all
the other ranking evaluation specification in RankEvalSpec. This is not ideal
since e.g. the indices are not rendered to xContent and so cannot be parsed
back. Instead we should keep them in RankEvalRequest.
* Decouple XContentBuilder from BytesReference
This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.
While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).
Relates to #28504
The REST status 503 means "I can not handle the request that you sent
me." However today we respond to a main request with a 503 when there
are certain cluster blocks despite still responding with an actual main
response. This is broken, we should respond with a 200 status. This
commit removes this silliness.
The recent addition of the _cluster/settings API was merged together with another change that added encoding of the different URL parts. That broke the cluster PUT settings API straight-away. This commit fixes this problem. The '/' that's part of the /_cluster/settings endpoint should not be encoded.
The REST high-level client supports now encoding of path parts, so that for instance documents with valid ids, but containing characters that need to be encoded as part of urls (`#` etc.), are properly supported. We also make sure that each path part can contain `/` by encoding them properly too.
Closes#28625