Improve the hard_limit memory audit message by reporting how many bytes
over the configured memory limit the job was at the point of the last
allocation failure.
Previously the model memory usage was reported, however this was
inaccurate and hence of limited use - primarily because the total
memory used by the model can decrease significantly after the models
status is changed to hard_limit but before the model size stats are
reported from autodetect to ES.
While this PR contains the changes to the format of the hard_limit audit
message it is dependent on modifications to the ml-cpp backend to
send additional data fields in the model size stats message. These
changes will follow in a subsequent PR. It is worth noting that this PR
must be merged prior to the ml-cpp one, to keep CI tests happy.
* [ML] adding pivot.size option for setting paging size
* Changing field name to address PR comments
* fixing ctor usage
* adjust hlrc for field name change
* [ML] Adds progress reporting for transforms
* fixing after master merge
* Addressing PR comments
* removing unused imports
* Adjusting afterKey handling and percentage to be 100*
* Making sure it is a linked hashmap for serialization
* removing unused import
* addressing PR comments
* removing unused import
* simplifying code, only storing total docs and decrementing
* adjusting for rewrite
* removing initial progress gathering from executor
Today the `_field_caps` API returns the list of indices where a field
is present only if this field has different types within the requested indices.
However if the request is an index pattern (or an alias, or both...) there
is no way to infer the indices if the response contains only fields that have
the same type in all indices. This commit changes the response to always return
the list of indices in the response. It also adds a way to retrieve unmapped field
in a specific section per field called `unmapped`. This section is created for each field
that is present in some indices but not all if the parameter `include_unmapped` is set to
true in the request (defaults to false).
In order to support empty action metadata in the first msearch item,
we need to remove support for prepending msearch request body with an
empty line, which prevents us from parsing the empty line as action
metadata for the first search item.
Relates to #41011
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for #40866 which now might lead to an outright rejection of the request instead of its items individually
* Fixed by adding retry functionality to the top level request as well
* Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did
* closes#41324
This change adds either ToXContentObject or ToXContentFragment to classes
directly implementing ToXContent currently. This helps in reasoning about
whether those implementations output full xcontent object or just fragments.
Relates to #16347
* moved hlrc parsing tests from xpack to hlrc module and removed dependency on hlrc from xpack core
* deprecated old base test class
* added deprecated jdoc tag
* split test between xpack-core part and hlrc part
* added lang-mustache test dependency, this previously came in via
hlrc dependency.
* added hlrc dependency on a qa module
* duplicated ClusterPrivilegeName class in xpack-core, since x-pack
core no longer has a dependency on hlrc.
* replace ClusterPrivilegeName usages with string literals
* moved tests to dedicated to hlrc packages in order to remove Hlrc part from the name and make sure to use imports instead of full qualified class where possible
* remove ESTestCase. from method invocation and use method directly,
because these tests indirectly extend from ESTestCase
changed hlrc ccr request tests to use AbstractRequestTestCase base class.
This way the request classes are tested in a more realistic setting.
Note this change also adds a test dependency on xpack core module.
Similar to #39844 but then for hlrc request serialization tests.
Removed iterators from hlrc parsing tests.
Use empty xcontent registries.
Relates to #39745
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
Made changes to ensure that unique IDs are generated for model snapshots
used by the deleteExpiredDataTest test in the MachineLearningIT suite.
Previously a sleep of 1s was performed between jobs under the assumption
that this would be sufficient to guarantee that the timestamps used in
the composition of the snapshot IDs would be different.
The new approach is to wait on the condition that the old and new
timestamps are in fact different (to 1s resolution).
This change updates our version of httpclient to version 4.5.8, which
contains the fix for HTTPCLIENT-1968, which is a bug where the client
started re-writing paths that contained encoded reserved characters
with their unreserved form.
Many gradle projects specifically use the -try exclude flag, because
there are many cases where auto-closeable resource ignore is never
referenced in body of corresponding try statement. Suppressing this
warning specifically in each case that it happens using
`@SuppressWarnings("try")` would be very verbose.
This change removes `-try` from any gradle project and adds it to the
build plugin. Also this change removes exclude flags from gradle projects
that is already specified in build plugin (for example -deprecation).
Relates to #40366
This commit fixes a problem with BWC that was brought up in #40511. A
newer version of the code was emitting a new value for an enum to an
older version, and the older version could not handle that. It caused
the response to error. The MainResponse is now relaxed, and will accept
whatever values the server expose, and holds most of them as Strings
instead of complex objects.
Fixes#40511
This adds a new `role_templates` field to role mappings that is an
alternative to the existing roles field.
These templates are evaluated at runtime to determine which roles should be
granted to a user.
For example, it is possible to specify:
"role_templates": [
{ "template":{ "source": "_user_{{username}}" } }
]
which would mean that every user is assigned to their own role based on
their username.
You may not specify both roles and role_templates in the same role
mapping.
This commit adds support for templates to the role mapping API, the role
mapping engine, the Java high level rest client, and Elasticsearch
documentation.
Due to the lack of caching in our role mapping store, it is currently
inefficient to use a large number of templated role mappings. This will be
addressed in a future change.
Backport of: #39984, #40504
It initially mentioned the type in the exception because the type used to be
required to uniquely identify a document. This is not necessary anymore given
that indices have at most one type.
The reindex family of APIs (reindex, update-by-query, delete-by-query) can
sometimes return responses that have an error status code (409 Conflict in this
case) but still have a body in the usual BulkByScrollResponse format. When the
HLRC tries to handle such responses, it blows up because it tris to parse it
expecting the error format that errors in general use. This change prompts the
HLRC to parse the response using the expected BulkByScrollResponse format.
* [ML] Add data frame task state object and field
* A new state item is added so that the overall task state can be
accoutned for
* A new FAILED state and reason have been added as well so that failures
can be shown to the user for optional correction
* Addressing PR comments
* adjusting after master merge
* addressing pr comment
* Adjusting auditor usage with failure state
* Refactor, renamed state items to task_state and indexer_state
* Adding todo and removing redundant auditor call
* Address HLRC changes and PR comment
* adjusting hlrc IT test
* [ML] make source and dest objects in the transform config
* addressing PR comments
* Fixing compilation post merge
* adding comment for Arrays.hashCode
* addressing changes for moving dest to object
* fixing data_frame yml tests
* fixing API test
The base class facilitates generating a server side response test instance,
that gets serialized as xcontent, which then gets parsed into a hlrc response
instance, which then gets asserted against the server side response instance.
This way of testing is more realistic then how hlrc response classes are tested
today, which basically tests that serialization works by generating
hlrc response instance, serialize that to xcontent and then parse it back
to a hlrc response instance.
Besides adding a base test class, this change also cuts AcknowledgedResponseTests
and BroadcastResponseTests over to use this base class.
Relates to #39745
The Migration Assistance API has been functionally replaced by the
Deprecation Info API, and the Migration Upgrade API is not used for the
transition from ES 6.x to 7.x, and does not need to be kept around to
repair indices that were not properly upgraded before upgrading the
cluster, as was the case in 6.
This PR adds an internal REST API for querying context information about
Painless whitelists.
Commands include the following:
GET /_scripts/painless/_context -- retrieves a list of contexts
GET /_scripts/painless/_context?context=%name% retrieves all available
information about the API for this specific context
As discovered in #40041, when parsing certificates from files, the
SUN Security Provider normalizes DNs from parsed certificates by
adding spaces between RDNs, while the BouncyCastle one (which we
use in FIPS tests) does not.
We could proceed to normalize the DNs in the same manner in this
test by using i.e. the Unbound LDAP SDK but since the goal of this
test is to validate that we do get to read these exact certificates
from our trust sources and not to validate subject DNs, this commit
changes the test to check the serial number instead
Resolves: #40041
The monitoring bulk API accepts the same format as the bulk API, yet its concept
of types is different from "mapping types" and the deprecation warning is only
emitted as a side-effect of this API reusing the parsing logic of bulk requests.
This commit extracts the parsing logic from `_bulk` into its own class with a
new flag that allows to configure whether usage of `_type` should emit a warning
or not. Support for payloads has been removed for simplicity since they were
unused.
@jakelandis has a separate change that removes this notion of type from the
monitoring bulk API that we are considering bringing to 8.0.
This commit introduces the forget follower API. This API is needed in cases that
unfollowing a following index fails to remove the shard history retention leases
on the leader index. This can happen explicitly through user action, or
implicitly through an index managed by ILM. When this occurs, history will be
retained longer than necessary. While the retention lease will eventually
expire, it can be expensive to allow history to persist for that long, and also
prevent ILM from performing actions like shrink on the leader index. As such, we
introduce an API to allow for manual removal of the shard history retention
leases in this case.
This change adds two new cluster privileges:
* manage_data_frame_transforms
* monitor_data_frame_transforms
And two new built-in roles:
* data_frame_transforms_admin
* data_frame_transforms_user
These permit access to the data frame transform endpoints.
(Index privileges are also required on the source and
destination indices for each data frame transform, but
since these indices are configurable they it is not
appropriate to grant them via built-in roles.)
The data frame plugin allows users to create feature indexes by pivoting a source index. In a
nutshell this can be understood as reindex supporting aggregations or similar to the so called entity
centric indexing.
Full history is provided in: feature/data-frame-transforms
We have had various reports of problems caused by the maxRetryTimeout
setting in the low-level REST client. Such setting was initially added
in the attempts to not have requests go through retries if the request
already took longer than the provided timeout.
The implementation was problematic though as such timeout would also
expire in the first request attempt (see #31834), would leave the
request executing after expiration causing memory leaks (see #33342),
and would not take into account the http client internal queuing (see #25951).
Given all these issues, it seems that this custom timeout mechanism
gives little benefits while causing a lot of harm. We should rather rely
on connect and socket timeout exposed by the underlying http client
and accept that a request can overall take longer than the configured
timeout, which is the case even with a single retry anyways.
This commit removes the `maxRetryTimeout` setting and all of its usages.
Updated IndexTemplateMetaData to use ObjectParser.
The IndexTemplateMetaData class used old parsing logic and was not
resiliant to new fields. This commit updates it to use the
ConstructingObjectParser and allow unknown fields.
Relates #36938
Co-authored-by: Michael Basnight <mbasnight@gmail.com>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
This test has been disabled since November 2018, but I was not able to reproduce
the failure. Re-enabling this so we can see the full log and get more context if
it fails again.
Relates to #35514
Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:
> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document
We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem.
This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.
Relates to #1078
`CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when
deserializing index creation requests, accidentally accepts mappings that are
nested twice under the type key (as described in the bug report #38266).
This in turn causes us to be too lenient in parsing typeless mappings. In
particular, we accept the following index creation request, even though it
should not contain the type key `_doc`:
```
PUT index?include_type_name=false
{
"mappings": {
"_doc": {
"properties": { ... }
}
}
}
```
There is a similar issue for both 'put templates' and 'put mappings' requests
as well.
This PR makes the minimal changes to detect and reject these typed mappings in
requests. It does not address #38266 generally, or attempt a larger refactor
around types in these server-side requests, as I think this should be done at a
later time.
This PR fixes a couple test issues:
* It narrows an assertWarnings call that was too broad, and wasn't always
applicable with certain random sequences.
* Previously, we could send a typeless bulk request containing '_type: 'null'.
Now we omit the _type key altogether for typeless requests.
This commit ensures that the parts of rollup caps that can allow unknown
fields will allow them. It also modifies the test such that we can use
the features we need for disallowing fields in spots where they would
not be allowed.
Relates #36938
Introduced FollowParameters class that put follow, resume follow,
put auto follow pattern requests and follow info response classes reuse.
The FollowParameters class had the fields, getters etc. for the common parameters
that all these APIs have. Also binary and xcontent serialization /
parsing is handled by this class.
The follow, resume follow, put auto follow pattern request classes originally
used optional non primitive fields, so FollowParameters has that too and the follow info api can handle that now too.
Also the followerIndex field can in production only be specified via
the url path. If it is also specified via the request body then
it must have the same value as is specified in the url path. This
option only existed to xcontent testing. However the AbstractSerializingTestCase
base class now also supports createXContextTestInstance() to provide
a different test instance when testing xcontent, so allowing followerIndex
to be specified via the request body is no longer needed.
By moving the followerIndex field from Body to ResumeFollowAction.Request
class and not allowing the followerIndex field to be specified via
the request body the Body class is redundant and can be removed. The
ResumeFollowAction.Request class can then directly use the
FollowParameters class.
For consistency I also removed the ability to specified followerIndex
in the put follow api and the name in put auto follow pattern api via
the request body.
* Adding apm_user
* Fixing SecurityDocumentationIT testGetRoles test
* Adding access to .ml-anomalies-*
* Fixing APM test, we don't have access to the ML state index
X-Pack security supports built-in authentication service
`token-service` that allows access tokens to be used to
access Elasticsearch without using Basic authentication.
The tokens are generated by `token-service` based on
OAuth2 spec. The access token is a short-lived token
(defaults to 20m) and refresh token with a lifetime of 24 hours,
making them unsuitable for long-lived or recurring tasks where
the system might go offline thereby failing refresh of tokens.
This commit introduces a built-in authentication service
`api-key-service` that adds support for long-lived tokens aka API
keys to access Elasticsearch. The `api-key-service` is consulted
after `token-service` in the authentication chain. By default,
if TLS is enabled then `api-key-service` is also enabled.
The service can be disabled using the configuration setting.
The API keys:-
- by default do not have an expiration but expiration can be
configured where the API keys need to be expired after a
certain amount of time.
- when generated will keep authentication information of the user that
generated them.
- can be defined with a role describing the privileges for accessing
Elasticsearch and will be limited by the role of the user that
generated them
- can be invalidated via invalidation API
- information can be retrieved via a get API
- that have been expired or invalidated will be retained for 1 week
before being deleted. The expired API keys remover task handles this.
Following are the API key management APIs:-
1. Create API Key - `PUT/POST /_security/api_key`
2. Get API key(s) - `GET /_security/api_key`
3. Invalidate API Key(s) `DELETE /_security/api_key`
The API keys can be used to access Elasticsearch using `Authorization`
header, where the auth scheme is `ApiKey` and the credentials, is the
base64 encoding of API key Id and API key separated by a colon.
Example:-
```
curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health
```
Closes#34383
The HLRC client currently uses `org.elasticsearch.action.admin.indices.get.GetIndexRequest`
and `org.elasticsearch.action.admin.indices.get.GetIndexResponse` in its get index calls. Both request and
response are designed for the typed APIs, including some return types e.g. for `getMappings()` which in
the maps it returns still use a level including the type name.
In order to change this without breaking existing users of the HLRC API, this PR introduces two new request
and response objects in the `org.elasticsearch.client.indices` client package. These are used by the
IndicesClient#get and IndicesClient#exists calls now by default and support the type-less API. The old request
and response objects are still kept for use in similarly named, but deprecated methods.
The newly introduced client side classes are simplified versions of the server side request/response classes since
they don't need to support wire serialization, and only the response needs fromXContent parsing (but no
xContent-serialization, since this is the responsibility of the server-side class).
Also changing the return type of `GetIndexResponse#getMapping` to
`Map<String, MappingMetaData> getMappings()`, while it previously was returning another map
keyed by the type-name. Similar getters return simple Maps instead of the ImmutableOpenMaps that the
server side response objects return.
As mapping types are being removed throughout Elasticsearch, the use of
`_type` in pipeline simulation requests is deprecated. Additionally, the
default `_type` used if one is not supplied has been changed to `_doc` for
consistency with the rest of Elasticsearch.
IndexLifecycleExplainResponse did not allow unknown fields. This commit
fixes the test and ConstructingObjectParser such that it allows unknown
fields.
This commit deprecates the few methods that had their parameters
reordered to facilitate the move from EmptyResponse to boolean. This
commit also readds the boolean based methods with the proper
signatures.
Relates #37540
Relates #36938
With this commit we add a monotonically strict timer to ensure time is
advancing even if the timer is called in a tight loop in tests. We also
relax a condition in a similar test so it only checks that time is not
moving backwards.
Closes#33747
Implements `geotile_grid` aggregation
This patch refactors previous implementation https://github.com/elastic/elasticsearch/pull/30240
This code uses the same base classes as `geohash_grid` agg, but uses a different hashing
algorithm to allow zoom consistency. Each grid bucket is aligned to Web Mercator tiles.
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.
This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.
Relates to #32125
The existing implementation was slow due to exceptions being thrown if
an accessor did not have a time zone. This implementation queries for
having a timezone, local time and local date and also checks for an
instant preventing to throw an exception and thus speeding up the conversion.
This removes the existing method and create a new one named
DateFormatters.from(TemporalAccessor accessor) to resemble the naming of
the java time ones.
Before this change an epoch millis parser using the toZonedDateTime
method took approximately 50x longer.
Relates #37826
* move watcher to seq# occ
* top level set
* fix parsing and missing setters
* share toXContent for PutResponse and rest end point
* fix redacted password
* fix username reference
* fix deactivate-watch.asciidoc have seq no references
* add seq# + term to activate-watch.asciidoc
* more doc fixes
This commit fixes the test case that ensures only a priority
less then 0 is used with testNonPositivePriority. This also
allows the HLRC to support a value of 0.
Closes#37652
Previously, ShrinkAction would fail if
it was executed on an index that had
the same number of shards as the target
shrunken number.
This PR introduced a new BranchingStep that
is used inside of ShrinkAction to branch which
step to move to next, depending on the
shard values. So no shrink will occur if the
shard count is unchanged.
The apache commons http client implementations recently released
versions that solve TLS compatibility issues with the new TLS engine
that supports TLSv1.3 with JDK 11. This change updates our code to
use these versions since JDK 11 is a supported JDK and we should
allow the use of TLSv1.3.
A few of the ILM Lifecycle Policy and classes did not allow for unknown
fields. This commit sets those parsers and fixes the tests for the
parsers.
Relates #36938
The subparser in verify repository allows for unknown fields. This
commit sets the value to true for the parser and modifies the test such
that it accurately tests it.
Relates #36938
The LLRC's exception handling for strict mode was previously throwing an
exception the HLRC assumed was an error response. This is not the case
if the result is valid in strict mode, as it will return the proper
response wrapped in an exception with warnings. This commit fixes the
HLRC such that it no longer spews if it encounters a strict LLRC
response.
Closes#37090
Added deprecation warnings for use of include_type_name in put/get index templates.
HLRC changes:
GetIndexTemplateRequest has a new client-side class which is a copy of server's GetIndexTemplateResponse but modified to be typeless.
PutIndexTemplateRequest has a new client-side counterpart which doesn't use types in the mappings
Relates to #35190
This commit modifies the put follow index action to use a
CcrRepository when creating a follower index. It routes
the logic through the snapshot/restore process. A
wait_for_active_shards parameter can be used to configure
how long to wait before returning the response.
The update request has a lesser known support for a one off update of a known document version. This PR adds an a seq# based alternative to power these operations.
Relates #36148
Relates #10708
We inject an Unfollow action before Shrink because the Shrink action
cannot be safely used on a following index, as it may not be fully
caught up with the leader index before the "original" following index is
deleted and replaced with a non-following Shrunken index. The Unfollow
action will verify that 1) the index is marked as "complete", and 2) all
operations up to this point have been replicated from the leader to the
follower before explicitly disconnecting the follower from the leader.
Injecting an Unfollow action before the Rollover action is done mainly
as a convenience: This allow users to use the same lifecycle policy on
both the leader and follower cluster without having to explictly modify
the policy to unfollow the index, while doing what we expect users to
want in most cases.
From previous PRs, we've already added support for include_type_name to
the get mapping API. We had also taken an approach to the HLRC where the
server-side `GetMappingResponse#fromXContent` could only handle typeless
input.
This PR updates the HLRC for 'get mapping' to be in line with our new approach:
* Add a typeless 'get mappings' method to the Java HLRC, that accepts new
client-side request and response objects. This new response only handles
typeless mapping definitions.
* Switch the old version of `GetMappingResponse` back to expecting typed
mappings, and deprecate the corresponding method on the HLRC.
Finally, the PR also does some small, related clean-up around 'get field mappings'.