This is related to #36652. We intend to deprecate a number of transport
settings in 7.x and remove them in 8.0. This commit removes the string
usages of these settings.
This commit removes xpack dependencies of many xpack qa modules.
(for some qa modules this will require some more work)
The reason behind this change is that qa rest modules should not depend
on the x-pack plugins, because the plugins are an implementation detail and
the tests should only know about the rest interface and qa cluster that is
being tested.
Also some qa modules rely on xpack plugins and hlrc (which is a valid
dependency for rest qa tests) creates a cyclic dependency and this is
something that we should avoid. Also Eclipse can't handle gradle cyclic
dependencies (see #41064).
* don't copy xpack-core's plugin property into the test resource of qa
modules. Otherwise installing security manager fails, because it tries
to find the XPackPlugin class.
This commit adds an OpenID Connect authentication realm to
elasticsearch. Elasticsearch (with the assistance of kibana or
another web component) acts as an OpenID Connect Relying
Party and supports the Authorization Code Grant and Implicit
flows as described in http://ela.st/oidc-spec. It adds support
for consuming and verifying signed ID Tokens, both RP
initiated and 3rd party initiated Single Sign on and RP
initiated signle logout.
It also adds an OpenID Connect Provider in the idp-fixture to
be used for the associated integration tests.
This is a backport of #40674
Backport of (#41087)
* Use environment settings instead of state settings for Watcher config
Prior to this we used the settings from cluster state to see whether ILM was
enabled of disabled, however, these settings don't accurately reflect the
`xpack.ilm.enabled` setting in `elasticsearch.yml`.
This commit changes to using the `Environment` settings, which correctly reflect
the ILM enabled setting.
Resolves#41042
* moved hlrc parsing tests from xpack to hlrc module and removed dependency on hlrc from xpack core
* deprecated old base test class
* added deprecated jdoc tag
* split test between xpack-core part and hlrc part
* added lang-mustache test dependency, this previously came in via
hlrc dependency.
* added hlrc dependency on a qa module
* duplicated ClusterPrivilegeName class in xpack-core, since x-pack
core no longer has a dependency on hlrc.
* replace ClusterPrivilegeName usages with string literals
* moved tests to dedicated to hlrc packages in order to remove Hlrc part from the name and make sure to use imports instead of full qualified class where possible
* remove ESTestCase. from method invocation and use method directly,
because these tests indirectly extend from ESTestCase
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
* Avoid sharing source directories as it breaks intellij
* Subprojects share main project output classes directory
* Fix jar hell
* Fix sql security with ssl integ tests
* Relax dependency ordering rule so we don't explode on cycles
Many gradle projects specifically use the -try exclude flag, because
there are many cases where auto-closeable resource ignore is never
referenced in body of corresponding try statement. Suppressing this
warning specifically in each case that it happens using
`@SuppressWarnings("try")` would be very verbose.
This change removes `-try` from any gradle project and adds it to the
build plugin. Also this change removes exclude flags from gradle projects
that is already specified in build plugin (for example -deprecation).
Relates to #40366
By default, in integ tests we wait for the standalone cluster to start
by using the ant Get task to retrieve the cluster health endpoint.
However the ant task has no facilities for customising the trusted
CAs for a https resource, so if the integ test cluster has TLS enabled
on the http interface (using a custom CA) we need a separate utility
for that purpose.
Backport of: #40573
* [ML] Addressing bug streaming DatafeedConfig aggs from (<= 6.5.4) -> 6.7.0 (#40610)
* Addressing stream failure and adding tests to catch such in the future
* Add aggs to full cluster restart tests
* Test BWC for datafeeds with and without aggs
The wire serialisation is different for null/non-null
aggs, so it's worth testing both cases.
* Fixing bwc test, removing types
* Fixing BWC test for datafeed
* Update 40_ml_datafeed_crud.yml
* Update build.gradle
This change removes the variants of the rolling upgrade and full
cluster restart tests that use or do not use a system key. These tests
were added during 5.x when the system key was still used for security
and now the system key is only used as the watcher encryption key so
duplicating rolling upgrade and full cluster restarts is not needed.
The change here removes the subprojects for testing these scenarios and
defaults to always run with the watcher sensitive values encrypted for
these tests.
Replaces the vagrant based kerberos fixtures with docker based test fixtures plugin.
The configuration is now entirely static on the docker side and no longer driven by Gradle,
also two different services are being configured since there are two different consumers of the fixture that can run in parallel and require different configurations.
This change removes the use of hardcoded port values for the
idp-fixture in favor of the mapped ephemeral ports. This should prevent
failures due to port conflicts in CI.
The Migration Assistance API has been functionally replaced by the
Deprecation Info API, and the Migration Upgrade API is not used for the
transition from ES 6.x to 7.x, and does not need to be kept around to
repair indices that were not properly upgraded before upgrading the
cluster, as was the case in 6.
This commit enables full-cluster-restart and rolling-upgrade tests
to run with nodes using a JVM in fips approved only node by using
PEM key material instead of a JKS for the transport layer in that
case.
The change replaces the Vagrant box based fixture with a fixture
based on docker compose and 2 docker images, one for an openldap
server and one for a Shibboleth SAML Identity Provider.
The configuration of both openldap and shibboleth is identical to
the previous one, in order to minimize required changes in the
tests
This change is a backport of #39252
- Fixes TokenBackwardsCompatibilityIT: Existing tests seemed to made
the assumption that in the oneThirdUpgraded stage the master node
will be on the old version and in the twoThirdsUpgraded stage, the
master node will be one of the upgraded ones. However, there is no
guarantee that the master node in any of the states will or will
not be one of the upgraded ones.
This class now tests:
- That we can generate and consume tokens before we start the
rolling upgrade.
- That we can consume tokens generated in the old cluster during
all the stages of the rolling upgrade.
- That while on a mixed cluster, when/if the master node is
upgraded, we can generate, consume and refresh a token
- That after the rolling upgrade, we can consume a token
generated in an old cluster and can invalidate it so that it
can't be used any more.
- Ensures that during the rolling upgrade, the upgraded nodes have
the same configuration as the old nodes. Specifically that the
file realm we use is explicitly named `file1`. This is needed
because while attempting to refresh a token in a mixed cluster
we might create a token hitting an old node and attempt to refresh
it hitting a new node. If the file realm name is not the same, the
refresh will be seen as being made by a "different" client, and
will, thus, fail.
- Renames the Authentication variable we check while refreshing a
token to be clientAuth in order to make the code more readable.
Some of the above were possibly causing the flakiness of #37379
* Remove Hipchat support from Watcher (#39199)
Hipchat has been shut down and has previously been deprecated in
Watcher (#39160), therefore we should remove support for these actions.
* Add migrate note
There have been intermittent failures where either
LDAP server could not be started or KDC server could
not be started causing failures during test runs.
`KdcNetwork` class from Apache kerby project does not set reuse
address to `true` on the socket so if the port that we found to be free
is in `TIME_WAIT` state it may fail to bind. As this is an internal
class for kerby, I could not find a way to extend.
This commit adds a retry loop for initialization. It will keep
trying in an await busy loop and fail after 10 seconds if not
initialized.
Closes#35982
* Disable specific locales for tests in fips mode
The Bouncy Castle FIPS provider that we use for running our tests
in fips mode has an issue with locale sensitive handling of Dates as
described in https://github.com/bcgit/bc-java/issues/405
This causes certificate validation to fail if any given test that
includes some form of certificate validation happens to run in one
of the locales. This manifested earlier in #33081 which was
handled insufficiently in #33299
This change ensures that the problematic 3 locales
* th-TH
* ja-JP-u-ca-japanese-x-lvariant-JP
* th-TH-u-nu-thai-x-lvariant-TH
will not be used when running our tests in a FIPS 140 JVM. It also
reverts #33299
* Fix#38623 remove xpack namespace REST API
Except for xpack.usage and xpack.info API's, this moves the last remaining API's out of the xpack namespace
* rename xpack api's inside inside the files as well
* updated yaml tests references to xpack namespaces api's
* update callsApi calls in the IT subclasses
* make sure docs testing does not use xpack namespaced api's
* fix leftover xpack namespaced method names in docs/build.gradle
* found another leftover reference
(cherry picked from commit ccb5d934363c37506b76119ac050a254fa80b5e7)
Follow index in follow cluster that follows an index in the leader cluster and another
follow index in the leader index that follows that index in the follow cluster.
During the upgrade index following is paused and after the upgrade
index following is resumed and then verified index following works as expected.
Relates to #38037
The rest of `CCRIT` is now no longer relevant, because the remaining
test tests the same of the index following test in the rolling upgrade
multi cluster module.
Added `tests.upgrade_from_version` version to test. It is not needed
in this branch, but is in 6.7 branch.
Closes#37231
* Add rolling upgrade multi cluster test module (#38277)
This test starts 2 clusters, each with 3 nodes.
First the leader cluster is started and tests are run against it and
then the follower cluster is started and tests execute against this two cluster.
Then the follower cluster is upgraded, one node at a time.
After that the leader cluster is upgraded, one node at a time.
Every time a node is upgraded tests are ran while both clusters are online.
(and either leader cluster has mixed node versions or the follower cluster)
This commit only tests CCR index following, but could be used for CCS tests as well.
In particular for CCR, unidirectional index following is tested during a rolling upgrade.
During the test several indices are created and followed in the leader cluster before or
while the follower cluster is being upgraded.
This tests also verifies that attempting to follow an index in the upgraded cluster
from the not upgraded cluster fails. After both clusters are upgraded following the
index that previously failed should succeed.
Relates to #37231 and #38037
* Filter out upgraded version index settings when starting index following (#38838)
The `index.version.upgraded` and `index.version.upgraded_string` are likely
to be different between leader and follower index. In the event that
a follower index gets restored on a upgraded node while the leader index
is still on non-upgraded nodes.
Closes#38835
Instead of using `WarningsHandler.PERMISSIVE`, we only match warnings
that are due to types removal.
This PR also renames `allowTypeRemovalWarnings` to `allowTypesRemovalWarnings`.
Relates to #37920.
* Rename integTest to bwcTestSample for bwc test projects
This change renames the `integTest` task to `bwcTestSample` for projects
testing bwc to make it possible to run all the bwc tests that check
would run without running on bwc tests.
This change makes it possible to add a new PR check on backports to make
sure these don't break BWC tests in master.
* Rename task as per PR
This commit adds the 7.1 version constant to the 7.x branch.
Co-authored-by: Andy Bristol <andy.bristol@elastic.co>
Co-authored-by: Tim Brooks <tim@uncontended.net>
Co-authored-by: Christoph Büscher <cbuescher@posteo.de>
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Co-authored-by: markharwood <markharwood@gmail.com>
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: David Roberts <dave.roberts@elastic.co>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Alpar Torok <torokalpar@gmail.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Co-authored-by: Albert Zaharovits <albert.zaharovits@gmail.com>
We have had various reports of problems caused by the maxRetryTimeout
setting in the low-level REST client. Such setting was initially added
in the attempts to not have requests go through retries if the request
already took longer than the provided timeout.
The implementation was problematic though as such timeout would also
expire in the first request attempt (see #31834), would leave the
request executing after expiration causing memory leaks (see #33342),
and would not take into account the http client internal queuing (see #25951).
Given all these issues, it seems that this custom timeout mechanism
gives little benefits while causing a lot of harm. We should rather rely
on connect and socket timeout exposed by the underlying http client
and accept that a request can overall take longer than the configured
timeout, which is the case even with a single retry anyways.
This commit removes the `maxRetryTimeout` setting and all of its usages.
`waitForRollUpJob` is an assertBusy that waits for the rollup job
to appear in the tasks list, and waits for it to be a certain state.
However, there was a null check around the state assertion, which meant
if the job _was_ null, the assertion would be skipped, and the
assertBusy would pass withouot an exception. This could then lead to
downstream assertions to fail because the job was not actually ready,
or in the wrong state.
This changes the test to assert the job is not null, so the assertBusy
operates as intended.
For some users, the built in authorization mechanism does not fit their
needs and no feature that we offer would allow them to control the
authorization process to meet their needs. In order to support this,
a concept of an AuthorizationEngine is being introduced, which can be
provided using the security extension mechanism.
An AuthorizationEngine is responsible for making the authorization
decisions about a request. The engine is responsible for knowing how to
authorize and can be backed by whatever mechanism a user wants. The
default mechanism is one backed by roles to provide the authorization
decisions. The AuthorizationEngine will be called by the
AuthorizationService, which handles more of the internal workings that
apply in general to authorization within Elasticsearch.
In order to support external authorization services that would back an
authorization engine, the entire authorization process has become
asynchronous, which also includes all calls to the AuthorizationEngine.
The use of roles also leaked out of the AuthorizationService in our
existing code that is not specifically related to roles so this also
needed to be addressed. RequestInterceptor instances sometimes used a
role to ensure a user was not attempting to escalate their privileges.
Addressing this leakage of roles meant that the RequestInterceptor
execution needed to move within the AuthorizationService and that
AuthorizationEngines needed to support detection of whether a user has
more privileges on a name than another. The second area where roles
leaked to the user is in the handling of a few privilege APIs that
could be used to retrieve the user's privileges or ask if a user has
privileges to perform an action. To remove the leakage of roles from
these actions, the AuthorizationService and AuthorizationEngine gained
methods that enabled an AuthorizationEngine to return the response for
these APIs.
Ultimately this feature is the work included in:
#37785#37495#37328#36245#38137#38219Closes#32435
`CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when
deserializing index creation requests, accidentally accepts mappings that are
nested twice under the type key (as described in the bug report #38266).
This in turn causes us to be too lenient in parsing typeless mappings. In
particular, we accept the following index creation request, even though it
should not contain the type key `_doc`:
```
PUT index?include_type_name=false
{
"mappings": {
"_doc": {
"properties": { ... }
}
}
}
```
There is a similar issue for both 'put templates' and 'put mappings' requests
as well.
This PR makes the minimal changes to detect and reject these typed mappings in
requests. It does not address #38266 generally, or attempt a larger refactor
around types in these server-side requests, as I think this should be done at a
later time.
Renames the following settings to remove the mention of `zen` in their names:
- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
This PR removes the temporary change we made to the yml test harness in #37285
to automatically set `include_type_name` to `true` in index creation requests
if it's not already specified. This is possible now that the vast majority of
index creation requests were updated to be typeless in #37611. A few additional
tests also needed updating here.
Additionally, this PR updates the test harness to set `include_type_name` to
`false` in index creation requests when communicating with 6.x nodes. This
mirrors the logic added in #37611 to allow for typeless document write requests
in test set-up code. With this update in place, we can remove many references
to `include_type_name: false` from the yml tests.
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.
This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.
Relates to #32125
The certgen, certutil and saml-metadata tools did not correctly return
their exit code to the calling shell.
These commands now explicitly exit with the code that was returned
from the main(args, terminal) method.
This commit adds deprecation warnings for index actions
and search actions when executed via watcher. Unit and
integration tests updated accordingly.
relates #35190
Made the test tolerant to index upgrade being run
in between the old/mixed/upgraded portions. This
can occur because the rolling upgrade tests all
share the same indices.
Fixes#37763
* Testing conventions now checks for tests in main
This is the last outstanding feature of the old NamingConventionsTask,
so time to remove it.
* PR review
This commit removes the Index Audit Output type, following its deprecation
in 6.7 by 8765a31d4e6770. It also adds the migration notice (settings notice).
In general, the problem with the index audit output is that event indexing
can be slower than the rate with which audit events are generated,
especially during the daily rollovers or the rolling cluster upgrades.
In this situation audit events will be lost which is a terrible failure situation
for an audit system.
Besides of the settings under the `xpack.security.audit.index` namespace, the
`xpack.security.audit.outputs` setting has also been deprecated and will be
removed in 7. Although explicitly configuring the logfile output does not touch
any deprecation bits, this setting is made redundant in 7 so this PR deprecates
it as well.
Relates #29881
This change moves the update to the results index mappings
from the open job action to the code that starts the
autodetect process.
When a rolling upgrade is performed we need to update the
mappings for already-open jobs that are reassigned from an
old version node to a new version node, but the open job
action is not called in this case.
Closes#37607
The integ tests currently use the raw zip project name as the
distribution type. This commit simplifies this specification to be
"default" or "oss". Whether zip or tar is used should be an internal
implementation detail of the integ test setup, which can (in the future)
be platform specific.
Removes all sensitive settings (passwords, auth tokens, urls, etc...) for
watcher notifications accounts. These settings were deprecated (and
herein removed) in favor of their secure sibling that is set inside the
elasticsearch keystore. For example:
`xpack.notification.email.account.<id>.smtp.password`
is no longer a valid setting, and it is replaced by
`xpack.notification.email.account.<id>.smtp.secure_password`
This change fixes the setup of the SSL configuration for the test
openldap realm. The configuration was missing the realm identifier so
the SSL settings being used were just the default JDK ones that do not
trust the certificate of the idp fixture.
See #37591
Migrate ml job and datafeed config of open jobs and update
the parameters of the persistent tasks as they become unallocated
during a rolling upgrade. Block allocation of ml persistent tasks
until the configs are migrated.
This commit removes the fallback for SSL settings. While this may be
seen as a non user friendly change, the intention behind this change
is to simplify the reasoning needed to understand what is actually
being used for a given SSL configuration. Each configuration now needs
to be explicitly specified as there is no global configuration or
fallback to some other configuration.
Closes#29797
Added warnings checks to existing tests
Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice.
Related to #35190
* ML: add migrate anomalies assistant
* adjusting failure handling for reindex
* Fixing request and tests
* Adding tests to blacklist
* adjusting test
* test fix: posting data directly to the job instead of relying on datafeed
* adjusting API usage
* adding Todos and adjusting endpoint
* Adding types to reindexRequest
* removing unreliable "live" data test
* adding index refresh to test
* adding index refresh to test
* adding index refresh to yaml test
* fixing bad exists call
* removing todo
* Addressing remove comments
* Adjusting rest endpoint name
* making service have its own logger
* adjusting validity check for newindex names
* fixing typos
* fixing renaming
In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested.
It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected.
Relates #33028
The phrase "missing authentication token" is historic and is based
around the use of "AuthenticationToken" objects inside the Realm code.
However, now that we have a TokenService and token API, this message
would sometimes lead people in the wrong direction and they would try
and generate a "token" for authentication purposes when they would
typically just need a username:password Basic Auth header.
This change replaces the word "token" with "credentials".
Commit #36786 updated docs and strings to reference transport.port instead of
transport.tcp.port. However, this breaks backwards compatibility tests
as the tests rely on string configurations and transport.port does not
exist prior to 6.6. This commit reverts the places were we reference
transport.tcp.port for tests. This work will need to be reintroduced in
a backwards compatible way.
This is related to #36652. In 7.0 we plan to deprecate a number of
settings that make reference to the concept of a tcp transport. We
mostly just have a single transport type now (based on tcp). Settings
should only reference tcp if they are referring to socket options. This
commit updates the settings in the docs. And removes string usages of
the old settings. Additionally it adds a missing remote compress setting
to the docs.
* [ML] Job and datafeed mappings with index template (#32719)
Index mappings for the configuration documents
* [ML] Job config document CRUD operations (#32738)
* [ML] Datafeed config CRUD operations (#32854)
* [ML] Change JobManager to work with Job config in index (#33064)
* [ML] Change Datafeed actions to read config from the config index (#33273)
* [ML] Allocate jobs based on JobParams rather than cluster state config (#33994)
* [ML] Return missing job error when .ml-config is does not exist (#34177)
* [ML] Close job in index (#34217)
* [ML] Adjust finalize job action to work with documents (#34226)
* [ML] Job in index: Datafeed node selector (#34218)
* [ML] Job in Index: Stop and preview datafeed (#34605)
* [ML] Delete job document (#34595)
* [ML] Convert job data remover to work with index configs (#34532)
* [ML] Job in index: Get datafeed and job stats from index (#34645)
* [ML] Job in Index: Convert get calendar events to index docs (#34710)
* [ML] Job in index: delete filter action (#34642)
This changes the delete filter action to search
for jobs using the filter to be deleted in the index
rather than the cluster state.
* [ML] Job in Index: Enable integ tests (#34851)
Enables the ml integration tests excluding the rolling upgrade tests and a lot of fixes to
make the tests pass again.
* [ML] Reimplement established model memory (#35500)
This is the 7.0 implementation of a master node service to
keep track of the native process memory requirement of each ML
job with an associated native process.
The new ML memory tracker service works when the whole cluster
is upgraded to at least version 6.6. For mixed version clusters
the old mechanism of established model memory stored on the job
in cluster state was used. This means that the old (and complex)
code to keep established model memory up to date on the job object
has been removed in 7.0.
Forward port of #35263
* [ML] Need to wait for shards to replicate in distributed test (#35541)
Because the cluster was expanded from 1 node to 3 indices would
initially start off with 0 replicas. If the original node was
killed before auto-expansion to 1 replica was complete then
the test would fail because the indices would be unavailable.
* [ML] DelayedDataCheckConfig index mappings (#35646)
* [ML] JIndex: Restore finalize job action (#35939)
* [ML] Replace Version.CURRENT in streaming functions (#36118)
* [ML] Use 'anomaly-detector' in job config doc name (#36254)
* [ML] Job In Index: Migrate config from the clusterstate (#35834)
Migrate ML configuration from clusterstate to index for closed jobs
only once all nodes are v6.6.0 or higher
* [ML] Check groups against job Ids on update (#36317)
* [ML] Adapt to periodic persistent task refresh (#36633)
* [ML] Adapt to periodic persistent task refresh
If https://github.com/elastic/elasticsearch/pull/36069/files is
merged then the approach for reallocating ML persistent tasks
after refreshing job memory requirements can be simplified.
This change begins the simplification process.
* Remove AwaitsFix and implement TODO
* [ML] Default search size for configs
* Fix TooManyJobsIT.testMultipleNodes
Two problems:
1. Stack overflow during async iteration when lots of
jobs on same machine
2. Not effectively setting search size in all cases
* Use execute() instead of submit() in MlMemoryTracker
We don't need a Future to wait for completion
* [ML][TEST] Fix NPE in JobManagerTests
* [ML] JIindex: Limit the size of bulk migrations (#36481)
* [ML] Prevent updates and upgrade tests (#36649)
* [FEATURE][ML] Add cluster setting that enables/disables config migration (#36700)
This commit adds a cluster settings called `xpack.ml.enable_config_migration`.
The setting is `true` by default. When set to `false`, no config migration will
be attempted and non-migrated resources (e.g. jobs, datafeeds) will be able
to be updated normally.
Relates #32905
* [ML] Snapshot ml configs before migrating (#36645)
* [FEATURE][ML] Split in batches and migrate all jobs and datafeeds (#36716)
Relates #32905
* SQL: Fix translation of LIKE/RLIKE keywords (#36672)
* SQL: Fix translation of LIKE/RLIKE keywords
Refactor Like/RLike functions to simplify internals and improve query
translation when chained or within a script context.
Fix#36039Fix#36584
* Fixing line length for EnvironmentTests and RecoveryTests (#36657)
Relates #34884
* Add back one line removed by mistake regarding java version check and
COMPAT jvm parameter existence
* Do not resolve addresses in remote connection info (#36671)
The remote connection info API leads to resolving addresses of seed
nodes when invoked. This is problematic because if a hostname fails to
resolve, we would not display any remote connection info. Yet, a
hostname not resolving can happen across remote clusters, especially in
the modern world of cloud services with dynamically chaning
IPs. Instead, the remote connection info API should be providing the
configured seed nodes. This commit changes the remote connection info to
display the configured seed nodes, avoiding a hostname resolution. Note
that care was taken to preserve backwards compatibility with previous
versions that expect the remote connection info to serialize a transport
address instead of a string representing the hostname.
* [Painless] Add boxed type to boxed type casts for method/return (#36571)
This adds implicit boxed type to boxed types casts for non-def types to create asymmetric casting relative to the def type when calling methods or returning values. This means that a user calling a method taking an Integer can call it with a Byte, Short, etc. legally which matches the way def works. This creates consistency in the casting model that did not previously exist.
* SNAPSHOTS: Adjust BwC Versions in Restore Logic (#36718)
* Re-enables bwc tests with adjusted version conditions now that #36397 enables concurrent snapshots in 6.6+
* ingest: fix on_failure with Drop processor (#36686)
This commit allows a document to be dropped when a Drop processor
is used in the on_failure fork of the processor chain.
Fixes#36151
* Initialize startup `CcrRepositories` (#36730)
Currently, the CcrRepositoryManger only listens for settings updates
and installs new repositories. It does not install the repositories that
are in the initial settings. This commit, modifies the manager to
install the initial repositories. Additionally, it modifies the ccr
integration test to configure the remote leader node at startup, instead
of using a settings update.
* [TEST] fix float comparison in RandomObjects#getExpectedParsedValue
This commit fixes a test bug introduced with #36597. This caused some
test failure as stored field values comparisons would not work when CBOR
xcontent type was used.
Closes#29080
* [Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)
This commit exposes lucene's LatLonShape field as the
default type in GeoShapeFieldMapper. To use the new
indexing approach, simply set "type" : "geo_shape" in
the mappings without setting any of the strategy, precision,
tree_levels, or distance_error_pct parameters. Note the
following when using the new indexing approach:
* geo_shape query does not support querying by
MULTIPOINT.
* LINESTRING and MULTILINESTRING queries do not
yet support WITHIN relation.
* CONTAINS relation is not yet supported.
The tree, precision, tree_levels, distance_error_pct,
and points_only parameters are deprecated.
* TESTS:Debug Log. IndexStatsIT#testFilterCacheStats
* ingest: support default pipelines + bulk upserts (#36618)
This commit adds support to enable bulk upserts to use an index's
default pipeline. Bulk upsert, doc_as_upsert, and script_as_upsert
are all supported.
However, bulk script_as_upsert has slightly surprising behavior since
the pipeline is executed _before_ the script is evaluated. This means
that the pipeline only has access the data found in the upsert field
of the script_as_upsert. The non-bulk script_as_upsert (existing behavior)
runs the pipeline _after_ the script is executed. This commit
does _not_ attempt to consolidate the bulk and non-bulk behavior for
script_as_upsert.
This commit also adds additional testing for the non-bulk behavior,
which remains unchanged with this commit.
fixes#36219
* Fix duplicate phrase in shrink/split error message (#36734)
This commit removes a duplicate "must be a" from the shrink/split error
messages.
* Deprecate types in get_source and exist_source (#36426)
This change adds a new untyped endpoint `{index}/_source/{id}` for both the
GET and the HEAD methods to get the source of a document or check for its
existance. It also adds deprecation warnings to RestGetSourceAction that emit
a warning when the old deprecated "type" parameter is still used. Also updating
documentation and tests where appropriate.
Relates to #35190
* Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)"
This reverts commit 5bc7822562.
* Enhance Invalidate Token API (#35388)
This change:
- Adds functionality to invalidate all (refresh+access) tokens for all users of a realm
- Adds functionality to invalidate all (refresh+access)tokens for a user in all realms
- Adds functionality to invalidate all (refresh+access) tokens for a user in a specific realm
- Changes the response format for the invalidate token API to contain information about the
number of the invalidated tokens and possible errors that were encountered.
- Updates the API Documentation
After back-porting to 6.x, the `created` field will be removed from master as a field in the
response
Resolves: #35115
Relates: #34556
* Add raw sort values to SearchSortValues transport serialization (#36617)
In order for CCS alternate execution mode (see #32125) to be able to do the final reduction step on the CCS coordinating node, we need to serialize additional info in the transport layer as part of each `SearchHit`. Sort values are already present but they are formatted according to the provided `DocValueFormat` provided. The CCS node needs to be able to reconstruct the lucene `FieldDoc` to include in the `TopFieldDocs` and `CollapseTopFieldDocs` which will feed the `mergeTopDocs` method used to reduce multiple search responses (one per cluster) into one.
This commit adds such information to the `SearchSortValues` and exposes it through a new getter method added to `SearchHit` for retrieval. This info is only serialized at transport and never printed out at REST.
* Watcher: Ensure all internal search requests count hits (#36697)
In previous commits only the stored toXContent version of a search
request was using the old format. However an executed search request was
already disabling hit counts. In 7.0 hit counts will stay enabled by
default to allow for proper migration.
Closes#36177
* [TEST] Ensure shard follow tasks have really stopped.
Relates to #36696
* Ensure MapperService#getAllMetaFields elements order is deterministic (#36739)
MapperService#getAllMetaFields returns an array, which is created out of
an `ObjectHashSet`. Such set does not guarantee deterministic hash
ordering. The array returned by its toArray may be sorted differently
at each run. This caused some repeatability issues in our tests (see #29080)
as we pick random fields from the array of possible metadata fields,
but that won't be repeatable if the input array is sorted differently at
every run. Once setting the tests seed, hppc picks that up and the sorting is
deterministic, but failures don't repeat with the seed that gets printed out
originally (as a seed was not originally set).
See also https://issues.carrot2.org/projects/HPPC/issues/HPPC-173.
With this commit, we simply create a static sorted array that is used for
`getAllMetaFields`. The change is in production code but really affects
only testing as the only production usage of this method was to iterate
through all values when parsing fields in the high-level REST client code.
Anyways, this seems like a good change as returning an array would imply
that it's deterministically sorted.
* Expose Sequence Number based Optimistic Concurrency Control in the rest layer (#36721)
Relates #36148
Relates #10708
* [ML] Mute MlDistributedFailureIT
* Deprecate types in index API
- deprecate type-based constructors of IndexRequest
- update tests to use typeless IndexRequest constructors
- no yaml tests as they have been already added in #35790
Relates to #35190
An issue was introduced due to the merge of authorization_realms with
the change to use Affix Settings for realms.
The ".type" setting no longer exists as the type is now part of the
setting key.
* Upgrae plugin to latest and expose udp
* Explicit check for windows
* Rename the properties for the port numbers
* Tasks for pre and pos container actions
Redeprecates the `/_xpack/rollup` endpoints in favor of `/_rollup`.
When we cleanup the rollup in a cluster containing 6.x nodes we need to
use `/_xpack/rollup` instead of `/_rollup` because the 6.x nodes don't
know about `/_rollup`. In those cases we must ignore the deprecation
warnings that the 7.0 node will return for the end point.
Closes#36044
* This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs.
- REST API docs
- HLRC docs and doc tests
- Handle REST actions with deprecation warnings
- Changed endpoints in rest-api-spec and relevant file names
For each API, the following updates were made:
- Add deprecation warnings to `Rest*Action`, plus tests in `Rest*ActionTests`.
- For each REST yml test, make sure there is one version without types, and another legacy version that retains types (called *_with_types.yml).
- Deprecate relevant methods on the Java HLRC requests/ responses.
- Update documentation (for both the REST API and Java HLRC).
Moves all remaining (rolling-upgrade and mixed-version) REST tests to use Zen2. To avoid adding
extra configuration, it relies on Zen2 being set as the default discovery type. This required a few
smaller changes in other tests. I've removed AzureMinimumMasterNodesTests which tests Zen1
functionality and dates from a time where host providers were not configurable and each cloud
plugin had its own discovery.type, subclassing the ZenDiscovery class. I've also adapted a few tests
which were unnecessarily adding addTestZenDiscovery = false for the same legacy reasons. Finally,
this also moves the unconfigured-node-name REST test to Zen2, testing the auto-bootstrapping
functionality in development mode when no discovery configuration is provided.
Set rest_total_hits_as_int in search requests only on the new cluster,
the old cluster can be on a version older than don't support this
parameter (< 6.6) and total hits is not an object in 6x anyway.
Closes#36291
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).
Relates #33028
Closes#35435
- make it easier to add additional testing tasks with the proper configuration and add some where they were missing.
- mute or fix failing tests
- add a check as part of testing conventions to find classes not included in any testing task.
If the randomly selected port was already in use the Kerberos
tests would fail. This commit adds check to see if the network
port is available and if not continue to find one for KDC server.
If it does not find port after 100 retries it throws an exception.
Closes#34261
This change adds the support for rest_total_hits_as_int
in the watcher search inputs. Setting this parameter in the request
will transform the search response to contain the total hits as
a number (instead of an object).
Note that this parameter is currently a noop since #35849 is not
merged.
Closes#36008
In #30241 Realm settings were changed, but the Kerberos realm settings
were not registered correctly. This change fixes the registration of
those Kerberos settings.
Also adds a new integration test that ensures every internal realm can
be configured in a test cluster.
Also fixes the QA test for kerberos.
Resolves: #35942
Right now using the `GET /_tasks/<taskid>` API and causing a task to opt
in to saving its result after being completed requires permissions on
the `.tasks` index. When we built this we thought that that was fine,
but we've since moved towards not leaking details like "persisting task
results after the task is completed is done by saving them into an index
named `.tasks`." A more modern way of doing this would be to save the
tasks into the index "under the hood" and to have APIs to manage the
saved tasks. This is the first step down that road: it drops the
requirement to have permissions to interact with the `.tasks` index when
fetching task statuses and when persisting statuses beyond the lifetime
of the task.
In particular, this moves the concept of the "origin" of an action into
a more prominent place in the Elasticsearch server. The origin of an
action is ignored by the server, but the security plugin uses the origin
to make requests on behalf of a user in such a way that the user need
not have permissions to perform these actions. It *can* be made to be
fairly precise. More specifically, we can create an internal user just
for the tasks API that just has permission to interact with the `.tasks`
index. This change doesn't do that, instead, it uses the ubiquitus
"xpack" user which has most permissions because it is simpler. Adding
the tasks user is something I'd like to get to in a follow up change.
Instead, the majority of this change is about moving the "origin"
concept from the security portion of x-pack into the server. This should
allow any code to use the origin. To keep the change managable I've also
opted to deprecate rather than remove the "origin" helpers in the
security code. Removing them is almost entirely mechanical and I'd like
to that in a follow up as well.
Relates to #35573
Clients can use the Kerberos V5 security mechanism and when it
used this to establish security context it failed to do so as
Elasticsearch server only accepted Spengo mechanism.
This commit adds support to accept Kerberos V5 credentials
over spnego.
Closes#34763
- Add the authentication realm and lookup realm name and type in the response for the _authenticate API
- The authentication realm is set as the lookup realm too (instead of setting the lookup realm to null or empty ) when no lookup realm is used.
This commit removes the use of AbstractComponent in xpack where it was
still being extended. It has been replaced with explicit logger
declarations.
See #34488
* ML: Removing result_finalization_window && overlapping_buckets
* Reverting bad method deletions
* Setting to current before backport to try and get a green build
* fixing testBuildAutodetectCommand test
* disabling bwc tests for backport
* Deprecate types in count requests.
* Move RestCountAction to the 'search' package.
* Deprecate types in multi search requests.
* Add tests for types deprecation in the _search endpoint.
The DefaultAuthenticationFailureHandler has a deprecated constructor
that was present to prevent a breaking change to custom realm plugin
authors in 6.x. This commit removes the constructor and its uses.
* DISCOVERY: Fix RollingUpgradeTests
* Don't manually manage min master nodes if not necessary
* Remove some dead code
* Allow for manually supplying list of seed nodes
* Closes#35178
Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT,
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT,
SmokeTestWatcherTestSuiteIT, WatcherRestIT,
XDocsClientYamlTestSuiteIT, and XPackRestIT
The change here is to throw an `AssertionError` instead of `break;` to
allow the `assertBusy()` to continue to busy wait until the desired
state is reached.
closes#33291, closes#29877, closes#34462, closes#30705, closes#34448
Adds basic rolling upgrade tests to check that lifecycles are still recognizable between rolling cluster upgrades and managed indices stay managed.
This is a placeholder for discussing types of checks so they are ready once we backported
This is a re-boot of the previous PR against index-lifecycle that needed to be
reverted due to CI bwc issues. #32828
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
This moves all Realm settings to an Affix definition.
However, because different realm types define different settings
(potentially conflicting settings) this requires that the realm type
become part of the setting key.
Thus, we now need to define realm settings as:
xpack.security.authc.realms:
file.file1:
order: 0
native.native1:
order: 1
- This is a breaking change to realm config
- This is also a breaking change to custom security realms (SecurityExtension)
Fixes
```
Could not determine the dependencies of task ':x-pack:qa:full-cluster-restart:with-system-key:v5.6.13#oldClusterTestCluster#node1.copyBwcPlugins'.
```
Stop passing `Settings` to `AbstractComponent`'s ctor. This allows us to
stop passing around `Settings` in a *ton* of places. While this change
touches many files, it touches them all in fairly small, mechanical
ways, doing a few things per file:
1. Drop the `super(settings);` line on everything that extends
`AbstractComponent`.
2. Drop the `settings` argument to the ctor if it is no longer used.
3. If the file doesn't use `logger` then drop `extends
AbstractComponent` from it.
4. Clean up all compilation failure caused by the `settings` removal
and drop any now unused `settings` isntances and method arguments.
I've intentionally *not* removed the `settings` argument from a few
files:
1. TransportAction
2. AbstractLifecycleComponent
3. BaseRestHandler
These files don't *need* `settings` either, but this change is large
enough as is.
Relates to #34488
The java yaml test runner supports sending request headers, yet not all clients support headers. This commit makes sure that we enforce adding a skip section with feature "headers" whenever headers are used in a do section as part of a test. That decreases the chance for new tests to break client builds due to the missing skip section.
Closes#34650
Move `x-pack/qa/smoke-test-graph-with-security` to
`x-pack/plugin/graph/qa/security` which should make it easier to run all
of the tests with a single command. It also lines up the directories
more closely with newer projects like cross cluster replication.
Moves `x-pack/qa/sql/*` into `x-pack/plugin/sql/qa` to make it simpler
to run all of the sql tests. This lines up with how newer projects like
cross cluster replication are testing themselves.
This commit filters out usage of deprecated tzs by tests. These are
tested separately and should not require checking for warnings on any
test using random timezones.
closes#34188
Previously, `Mapper` was returning an incorrect plan which resulted in an
```
SQLFeatureNotSupportedException: Found 1 problem(s)
line 1:8: Unexecutable item
```
Queries with a `WHERE` and/or `HAVING` clause which results in NO_MATCH
are now handled correctly and return 0 rows.
Fixes: #34613
Co-authored-by: Costin Leau <costin.leau@gmail.com>
* Adding stack_monitoring_agent role
* Fixing checkstyle issues
* Adding tests for new role
* Tighten up privileges around index templates
* s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user
* Fixing checkstyle violation
* Fix test
* Removing unused field
* Adding missed code
* Fixing data type
* Update Integration Test for new builtin user
Implemented null handling for both the value tested but also for
values inside the list of values tested against.
The null handling is implemented for local processors, painless scripts
and Lucene Terms queries making it available for `IN` expressions occuring
in `SELECT`, `WHERE` and `HAVING` clauses.
Closes: #34582
#33708 introduced a strict deprecation mode that makes a REST request
fail if there is a warning header in the response returned by
Elasticsearch (usually a deprecation message signaling that a feature
or a field has been deprecated).
This change adds the strict deprecation mode into the REST integration
tests, and makes the tests fail if a deprecated feature is used. Also
any test using a deprecated feature has been modified to pass the build.
The YAML integration tests already analyzed HTTP warnings so they do
not use this mode, keeping their "expected vs actual" behavior.
We should delete a job by directly talking to the allocated
task and telling it to shutdown. Today we shut down a job
via the persistent task framework. This is not ideal because,
while the job has been removed from the persistent task
CS, the allocated task continues to live until it gets the
shutdown message.
This means a user can delete a job, immediately delete
the rollup index, and then see new documents appear in
the just-deleted index. This happens because the indexer
in the allocated task is still running and indexes a few
more documents before getting the shutdown command.
In this PR, the transport action is changed to a TransportTasksAction,
and we invoke onCancelled() directly on the matching job.
The race condition still exists after this PR (albeit less likely),
but this was a precursor to fixing the issue and a self-contained
chunk of code. A second PR will followup to fix the race itself.
Extend querying support on multiple indices from being strictly
identical to being just compatible.
Use FieldCapabilities API (extended through #33803) for mapping merging.
Close#31837#31611
Implement the functionality to translate the
`field IN (value1, value2,...)` expressions to proper Lucene queries
or painless script or local processors depending on the use case.
The `IN` expression can be used in SELECT, WHERE and HAVING clauses.
Closes: #32955
`CONVERT` works exactly like cast with slightly different syntax:
`CONVERT(<value>, <data_type)` as opposed to `CAST(<value> AS <data_type>)`
Moreover it support format of the MS-SQL data types `SQL_<type>`,
e.g.: `SQL_INTEGER`
Closes: #34513
Make SQL aware of missing and/or unmapped fields treating them as NULL
Make _all_ functions and operators null-safe aware, including when used
in filtering or sorting contexts
Add missing and null-safe doc value extractor
Modify dataset to have null fields spread around (in groups of 10)
Enforce missing last and unmapped_type inside sorting
Consolidate Predicate templating and declaration
Add support for Like/RLike in scripting
Generalize NULLS LAST/FIRST
Introduce early schema declaration for CSV spec tests: to keep the doc
snippets in place (introduce schema:: prefix for declaration)
upfront.
Fix#32079
A constant can now be used outside aggregation only queries.
Don't skip an ES query in case of constants-only selects.
Loosen the binary pipe restriction of being used only in aggregation queries.
Fixes https://github.com/elastic/elasticsearch/issues/31863
This moves the rollup cleanup code for http tests from the high level rest
client into the test framework and then entirely removes the rollup cleanup
code for http tests that lived in x-pack. This is nice because it
consolidates the cleanup into one spot, automatically invokes the cleanup
without the test having to know that it is "about rollup", and should allow
us to run the rollup docs tests.
Part of #34530
Security caches the result of role lookups and negative lookups are
cached indefinitely. In the case of transient failures this leads to a
bad experience as the roles could truly exist. The CompositeRolesStore
needs to know if a failure occurred in one of the roles stores in order
to make the appropriate decision as it relates to caching. In order to
provide this information to the CompositeRolesStore, the return type of
methods to retrieve roles has changed to a new class,
RoleRetrievalResult. This class provides the ability to pass back an
exception to the roles store. This exception does not mean that a
request should be failed but instead serves as a signal to the roles
store that missing roles should not be cached and neither should the
combined role if there are missing roles.
As part of this, the negative lookup cache was also changed from an
unbounded cache to a cache with a configurable limit.
Relates #33205
* New OCTET_LENGTH function
* Changed the way the FunctionRegistry stores functions, considering the alphabetic ordering by name
* Added documentation for the RANDOM function
Since all calls to `ESLoggerFactory` outside of the logging package were
deprecated, it seemed like it'd simplify things to migrate all of the
deprecated calls and declare `ESLoggerFactory` to be package private.
This does that.
Previously, parsing an arithmetic expression with `*` and no spaces,
e.g.: `2*i` threw a parsing exception as the grammar rule for
tableIdentifier was clashing with the rule for arithmetic operator `*`.
This issue comes already in the lexer and the left part of the
expression (in our example `2*`) was recognised as a
TABLE_IDENTIFIER token.
The solution adopted is to allow the `*` wildcard in the table name
only if it's surrounded with double quotes, e.g.: `"my*index"`
Closes: #33957
We use wrap code in `// tag` and `//end` to include it in our docs. Our
current docs style wraps code snippets in a box that is only wide enough
for 76 characters and adds a horizontal scroll bar for wider snippets
which makes the snippet much harder to read. This adds a checkstyle check
that looks for java code that is included in the docs and is wider than
that 76 characters so all snippets fit into the box. It solves many of
the failures that this catches but suppresses many more. I will clean
those up in a follow up change.
Centralize and simplify the script generation between operators and functions which are currently decoupled. As part of this process most predicates (<, <=, etc...) were made ScalarFunction as their purpose and functionality is quite similar (see % and MOD functions).
Renamed ProcessDefinition to Pipe
Add ScriptWeaver as a mixin-in interface for script customization
Add logic for equals/lte/lt
Improve BinaryOperator/expression toString
Minimize duplication across string functions
Close#33975