Currently when an ILM policy finishes its execution, the index moves into the `TerminalPolicyStep`,
denoted by a completed/completed/completed phase/action/step lifecycle execution state.
This commit changes the behavior so that the index lifecycle execution state halts at the last
configured phase's `PhaseCompleteStep`, so for instance, if an index were configured with a policy
containing a `hot` and `cold` phase, the index would stop at the `cold/complete/complete`
`PhaseCompleteStep`. This allows an ILM user to update the policy to add any later phases and have
indices configured to use that policy pick up execution at the newly added "later" phase. For
example, if a `delete` phase were added to the policy specified about, the index would then move
from `cold/complete/complete` into the `delete` phase.
Relates to #48431
The changes are to help users prepare for migration to next major
release (v8.0.0) regarding to the break change of realm order config.
Warnings are added for when:
* A realm does not have an order config
* Multiple realms have the same order config
The warning messages are added to both deprecation API and loggings.
The main reasons for doing this are: 1) there is currently no automatic relay
between the two; 2) deprecation API is under basic and we need logging
for OSS.
This commit switches the strategy for managing dot-prefixed indices that
should be hidden indices from using "fake" system indices to an explicit
exclusions list that must be updated when those indices are converted to
hidden indices.
* Rename ILM history index enablement setting
The previous setting was `index.lifecycle.history_index_enabled`, this commit changes it to
`indices.lifecycle.history_index_enabled` to indicate this is not an index-level setting (it's node
level).
* [ML][Inference] Fix weighted mode definition (#51648)
Weighted mode inaccurately assumed that the "max value" of the input values would be the maximum class value. This does not make sense.
Weighted Mode should know how many classes there are. Hence the new parameter `num_classes`. This indicates what the maximum class value to be expected.
Three fixes for when the `compressed_definition` is utilized on PUT
* Update the inflate byte limit to be the minimum of 10% the max heap, or 1GB (what it was previously)
* Stream data directly to the JSON parser, so if it is invalid, we don't have to inflate the whole stream to find out
* Throw when the maximum bytes are reach indicating that is why the request was rejected
This commit creates a new index privilege named `maintenance`.
The privilege grants the following actions: `refresh`, `flush` (also synced-`flush`),
and `force-merge`. Previously the actions were only under the `manage` privilege
which in some situations was too permissive.
Co-authored-by: Amir H Movahed <arhd83@gmail.com>
The timeout.tcp_read AD/LDAP realm setting, despite the low-level
allusion, controls the time interval the realms wait for a response for
a query (search or bind). If the connection to the server is synchronous
(un-pooled) the response timeout is analogous to the tcp read timeout.
But the tcp read timeout is irrelevant in the common case of a pooled
connection (when a Bind DN is specified).
The timeout.tcp_read qualifier is hereby deprecated in favor of
timeout.response.
In addition, the default value for both timeout.tcp_read and
timeout.response is that of timeout.ldap_search, instead of the 5s (but
the default for timeout.ldap_search is still 5s). The
timeout.ldap_search defines the server-controlled timeout of a search
request. There is no practical use case to have a smaller tcp_read
timeout compared to ldap_search (in this case the request would time-out
on the client but continue to be processed on the server). The proposed
change aims to simplify configuration so that the more common
configuration change, adjusting timeout.ldap_search up, has the expected
result (no timeout during searches) without any additional
modifications.
Closes#46028
* Allow Repository Plugins to Filter Metadata on Create
Add a hook that allows repository plugins to filter the repository metadata
before it gets written to the cluster state.
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.
This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
* Reload secure settings with password (#43197)
If a password is not set, we assume an empty string to be
compatible with previous behavior.
Only allow the reload to be broadcast to other nodes if TLS is
enabled for the transport layer.
* Add passphrase support to elasticsearch-keystore (#38498)
This change adds support for keystore passphrases to all subcommands
of the elasticsearch-keystore cli tool and adds a subcommand for
changing the passphrase of an existing keystore.
The work to read the passphrase in Elasticsearch when
loading, which will be addressed in a different PR.
Subcommands of elasticsearch-keystore can handle (open and create)
passphrase protected keystores
When reading a keystore, a user is only prompted for a passphrase
only if the keystore is passphrase protected.
When creating a keystore, a user is allowed (default behavior) to create one with an
empty passphrase
Passphrase can be set to be empty when changing/setting it for an
existing keystore
Relates to: #32691
Supersedes: #37472
* Restore behavior for force parameter (#44847)
Turns out that the behavior of `-f` for the add and add-file sub
commands where it would also forcibly create the keystore if it
didn't exist, was by design - although undocumented.
This change restores that behavior auto-creating a keystore that
is not password protected if the force flag is used. The force
OptionSpec is moved to the BaseKeyStoreCommand as we will presumably
want to maintain the same behavior in any other command that takes
a force option.
* Handle pwd protected keystores in all CLI tools (#45289)
This change ensures that `elasticsearch-setup-passwords` and
`elasticsearch-saml-metadata` can handle a password protected
elasticsearch.keystore.
For setup passwords the user would be prompted to add the
elasticsearch keystore password upon running the tool. There is no
option to pass the password as a parameter as we assume the user is
present in order to enter the desired passwords for the built-in
users.
For saml-metadata, we prompt for the keystore password at all times
even though we'd only need to read something from the keystore when
there is a signing or encryption configuration.
* Modify docs for setup passwords and saml metadata cli (#45797)
Adds a sentence in the documentation of `elasticsearch-setup-passwords`
and `elasticsearch-saml-metadata` to describe that users would be
prompted for the keystore's password when running these CLI tools,
when the keystore is password protected.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
* Elasticsearch keystore passphrase for startup scripts (#44775)
This commit allows a user to provide a keystore password on Elasticsearch
startup, but only prompts when the keystore exists and is encrypted.
The entrypoint in Java code is standard input. When the Bootstrap class is
checking for secure keystore settings, it checks whether or not the keystore
is encrypted. If so, we read one line from standard input and use this as the
password. For simplicity's sake, we allow a maximum passphrase length of 128
characters. (This is an arbitrary limit and could be increased or eliminated.
It is also enforced in the keystore tools, so that a user can't create a
password that's too long to enter at startup.)
In order to provide a password on standard input, we have to account for four
different ways of starting Elasticsearch: the bash startup script, the Windows
batch startup script, systemd startup, and docker startup. We use wrapper
scripts to reduce systemd and docker to the bash case: in both cases, a
wrapper script can read a passphrase from the filesystem and pass it to the
bash script.
In order to simplify testing the need for a passphrase, I have added a
has-passwd command to the keystore tool. This command can run silently, and
exit with status 0 when the keystore has a password. It exits with status 1 if
the keystore doesn't exist or exists and is unencrypted.
A good deal of the code-change in this commit has to do with refactoring
packaging tests to cleanly use the same tests for both the "archive" and the
"package" cases. This required not only moving tests around, but also adding
some convenience methods for an abstraction layer over distribution-specific
commands.
* Adjust docs for password protected keystore (#45054)
This commit adds relevant parts in the elasticsearch-keystore
sub-commands reference docs and in the reload secure settings API
doc.
* Fix failing Keystore Passphrase test for feature branch (#50154)
One problem with the passphrase-from-file tests, as written, is that
they would leave a SystemD environment variable set when they failed,
and this setting would cause elasticsearch startup to fail for other
tests as well. By using a try-finally, I hope that these tests will fail
more gracefully.
It appears that our Fedora and Ubuntu environments may be configured to
store journald information under /var rather than under /run, so that it
will persist between boots. Our destructive tests that read from the
journal need to account for this in order to avoid trying to limit the
output we check in tests.
* Run keystore management tests on docker distros (#50610)
* Add Docker handling to PackagingTestCase
Keystore tests need to be able to run in the Docker case. We can do this
by using a DockerShell instead of a plain Shell when Docker is running.
* Improve ES startup check for docker
Previously we were checking truncated output for the packaged JDK as
an indication that Elasticsearch had started. With new preliminary
password checks, we might get a false positive from ES keystore
commands, so we have to check specifically that the Elasticsearch
class from the Bootstrap package is what's running.
* Test password-protected keystore with Docker (#50803)
This commit adds two tests for the case where we mount a
password-protected keystore into a Docker container and provide a
password via a Docker environment variable.
We also fix a logging bug where we were logging the identifier for an
array of strings rather than the contents of that array.
* Add documentation for keystore startup prompting (#50821)
When a keystore is password-protected, Elasticsearch will prompt at
startup. This commit adds documentation for this prompt for the archive,
systemd, and Docker cases.
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
* Warn when unable to upgrade keystore on debian (#51011)
For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This
commit brings the same logic to the code for Debian packages. See the
posttrans file for gets executed for RPMs.
* Restore handling of string input
Adds tests that were mistakenly removed. One of these tests proved
we were not handling the the stdin (-x) option correctly when no
input was added. This commit restores the original approach of
reading stdin one char at a time until there is no more (-1, \r, \n)
instead of using readline() that might return null
* Apply spotless reformatting
* Use '--since' flag to get recent journal messages
When we get Elasticsearch logs from journald, we want to fetch only log
messages from the last run. There are two reasons for this. First, if
there are many logs, we might get a string that's too large for our
utility methods. Second, when we're looking for a specific message or
error, we almost certainly want to look only at messages from the last
execution.
Previously, we've been trying to do this by clearing out the physical
files under the journald process. But there seems to be some contention
over these directories: if journald writes a log file in between when
our deletion command deletes the file and when it deletes the log
directory, the deletion will fail.
It seems to me that we might be able to use journald's "--since" flag to
retrieve only log messages from the last run, and that this might be
less likely to fail due to race conditions in file deletion.
Unfortunately, it looks as if the "--since" flag has a granularity of
one-second. I've added a two-second sleep to make sure that there's a
sufficient gap between the test that will read from journald and the
test before it.
* Use new journald wrapper pattern
* Update version added in secure settings request
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
This commit sets `xpack.security.ssl.diagnose.trust` to false in all
of our tests when running in FIPS 140 mode and when settings objects
are used to create an instance of the SSLService. This is needed
in 7.x because setting xpack.security.ssl.diagnose.trust to true
wraps SunJSSE TrustManager with our own DiagnosticTrustManager and
this is not allowed when SunJSSE is in FIPS mode.
An alternative would be to set xpack.security.fips.enabled to
true which would also implicitly disable
xpack.security.ssl.diagnose.trust but would have additional effects
(would require that we set PBKDF2 for password hashing algorithm in
all test clusters, would prohibit using JKS keystores in nodes even
if relevant tests have been muted in FIPS mode etc.)
Relates: #49900Resolves: #51268
Today we are repeatedly checking if the current build is a snapshot
build or not by reading the system property build.snapshot. This commit
formalizes this by adding a build parameter to indicate whether or not
the current build is a snapshot build.
This change changes the way to run our test suites in
JVMs configured in FIPS 140 approved mode. It does so by:
- Configuring any given runtime Java in FIPS mode with the bundled
policy and security properties files, setting the system
properties java.security.properties and java.security.policy
with the == operator that overrides the default JVM properties
and policy.
- When runtime java is 11 and higher, using BouncyCastle FIPS
Cryptographic provider and BCJSSE in FIPS mode. These are
used as testRuntime dependencies for unit
tests and internal clusters, and copied (relevant jars)
explicitly to the lib directory for testclusters used in REST tests
- When runtime java is 8, using BouncyCastle FIPS
Cryptographic provider and SunJSSE in FIPS mode.
Running the tests in FIPS 140 approved mode doesn't require an
additional configuration either in CI workers or locally and is
controlled by specifying -Dtests.fips.enabled=true
* Centralize mocks initialization in ILM steps tests
This change centralizes initialization of `Client`, `AdminClient`
and `IndicesAdminClient` for all classes extending `AbstractStepTestCase`.
This removes a lot of code duplication and make it easier to write tests.
This also removes need for `AsyncActionStep#setClient`
* Unused imports removed
* Added missed tests
* Fix OpenFollowerIndexStepTests
* [ML][Inference] add tags url param to GET (#51330)
Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint.
This parameter allows the list of models to be further reduced to those who contain all the provided tags.
check bulk indexing error for permanent problems and ensure the state goes into failed instead of
retry. Corrects the stats API to show the real error and avoids excessive audit logging.
fixes#50122
This change exposes master timeout to ILM steps through global dynamic setting.
All currently implemented steps make use of this setting as well.
Closes#44136
Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.
Backport of #51232
This makes the UpdateSettingsStep retryable. This step updates settings needed
during the execution of ILM actions (mark indexes as read-only, change
allocation configurations, mark indexing complete, etc)
As the index updates are idempotent in nature (PUT requests and are applied only
if the values have changed) and the settings values are seldom user-configurable
(aside from the allocate action) the testing for this change goes along the
lines of artificially simulating a setting update failure on a particular value
update, which is followed by a successful step execution (a retry) in an
environment outside of ILM (the step executions are triggered manually).
(cherry picked from commit 8391b0aba469f39532bfc2796b76148167dc0289)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
After we rollover the index we wait for the configured number of shards for the
rolled index to become active (based on the index.write.wait_for_active_shards
setting which might be present in a template, or otherwise in the default case,
for the primaries to become active).
This wait might be long due to disk watermarks being tripped, replicas not
being able to spring to life due to cluster nodes reconfiguration and others
and, the RolloverStep might not complete successfully due to this inherent
transient situation, albeit the rolled index having been created.
(cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
If 1000 different category definitions are created for a job in
the first 100 buckets it processes then an audit warning will now
be created. (This will cause a yellow warning triangle in the
ML UI's jobs list.)
Such a large number of categories suggests that the field that
categorization is working on is not well suited to the ML
categorization functionality.
If a transform config got lost (e.g. because the internal index disappeared) tasks could not be
stopped using transform API. This change makes it possible to stop transforms without a config,
meaning to remove the background task. In order to do so force must be set to true.
Knowing about used analysis components and mapping types would be incredibly
useful in order to know which ones may be deprecated or should get more love.
Some field types also act as a proxy to know about feature usage of some APIs
like the `percolator` or `completion` fields types for percolation and the
completion suggester, respectively.
This change adds a new `kibana_admin` role, and deprecates
the old `kibana_user` and`kibana_dashboard_only_user`roles.
The deprecation is implemented via a new reserved metadata
attribute, which can be consumed from the API and also triggers
deprecation logging when used (by a user authenticating to
Elasticsearch).
Some docs have been updated to avoid references to these
deprecated roles.
Backport of: #46456
Co-authored-by: Larry Gregory <lgregorydev@gmail.com>
Check it out:
```
$ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{
"dac": {}
}'
{
"error" : {
"root_cause" : [
{
"type" : "x_content_parse_exception",
"reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
}
],
"type" : "x_content_parse_exception",
"reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
},
"status" : 400
}
```
The tricky thing about implementing this is that x-content doesn't
depend on Lucene. So this works by creating an extension point for the
error message using SPI. Elasticsearch's server module provides the
"spell checking" implementation.
s
* [ML][Inference] Adding classification_weights to ensemble models
classification_weights are a way to allow models to
prefer specific classification results over others
this might be advantageous if classification value
probabilities are a known quantity and can improve
model error rates.
Adds a new parameter to regression and classification that enables computation
of importance for the top most important features. The computation of the importance
is based on SHAP (SHapley Additive exPlanations) method.
Backport of #50914
This adds a new "http" sub-command to the certutil CLI tool.
The http command generates certificates/CSRs for use on the http
interface of an elasticsearch node/cluster.
It is designed to be a guided tool that provides explanations and
sugestions for each of the configuration options. The generated zip
file output includes extensive "readme" documentation and sample
configuration files for core Elastic products.
Backport of: #49827
The Document Level Security BitSet Cache (see #43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.
This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.
The new defaults are 10% of heap, and 2 hours.
This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.
Backport of: #50535
Previously custom realms were limited in what services and components
they had easy access to. It was possible to work around this because a
security extension is packaged within a Plugin, so there were ways to
store this components in static/SetOnce variables and access them from
the realm, but those techniques were fragile, undocumented and
difficult to discover.
This change includes key services as an argument to most of the methods
on SecurityExtension so that custom realm / role provider authors can
have easy access to them.
Backport of: #50534
The Document Level Security BitSet cache stores a secondary "lookup
map" so that it can determine which cache entries to invalidate when
a Lucene index is closed (merged, etc).
There was a memory leak because this secondary map was not cleared
when entries were naturally evicted from the cache (due to size/ttl
limits).
This has been solved by adding a cache removal listener and processing
those removal events asyncronously.
Backport of: #50635
When creating a role, we do not check if the exceptions for
the field permissions are a subset of granted fields. If such
a role is assigned to a user then that user's authentication fails
for this reason.
We added a check to validate role query in #46275 and on the same lines,
this commit adds check if the exceptions for the field
permissions is a subset of granted fields when parsing the
index privileges from the role descriptor.
Backport of: #50212
Co-authored-by: Yogesh Gaikwad <bizybot@users.noreply.github.com>
The enterprise license type must have "max_resource_units" and may not
have "max_nodes".
This change adds support for this new field, validation that the field
is present if-and-only-if the license is enterprise and bumps the
license version number to reflect the new field.
Includes a BWC layer to return "max_nodes: ${max_resource_units}" in
the GET license API.
Backport of: #50735
* ILM action to wait for SLM policy execution (#50454)
This change add new ILM action to wait for SLM policy execution to ensure that index has snapshot before deletion.
Closes#45067
* Fix flaky TimeSeriesLifecycleActionsIT#testWaitForSnapshot test
This change adds some randomness and cleanup step to TimeSeriesLifecycleActionsIT#testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore tests in attempt to make them stable.
Reletes to #50781
* Formatting changes
* Longer timeout
* Fix Map.of in Java8
* Unused import removed
This commit changes the default behavior for
xpack.security.ssl.diagnose.trust when running in a FIPS 140 JVM.
More specifically, when xpack.security.fips_mode.enabled is true:
- If xpack.security.ssl.diagnose.trust is not explicitly set, the
default value of it becomes false and a log message is printed
on info level, notifying of the fact that the TLS/SSL diagnostic
messages are not enabled when in a FIPS 140 JVM.
- If xpack.security.ssl.diagnose.trust is explicitly set, the value of
it is honored, even in FIPS mode.
This is relevant only for 7.x where we support Java 8 in which
SunJSSE can still be used as a FIPS 140 provider for TLS. SunJSSE
in FIPS mode, disallows the use of other TrustManager implementations
than the one shipped with SunJSSE.