Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and
also that thread contexts are not leaked). Moves the ClusterApplierService to use the system
context (same as we do for MasterService), which allows to remove a hack from
TemplateUgradeService and makes it clearer that applying CS updates is fully executing under
system context.
Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and
also that thread contexts are not leaked). Moves the ClusterApplierService to use the system
context (same as we do for MasterService), which allows to remove a hack from
TemplateUgradeService and makes it clearer that applying CS updates is fully executing under
system context.
In #51089 where SamlAuthenticatorTests were refactored, we missed
to update one test case which meant that a single key would be
used both for signing and encryption in the same run. As explained
in #51089, and due to FIPS 140 requirements, BouncyCastle FIPS
provider will block RSA keys that have been used for signing from
being used for encryption and vice versa
This commit changes testNoAttributesReturnedWhenTheyCannotBeDecrypted
to always use the specific keys we have added for encryption.
- Use opensaml to sign and encrypt responses/assertions/attributes
instead of doing this manually
- Use opensaml to build response and assertion objects instead of
parsing xml strings
- Always use different keys for signing and encryption. Due to FIPS
140 requirements, BouncyCastle FIPS provider will block
RSA keys that have been used for signing from being used for
encryption and vice versa. This change adds new encryption specific
keys to be used throughout the tests.
This change aims to fix our setup in CI so that we can run 7.x in
FIPS 140 mode. The major issue that we have in 7.x and did not
have in master is that we can't use the diagnostic trust manager
in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it
explicitly disallows the wrapping of X509TrustManager.
Previous attempts like #56427 and #52211 focused on disabling the
setting in all of our tests when creating a Settings object or
on setting fips_mode.enabled accordingly (which implicitly disables
the diagnostic trust manager). The attempts weren't future proof
though as nothing would forbid someone to add new tests without
setting the necessary setting and forcing this would be very
inconvenient for any other case ( see
#56427 (comment) for the full argumentation).
This change introduces a runtime check in SSLService that overrides
the configuration value of xpack.security.ssl.diagnose.trust and
disables the diagnostic trust manager when we are running in Java 8
and the SunJSSE provider is set in FIPS mode.
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
Currently Elasticsearch creates independent event loop groups for each
transport (http and internal) transport type. This is unnecessary and
can lead to contention when different threads access shared resources
(ex: allocators). This commit moves to a model where, by default, the
event loops are shared between the transports. The previous behavior can
be attained by specifically setting the http worker count.
AuthN realms are ordered as a chain so that the credentials of a given
user are verified in succession. Upon the first successful verification,
the user is authenticated. Realms do however have the option to cut short
this iterative process, when the credentials don't verify and the user
cannot exist in any other realm. This mechanism is currently used by
the Reserved and the Kerberos realm.
This commit improves the early termination operation by allowing
realms to gracefully terminate authentication, as if the chain has been
tried out completely. Previously, early termination resulted in an
authentication error which varies the response body compared
to the failed authentication outcome where no realm could verify the
credentials successfully.
Reserved users are hence denied authentication in exactly the same
way as other users are when no realm can validate their credentials.
Backport of #56034.
Move includeDataStream flag from an IndicesOptions to IndexNameExpressionResolver.Context
as a dedicated field that callers to IndexNameExpressionResolver can set.
Also alter indices stats api to support data streams.
The rollover api uses this api and otherwise rolling over data stream does no longer work.
Relates to #53100
Backport of #55858 to 7.x branch.
Currently the TransportBulkAction detects whether an index is missing and
then decides whether it should be auto created. The coordination of the
index creation also happens in the TransportBulkAction on the coordinating node.
This change adds a new transport action that the TransportBulkAction delegates to
if missing indices need to be created. The reasons for this change:
* Auto creation of data streams can't occur on the coordinating node.
Based on the index template (v2) either a regular index or a data stream should be created.
However if the coordinating node is slow in processing cluster state updates then it may be
unaware of the existence of certain index templates, which then can load to the
TransportBulkAction creating an index instead of a data stream. Therefor the coordination of
creating an index or data stream should occur on the master node. See #55377
* From a security perspective it is useful to know whether index creation originates from the
create index api or from auto creating a new index via the bulk or index api. For example
a user would be allowed to auto create an index, but not to use the create index api. The
auto create action will allow security to distinguish these two different patterns of
index creation.
This change adds the following new transport actions:
AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via #55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created.
The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege.
Relates to #53100
* Make xpack.monitoring.enabled setting a no-op
This commit turns xpack.monitoring.enabled into a no-op. Mostly, this involved
removing the setting from the setup for integration tests. Monitoring may
introduce some complexity for test setup and teardown, so we should keep an eye
out for turbulence and failures
* Docs for making deprecated setting a no-op
This commit converts the remaining isXXXAllowed methods to instead of
use isAllowed with a Feature value. There are a couple other methods
that are static, as well as some licensed features that check the
license directly, but those will be dealt with in other followups.
Anonymous roles resolution and user role deduplication are now performed during authentication instead of authorization. The change ensures:
* If anonymous access is enabled, user will be able to see the anonymous roles added in the roles field in the /_security/_authenticate response.
* Any duplication in user roles are removed and will not show in the above authenticate response.
* In any other case, the response is unchanged.
It also introduces a behaviour change: the anonymous role resolution is now authentication node specific, previously it was authorization node specific. Details can be found at #47195 (comment)
Currently there is a clear mechanism to stub sending a request through
the transport. However, this is limited to testing exceptions on the
sender side. This commit reworks our transport related testing
infrastructure to allow stubbing request handling on the receiving side.
This change adds validation when running the users tool so that
if Elasticsearch is expected to run in a JVM that is configured to
be in FIPS 140 mode and the password hashing algorithm is not
compliant, we would throw an error.
Users tool uses the configuration from the node and this validation
would also happen upon node startup but users might be added in the
file realm before the node is started and we would have the
opportunity to notify the user of this misconfiguration.
The changes in #55544 make this much less probable to happen in 8
since the default algorithm will be compliant but this change can
act as a fallback in anycase and makes for a better user experience.
Our handling for concurrent refresh of access tokens suffered from
a race condition where:
1. Thread A has just finished with updating the existing token
document, but hasn't stored the new tokens in a new document
yet
2. Thread B attempts to refresh the same token and since the
original token document is marked as refreshed, it decrypts and
gets the new access token and refresh token and returns that to
the caller of the API.
3. The caller attempts to use the newly refreshed access token
immediately and gets an authentication error since thread A still
hasn't finished writing the document.
This commit changes the behavior so that Thread B, would first try
to do a Get request for the token document where it expects that
the access token it decrypted is stored(with exponential backoff )
and will not respond until it can verify that it reads it in the
tokens index. That ensures that we only ever return tokens in a
response if they are already valid and can be used immediately
It also adjusts TokenAuthIntegTests
to test authenticating with the tokens each thread receives,
which would fail without the fix.
Resolves: #54289
This commit fixes reproducible test failures with the
SSLReloadDuringStartupIntegTests on the 7.x branch. The failures only
occur on 7.x due to the existence of the transport client and its usage
in our test infrastructure. This change removes the randomized usage of
transport clients when retrieving a client from a node in the internal
cluster. Transport clients do not support the reloading of files for
TLS configuration changes but if we build one from the nodes settings
and attempt to use it after the files have been changed, the client
will not know about the changes and the TLS connection will fail.
Closes#55524
License state is currently made up of boolean methods that check whether
a particular feature is allowed by the current license state. Each new
feature must copy/past boiler plate code. While that has gotten easier
with utilities like isAllowedByLicense, this is still more cumbersome
than should be necessary. This commit adds a general purpose isAllowed
method which takes a new Feature enum, where each value of the enum
defines the minimum license mode and whether the license must be active
to be allowed. Only security features are converted in this PR, in order
to keep the commit size relatively small. The rest of the features will
be converted in a followup.
Backport of #55115.
Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...),
with an appropriate key, so that all messages can potentially be deduplicated.
Out of the box "access granted" audit events are not logged
for system users. The list of system users was stale and included
only the _system and _xpack users. This commit expands this list
with _xpack_security and _async_search, effectively reducing the
auditing noise by not logging the audit events of these system
users out of the box.
Closes#37924
This commit upgrades the ASM dependency used in the feature aware check
to 7.3.1. This gives support for JDK 14. Additionally, now that Gradle
understands JDK 13, it means we can remove a restriction on running the
feature aware check to JDK 12 and lower.
This change reworks the loading and monitoring of files that are used
for the construction of SSLContexts so that updates to these files are
not lost if the updates occur during startup. Previously, the
SSLService would parse the settings, build the SSLConfiguration
objects, and construct the SSLContexts prior to the
SSLConfigurationReloader starting to monitor these files for changes.
This allowed for a small window where updates to these files may never
be observed until the node restarted.
To remove the potential miss of a change to these files, the code now
parses the settings and builds SSLConfiguration instances prior to the
construction of the SSLService. The files back the SSLConfiguration
instances are then registered for monitoring and finally the SSLService
is constructed from the previously parse SSLConfiguration instances. As
the SSLService is not constructed when the code starts monitoring the
files for changes, a CompleteableFuture is used to obtain a reference
to the SSLService; this allows for construction of the SSLService to
complete and ensures that we do not miss any file updates during the
construction of the SSLService.
While working on this change, the SSLConfigurationReloader was also
refactored to reflect how it is currently used. When the
SSLConfigurationReloader was originally written the files that it
monitored could change during runtime. This is no longer the case as
we stopped the monitoring of files that back dynamic SSLContext
instances. In order to support the ability for items to change during
runtime, the class made use of concurrent data structures. The use of
these concurrent datastructures has been removed.
Closes#54867
Backport of #54999
Security features in the license state currently do a dynamic check on
whether security is enabled. This is because the license level can
change the default security enabled state. This commit splits out the
check on security being enabled, so that the combo method of security
enabled plus license allowed is no longer necessary.
We believe there's no longer a need to be able to disable basic-license
features completely using the "xpack.*.enabled" settings. If users don't
want to use those features, they simply don't need to use them. Having
such features always available lets us build more complex features that
assume basic-license features are present.
This commit deprecates settings of the form "xpack.*.enabled" for
basic-license features, excluding "security", which is a special case.
It also removes deprecated settings from integration tests and unit
tests where they're not directly relevant; e.g. monitoring and ILM are
no longer disabled in many integration tests.
Backport from: #54726
The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used.
In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs.
Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex().
If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).
Relates to #53100
Today we pass the `RepositoriesService` to the searchable snapshots plugin
during the initialization of the `RepositoryModule`, forcing the plugin to be a
`RepositoryPlugin` even though it does not implement any repositories.
After discussion we decided it best for now to pass this in via
`Plugin#createComponents` instead, pending some future work in which plugins
can depend on services more dynamically.
This commit fixes our behavior regarding the responses we
return in various cases for the use of token related APIs.
More concretely:
- In the Get Token API with the `refresh` grant, when an invalid
(already deleted, malformed, unknown) refresh token is used in the
body of the request, we respond with `400` HTTP status code
and an `error_description` header with the message "could not
refresh the requested token".
Previously we would return erroneously return a `401` with "token
malformed" message.
- In the Invalidate Token API, when using an invalid (already
deleted, malformed, unknown) access or refresh token, we respond
with `404` and a body that shows that no tokens were invalidated:
```
{
"invalidated_tokens":0,
"previously_invalidated_tokens":0,
"error_count":0
}
```
The previous behavior would be to erroneously return
a `400` or `401` ( depending on the case ).
- In the Invalidate Token API, when the tokens index doesn't
exist or is closed, we return `400` because we assume this is
a user issue either because they tried to invalidate a token
when there is no tokens index yet ( i.e. no tokens have
been created yet or the tokens index has been deleted ) or the
index is closed.
- In the Invalidate Token API, when the tokens index is
unavailable, we return a `503` status code because
we want to signal to the caller of the API that the token they
tried to invalidate was not invalidated and we can't be sure
if it is still valid or not, and that they should try the request
again.
Resolves: #53323
The ResourceWatcherService enables watching of files for modifications
and deletions. During startup various consumers register the files that
should be watched by this service. There is behavior that might be
unexpected in that the service may not start polling until later in the
startup process due to the use of lifecycle states to control when the
service actually starts the jobs to monitor resources. This change
removes this unexpected behavior so that upon construction the service
has already registered its tasks to poll resources for changes. In
making this modification, the service no longer extends
AbstractLifecycleComponent and instead implements the Closeable
interface so that the polling jobs can be terminated when the service
is no longer required.
Relates #54867
Backport of #54993
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
Currently forbidden apis accounts for 800+ tasks in the build. These
tasks are aggressively created by the plugin. In forbidden apis 3.0, we
will get task avoidance
(https://github.com/policeman-tools/forbidden-apis/pull/162), but we
need to ourselves use the same task avoidance mechanisms to not trigger
these task creations. This commit does that for our foribdden apis
usages, in preparation for upgrading to 3.0 when it is released.
We implicitly only supported the prime256v1 ( aka secp256r1 )
curve for the EC keys we read as PEM files to be used in any
SSL Context. We would not fail when trying to read a key
pair using a different curve but we would silently assume
that it was using `secp256r1` which would lead to strange
TLS handshake issues if the curve was actually another one.
This commit fixes that behavior in that it
supports parsing EC keys that use any of the named curves
defined in rfc5915 and rfc5480 making no assumptions about
whether the security provider in use supports them (JDK8 and
higher support all the curves defined in rfc5480).
This commit refactors the `AuditTrail` to use the `TransportRequest` as a parameter
for all its audit methods, instead of the current `TransportMessage` super class.
The goal is to gain access to the `TransportRequest#parentTaskId` member,
so that it can be audited. The `parentTaskId` is used internally when spawning tasks
that handle transport requests; in this way tasks across nodes are related by the
same parent task.
Relates #52314
This is a first cut at giving NodeInfo the ability to carry a flexible
list of heterogeneous info responses. The trick is to be able to
serialize and deserialize an arbitrary list of blocks of information. It
is convenient to be able to deserialize into usable Java objects so that
we can aggregate nodes stats for the cluster stats endpoint.
In order to provide a little bit of clarity about which objects can and
can't be used as info blocks, I've introduced a new interface called
"ReportingService."
I have removed the hard-coded getters (e.g., getOs()) in favor of a
flexible method that can return heterogeneous kinds of info blocks
(e.g., getInfo(OsInfo.class)). Taking a class as an argument removes the
need to cast in the client code.
The isAuthAllowed() method for license checking is used by code that
wants to ensure security is both enabled and available. The enabled
state is dynamic and provided by isSecurityEnabled(). But since security
is available with all license types, an check on the license level is
not necessary. Thus, this change replaces isAuthAllowed() with calling
isSecurityEnabled().
This change reintroduces the system index APIs for Kibana without the
changes made for marking what system indices could be accessed using
these APIs. In essence, this is a partial revert of #53912. The changes
for marking what system indices should be allowed access will be
handled in a separate change.
The APIs introduced here are wrapped versions of the existing REST
endpoints. A new setting is also introduced since the Kibana system
indices' names are allowed to be changed by a user in case multiple
instances of Kibana use the same instance of Elasticsearch.
Relates #52385
Backport of #54858
Guava was removed from Elasticsearch many years ago, but remnants of it
remain due to transitive dependencies. When a dependency pulls guava
into the compile classpath, devs can inadvertently begin using methods
from guava without realizing it. This commit moves guava to a runtime
dependency in the modules that it is needed.
Note that one special case is the html sanitizer in watcher. The third
party dep uses guava in the PolicyFactory class signature. However, only
calling a method on the PolicyFactory actually causes the class to be
loaded, a reference alone does not trigger compilation to look at the
class implementation. There we utilize a MethodHandle for invoking the
relevant method at runtime, where guava will continue to exist.