Out of the box "access granted" audit events are not logged
for system users. The list of system users was stale and included
only the _system and _xpack users. This commit expands this list
with _xpack_security and _async_search, effectively reducing the
auditing noise by not logging the audit events of these system
users out of the box.
Closes#37924
This commit upgrades the ASM dependency used in the feature aware check
to 7.3.1. This gives support for JDK 14. Additionally, now that Gradle
understands JDK 13, it means we can remove a restriction on running the
feature aware check to JDK 12 and lower.
This change reworks the loading and monitoring of files that are used
for the construction of SSLContexts so that updates to these files are
not lost if the updates occur during startup. Previously, the
SSLService would parse the settings, build the SSLConfiguration
objects, and construct the SSLContexts prior to the
SSLConfigurationReloader starting to monitor these files for changes.
This allowed for a small window where updates to these files may never
be observed until the node restarted.
To remove the potential miss of a change to these files, the code now
parses the settings and builds SSLConfiguration instances prior to the
construction of the SSLService. The files back the SSLConfiguration
instances are then registered for monitoring and finally the SSLService
is constructed from the previously parse SSLConfiguration instances. As
the SSLService is not constructed when the code starts monitoring the
files for changes, a CompleteableFuture is used to obtain a reference
to the SSLService; this allows for construction of the SSLService to
complete and ensures that we do not miss any file updates during the
construction of the SSLService.
While working on this change, the SSLConfigurationReloader was also
refactored to reflect how it is currently used. When the
SSLConfigurationReloader was originally written the files that it
monitored could change during runtime. This is no longer the case as
we stopped the monitoring of files that back dynamic SSLContext
instances. In order to support the ability for items to change during
runtime, the class made use of concurrent data structures. The use of
these concurrent datastructures has been removed.
Closes#54867
Backport of #54999
Security features in the license state currently do a dynamic check on
whether security is enabled. This is because the license level can
change the default security enabled state. This commit splits out the
check on security being enabled, so that the combo method of security
enabled plus license allowed is no longer necessary.
We believe there's no longer a need to be able to disable basic-license
features completely using the "xpack.*.enabled" settings. If users don't
want to use those features, they simply don't need to use them. Having
such features always available lets us build more complex features that
assume basic-license features are present.
This commit deprecates settings of the form "xpack.*.enabled" for
basic-license features, excluding "security", which is a special case.
It also removes deprecated settings from integration tests and unit
tests where they're not directly relevant; e.g. monitoring and ILM are
no longer disabled in many integration tests.
Backport from: #54726
The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used.
In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs.
Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex().
If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).
Relates to #53100
Today we pass the `RepositoriesService` to the searchable snapshots plugin
during the initialization of the `RepositoryModule`, forcing the plugin to be a
`RepositoryPlugin` even though it does not implement any repositories.
After discussion we decided it best for now to pass this in via
`Plugin#createComponents` instead, pending some future work in which plugins
can depend on services more dynamically.
This commit fixes our behavior regarding the responses we
return in various cases for the use of token related APIs.
More concretely:
- In the Get Token API with the `refresh` grant, when an invalid
(already deleted, malformed, unknown) refresh token is used in the
body of the request, we respond with `400` HTTP status code
and an `error_description` header with the message "could not
refresh the requested token".
Previously we would return erroneously return a `401` with "token
malformed" message.
- In the Invalidate Token API, when using an invalid (already
deleted, malformed, unknown) access or refresh token, we respond
with `404` and a body that shows that no tokens were invalidated:
```
{
"invalidated_tokens":0,
"previously_invalidated_tokens":0,
"error_count":0
}
```
The previous behavior would be to erroneously return
a `400` or `401` ( depending on the case ).
- In the Invalidate Token API, when the tokens index doesn't
exist or is closed, we return `400` because we assume this is
a user issue either because they tried to invalidate a token
when there is no tokens index yet ( i.e. no tokens have
been created yet or the tokens index has been deleted ) or the
index is closed.
- In the Invalidate Token API, when the tokens index is
unavailable, we return a `503` status code because
we want to signal to the caller of the API that the token they
tried to invalidate was not invalidated and we can't be sure
if it is still valid or not, and that they should try the request
again.
Resolves: #53323
The ResourceWatcherService enables watching of files for modifications
and deletions. During startup various consumers register the files that
should be watched by this service. There is behavior that might be
unexpected in that the service may not start polling until later in the
startup process due to the use of lifecycle states to control when the
service actually starts the jobs to monitor resources. This change
removes this unexpected behavior so that upon construction the service
has already registered its tasks to poll resources for changes. In
making this modification, the service no longer extends
AbstractLifecycleComponent and instead implements the Closeable
interface so that the polling jobs can be terminated when the service
is no longer required.
Relates #54867
Backport of #54993
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
Currently forbidden apis accounts for 800+ tasks in the build. These
tasks are aggressively created by the plugin. In forbidden apis 3.0, we
will get task avoidance
(https://github.com/policeman-tools/forbidden-apis/pull/162), but we
need to ourselves use the same task avoidance mechanisms to not trigger
these task creations. This commit does that for our foribdden apis
usages, in preparation for upgrading to 3.0 when it is released.
We implicitly only supported the prime256v1 ( aka secp256r1 )
curve for the EC keys we read as PEM files to be used in any
SSL Context. We would not fail when trying to read a key
pair using a different curve but we would silently assume
that it was using `secp256r1` which would lead to strange
TLS handshake issues if the curve was actually another one.
This commit fixes that behavior in that it
supports parsing EC keys that use any of the named curves
defined in rfc5915 and rfc5480 making no assumptions about
whether the security provider in use supports them (JDK8 and
higher support all the curves defined in rfc5480).
This commit refactors the `AuditTrail` to use the `TransportRequest` as a parameter
for all its audit methods, instead of the current `TransportMessage` super class.
The goal is to gain access to the `TransportRequest#parentTaskId` member,
so that it can be audited. The `parentTaskId` is used internally when spawning tasks
that handle transport requests; in this way tasks across nodes are related by the
same parent task.
Relates #52314
This is a first cut at giving NodeInfo the ability to carry a flexible
list of heterogeneous info responses. The trick is to be able to
serialize and deserialize an arbitrary list of blocks of information. It
is convenient to be able to deserialize into usable Java objects so that
we can aggregate nodes stats for the cluster stats endpoint.
In order to provide a little bit of clarity about which objects can and
can't be used as info blocks, I've introduced a new interface called
"ReportingService."
I have removed the hard-coded getters (e.g., getOs()) in favor of a
flexible method that can return heterogeneous kinds of info blocks
(e.g., getInfo(OsInfo.class)). Taking a class as an argument removes the
need to cast in the client code.
The isAuthAllowed() method for license checking is used by code that
wants to ensure security is both enabled and available. The enabled
state is dynamic and provided by isSecurityEnabled(). But since security
is available with all license types, an check on the license level is
not necessary. Thus, this change replaces isAuthAllowed() with calling
isSecurityEnabled().
This change reintroduces the system index APIs for Kibana without the
changes made for marking what system indices could be accessed using
these APIs. In essence, this is a partial revert of #53912. The changes
for marking what system indices should be allowed access will be
handled in a separate change.
The APIs introduced here are wrapped versions of the existing REST
endpoints. A new setting is also introduced since the Kibana system
indices' names are allowed to be changed by a user in case multiple
instances of Kibana use the same instance of Elasticsearch.
Relates #52385
Backport of #54858
Guava was removed from Elasticsearch many years ago, but remnants of it
remain due to transitive dependencies. When a dependency pulls guava
into the compile classpath, devs can inadvertently begin using methods
from guava without realizing it. This commit moves guava to a runtime
dependency in the modules that it is needed.
Note that one special case is the html sanitizer in watcher. The third
party dep uses guava in the PolicyFactory class signature. However, only
calling a method on the PolicyFactory actually causes the class to be
loaded, a reference alone does not trigger compilation to look at the
class implementation. There we utilize a MethodHandle for invoking the
relevant method at runtime, where guava will continue to exist.
This change ensures that the AsyncSearchUser is correctly (de)serialized when
an action executed by this user is sent to a remote node internally (via transport client).
* Refactor nodes stats request builders to match requests (#54363)
* Remove hard-coded setters from NodesInfoRequestBuilder
* Remove hard-coded setters from NodesStatsRequest
* Use static imports to reduce clutter
* Remove uses of old info APIs
Refactor SearchHit to have separate document and meta fields.
This is a part of bigger refactoring of issue #24422 to remove
dependency on MapperService to check if a field is metafield.
Relates to PR: #38373
Relates to issue #24422
Co-authored-by: sandmannn <bohdanpukalskyi@gmail.com>
This is a follow up to a previous commit that renamed MetaData to
Metadata in all of the places. In that commit in master, we renamed
META_DATA to METADATA, but lost this on the backport. This commit
addresses that.
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
Backport of #53982
In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams,
the abstraction needs to be made more flexible, because currently it really can be only
an alias or an index.
* Renamed `AliasOrIndex` to `IndexAbstraction`.
* Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is.
* Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum.
* Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface.
* Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`.
* Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method.
Relates to #53100
Today the security plugin stashes a copy of the environment in its
constructor, and uses the stashed copy to construct its components even
though it is provided with an environment to create these
components. What is more, the environment it creates in its constructor
is not fully initialized, as it does not have the final copy of the
settings, but the environment passed in while creating components
does. This commit removes that stashed copy of the environment.
Currently all of our transport protocol decoding and aggregation occurs
in the individual transport modules. This means that each implementation
(test, netty, nio) must implement this logic. Additionally, it means
that the entire message has been read from the network before the server
package receives it.
This commit creates a pipeline in server which can be passed arbitrary
bytes to handle. Internally, the pipeline will decode, decompress, and
aggregate the messages. Additionally, this allows us to run many
megabytes of bytes through the pipeline in tests to ensure that the
logic works.
This work will enable future work:
Circuit breaking or backoff logic based on message type and byte
in the content aggregator.
Sharing bytes with the application layer using the ref counted
releasable network bytes.
Improved network monitoring based specifically on channels.
Finally, this fixes the bug where we do not circuit break on the correct
message size when compression is enabled.
Changes ThreadPool's schedule method to run the schedule task in the context of the thread
that scheduled the task.
This is the more sensible default for this method, and eliminates a range of bugs where the
current thread context is mistakenly dropped.
Closes#17143
Xpack license state contains a helper method to determine whether
security is disabled due to license level defaults. Most code needs to
know whether security is enabled, not disabled, but this method exists
so that the security being explicitly disabled can be distinguished from
licence level defaulting to disabled. However, in the case that security
is explicitly disabled, the handlers in question are never registered,
so security is implicitly not disabled explicitly, and thus we can share
a single method to know whether licensing is enabled.
This change merges the "feature-internal-idp" branch into Elasticsearch.
This introduces a small identity-provider plugin as a child of the x-pack module.
This allows ES to act as a SAML IdP, for users who are authenticated against the
Elasticsearch cluster.
This feature is intended for internal use within Elastic Cloud environments
and is not supported for any other use case. It falls under an enterprise license tier.
The IdP is disabled by default.
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
Role names are now compiled from role templates before role mapping is saved.
This serves as validation for role templates to prevent malformed and invalid scripts
to be persisted, which could later break authentication.
Resolves: #48773
This commit removes the configuration time vs execution time distinction
with regards to certain BuildParms properties. Because of the cost of
determining Java versions for configuration JDK locations we deferred
this until execution time. This had two main downsides. First, we had
to implement all this build logic in tasks, which required a bunch of
additional plumbing and complexity. Second, because some information
wasn't known during configuration time, we had to nest any build logic
that depended on this in awkward callbacks.
We now defer to the JavaInstallationRegistry recently added in Gradle.
This utility uses a much more efficient method for probing Java
installations vs our jrunscript implementation. This, combined with some
optimizations to avoid probing the current JVM as well as deferring
some evaluation via Providers when probing installations for BWC builds
we can maintain effectively the same configuration time performance
while removing a bunch of complexity and runtime cost (snapshotting
inputs for the GenerateGlobalBuildInfoTask was very expensive). The end
result should be a much more responsive build execution in almost all
scenarios.
(cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)
This change adds a "grant API key action"
POST /_security/api_key/grant
that creates a new API key using the privileges of one user ("the
system user") to execute the action, but creates the API key with
the roles of the second user ("the end user").
This allows a system (such as Kibana) to create API keys representing
the identity and access of an authenticated user without requiring
that user to have permission to create API keys on their own.
This also creates a new QA project for security on trial licenses and runs
the API key tests there
Backport of: #52886
This change adds a new exception with consistent metadata for when
security features are not enabled. This allows clients to be able to
tell that an API failed due to a configuration option, and respond
accordingly.
Relates: kibana#55255
Resolves: #52311, #47759
Backport of: #52811
In xpack the license state contains methods to determine whether a
particular feature is allowed to be used. The one exception is
allowsRealmTypes() which returns an enum of the types of realms allowed.
This change converts the enum values to boolean methods. There are 2
notable changes: NONE is removed as we always fall back to basic license
behavior, and NATIVE is not needed because it would always return true
since we should always have a basic license.
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.
Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
The AuditTrailService has historically been an AuditTrail itself, acting
as a composite of the configured audit trails. This commit removes that
interface from the service and instead builds a composite delegating
implementation internally. The service now has a single get() method to
get an AuditTrail implementation which may be called. If auditing is not
allowed by the license, an empty noop version is returned.
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
Password changes are only allowed when the user is currently
authenticated by a realm (that permits the password to be changed)
and not when authenticated by a bearer token or an API key.
The current implicit behaviour is that when an API keys is used to create another API key,
the child key is created without any privilege. This implicit behaviour is surprising and is
a source of confusion for users.
This change makes that behaviour explicit.
If security was disabled (explicitly), then the SecurityContext would
be null, but the set_security_user processor was still registered.
Attempting to define a pipeline that used that processor would fail
with an (intentional) NPE. This behaviour, introduced in #52032, is a
regression from previous releases where the pipeline was allowed, but
was no usable.
This change restores the previous behaviour (with a new warning).
Backport of: #52691
This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.
````
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
"aggs": {
"date_histogram": {
"field": "@timestamp",
"fixed_interval": "1h"
}
}
}
````
If after 100ms the final response is not available, a `partial_response` is included in the body:
````
{
"id": "9N3J1m4BgyzUDzqgC15b",
"version": 1,
"is_running": true,
"is_partial": true,
"response": {
"_shards": {
"total": 100,
"successful": 5,
"failed": 0
},
"total_hits": {
"value": 1653433,
"relation": "eq"
},
"aggs": {
...
}
}
}
````
The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:
````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````
That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:
````
{
"id": "9N3J1m4BgyzUDzqgC15b",
"version": 10,
"is_running": false,
"response": {
"is_partial": false,
...
}
}
````
Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`.
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.
Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed.
Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.
````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````
The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.
The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.
Relates #49091
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
This change makes it possible to send secondary authentication
credentials to select endpoints that need to perform a single action
in the context of two users.
Typically this need arises when a server process needs to call an
endpoint that users should not (or might not) have direct access to,
but some part of that action must be performed using the logged-in
user's identity.
Backport of: #52093
This change adds a new parameter to the authenticate methods in the
AuthenticationService to optionally exclude support for the anonymous
user (if an anonymous user exists).
Backport of: #52094
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:
1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index
This commit avoids these issues by adding a UUID to SearchContexId.
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.
The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
This commit introduces a module for Kibana that exposes REST APIs that
will be used by Kibana for access to its system indices. These APIs are wrapped
versions of the existing REST endpoints. A new setting is also introduced since
the Kibana system indices' names are allowed to be changed by a user in case
multiple instances of Kibana use the same instance of Elasticsearch.
Additionally, the ThreadContext has been extended to indicate that the use of
system indices may be allowed in a request. This will be built upon in the future
for the protection of system indices.
Backport of #52385
When user A runs as user B and performs any API key related operations,
user B's realm should always be used to associate with the API key.
Currently user A's realm is used when getting or invalidating API keys
and owner=true. The PR is to fix this bug.
resolves: #51975
* Smarter copying of the rest specs and tests (#52114)
This PR addresses the unnecessary copying of the rest specs and allows
for better semantics for which specs and tests are copied. By default
the rest specs will get copied if the project applies
`elasticsearch.standalone-rest-test` or `esplugin` and the project
has rest tests or you configure the custom extension `restResources`.
This PR also removes the need for dozens of places where the x-pack
specs were copied by supporting copying of the x-pack rest specs too.
The plugin/task introduced here can also copy the rest tests to the
local project through a similar configuration.
The new plugin/task allows a user to minimize the surface area of
which rest specs are copied. Per project can be configured to include
only a subset of the specs (or tests). Configuring a project to only
copy the specs when actually needed should help with build cache hit
rates since we can better define what is actually in use.
However, project level optimizations for build cache hit rates are
not included with this PR.
Also, with this PR you can no longer use the includePackaged flag on
integTest task.
The following items are included in this PR:
* new plugin: `elasticsearch.rest-resources`
* new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy
* new extension 'restResources'
```
restResources {
restApi {
includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar
includeXpack 'baz' //will include x-pack specs that start with baz
}
restTests {
includeCore 'foo', 'bar' //will include the core tests that start with foo and bar
includeXpack 'baz' //will include the x-pack tests that start with baz
}
}
```
Add validation for the following logfile audit settings:
xpack.security.audit.logfile.events.include
xpack.security.audit.logfile.events.exclude
xpack.security.audit.logfile.events.ignore_filters.*.users
xpack.security.audit.logfile.events.ignore_filters.*.realms
xpack.security.audit.logfile.events.ignore_filters.*.roles
xpack.security.audit.logfile.events.ignore_filters.*.indices
Closes#52357
Relates #47711#47038
Follows the example from #47246
This commit renames ElasticsearchAssertions#assertThrows to
assertRequestBuilderThrows and assertFutureThrows to avoid a
naming clash with JUnit 4.13+ and static imports of these methods.
Additionally, these methods have been updated to make use of
expectThrows internally to avoid duplicating the logic there.
Relates #51787
Backport of #52582
This commit modifies the codebase so that our production code uses a
single instance of the IndexNameExpressionResolver class. This change
is being made in preparation for allowing name expression resolution
to be augmented by a plugin.
In order to remove some instances of IndexNameExpressionResolver, the
single instance is added as a parameter of Plugin#createComponents and
PersistentTaskPlugin#getPersistentTasksExecutor.
Backport of #52596
Add enterprise operation mode to properly map enterprise license.
Aslo refactor XPackLicenstate class to consolidate license status and mode checks.
This class has many sychronised methods to check basically three things:
* Minimum operation mode required
* Whether security is enabled
* Whether current license needs to be active
Depends on the actual feature, either 1, 2 or all of above checks are performed.
These are now consolidated in to 3 helper methods (2 of them are new).
The synchronization is pushed down to the helper methods so actual checking
methods no longer need to worry about it.
resolves: #51081
Currently we used the secure random number generate when generating http
request ids in the security AuditUtil. We do not need to be using this
level of randomness for this use case. Additionally, this random number
generator involves locking that blocks the http worker threads at high
concurrency loads.
This commit modifies this randomness generator to use our reproducible
randomness generator for Elasticsearch. This generator will fall back to
thread local random when used in production.
This is useful in cases where the caller of the API needs to know
the name of the realm that consumed the SAML Response and
authenticated the user and this is not self evident (i.e. because
there are many saml realms defined in ES).
Currently, the way to learn the realm name would be to make a
subsequent request to the `_authenticate` API.
ML mappings and index templates have so far been created
programmatically. While this had its merits due to static typing,
there is consensus it would be clear to maintain those in json files.
In addition, we are going to adding ILM policies to these indices
and the component for a plugin to register ILM policies is
`IndexTemplateRegistry`. It expects the templates to be in resource
json files.
For the above reasons this commit refactors ML mappings and index
templates into json resource files that are registered via
`MlIndexTemplateRegistry`.
Backport of #51765
This commit removes the need for DeprecatedRoute and ReplacedRoute to
have an instance of a DeprecationLogger. Instead the RestController now
has a DeprecationLogger that will be used for all deprecated and
replaced route messages.
Relates #51950
Backport of #52278
This commit adds a new security origin, and an associated reserved user
and role, named `_async_search`, which can be used by internal clients to
manage the `.async-search-*` restricted index namespace.
The changes add more granularity for identiying the data ingestion user.
The ingest pipeline can now be configure to record authentication realm and
type. It can also record API key name and ID when one is in use.
This improves traceability when data are being ingested from multiple agents
and will become more relevant with the incoming support of required
pipelines (#46847)
Resolves: #49106
This change extracts the code that previously existed in the
"Authentication" class that was responsible for reading and writing
authentication objects to/from the ThreadContext.
This is needed to support multiple authentication objects under
separate keys.
This refactoring highlighted that there were a large number of places
where we extracted the Authentication/User objects from the thread
context, in a variety of ways. These have been consolidated to rely on
the SecurityContext object.
Backport of: #52032
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.
This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.
Closes#51622
Co-authored-by: Jason Tedor <jason@tedor.me>
Backport of #51950
Now that the FIPS 140 security provider is simply a test dependency
we don't need the thirdPartyAudit exceptions, but plugin-cli and
transport-netty4 do need jarHell disabled as they use the non fips
BouncyCastle security provider as a test dependency too.
Some parts of the User class (e.g. equals/hashCode) assumed that
principal could never be null, but the constructor didn't enforce
that.
This adds a null check into the constructor and fixes a few tests that
relied on being able to pass in null usernames.
Backport of: #51988
The changes are to help users prepare for migration to next major
release (v8.0.0) regarding to the break change of realm order config.
Warnings are added for when:
* A realm does not have an order config
* Multiple realms have the same order config
The warning messages are added to both deprecation API and loggings.
The main reasons for doing this are: 1) there is currently no automatic relay
between the two; 2) deprecation API is under basic and we need logging
for OSS.
This commit switches the strategy for managing dot-prefixed indices that
should be hidden indices from using "fake" system indices to an explicit
exclusions list that must be updated when those indices are converted to
hidden indices.
This commit creates a new index privilege named `maintenance`.
The privilege grants the following actions: `refresh`, `flush` (also synced-`flush`),
and `force-merge`. Previously the actions were only under the `manage` privilege
which in some situations was too permissive.
Co-authored-by: Amir H Movahed <arhd83@gmail.com>
The timeout.tcp_read AD/LDAP realm setting, despite the low-level
allusion, controls the time interval the realms wait for a response for
a query (search or bind). If the connection to the server is synchronous
(un-pooled) the response timeout is analogous to the tcp read timeout.
But the tcp read timeout is irrelevant in the common case of a pooled
connection (when a Bind DN is specified).
The timeout.tcp_read qualifier is hereby deprecated in favor of
timeout.response.
In addition, the default value for both timeout.tcp_read and
timeout.response is that of timeout.ldap_search, instead of the 5s (but
the default for timeout.ldap_search is still 5s). The
timeout.ldap_search defines the server-controlled timeout of a search
request. There is no practical use case to have a smaller tcp_read
timeout compared to ldap_search (in this case the request would time-out
on the client but continue to be processed on the server). The proposed
change aims to simplify configuration so that the more common
configuration change, adjusting timeout.ldap_search up, has the expected
result (no timeout during searches) without any additional
modifications.
Closes#46028
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.
This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
The docs tests have recently been running much slower than before (see #49753).
The gist here is that with ILM/SLM we do a lot of unnecessary setup / teardown work on each
test. Compounded with the slightly slower cluster state storage mechanism, this causes the
tests to run much slower.
In particular, on RAMDisk, docs:check is taking
ES 7.4: 6:55 minutes
ES master: 16:09 minutes
ES with this commit: 6:52 minutes
on SSD, docs:check is taking
ES 7.4: ??? minutes
ES master: 32:20 minutes
ES with this commit: 11:21 minutes
* Reload secure settings with password (#43197)
If a password is not set, we assume an empty string to be
compatible with previous behavior.
Only allow the reload to be broadcast to other nodes if TLS is
enabled for the transport layer.
* Add passphrase support to elasticsearch-keystore (#38498)
This change adds support for keystore passphrases to all subcommands
of the elasticsearch-keystore cli tool and adds a subcommand for
changing the passphrase of an existing keystore.
The work to read the passphrase in Elasticsearch when
loading, which will be addressed in a different PR.
Subcommands of elasticsearch-keystore can handle (open and create)
passphrase protected keystores
When reading a keystore, a user is only prompted for a passphrase
only if the keystore is passphrase protected.
When creating a keystore, a user is allowed (default behavior) to create one with an
empty passphrase
Passphrase can be set to be empty when changing/setting it for an
existing keystore
Relates to: #32691
Supersedes: #37472
* Restore behavior for force parameter (#44847)
Turns out that the behavior of `-f` for the add and add-file sub
commands where it would also forcibly create the keystore if it
didn't exist, was by design - although undocumented.
This change restores that behavior auto-creating a keystore that
is not password protected if the force flag is used. The force
OptionSpec is moved to the BaseKeyStoreCommand as we will presumably
want to maintain the same behavior in any other command that takes
a force option.
* Handle pwd protected keystores in all CLI tools (#45289)
This change ensures that `elasticsearch-setup-passwords` and
`elasticsearch-saml-metadata` can handle a password protected
elasticsearch.keystore.
For setup passwords the user would be prompted to add the
elasticsearch keystore password upon running the tool. There is no
option to pass the password as a parameter as we assume the user is
present in order to enter the desired passwords for the built-in
users.
For saml-metadata, we prompt for the keystore password at all times
even though we'd only need to read something from the keystore when
there is a signing or encryption configuration.
* Modify docs for setup passwords and saml metadata cli (#45797)
Adds a sentence in the documentation of `elasticsearch-setup-passwords`
and `elasticsearch-saml-metadata` to describe that users would be
prompted for the keystore's password when running these CLI tools,
when the keystore is password protected.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
* Elasticsearch keystore passphrase for startup scripts (#44775)
This commit allows a user to provide a keystore password on Elasticsearch
startup, but only prompts when the keystore exists and is encrypted.
The entrypoint in Java code is standard input. When the Bootstrap class is
checking for secure keystore settings, it checks whether or not the keystore
is encrypted. If so, we read one line from standard input and use this as the
password. For simplicity's sake, we allow a maximum passphrase length of 128
characters. (This is an arbitrary limit and could be increased or eliminated.
It is also enforced in the keystore tools, so that a user can't create a
password that's too long to enter at startup.)
In order to provide a password on standard input, we have to account for four
different ways of starting Elasticsearch: the bash startup script, the Windows
batch startup script, systemd startup, and docker startup. We use wrapper
scripts to reduce systemd and docker to the bash case: in both cases, a
wrapper script can read a passphrase from the filesystem and pass it to the
bash script.
In order to simplify testing the need for a passphrase, I have added a
has-passwd command to the keystore tool. This command can run silently, and
exit with status 0 when the keystore has a password. It exits with status 1 if
the keystore doesn't exist or exists and is unencrypted.
A good deal of the code-change in this commit has to do with refactoring
packaging tests to cleanly use the same tests for both the "archive" and the
"package" cases. This required not only moving tests around, but also adding
some convenience methods for an abstraction layer over distribution-specific
commands.
* Adjust docs for password protected keystore (#45054)
This commit adds relevant parts in the elasticsearch-keystore
sub-commands reference docs and in the reload secure settings API
doc.
* Fix failing Keystore Passphrase test for feature branch (#50154)
One problem with the passphrase-from-file tests, as written, is that
they would leave a SystemD environment variable set when they failed,
and this setting would cause elasticsearch startup to fail for other
tests as well. By using a try-finally, I hope that these tests will fail
more gracefully.
It appears that our Fedora and Ubuntu environments may be configured to
store journald information under /var rather than under /run, so that it
will persist between boots. Our destructive tests that read from the
journal need to account for this in order to avoid trying to limit the
output we check in tests.
* Run keystore management tests on docker distros (#50610)
* Add Docker handling to PackagingTestCase
Keystore tests need to be able to run in the Docker case. We can do this
by using a DockerShell instead of a plain Shell when Docker is running.
* Improve ES startup check for docker
Previously we were checking truncated output for the packaged JDK as
an indication that Elasticsearch had started. With new preliminary
password checks, we might get a false positive from ES keystore
commands, so we have to check specifically that the Elasticsearch
class from the Bootstrap package is what's running.
* Test password-protected keystore with Docker (#50803)
This commit adds two tests for the case where we mount a
password-protected keystore into a Docker container and provide a
password via a Docker environment variable.
We also fix a logging bug where we were logging the identifier for an
array of strings rather than the contents of that array.
* Add documentation for keystore startup prompting (#50821)
When a keystore is password-protected, Elasticsearch will prompt at
startup. This commit adds documentation for this prompt for the archive,
systemd, and Docker cases.
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
* Warn when unable to upgrade keystore on debian (#51011)
For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This
commit brings the same logic to the code for Debian packages. See the
posttrans file for gets executed for RPMs.
* Restore handling of string input
Adds tests that were mistakenly removed. One of these tests proved
we were not handling the the stdin (-x) option correctly when no
input was added. This commit restores the original approach of
reading stdin one char at a time until there is no more (-1, \r, \n)
instead of using readline() that might return null
* Apply spotless reformatting
* Use '--since' flag to get recent journal messages
When we get Elasticsearch logs from journald, we want to fetch only log
messages from the last run. There are two reasons for this. First, if
there are many logs, we might get a string that's too large for our
utility methods. Second, when we're looking for a specific message or
error, we almost certainly want to look only at messages from the last
execution.
Previously, we've been trying to do this by clearing out the physical
files under the journald process. But there seems to be some contention
over these directories: if journald writes a log file in between when
our deletion command deletes the file and when it deletes the log
directory, the deletion will fail.
It seems to me that we might be able to use journald's "--since" flag to
retrieve only log messages from the last run, and that this might be
less likely to fail due to race conditions in file deletion.
Unfortunately, it looks as if the "--since" flag has a granularity of
one-second. I've added a two-second sleep to make sure that there's a
sufficient gap between the test that will read from journald and the
test before it.
* Use new journald wrapper pattern
* Update version added in secure settings request
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
This commit sets `xpack.security.ssl.diagnose.trust` to false in all
of our tests when running in FIPS 140 mode and when settings objects
are used to create an instance of the SSLService. This is needed
in 7.x because setting xpack.security.ssl.diagnose.trust to true
wraps SunJSSE TrustManager with our own DiagnosticTrustManager and
this is not allowed when SunJSSE is in FIPS mode.
An alternative would be to set xpack.security.fips.enabled to
true which would also implicitly disable
xpack.security.ssl.diagnose.trust but would have additional effects
(would require that we set PBKDF2 for password hashing algorithm in
all test clusters, would prohibit using JKS keystores in nodes even
if relevant tests have been muted in FIPS mode etc.)
Relates: #49900Resolves: #51268
* Don't overwrite target field with SetSecurityUserProcessor
This change fix problem with `SetSecurityUserProcessor` which was overwriting
whole target field and not only fields really filled by the processor.
Closes#51428
* Unused imports removed
This change changes the way to run our test suites in
JVMs configured in FIPS 140 approved mode. It does so by:
- Configuring any given runtime Java in FIPS mode with the bundled
policy and security properties files, setting the system
properties java.security.properties and java.security.policy
with the == operator that overrides the default JVM properties
and policy.
- When runtime java is 11 and higher, using BouncyCastle FIPS
Cryptographic provider and BCJSSE in FIPS mode. These are
used as testRuntime dependencies for unit
tests and internal clusters, and copied (relevant jars)
explicitly to the lib directory for testclusters used in REST tests
- When runtime java is 8, using BouncyCastle FIPS
Cryptographic provider and SunJSSE in FIPS mode.
Running the tests in FIPS 140 approved mode doesn't require an
additional configuration either in CI workers or locally and is
controlled by specifying -Dtests.fips.enabled=true
The ApiKeyService would aggressively "close" ApiKeyCredentials objects
during processing. However, under rare circumstances, the verfication
of the secret key would be performed asychronously and may need access
to the SecureString after it had been closed by the caller.
The trigger for this would be if the cache already held a Future for
that ApiKey, but the future was not yet complete. In this case the
verification of the secret key would take place asynchronously on the
generic thread pool.
This commit moves the "close" of the credentials to the body of the
listener so that it only occurs after key verification is complete.
Backport of: #51244
API Key expiration value has millisecond precision as we use
{@link Instant#toEpoqueMilli()} when creating the API key
document.
It could often happen that `Instant.now()` Instant in the testCreateApiKey
was close enough to the ApiKeyService's `clock.instant()` Instant,
when the nanos were removed from the latter ( due to the call
to `toEpoqueMilli()` ) the result of comparing these two Instants
was a few nanos short of a 7 days.
Resolves: #47958
When not truncated, a long SAML response XML document can fill max
line length and mask the actual exception message that the trace
statement is meant to inform about.
The same XML Document is also printed in full on trace level in
SamlRequestHandler#parseSamlMessage() so there is no loss of
information
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.
Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
the wildcard pattern begins with a `.`. This behavior is similar to
shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
indices with the `expand_wildcards` parameter. To expand wildcards
to hidden indices, use the value `hidden` in conjunction with `open`
and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
global index template.
* Accessing a hidden index directly requires no additional parameters.
Backport of #50452
This commit changes our behavior so that when we receive a
request with an invalid/expired/wrong access token or API Key
we do not fallback to authenticating as the anonymous user even if
anonymous access is enabled for Elasticsearch.
When we receive a request with an Authorization header that contains
a Bearer token that is not generated by us or that is malformed in
some way, attempting to decode it as one of our own might cause a
number of exceptions that are not IOExceptions. This commit ensures
that we catch and log these too and call onResponse with `null, so
that we can return 401 instead of 500.
Resolves: #50497
Replace DES with AES to align with modern encryption standards
Backport also fixs Files.readString API that is not available in Java 8
Resolves: #50843
This change adds a new `kibana_admin` role, and deprecates
the old `kibana_user` and`kibana_dashboard_only_user`roles.
The deprecation is implemented via a new reserved metadata
attribute, which can be consumed from the API and also triggers
deprecation logging when used (by a user authenticating to
Elasticsearch).
Some docs have been updated to avoid references to these
deprecated roles.
Backport of: #46456
Co-authored-by: Larry Gregory <lgregorydev@gmail.com>
This adds a new "http" sub-command to the certutil CLI tool.
The http command generates certificates/CSRs for use on the http
interface of an elasticsearch node/cluster.
It is designed to be a guided tool that provides explanations and
sugestions for each of the configuration options. The generated zip
file output includes extensive "readme" documentation and sample
configuration files for core Elastic products.
Backport of: #49827
Previously custom realms were limited in what services and components
they had easy access to. It was possible to work around this because a
security extension is packaged within a Plugin, so there were ways to
store this components in static/SetOnce variables and access them from
the realm, but those techniques were fragile, undocumented and
difficult to discover.
This change includes key services as an argument to most of the methods
on SecurityExtension so that custom realm / role provider authors can
have easy access to them.
Backport of: #50534
The Document Level Security BitSet cache stores a secondary "lookup
map" so that it can determine which cache entries to invalidate when
a Lucene index is closed (merged, etc).
There was a memory leak because this secondary map was not cleared
when entries were naturally evicted from the cache (due to size/ttl
limits).
This has been solved by adding a cache removal listener and processing
those removal events asyncronously.
Backport of: #50635
When creating a role, we do not check if the exceptions for
the field permissions are a subset of granted fields. If such
a role is assigned to a user then that user's authentication fails
for this reason.
We added a check to validate role query in #46275 and on the same lines,
this commit adds check if the exceptions for the field
permissions is a subset of granted fields when parsing the
index privileges from the role descriptor.
Backport of: #50212
Co-authored-by: Yogesh Gaikwad <bizybot@users.noreply.github.com>
This commit changes the default behavior for
xpack.security.ssl.diagnose.trust when running in a FIPS 140 JVM.
More specifically, when xpack.security.fips_mode.enabled is true:
- If xpack.security.ssl.diagnose.trust is not explicitly set, the
default value of it becomes false and a log message is printed
on info level, notifying of the fact that the TLS/SSL diagnostic
messages are not enabled when in a FIPS 140 JVM.
- If xpack.security.ssl.diagnose.trust is explicitly set, the value of
it is honored, even in FIPS mode.
This is relevant only for 7.x where we support Java 8 in which
SunJSSE can still be used as a FIPS 140 provider for TLS. SunJSSE
in FIPS mode, disallows the use of other TrustManager implementations
than the one shipped with SunJSSE.
Hide the `.async-search-*` in Security by making it a restricted index namespace.
The namespace is hard-coded.
To grant privileges on restricted indices, one must explicitly toggle the
`allow_restricted_indices` flag in the indices permission in the role definition.
As is the case with any other index, if a certain user lacks all permissions for an
index, that index is effectively nonexistent for that user.
The OpenIdConnectRealm had a bug which would cause it not to populate
User metadata for collections contained in the user JWT claims.
This commit fixes that bug.
Backport of: #50521
In security we currently monitor a set of files for changes:
- config/role_mapping.yml (or alternative configured path)
- config/roles.yml
- config/users
- config/users_roles
This commit prevents unnecessary reloading when the file change actually doesn't change the internal structure.
Backport of: #50207
Co-authored-by: Anton Shuvaev <anton.shuvaev91@gmail.com>
This drops all remaining references to `BaseRestHandler.logger` which
has been deprecated for something like a year now. I replaced all of the
references with locally declared loggers which is so much less spooky
action at a distance to me.
This adds a new cluster privilege `monitor_snapshot` which is a restricted
version of `create_snapshot`, granting the same privileges to view
snapshot and repository info and status but not granting the actual
privilege to create a snapshot.
Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
XPackPlugin created an SSLService within the plugin contructor.
This has 2 negative consequences:
1. The service may be constructed based on a partial view of settings.
Other plugins are free to add setting values via the
additionalSettings() method, but this (necessarily) happens after
plugins have been constructed.
2. Any exceptions thrown during the plugin construction are handled
differently than exceptions thrown during "createComponents".
Since SSL configurations exceptions are relatively common, it is
far preferable for them to be thrown and handled as part of the
createComponents flow.
This commit moves the creation of the SSLService to
XPackPlugin.createComponents, and alters the sequence of some other
steps to accommodate this change.
Backport of: #49667
Our REST infrastructure will reject requests that have a body where the
body of the request is never consumed. This ensures that we reject
requests on endpoints that do not support having a body. This requires
cooperation from the REST handlers though, to actually consume the body,
otherwise the REST infrastructure will proceed with rejecting the
request. This commit addresses an issue in the has privileges API where
we would prematurely try to reject a request for not having a username,
before consuming the body. Since the body was not consumed, the REST
infrastructure would instead reject the request as a bad request.
* Remove BlobContainer Tests against Mocks
Removing all these weird mocks as asked for by #30424.
All these tests are now part of real repository ITs and otherwise left unchanged if they had
independent tests that didn't call the `createBlobStore` method previously.
The HDFS tests also get added coverage as a side-effect because they did not have an implementation
of the abstract repository ITs.
Closes#30424
This test was fixed as part of #49736 so that it used a
TokenService mock instance that was enabled, so that token
verification fails because the token is invalid and not because
the token service is not enabled.
When the randomly generated token we send, decodes to being of
version > 7.2 , we need to have mocked a GetResponse for the call
that TokenService#getUserTokenFromId will make, otherwise this
hangs and times out.
Return a 401 in all cases when a request is submitted with an
access token that we can't consume. Before this change, we would
throw a 500 when a request came in with an access token that we
had generated but was then invalidated/expired and deleted from
the tokens index.
Resolves: #38866
Backport of #49736
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
* I think the issue that preventet that PR that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
- Improves HTTP client hostname verification failure messages
- Adds "DiagnosticTrustManager" which logs certificate information
when trust cannot be established (hostname failure, CA path failure,
etc)
These diagnostic messages are designed so that many common TLS
problems can be diagnosed based solely (or primarily) on the
elasticsearch logs.
These diagnostics can be disabled by setting
xpack.security.ssl.diagnose.trust: false
Backport of: #48911
Add a mirror of the maven repository of the shibboleth project
and upgrade opensaml and related dependencies to the latest
version available version
Resolves: #44947
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.
Closes#43599
Authentication has grown more complex with the addition of new realm
types and authentication methods. When user authentication does not
behave as expected it can be difficult to determine where and why it
failed.
This commit adds DEBUG and TRACE logging at key points in the
authentication flow so that it is possible to gain addition insight
into the operation of the system.
Backport of: #49575
The AuthenticationService has a feature to "smart order" the realm
chain so that whicherver realm was the last one to successfully
authenticate a given user will be tried first when that user tries to
authenticate again.
There was a bug where the building of this realm order would
incorrectly drop the first realm from the default chain unless that
realm was the "last successful" realm.
In most cases this didn't cause problems because the first realm is
the reserved realm and so it is unusual for a user that authenticated
against a different realm to later need to authenticate against the
resevered realm.
This commit fixes that bug and adds relevant asserts and tests.
Backport of: #49473
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
This commit adds a deprecation warning when starting
a node where either of the server contexts
(xpack.security.transport.ssl and xpack.security.http.ssl)
meet either of these conditions:
1. The server lacks a certificate/key pair (i.e. neither
ssl.keystore.path not ssl.certificate are configured)
2. The server has some ssl configuration, but ssl.enabled is not
specified. This new validation does not care whether ssl.enabled is
true or false (though other validation might), it simply makes it
an error to configure server SSL without being explicit about
whether to enable that configuration.
Backport of: #45892
This commit changes the ThreadContext to just use a regular ThreadLocal
over the lucene CloseableThreadLocal. The CloseableThreadLocal solves
issues with ThreadLocals that are no longer needed during runtime but
in the case of the ThreadContext, we need it for the runtime of the
node and it is typically not closed until the node closes, so we miss
out on the benefits that this class provides.
Additionally by removing the close logic, we simplify code in other
places that deal with exceptions and tracking to see if it happens when
the node is closing.
Closes#42577
Ensures that methods that are called from different threads ( i.e.
from the callbacks of org.apache.http.concurrent.FutureCallback )
catch `Exception` instead of only the expected checked exceptions.
This resolves a bug where OpenIdConnectAuthenticator#mergeObjects
would throw an IllegalStateException that was never caught causing
the thread to hang and the listener to never be called. This would
in turn cause Kibana requests to authenticate with OpenID Connect
to timeout and fail without even logging anything relevant.
This also guards against unexpected Exceptions that might be thrown
by invoked library methods while performing the necessary operations
in these callbacks.
Backport of #48849. Update `.editorconfig` to make the Java settings the
default for all files, and then apply a 2-space indent to all `*.gradle`
files. Then reformat all the files.
This commit introduces a consistent, and type-safe manner for handling
global build parameters through out our build logic. Primarily this
replaces the existing usages of extra properties with static accessors.
It also introduces and explicit API for initialization and mutation of
any such parameters, as well as better error handling for uninitialized
or eager access of parameter values.
Closes#42042
Previous behavior while copying HTTP headers to the ThreadContext,
would allow multiple HTTP headers with the same name, handling only
the first occurrence and disregarding the rest of the values. This
can be confusing when dealing with multiple Headers as it is not
obvious which value is read and which ones are silently dropped.
According to RFC-7230, a client must not send multiple header fields
with the same field name in a HTTP message, unless the entire field
value for this header is defined as a comma separated list or this
specific header is a well-known exception.
This commits changes the behavior in order to be more compliant to
the aforementioned RFC by requiring the classes that implement
ActionPlugin to declare if a header can be multi-valued or not when
registering this header to be copied over to the ThreadContext in
ActionPlugin#getRestHeaders.
If the header is allowed to be multivalued, then all such headers
are read from the HTTP request and their values get concatenated in
a comma-separated string.
If the header is not allowed to be multivalued, and the HTTP
request contains multiple such Headers with different values, the
request is rejected with a 400 status.
When we load a JSON Web Key (JWKSet) from the specified
file using JWKSet.load it internally uses IOUtils.readFileToString
but the opened FileInputStream is never closed after usage.
https://bitbucket.org/connect2id/nimbus-jose-jwt/issues/342
This commit reads the file and parses the JWKSet from the string.
This also fixes an issue wherein if the underlying file changed,
for every change event it would add another file watcher. The
change is to only add the file watcher at the start.
Closes#44942
The 1MB IO-buffer size per transport thread is causing trouble in
some tests, albeit at a low rate. Reducing the number of transport
threads was not enough to fully fix this situation.
Allowing to configure the size of the buffer and reducing it by
more than an order of magnitude should fix these tests.
Closes#46803
Backport of #48452.
The SAML tests have large XML documents within which various parameters
are replaced. At present, if these test are auto-formatted, the XML
documents get strung out over many, many lines, and are basically
illegible.
Fix this by using named placeholders for variables, and indent the
multiline XML documents.
The tests in `SamlSpMetadataBuilderTests` deserve a special mention,
because they include a number of certificates in Base64. I extracted
these into variables, for additional legibility.
* Extract remote "sniffing" to connection strategy (#47253)
Currently the connection strategy used by the remote cluster service is
implemented as a multi-step sniffing process in the
RemoteClusterConnection. We intend to introduce a new connection strategy
that will operate in a different manner. This commit extracts the
sniffing logic to a dedicated strategy class. Additionally, it implements
dedicated tests for this class.
Additionally, in previous commits we moved away from a world where the
remote cluster connection was mutable. Instead, when setting updates are
made, the connection is torn down and rebuilt. We still had methods and
tests hanging around for the mutable behavior. This commit removes those.
* Introduce simple remote connection strategy (#47480)
This commit introduces a simple remote connection strategy which will
open remote connections to a configurable list of user supplied
addresses. These addresses can be remote Elasticsearch nodes or
intermediate proxies. We will perform normal clustername and version
validation, but otherwise rely on the remote cluster to route requests
to the appropriate remote node.
* Make remote setting updates support diff strategies (#47891)
Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.
As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
* Make remote setting updates support diff strategies (#47891)
Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.
As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
FIPS 140 bootstrap checks should not be bootstrap checks as they
are always enforced. This commit moves the validation logic within
the security plugin.
The FIPS140SecureSettingsBootstrapCheck was not applicable as the
keystore was being loaded on init, before the Bootstrap checks
were checked, so an elasticsearch keystore of version < 3 would
cause the node to fail in a FIPS 140 JVM before the bootstrap check
kicked in, and as such hasn't been migrated.
Resolves: #34772
This PR adds an origin for the Enrich feature, and modifies the background
maintenance task to use the origin when executing client operations.
Without this fix, the maintenance task fails to execute when security is
enabled.
All internal searches (triggered by APIs) across the .security index
must be performed while "under the security origin". Otherwise,
the search is performed in the context of the caller which most
likely does not have privileges to search .security (hopefully).
This commit fixes this in the case of two methods in the
TokenService and corrects an overly done such context switch
in the ApiKeyService.
In addition, this makes all tests from the client/rest-high-level
module execute as an all mighty administrator,
but not a literal superuser.
Closes#47151
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
Use case:
User with `create_doc` index privilege will be allowed to only index new documents
either via Index API or Bulk API.
There are two cases that we need to think:
- **User indexing a new document without specifying an Id.**
For this ES auto generates an Id and now ES version 7.5.0 onwards defaults to `op_type` `create` we just need to authorize on the `op_type`.
- **User indexing a new document with an Id.**
This is problematic as we do not know whether a document with Id exists or not.
If the `op_type` is `create` then we can assume the user is trying to add a document, if it exists it is going to throw an error from the index engine.
Given these both cases, we can safely authorize based on the `op_type` value. If the value is `create` then the user with `create_doc` privilege is authorized to index new documents.
In the `AuthorizationService` when authorizing a bulk request, we check the implied action.
This code changes that to append the `:op_type/index` or `:op_type/create`
to indicate the implied index action.
This commit adds support to retrieve all API keys if the authenticated
user is authorized to do so.
This removes the restriction of specifying one of the
parameters (like id, name, username and/or realm name)
when the `owner` is set to `false`.
Closes#46887
When API key is invalidated we do two things first it tries to trigger `ExpiredApiKeysRemover` task
and second, we do index the invalidation for the API key. The index invalidation may happen
before the `ExpiredApiKeysRemover` task is run and in that case, the API key
invalidated will also get deleted. If the `ExpiredApiKeysRemover` runs before the
API key invalidation is indexed then the API key is not deleted and will be
deleted in the future run.
This behavior was not captured in the tests related to `ExpiredApiKeysRemover`
causing intermittent failures.
This commit fixes those tests by checking if the API key invalidated is reported
back when we get API keys after invalidation and perform the checks based on that.
Closes#41747
The changes introduced in #47179 made it so that we could try to
build an SSLContext with verification mode set to None, which is
not allowed in FIPS 140 JVMs. This commit address that
Fixes multiple Active Directory related tests that run against the
samba fixture. Some were failing since we changed the realm settings
format in 7.0 and a few were slightly broken in other ways.
We can move to cleanup the tests in a follow up but this work fits
better to be done with or after we move the tests from a Samba
based fixture to a real(-ish) Microsoft Active Directory based
fixture.
Resolves: #33425, #35738
Due to a regression bug the metadata Active Directory realm
setting is ignored (it works correctly for the LDAP realm type).
This commit redresses it.
Closes#45848