This is modelled on the qa test for TLS on basic.
It starts a cluster on basic with security & performs a number of
security related checks.
It also performs those same checks on a trial license.
This adds support for using security on a basic license.
It includes:
- AllowedRealmType.NATIVE realms (reserved, native, file)
- Roles / RBAC
- TLS (already supported)
It does not support:
- Audit
- IP filters
- Token Service & API Keys
- Advanced realms (AD, LDAP, SAML, etc)
- Advanced roles (DLS, FLS)
- Pluggable security
As with trial licences, security is disabled by default.
This commit does not include any new automated tests, but existing tests have been updated.
This commit introduces the `.security-tokens` and `.security-tokens-7`
alias-index pair. Because index snapshotting is at the index level granularity
(ie you cannot snapshot a subset of an index) snapshoting .`security` had
the undesirable effect of storing ephemeral security tokens. The changes
herein address this issue by moving tokens "seamlessly" (without user
intervention) to another index, so that a "Security Backup" (ie snapshot of
`.security`) would not be bloated by ephemeral data.
Today we allow adding entries from a file or from a string, yet we
internally maintain this distinction such that if you try to add a value
from a file for a setting that expects a string or add a value from a
string for a setting that expects a file, you will have a bad time. This
causes a pain for operators such that for each setting they need to know
this difference. Yet, we do not need to maintain this distinction
internally as they are bytes after all. This commit removes that
distinction and includes logic to upgrade legacy keystores.
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
This commit also backports the following commit:
Handle WRAP ops during SSL read
It is possible that a WRAP operation can occur while decrypting
handshake data in TLS 1.3. The SSLDriver does not currently handle this
well as it does not have access to the outbound buffer during read call.
This commit moves the buffer into the Driver to fix this issue. Data
wrapped during a read call will be queued for writing after the read
call is complete.
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
TLS 1.3 changes to the SSLEngine introduced a scenario where a UNWRAP
call during a handshake can consume a close notify alerty without
throwing an exception. This means that we continue down a codepath where
we assert that we are still in handshaking mode. Transitioning to closed
from handshaking is a valid scenario. This commit removes this
assertion.
Today the `_field_caps` API returns the list of indices where a field
is present only if this field has different types within the requested indices.
However if the request is an index pattern (or an alias, or both...) there
is no way to infer the indices if the response contains only fields that have
the same type in all indices. This commit changes the response to always return
the list of indices in the response. It also adds a way to retrieve unmapped field
in a specific section per field called `unmapped`. This section is created for each field
that is present in some indices but not all if the parameter `include_unmapped` is set to
true in the request (defaults to false).
The Has Privileges API allows to tap into the authorization process, to validate
privileges without actually running the operations to be authorized. This commit
fixes a bug, in which the Has Privilege API returned spurious results when checking
for index privileges over restricted indices (currently .security, .security-6,
.security-7). The actual authorization process is not affected by the bug.
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
The `DistinguishedNamePredicate`, used for matching users to role mapping
expressions, should handle users with null DNs. But it fails to do so (and this is
a NPE bug), if the role mapping expression contains a lucene regexp or a wildcard.
The fix simplifies `DistinguishedNamePredicate` to not handle null DNs at all, and
instead use the `ExpressionModel#NULL_PREDICATE` for the DN field, just like
any other missing user field.
When the same alias points to multiple indices we can write to only one index
with `is_write_index` value `true`. The special handling in case of the put
mapping request(to resolve authorized indices) has a check on indices size
for a concrete index. If multiple indices existed then it marked the request
as unauthorized.
The check has been modified to consider write index flag and only when the
requested index matches with the one with write index alias, the alias is considered
for authorization.
Closes#40831
This commit adds an OpenID Connect authentication realm to
elasticsearch. Elasticsearch (with the assistance of kibana or
another web component) acts as an OpenID Connect Relying
Party and supports the Authorization Code Grant and Implicit
flows as described in http://ela.st/oidc-spec. It adds support
for consuming and verifying signed ID Tokens, both RP
initiated and 3rd party initiated Single Sign on and RP
initiated signle logout.
It also adds an OpenID Connect Provider in the idp-fixture to
be used for the associated integration tests.
This is a backport of #40674
For pattern "n:localhost" PatternRule#isLocalhost() matches
any local address, loopback address.
[Note: I think for "localhost" this should not consider IP address
as a match when they are bound to network interfaces. It should just
be loopback address check unless the intent is to match all local addresses.
This class is adopted from Netty3 and I am not sure if this is intended
behavior or maybe I am missing something]
For now I have fixed this assuming the PatternRule#isLocalhost check is
correct by avoiding use of local address to check address denied.
Closes#40194
* moved hlrc parsing tests from xpack to hlrc module and removed dependency on hlrc from xpack core
* deprecated old base test class
* added deprecated jdoc tag
* split test between xpack-core part and hlrc part
* added lang-mustache test dependency, this previously came in via
hlrc dependency.
* added hlrc dependency on a qa module
* duplicated ClusterPrivilegeName class in xpack-core, since x-pack
core no longer has a dependency on hlrc.
* replace ClusterPrivilegeName usages with string literals
* moved tests to dedicated to hlrc packages in order to remove Hlrc part from the name and make sure to use imports instead of full qualified class where possible
* remove ESTestCase. from method invocation and use method directly,
because these tests indirectly extend from ESTestCase
This PR generates deprecation log entries for each Role Descriptor,
used for building a Role, when the Role Descriptor grants more privileges
for an alias compared to an index that the alias points to. This is done in
preparation for the removal of the ability to define privileges over aliases.
There is one log entry for each "role descriptor name"-"alias name" pair.
On such a notice, the administrator is expected to modify the Role Descriptor
definition so that the name pattern for index names does not cover aliases.
Caveats:
* Role Descriptors that are not used in any authorization process,
either because they are not mapped to any user or the user they are mapped to
is not used by clients, are not be checked.
* Role Descriptors are merged when building the effective Role that is used in
the authorization process. Therefore some Role Descriptors can overlap others,
so even if one matches aliases in a deprecated way, and it is reported as such,
it is not at risk from the breaking behavior in the current role mapping configuration
and index-alias configuration. It is still reported because it is a best practice to
change its definition, or remove offending aliases.
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
This opt-out query cache has an unsafe publication issue, where the
cache is exposed to another thread (namely the cluster state update
thread) before the constructor has finished execution. This exposes the
opt-out query cache to concurrency bugs. This commit addresses this by
ensuring that the opt-out query cache is not registered as a listener
for license state changes until after the constructor has returned.
* Avoid sharing source directories as it breaks intellij
* Subprojects share main project output classes directory
* Fix jar hell
* Fix sql security with ssl integ tests
* Relax dependency ordering rule so we don't explode on cycles
This adds a new security/qa test for TLS on a basic license.
It starts a 2 node cluster with a basic license, and TLS enabled
on both HTTP and Transport, and verifies the license type, x-pack
SSL usage and SSL certificates API.
It also upgrades the cluster to a trial license and performs that
same set of checks (to ensure that clusters with basic license
and TLS enabled can be upgraded to a higher feature license)
Backport of: #40714
This change updates our version of httpclient to version 4.5.8, which
contains the fix for HTTPCLIENT-1968, which is a bug where the client
started re-writing paths that contained encoded reserved characters
with their unreserved form.
Many gradle projects specifically use the -try exclude flag, because
there are many cases where auto-closeable resource ignore is never
referenced in body of corresponding try statement. Suppressing this
warning specifically in each case that it happens using
`@SuppressWarnings("try")` would be very verbose.
This change removes `-try` from any gradle project and adds it to the
build plugin. Also this change removes exclude flags from gradle projects
that is already specified in build plugin (for example -deprecation).
Relates to #40366
It is possible to have SSL enabled but security disabled if security
was dynamically disabled by the license type (e.g. trial license).
e.g. In the following configuration:
xpack.license.self_generated.type: trial
# xpack.security not set, default to disabled on trial
xpack.security.transport.ssl.enabled: true
The security feature will be reported as
available: true
enabled: false
And in this case, SSL will be active even though security is not
enabled.
This commit causes the X-Pack feature usage to report the state of the
"ssl" features unless security was explicitly disabled in the
settings.
Backport of: #40672
This adds a new `role_templates` field to role mappings that is an
alternative to the existing roles field.
These templates are evaluated at runtime to determine which roles should be
granted to a user.
For example, it is possible to specify:
"role_templates": [
{ "template":{ "source": "_user_{{username}}" } }
]
which would mean that every user is assigned to their own role based on
their username.
You may not specify both roles and role_templates in the same role
mapping.
This commit adds support for templates to the role mapping API, the role
mapping engine, the Java high level rest client, and Elasticsearch
documentation.
Due to the lack of caching in our role mapping store, it is currently
inefficient to use a large number of templated role mappings. This will be
addressed in a future change.
Backport of: #39984, #40504
This commit introduces 2 changes to application privileges:
- The validation rules now accept a wildcard in the "suffix" of an application name.
Wildcards were always accepted in the application name, but the "valid filename" check
for the suffix incorrectly prevented the use of wildcards there.
- A role may now be defined against a wildcard application (e.g. kibana-*) and this will
be correctly treated as granting the named privileges against all named applications.
This does not allow wildcard application names in the body of a "has-privileges" check, but the
"has-privileges" check can test concrete application names against roles with wildcards.
Backport of: #40398
Replicated closed indices can't be indexed into or searched, and therefore don't need a shard with
full indexing and search capabilities allocated. We can save on a lot of heap memory for those
indices by not allocating a mapper service and caching infrastructure (which preallocates a constant
amount per instance). Before this change, a 1GB ES instance could host 250 replicated closed
metricbeat indices (each index with one shard). After this change, the same instance can host 7300
replicated closed metricbeat instances (not that this would be a recommended configuration). Most
of the remaining memory is in the cluster state and the IndexSettings object.
This refactoring is in the context of the work related to moving security
tokens to a new index. In that regard, the Token Service has to work with
token documents stored in any of the two indices, albeit only as a transient
situation. I reckoned the added complexity as unmanageable,
hence this refactoring.
This is incomplete, as it fails to address the goal of minimizing .security accesses,
but I have stopped because otherwise it would've become a full blown rewrite
(if not already). I will follow-up with more targeted PRs.
In addition to being a true refactoring, some 400 errors moved to 500. Furthermore,
more stringed validation of various return result, has been implemented, notably the
one of the token document creation.
When creating API keys we check for if API key with
the same key name already exists and fail the request if it does.
The check should have been performed with XPackSecurityUser
instead of the authenticated user. This caused the request to fail
in case of the non-super user trying to create an API key.
This commit fixes by executing search action with SECURITY_ORIGIN
so it can be executed with XPackSecurityUser.
Also fixed the Rest test to avoid using a user with `super_user` role.
Closes#40029
The setup-passwords tool gives cryptic messages in case where custom discovery providers are
used (see #33580). As the URL auto-detection logic should be seen as best effort, this commit
improves the exception message to make it clearer what needs to be done to fix the issue.
Relates #33580
`SecurityIndexManager` is hardcoded to handle only the `.security`-`.security-7` alias-index pair.
This commit removes the hardcoded bits, so that the `SecurityIndexManager` can be reused
for other indices, such as the planned security tokens index (`.security-tokens-7`).
The LDAP tests attempt to bind all interfaces,
but if for some reason an interface can't be bound
the tests will stall until the suite times out.
This modifies the tests to be a bit more lenient and allow
some binding to fail so long as at least one succeeds.
This allows the test to continue even in more antagonistic
environments.
A TLS handshake requires exchanging multiple messages to initiate a
session. If one side decides to close during the handshake, it is
supposed to send a close_notify alert (similar to closing during
application data exchange). The java SSLEngine engine throws an
exception when this happens. We currently log this at the warn level if
trace logging is not enabled. This level is too high for a valid
scenario. Additionally it happens all the time in tests (quickly closing
and opened transports). This commit changes this to be logged at the
debug level if trace is not enabled. Additionally, it extracts the
transport security exception handling to a common class.
Previously all the threads were writing the received tokens to a
HashSet. In cases with many threads, sometimes (1 every ~25 tests)
calling size() on the HashSet returned 2 even though it seemed to
contain only one String and there was no evidence from logging that
threadSecurityClient.refreshToken() ever returned a different
access or refresh token.
This commit changes the test to use a ConcurrentHashMap instead,
checking that we only received one pair of access token/refresh token
eventually. It also adds a check so that we won't take into consideration
tokens that are returned after 30s, hence not in the concurrent refresh
time window.
Fixes several errors of the token retry logic:
* not checking for backoff.hasNext() before calling backoff.next()
* checking for backoff.hasNext() without calling backoff.next()
* not preserving the context on the retry
* calling scheduleWithFixedDelay instead of schedule
Today the `GroupedActionListener` accepts a `defaults` parameter but all
callers pass an empty list. Also it is permitted to pass an empty group but
this is trappy because the delegated listener is never be called in that case.
This commit removes the `defaults` parameter and forbids an empty group.
As we are moving to single type indices,
we need to address this change in security-related indexes.
To address this, we are
- updating index templates to use preferred type name `_doc`
- updating the API calls to use preferred type name `_doc`
Upgrade impact:-
In case of an upgrade from 6.x, the security index has type
`doc` and this will keep working as there is a single type and `_doc`
works as an alias to an existing type. The change is handled in the
`SecurityIndexManager` when we load mappings and settings from
the template. Previously, we used to do a `PutIndexTemplateRequest`
with the mapping source JSON with the type name. This has been
modified to remove the type name from the source.
So in the case of an upgrade, the `doc` type is updated
whereas for fresh installs `_doc` is updated. This happens as
backend handles `_doc` as an alias to the existing type name.
An optional step is to `reindex` security index and update the
type to `_doc`.
Since we do not support the security audit log index,
that template has been deleted.
Relates: #38637
This is a backport of #39631
Co-authored-by: Jay Modi jaymode@users.noreply.github.com
This change adds support for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)
This change also handles the serialization/deserialization BWC logic:
In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)
When reading/writing TokensInvalidationResult objects, we take into
consideration that pre 7.1.0 these contained an integer field that carried
the attempt count
Resolves#36872
Previously, the security index could be wrongfully recreated. This might
happen if the index was interpreted as missing, as in the case of a fresh
install, but the index existed and the state did not yet recover.
This fix will return HTTP SERVICE_UNAVAILABLE (503) for requests that
try to write to the security index before the state has not been recovered yet.