The following updates were made:
- Add a new untyped endpoint `{index}/_explain/{id}`.
- Add deprecation warnings to Rest*Action, plus tests in Rest*ActionTests.
- For each REST yml test, make sure there is one version without types, and another legacy version that retains types (called *_with_types.yml).
- Deprecate relevant methods on the Java HLRC requests/ responses.
- Update documentation (for both the REST API and Java HLRC).
This is related to #27260. In Elasticsearch all of the messages that we
serialize to write to the network are composed of heap bytes. When you
read or write to a nio socket in java, the heap memory you passed down
must be copied to/from direct memory. The JVM internally does some
buffering of the direct memory, however it is essentially unbounded.
This commit introduces a simple mechanism of buffering and copying the
memory in transport-nio. Each network event loop is given a 64kb
DirectByteBuffer. When we go to read we use this buffer and copy the
data after the read. Additionally, when we go to write, we copy the data
to the direct memory before calling write. 64KB is chosen as this is the
default receive buffer size we use for transport-netty4
(NETTY_RECEIVE_PREDICTOR_SIZE).
Since we only have one buffer per thread, we could afford larger.
However, if we the buffer is large and not all of the data is flushed in
a write call, we will do excess copies. This is something we can
explore in the future.
* Add deprecation warnings to `Rest*TermVectorsAction`, plus tests in `Rest*TermVectorsActionTests`.
* Deprecate relevant methods on the Java HLRC requests/ responses.
* Update documentation (for both the REST API and Java HLRC).
* For each REST yml test, create one version without types, and another legacy version that retains types (called *_with_types.yml).
We have a few places where we register license state listeners on
transient components (i.e., resources that can be open and closed during
the lifecycle of the server). In one case (the opt-out query cache) we
were never removing the registered listener, effectively a terrible
memory leak. In another case, we were not un-registered the listener
that we registered, since we were not referencing the same instance of
Runnable. This commit does two things:
- introduces a marker interface LicenseStateListener so that it is
easier to identify these listeners in the codebase and avoid classes
that need to register a license state listener from having to
implement Runnable which carries a different semantic meaning than
we want here
- fixes the two places where we are currently leaking license state
listeners
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).
Relates #33028
Made credentials mandatory for xpack migrate tool.
Closes#29847.
The x-pack user and roles APIs aren't available unless security is enabled, so the tool should always be called with the -u and -p options specified.
This commit adds an empty CcrRepository snapshot/restore repository.
When a new cluster is registered in the remote cluster settings, a new
CcrRepository is registered for that cluster.
This is implemented using a new concept of "internal repositories".
RepositoryPlugin now allows implementations to return factories for
"internal repositories". The "internal repositories" are different from
normal repositories in that they cannot be registered through the
external repository api. Additionally, "internal repositories" are local
to a node and are not stored in the cluster state.
The repository will be unregistered if the remote cluster is removed.
This commit makes `document`, `update`, `explain`, `termvectors` and `mapping`
typeless APIs work on indices that have a type whose name is not `_doc`.
Unfortunately, this needs to be a bit of a hack since I didn't want calls with
random type names to see documents with the type name that the user had chosen
upon type creation.
The `explain` and `termvectors` do not support being called without a type for
now so the test is just using `_doc` as a type for now, we will need to fix
tests later but this shouldn't require further changes server-side since passing
`_doc` as a type name is what typeless APIs do internally anyway.
Relates #35190
Introduces a debug log message when a bind fails and a trace message
when a bind succeeds.
It may seem strange to only debug a bind failure, but failures of this
nature are relatively common in some realm configurations (e.g. LDAP
realm with multiple user templates, or additional realms configured
after an LDAP realm).
This is a follow-up to #35144. That commit made the underlying
connection opening process in TcpTransport asynchronous. However the
method still blocked on the process being complete before returning.
This commit moves the blocking to the ConnectionManager level. This is
another step towards the top-level TransportService api being async.
This commit upgrades netty. This will close#35360. Netty started
throwing an IllegalArgumentException if a CompositeByteBuf is
created with < 2 components. Netty4Utils was updated to reflect this
change.
This is related to #34405 and a follow-up to #34753. It makes a number
of changes to our current keepalive pings.
The ping interval configuration is moved to the ConnectionProfile.
The server channel now responds to pings. This makes the keepalive
pings bidirectional.
On the client-side, the pings can now be optimized away. What this
means is that if the channel has received a message or sent a message
since the last pinging round, the ping is not sent for this round.
Today the default for USE_ZEN2 is false and it is overridden in many places. By
defaulting it to true we can be sure that the only places in which Zen2 does
not work are those in which it is explicitly set to false.
In #30241 Realm settings were changed, but the Kerberos realm settings
were not registered correctly. This change fixes the registration of
those Kerberos settings.
Also adds a new integration test that ensures every internal realm can
be configured in a test cluster.
Also fixes the QA test for kerberos.
Resolves: #35942
Right now using the `GET /_tasks/<taskid>` API and causing a task to opt
in to saving its result after being completed requires permissions on
the `.tasks` index. When we built this we thought that that was fine,
but we've since moved towards not leaking details like "persisting task
results after the task is completed is done by saving them into an index
named `.tasks`." A more modern way of doing this would be to save the
tasks into the index "under the hood" and to have APIs to manage the
saved tasks. This is the first step down that road: it drops the
requirement to have permissions to interact with the `.tasks` index when
fetching task statuses and when persisting statuses beyond the lifetime
of the task.
In particular, this moves the concept of the "origin" of an action into
a more prominent place in the Elasticsearch server. The origin of an
action is ignored by the server, but the security plugin uses the origin
to make requests on behalf of a user in such a way that the user need
not have permissions to perform these actions. It *can* be made to be
fairly precise. More specifically, we can create an internal user just
for the tasks API that just has permission to interact with the `.tasks`
index. This change doesn't do that, instead, it uses the ubiquitus
"xpack" user which has most permissions because it is simpler. Adding
the tasks user is something I'd like to get to in a follow up change.
Instead, the majority of this change is about moving the "origin"
concept from the security portion of x-pack into the server. This should
allow any code to use the origin. To keep the change managable I've also
opted to deprecate rather than remove the "origin" helpers in the
security code. Removing them is almost entirely mechanical and I'd like
to that in a follow up as well.
Relates to #35573
Clients can use the Kerberos V5 security mechanism and when it
used this to establish security context it failed to do so as
Elasticsearch server only accepted Spengo mechanism.
This commit adds support to accept Kerberos V5 credentials
over spnego.
Closes#34763
- Add the authentication realm and lookup realm name and type in the response for the _authenticate API
- The authentication realm is set as the lookup realm too (instead of setting the lookup realm to null or empty ) when no lookup realm is used.
This commit is related to #32517. It allows an "sni_server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. Prior to this commit, this functionality
was only support for the netty transport. This commit adds this
functionality to the security nio transport.
This commit adds a test for handling correctly all they possible
`SamlPrepareAuthenticationRequest` parameter combinations that
we might get from Kibana or a custom web application talking to the
SAML APIs.
We can match the correct SAML realm based either on the realm name
or the ACS URL. If both are included in the request then both need to
match the realm configuration.
This generates a synthesized "id" for each incoming request that is
included in the audit logs (file only).
This id can be used to correlate events for the same request (e.g.
authentication success with access granted).
This request.id is specific to the audit logs and is not used for any
other purpose
The request.id is consistent across nodes if a single request requires
execution on multiple nodes (e.g. search acros multiple shards).
When assertions are enabled, a Put User action that have no effect (a
noop update) would trigger an assertion failure and shutdown the node.
This change accepts "noop" as an update result, and adds more
diagnostics to the assertion failure message.
The RestHasPrivilegesAction previously handled its own XContent
generation. This change moves that into HasPrivilegesResponse and
makes the response implement ToXContent.
This allows HasPrivilegesResponseTests to be used to test
compatibility between HLRC and X-Pack internals.
A serialization bug (cluster privs) was also fixed here.
* The port assigned to all loopback interfaces doesn't necessarily have to be the same for ipv4 and ipv6
=> use actual address from profile instead of just port + loopback in test
* Closes#35584
Zen2 is now feature-complete enough to run most ESIntegTestCase tests. The changes in this PR
are as follows:
- ClusterSettingsIT is adapted to not be Zen1 specific anymore (it was using Zen1 settings).
- Some of the integration tests require persistent storage of the cluster state, which is not fully
implemented yet (see #33958). These tests keep running with Zen1 for now but will be switched
over as soon as that is fully implemented.
- Some very few integration tests are not running yet with Zen2 for other reasons, depending on
some of the other open points in #32006.
The DefaultAuthenticationFailureHandler has a deprecated constructor
that was present to prevent a breaking change to custom realm plugin
authors in 6.x. This commit removes the constructor and its uses.
For some time, the PutUser REST API has supported storing a pre-hashed
password for a user. The change adds validation and tests around that
feature so that it can be documented & officially supported.
It also prevents the request from containing both a "password" and a "password_hash".
Many realm tests were written to use separate setting objects for
"global settings" and "realm settings".
Since #30241 there is no distinction between these settings, so these
tests can be cleaned up to use a single Settings object.
This is related to #34483. It introduces a namespaced setting for
compression that allows users to configure compression on a per remote
cluster basis. The transport.tcp.compress remains as a fallback
setting. If transport.tcp.compress is set to true, then all requests
and responses are compressed. If it is set to false, only requests to
clusters based on the cluster.remote.cluster_name.transport.compress
setting are compressed. However, after this change regardless of any
local settings, responses will be compressed if the request that is
received was compressed.
- Introduces a transport API for bootstrapping a Zen2 cluster
- Introduces a transport API for requesting the set of nodes that a
master-eligible node has discovered and for waiting until this comprises the
expected number of nodes.
- Alters ESIntegTestCase to use these APIs when forming a cluster, rather than
injecting the initial configuration directly.
There is no longer a concept of non-global "realm settings". All realm
settings should be loaded from the node's settings using standard
Setting classes.
This change renames the "globalSettings" field and method to simply be
"settings".
The file realm has not supported custom filenames/locations since at
least 5.0, but this test still tried to configure them.
Remove all configuration of file locations, and cleaned up a few other
warnings and deprecations
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
This is related to #29023. Additionally at other points we have
discussed a preference for removing the need to unnecessarily block
threads for opening new node connections. This commit lays the groudwork
for this by opening connections asynchronously at the transport level.
We still block, however, this work will make it possible to eventually
remove all blocking on new connections out of the TransportService
and Transport.
This moves all Realm settings to an Affix definition.
However, because different realm types define different settings
(potentially conflicting settings) this requires that the realm type
become part of the setting key.
Thus, we now need to define realm settings as:
xpack.security.authc.realms:
file.file1:
order: 0
native.native1:
order: 1
- This is a breaking change to realm config
- This is also a breaking change to custom security realms (SecurityExtension)
Stop passing `Settings` to `AbstractComponent`'s ctor. This allows us to
stop passing around `Settings` in a *ton* of places. While this change
touches many files, it touches them all in fairly small, mechanical
ways, doing a few things per file:
1. Drop the `super(settings);` line on everything that extends
`AbstractComponent`.
2. Drop the `settings` argument to the ctor if it is no longer used.
3. If the file doesn't use `logger` then drop `extends
AbstractComponent` from it.
4. Clean up all compilation failure caused by the `settings` removal
and drop any now unused `settings` isntances and method arguments.
I've intentionally *not* removed the `settings` argument from a few
files:
1. TransportAction
2. AbstractLifecycleComponent
3. BaseRestHandler
These files don't *need* `settings` either, but this change is large
enough as is.
Relates to #34488
Drops the `Settings` member from `AbstractComponent`, moving it from the
base class on to the classes that use it. For the most part this is a
mechanical change that doesn't drop `Settings` accesses. The one
exception to this is naming threads where it switches from an invocation
that passes `Settings` and extracts the node name to one that explicitly
passes the node name.
This change doesn't drop the `Settings` argument from
`AbstractComponent`'s ctor because this change is big enough as is.
We'll do that in a follow up change.
The native roles store previously would issue a search if attempting to
retrieve more than a single role. This is fine when we are attempting
to retrieve all of the roles to list them in the api, but could cause
issues when attempting to find roles for a user. The search is not
prioritized over other search requests, so heavy aggregations/searches
or overloaded nodes could cause roles to be cached as missing if the
search is rejected.
When attempting to load specific roles, we know the document id for the
role that we are trying to load, which allows us to use the multi-get
api for loading these roles. This change makes use of the multi-get api
when attempting to load more than one role by name. This api is also
constrained by a threadpool but the tasks in the GET threadpool should
be quicker than searches.
See #33205
SSLTrustRestrictionsTests.testRestrictionsAreReloaded checks that the
SSL trust configuration is automatically updated reapplied if the
underlying "trust_restrictions.yml" file is modified.
Since the default resource watcher frequency is 5seconds, it could
take 10 second to run that test (as it waits for 2 reloaded).
Previously this test set that frequency to a very low value (3ms) so
that the elapsed time for the test would be reduced. However this
caused other problems, including that the resource watcher would
frequently run while the cluster was shutting down and files were
being cleaned up.
This change resets that watch frequency back to its default (5s) and
then manually calls the "notifyNow" method on the resource watcher
whenever the restrictions file is modified, so that the SSL trust
configuration is reloaded at exactly the right time.
Resolves: #34502
In order to remove Streamable from the codebase, Response objects need
to be read using the Writeable.Reader interface which this change
enables. This change enables the use of Writeable.Reader by adding the
`Action#getResponseReader` method. The default implementation simply
uses the existing `newResponse` method and the readFrom method. As
responses are migrated to the Writeable.Reader interface, Action
classes can be updated to throw an UnsupportedOperationException when
`newResponse` is called and override the `getResponseReader` method.
Relates #34389
This is related to #30876. The AbstractSimpleTransportTestCase initiates
many tcp connections. There are normally over 1,000 connections in
TIME_WAIT at the end of the test. This is because every test opens at
least two different transports that connect to each other with 13
channel connection profiles. This commit modifies the default
connection profile used by this test to 6. One connection for each
type, except for REG which gets 2 connections.
* NETWORKING: Add SSL Handler before other Handlers
* The only way to run into the issue in #33998 is for `Netty4MessageChannelHandler`
to be in the pipeline while the SslHandler is not. Adding the SslHandler before any
other handlers should ensure correct ordering here even when we handle upstream events
in our own thread pool
* Ensure that channels that were closed concurrently don't trip the assertion
* Closes#33998
* Adding stack_monitoring_agent role
* Fixing checkstyle issues
* Adding tests for new role
* Tighten up privileges around index templates
* s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user
* Fixing checkstyle violation
* Fix test
* Removing unused field
* Adding missed code
* Fixing data type
* Update Integration Test for new builtin user
This fixes a bug about aliases authorization.
That is, a user might see aliases which he is not authorized to see.
This manifests when the user is not authorized to see any aliases
and the `GetAlias` request is empty which normally is a marking
that all aliases are requested. In this case, no aliases should be
returned, but due to this bug, all aliases will have been returned.
JDK11 introduced some changes with the SSLEngine. A number of error
messages were changed. Additionally, there were some behavior changes
in regard to how the SSLEngine handles closes during the handshake
process. This commit updates our tests and SSLDriver to support these
changes.
The security native stores follow a pattern where
`SecurityIndexManager#prepareIndexIfNeededThenExecute` wraps most calls
made for the security index. The reasoning behind this was to check if
the security index had been upgraded to the latest version in a
consistent manner. However, this has the potential side effect that a
read will trigger the creation of the security index or an updating of
its mappings, which can lead to issues such as failures due to put
mapping requests timing out even though we might have been able to read
from the index and get the data necessary.
This change introduces a new method, `checkIndexVersionThenExecute`,
that provides the consistent checking of the security index to make
sure it has been upgraded. That is the only check that this method
performs prior to running the passed in operation, which removes the
possible triggering of index creation and mapping updates for reads.
Additionally, areas where we do reads now check the availability of the
security index and can short circuit requests. Availability in this
context means that the index exists and all primaries are active.
This is the fixed version of #34246, which was reverted.
Relates #33205
For user/_has_privileges and user/_privileges, handle the case where
there is no user in the security context. This is likely to indicate
that the server is running with a basic license, in which case the
action will be rejected with a non-compliance exception (provided
we don't throw a NPE).
The implementation here is based on the _authenticate API.
Resolves: #34567
The logfile audit log format is no longer formed by prefix fields followed
by key value fields, it is all formed by key value fields only (JSON format).
Consequently, the following settings, which toggled some of the prefix
fields, have been renamed:
audit.logfile .prefix.emit_node_host_address
audit.logfile .prefix.emit_node_host_name
audit.logfile .prefix.emit_node_name
This API is intended as a companion to the _has_privileges API.
It returns the list of privileges that are held by the current user.
This information is difficult to reason about, and consumers should
avoid making direct security decisions based solely on this data.
For example, each of the following index privileges (as well as many
more) would grant a user access to index a new document into the
"metrics-2018-08-30" index, but clients should not try and deduce
that information from this API.
- "all" on "*"
- "all" on "metrics-*"
- "write" on "metrics-2018-*"
- "write" on "metrics-2018-08-30"
Rather, if a client wished to know if a user had "index" access to
_any_ index, it would be possible to use this API to determine whether
the user has any index privileges, and on which index patterns, and
then feed those index patterns into _has_privileges in order to
determine whether the "index" privilege had been granted.
The result JSON is modelled on the Role API, with a few small changes
to reflect how privileges are modelled when multiple roles are merged
together (multiple DLS queries, multiple FLS grants, multiple global
conditions, etc).
This reverts commit 0b4e8db1d3 as some
issues have been identified with the changed handling of a primary
shard of the security index not being available.
The token service has fairly strict validation and there are a range
of reasons why request may be rejected.
The detail is typically returned in the client exception / json body
but the ES admin can only debug that if they have access to detailed
logs from the client.
This commit adds debug & trace logging to the token service so that it
is possible to perform this debugging from the server side if
necessary.
The security native stores follow a pattern where
`SecurityIndexManager#prepareIndexIfNeededThenExecute` wraps most calls
made for the security index. The reasoning behind this was to check if
the security index had been upgraded to the latest version in a
consistent manner. However, this has the potential side effect that a
read will trigger the creation of the security index or an updating of
its mappings, which can lead to issues such as failures due to put
mapping requests timing out even though we might have been able to read
from the index and get the data necessary.
This change introduces a new method, `checkIndexVersionThenExecute`,
that provides the consistent checking of the security index to make
sure it has been upgraded. That is the only check that this method
performs prior to running the passed in operation, which removes the
possible triggering of index creation and mapping updates for reads.
Additionally, areas where we do reads now check the availability of the
security index and can short circuit requests. Availability in this
context means that the index exists and all primaries are active.
Relates #33205
Security caches the result of role lookups and negative lookups are
cached indefinitely. In the case of transient failures this leads to a
bad experience as the roles could truly exist. The CompositeRolesStore
needs to know if a failure occurred in one of the roles stores in order
to make the appropriate decision as it relates to caching. In order to
provide this information to the CompositeRolesStore, the return type of
methods to retrieve roles has changed to a new class,
RoleRetrievalResult. This class provides the ability to pass back an
exception to the roles store. This exception does not mean that a
request should be failed but instead serves as a signal to the roles
store that missing roles should not be cached and neither should the
combined role if there are missing roles.
As part of this, the negative lookup cache was also changed from an
unbounded cache to a cache with a configurable limit.
Relates #33205
PR #34290 made it impossible to use thread-context values to pass
authentication metadata out of a realm. The SAML realm used this
technique to allow the SamlAuthenticateAction to process the parsed
SAML token, and apply them to the access token that was generated.
This new method adds metadata to the AuthenticationResult itself, and
then the authentication service makes this result available on the
thread context.
Closes: #34332
ListenableFuture may run a listener on the same thread that called the
addListener method or it may execute on another thread after the future
has completed. Whenever the ListenableFuture stores the listener for
execution later, it should preserve the thread context which is what
this change does.
Since all calls to `ESLoggerFactory` outside of the logging package were
deprecated, it seemed like it'd simplify things to migrate all of the
deprecated calls and declare `ESLoggerFactory` to be package private.
This does that.
The "lookupUser" method on a realm facilitates the "run-as" and
"authorization_realms" features.
This commit allows a realm to be used for "lookup only", in which
case the "authenticate" method (and associated token methods) are
disabled.
It does this through the introduction of a new
"authentication.enabled" setting, which defaults to true.
Building automatons can be costly. For the most part we cache things
that use automatons so the cost is limited.
However:
- We don't (currently) do that everywhere (e.g. we don't cache role
mappings)
- It is sometimes necessary to clear some of those caches which can
cause significant CPU overhead and processing delays.
This commit introduces a new cache in the Automatons class to avoid
unnecesarily recomputing automatons.
There may be values in the thread context that ought to be preseved
for later use, even if one or more realms perform asynchronous
authentication.
This commit changes the AuthenticationService to wrap the potentially
asynchronous calls in a ContextPreservingActionListener that retains
the original thread context for the authentication.
The Security plugin authorizes actions on indices. Authorization
happens on a per index/alias basis. Therefore a request with a
Multi Index Expression (containing wildcards) has to be
first evaluated in the authorization layer, before the request is
handled. For authorization purposes, wildcards in expressions will
only be expanded to indices/aliases that are visible by the authenticated
user. However, this "constrained" evaluation has to be compatible with
the expression evaluation that a cluster without the Security plugin
would do. Therefore any change in the evaluation logic
in any of these sites has to be mirrored in the other site.
This commit mirrors the changes in core from #33518 that allowed
for Multi Index Expression in the Get Alias API, loosely speaking.
When the cluster.routing.allocation.disk.watermark.flood_stage watermark
is breached, DiskThresholdMonitor marks the indices as read-only. This
failed when x-pack security was present as system user does not have the privilege
for update settings action("indices:admin/settings/update").
This commit adds the required privilege for the system user. Also added missing
debug logs when access is denied to help future debugging.
An assert statement is added to catch any missed privileges required for
system user.
Closes#33119
In SessionFactoryLoadBalancingTests#testRoundRobinWithFailures()
we kill ldap servers randomly and immediately bind to that port
connecting to mock server socket. This is done to avoid someone else
listening to this port. As the creation of mock socket and binding to the
port is immediate, sometimes the earlier socket would be in TIME_WAIT state
thereby having problems with either bind or connect.
This commit sets the SO_REUSEADDR explicitly to true and also sets
the linger on time to 0(as we are not writing any data) so as to
allow re-use of the port and close immediately.
Note: I could not find other places where this might be problematic
but looking at test runs and netstat output I do see lot of sockets
in TIME_WAIT. If we find that this needs to be addressed we can
wrap ServerSocketFactory to set these options and use that with in
memory ldap server configuration during tests.
Closes#32190
This commit upgrades the unboundid ldapsdk to version 4.0.8. The
primary driver for upgrading is a fix that prevents this library from
rewrapping Error instances that would normally bubble up to the
UncaughtExceptionHandler and terminate the JVM. Other notable changes
include some fixes related to connection handling in the library's
connection pool implementation.
Closes#33175
The `DnRoleMapper` class is used to map distinguished names of groups
and users to role names. This mapper builds in an internal map that
maps from a `com.unboundid.ldap.sdk.DN` to a `Set<String>`. In cases
where a lot of distinct DNs are mapped to roles, this can consume quite
a bit of memory. The majority of the memory is consumed by the DN
object. For example, a 94 character DN that has 9 relative DNs (RDN)
will retain 4KB of memory, whereas the String itself consumes less than
250 bytes.
In order to reduce memory usage, we can map from a normalized DN string
to a List of roles. The normalized string is actually how the DN class
determines equality with another DN and we can drop the overhead of
needing to keep all of the other objects in memory. Additionally the
use of a List provides memory savings as each HashSet is backed by a
HashMap, which consumes a great deal more memory than an appropriately
sized ArrayList. The uniqueness we get from a Set is maintained by
first building a set when parsing the file and then converting to a
list upon completion.
Closes#34237
Today we reverse the initial order of the nested documents when we
index them in order to ensure that parents documents appear after
their children. This means that a query will always match nested documents
in the reverse order of their offsets in the source document.
Reversing all documents is not needed so this change ensures that parents
documents appear after their children without modifying the initial order
in each nested level. This allows to match children in the order of their
appearance in the source document which is a requirement to efficiently
implement #33587. Old indices created before this change will continue
to reverse the order of nested documents to ensure backwark compatibility.
In prior versions of Java, we expected to see a SSLHandshakeException
when starting a handshake with a server that we do not trust. In JDK11,
the exception has changed to a SSLException, which
SSLHandshakeException extends. This is most likely a side effect of the
TLS 1.3 changes in JDK11. This change updates the test to catch the
SSLException instead of the SSLHandshakeException and enables the test
to work on JDK8 through JDK11.
Closes#29989
In prior versions of Java, we expected to see a SSLHandshakeException
when starting a handshake with a server that we do not trust. In JDK11,
the exception has changed to a SSLException, which
SSLHandshakeException extends. This is most likely a side effect of the
TLS 1.3 changes in JDK11. This change updates the test to catch the
SSLException instead of the SSLHandshakeException and enables the test
to work on JDK8 through JDK11.
Closes#32293
`Settings` is no longer required to get a `Logger` and we went to quite
a bit of effort to pass it to the `Logger` getters. This removes the
`Settings` from all of the logger fetches in security and x-pack:core.
Security previously hardcoded a default scroll keepalive of 10 seconds,
but in some cases this is not enough time as there can be network
issues or overloading of host machines. After this change, security
will now use the default keepalive timeout, which is controllable using
a setting and the default value is 5 minutes.
In order to optimize the use of the role cache, when the roles.yml file
is reloaded we now calculate the names of removed, changed, and added
roles so that they may be passed to any listeners. This allows a
listener to selectively clear cache for only the roles that have been
modified. The CompositeRolesStore has been adapted to do exactly that
so that we limit the need to reload roles from sources such as the
native roles stores or external role providers.
See #33205
This change cleans up "unused variable" warnings. There are several cases were we
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
* TESTS: Stabilize Renegotiation Test
* The second `startHandshake` is not synchronous and a read of only
50ms may fail to trigger it entirely (the failure can be reproduced reliably by setting the socket timeout to `1`)
=> fixed by retrying the read until the handshake finishes (a longer timeout would've worked too,
but retrying seemed more stable)
* Closes#33772
This commit introduces an AbstractSimpleSecurityTransportTestCase for
security transports. This classes provides transport tests that are
specific for security transports. Additionally, it fixes the tests referenced in
#33285.
SearchGroupsResolverInMemoryTests was (rarely) fail in a way that
suggests that the server-side delay (100ms) was not enough to trigger
the client-side timeout (5ms).
The server side delay has been increased to try and overcome this.
Resolves: #32913
This change adds the OneStatementPerLineCheck to our checkstyle precommit
checks. This rule restricts the number of statements per line to one. The
resoning behind this is that it is very difficult to read multiple statements on
one line. People seem to mostly use it in short lambdas and switch statements in
our code base, but just going through the changes already uncovered some actual
problems in randomization in test code, so I think its worth it.
Changes the format of log events in the audit logfile.
It also changes the filename suffix from `_access` to `_audit`.
The new entry format is consistent with Elastic Common Schema.
Entries are formatted as JSON with no nested objects and field
names have a dotted syntax. Moreover, log entries themselves
are not spaced by commas and there is exactly one entry per line.
In addition, entry fields are ordered, unlike a typical JSON doc,
such that a human would not strain his eyes over jumbled
fields from one line to the other; the order is defined in the log4j2
properties file.
The implementation utilizes the log4j2's `StringMapMessage`.
This means that the application builds the log event as a map
and the log4j logic (the appender's layout) handle the format
internally. The layout, such as the set of printed fields and their
order, can be changed at runtime without restarting the node.
We have a Kerberos setting to remove realm part from the user
principal name (remove_realm_name). If this is true then
the realm name is removed to form username but in the process,
the realm name is lost. For scenarios like Kerberos cross-realm
authentication, one could make use of the realm name to determine
role mapping for users coming from different realms.
This commit adds user metadata for kerberos_realm and
kerberos_user_principal_name.
We have a test dependency on Apache Mina when using SimpleKdcServer
for testing Kerberos. When checking for LDAP backend connectivity,
the code checks for deadlocks which require additional security
permissions accessClassInPackage.sun.reflect. As this is only for
test and we do not want to add security permissions to production,
this commit moves these tests and related classes to
x-pack evil-tests where they can run with security manager disabled.
The plan is to handle the security manager exception in the upstream issue
DIRMINA-1093
and then once the release is available to run these tests with security
manager enabled.
Closes#32739
This change removes the wrapping of the created field in the put user
response. The created field was added as a top level field in #32332,
while also still being wrapped within the `user` object of the
response. Since the value is available in both formats in 6.x, we can
remove the wrapped version for 7.0.
Today we use a special unicast hosts provider, the `MockUncasedHostsProvider`,
in many integration tests, to deal with the dynamic nature of the allocation of
ports to nodes. However #33241 allows us to use file-based discovery to achieve
the same goal, so the special test-only `MockUncasedHostsProvider` is no longer
required.
This change removes `MockUncasedHostProvider` and replaces it with file-based
discovery in tests based on `EsIntegTestCase`.
This change addresses some issues regarding thread safety around
updates and method calls on the XPackLicenseState object. There exists
a possibility that there could be a concurrent update to the
XPackLicenseState when there is a scheduled check to see if the license
is expired and a cluster state update. In order to address this, the
update method now has a synchronized block where member variables are
updated. Each method that reads these variables is now also
synchronized.
Along with the above change, there was a consistency issue around
security calls to the license state. The majority of security checks
make two calls to the license state, which could result in incorrect
behavior due to the checks being made against different license states.
The majority of this behavior was introduced for 6.3 with the inclusion
of x-pack in the default distribution. In order to resolve the majority
of these cases, the `isSecurityEnabled` method is no longer public and
the logic is also included in individual methods about security such as
`isAuthAllowed`. There were a few cases where this did not remove
multiple calls on the license state, so a new method has been added
which creates a copy of the current license state that will not change.
Callers can use this copy of the license state to make decisions based
on a consistent view of the license state.
In some cases we want to deprecate a setting, and then automatically
upgrade uses of that setting to a replacement setting. This commit adds
infrastructure for this so that we can upgrade settings when recovering
the cluster state, as well as when such settings are dynamically applied
on cluster update settings requests. This commit only focuses on cluster
settings, index settings can build on this infrastructure in a
follow-up.
This commit ensures that we bootstrap a new history_uuid when force
allocating a stale primary. A stale primary should never be the source
of an operation-based recovery to another shard which exists before the
forced-allocation.
Closes#26712
Today when checking settings dependencies, we do not check if fallback
settings are present. This means, for example, that if
cluster.remote.*.seeds falls back to search.remote.*.seeds, and
cluster.remote.*.skip_unavailable and search.remote.*.skip_unavailable
depend on cluster.remote.*.seeds, and we have set search.remote.*.seeds
and search.remote.*.skip_unavailable, then validation will fail because
it is expected that cluster.ermote.*.seeds is set here. This commit
addresses this by also checking fallback settings when validating
dependencies. To do this, we adjust the settings exist method to also
check for fallback settings, a case that it was not handling previously.
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
With features like CCR building on the CCS infrastructure, the settings
prefix search.remote makes less sense as the namespace for these remote
cluster settings than does a more general namespace like
cluster.remote. This commit replaces these settings with cluster.remote
with a fallback to the deprecated settings search.remote.
This commit is related to #32517. It allows an "server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. This functionality is only implemented for
the netty security transport.
This commit adds a security client to the high level rest client, which
includes an implementation for the put user api. As part of these
changes, a new request and response class have been added that are
specific to the high level rest client. One change here is that the response
was previously wrapped inside a user object. The plan is to remove this
wrapping and this PR adds an unwrapped response outside of the user
object so we can remove the user object later on.
See #29827
Solves all of the xpack line length suppressions and then merges the
remainder of the xpack checkstyle_suppressions.xml file into the core
checkstyle_suppressions.xml file. At this point that just means the
antlr generated files for sql.
It also adds an exclusion to the line length tests for javadocs that
are just a URL. We have one such javadoc and breaking up the line would
make the link difficult to use.