This commit adds licensing enforcement for FIPS mode through the use of
a bootstrap check, a node join validator, and a check in the license
service. The work done here is based on the current implementation of
the TLS enforcement with a production license.
The bootstrap check is always enforced since we need to enforce the
licensing and this is the best option to do so at the present time.
The default behaviour for "GetPrivileges" is to get all application
privileges. This should only be allowed if the user has access to
the "*" application.
Previously we had two patterns for naming of strict
and lenient parsers.
Some classes had CONFIG_PARSER and METADATA_PARSER,
and used an enum to pass the parser type to nested
parsers.
Other classes had STRICT_PARSER and LENIENT_PARSER
and used ternary operators to pass the parser type
to nested parsers.
This change makes all ML classes use the second of
the patterns described above.
This commit reverts to the pre-6.3 way of merging automata as the
change in 6.3 significantly impacts the performance for roles with a
large number of concrete indices. In addition, the maximum number of
states for security automata has been increased to 100,000 in order
to allow users to use roles that caused problems pre-6.3 and 6.3 fixed.
As an escape hatch, the maximum number of states is configurable with
a setting so that users with complex patterns in roles can increase
the states with the knowledge that there is more memory usage.
In the HL REST client we replace the License object with a string, because of
complexity of this class. It is also not really needed on the client side since
end-users are not interacting with the license besides passing it as a string
to the server.
Relates #29827
This commit introduces "Application Privileges" to the X-Pack security
model.
Application Privileges are managed within Elasticsearch, and can be
tested with the _has_privileges API, but do not grant access to any
actions or resources within Elasticsearch. Their purpose is to allow
applications outside of Elasticsearch to represent and store their own
privileges model within Elasticsearch roles.
Access to manage application privileges is handled in a new way that
grants permission to specific application names only. This lays the
foundation for more OLS on cluster privileges, which is implemented by
allowing a cluster permission to inspect not just the action being
executed, but also the request to which the action is applied.
To support this, a "conditional cluster privilege" is introduced, which
is like the existing cluster privilege, except that it has a Predicate
over the request as well as over the action name.
Specifically, this adds
- GET/PUT/DELETE actions for defining application level privileges
- application privileges in role definitions
- application privileges in the has_privileges API
- changes to the cluster permission class to support checking of request
objects
- a new "global" element on role definition to provide cluster object
level security (only for manage application privileges)
- changes to `kibana_user`, `kibana_dashboard_only_user` and
`kibana_system` roles to use and manage application privileges
Closes#29820Closes#31559
This commit adds support for Kerberos authentication with a platinum
license. Kerberos authentication support relies on SPNEGO, which is
triggered by challenging clients with a 401 response with the
`WWW-Authenticate: Negotiate` header. A SPNEGO client will then provide
a Kerberos ticket in the `Authorization` header. The tickets are
validated using Java's built-in GSS support. The JVM uses a vm wide
configuration for Kerberos, so there can be only one Kerberos realm.
This is enforced by a bootstrap check that also enforces the existence
of the keytab file.
In many cases a fallback authentication mechanism is needed when SPNEGO
authentication is not available. In order to support this, the
DefaultAuthenticationFailureHandler now takes a list of failure response
headers. For example, one realm can provide a
`WWW-Authenticate: Negotiate` header as its default and another could
provide `WWW-Authenticate: Basic` to indicate to the client that basic
authentication can be used in place of SPNEGO.
In order to test Kerberos, unit tests are run against an in-memory KDC
that is backed by an in-memory ldap server. A QA project has also been
added to test against an actual KDC, which is provided by the krb5kdc
fixture.
Closes#30243
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `x-pack:core` project to use the new versions.
* Remove BouncyCastle dependency from runtime
This commit introduces a new gradle project that contains
the classes that have a dependency on BouncyCastle. For
the default distribution, It builds a jar from those and
in puts it in a subdirectory of lib
(/tools/security-cli) along with the BouncyCastle jars.
This directory is then passed in the
ES_ADDITIONAL_CLASSPATH_DIRECTORIES of the CLI tools
that use these classes.
BouncyCastle is removed as a runtime dependency (remains
as a compileOnly one) from x-pack core and x-pack security.
Prior to 6.3 a trial license default to security enabled. Since 6.3
they default to security disabled. If a cluster is upgraded from <6.3
to >6.3, then we detect this and mimic the old behaviour with respect
to security.
Ensure our tests can run in a FIPS JVM
JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.
IT will be covered in a subsequent PR
Metric config already whitelist scaled_floats, but it wasn't added to
the histo group config. This centralizes the mapping types map
so that both metrics and histo (and any future configs) use the same
map.
Fixes#32035
We can leverage the composite agg's new `missing_bucket` feature on
terms groupings. This means the aggregation criteria used in the indexer
will now return null buckets for missing keys.
Because all buckets are now returned (even if a key is null),
we can guarantee correct doc counts with
"combined" jobs (where a job rolls up multiple schemas). This was
previously impossible since composite would ignore documents that
didn't have _all_ the keys, meaning non-overlapping schemas would
cause composite to return no buckets.
Note: date_histo does not use `missing_bucket`, since a timestamp is
always required.
The docs have been adjusted to recommend a single, combined job. It
also makes reference to the previous issue to help users that are upgrading
(rather than just deleting the sections).
Historically we have loaded SSL objects (such as SSLContext,
SSLIOSessionStrategy) by passing in the SSL settings, constructing a
new SSL configuration from those settings and then looking for a
cached object that matches those settings.
The primary issue with this approach is that it requires a fully
configured Settings object to be available any time the SSL context
needs to be loaded. If the Settings include SecureSettings (such as
passwords for keys or keystores) then this is not true, and the cached
SSL object cannot be loaded at runtime.
This commit introduces an alternative approach of naming every cached
ssl configuration, so that it is possible to load the SSL context for
a named configuration (such as "xpack.http.ssl"). This means that the
calling code does not need to have ongoing access to the secure
settings that were used to load the configuration.
This change also allows monitoring exporters to use SSL passwords
from secure settings, however an exporter that uses a secure SSL setting
(e.g. truststore.secure_password) may not have its SSL settings updated
dynamically (this is prevented by a settings validator).
Exporters without secure settings can continue to be defined and updated
dynamically.
The test failure in #31916 revealed that updating
rules on a job was modifying the detectors list
in-place. That meant the old cluster state and the
updated cluster state had no difference and thus the
change was not propagated to non-master nodes.
This commit fixes that and also reviews all of ML
metadata in order to ensure immutability.
Closes#31916
* Adding Beats x-pack plugin + index templates
* Adding built-in roles for Beats central management
* Fixing typo
* Refactoring: extract common code into method
* More refactoring for more code reuse
* Use a single index for Beats management
* Rename "fragment" to "block"
* Adding configuration block type
* Expand kibana_system role to include Beats management index privileges
* Fixing syntax
* Adding test
* Adding asserting for reserved role
* Fixing privileges
* Updating template
* Removing beats plugin
* Fixing tests
* Fixing role variable name
* Fixing assertions
* Switching to preferred syntax for boolean false checks
* Making class final
* Making variables final
* Updating Basic license message to be more accurate
Originally I put the X-Pack info object into the top level rest client
object. I did that because we thought we'd like to squash `xpack` from
the name of the X-Pack APIs now that it is part of the default
distribution. We still kind of want to do that, but at least for now we
feel like it is better to keep the high level rest client aligned with
the other language clients like C# and Python. This shifts the X-Pack
info API to align with its json spec file.
Relates to #31870
This is the first x-pack API we're adding to the high level REST client
so there is a lot to talk about here!
= Open source
The *client* for these APIs is open source. We're taking the previously
Elastic licensed files used for the `Request` and `Response` objects and
relicensing them under the Apache 2 license.
The implementation of these features is staying under the Elastic
license. This lines up with how the rest of the Elasticsearch language
clients work.
= Location of the new files
We're moving all of the `Request` and `Response` objects that we're
relicensing to the `x-pack/protocol` directory. We're adding a copy of
the Apache 2 license to the root fo the `x-pack/protocol` directory to
line up with the language in the root `LICENSE.txt` file. All files in
this directory will have the Apache 2 license header as well. We don't
want there to be any confusion. Even though the files are under the
`x-pack` directory, they are Apache 2 licensed.
We chose this particular directory layout because it keeps the X-Pack
stuff together and easier to think about.
= Location of the API in the REST client
We've been following the layout of the rest-api-spec files for other
APIs and we plan to do this for the X-Pack APIs with one exception:
we're dropping the `xpack` from the name of most of the APIs. So
`xpack.graph.explore` will become `graph().explore()` and
`xpack.license.get` will become `license().get()`.
`xpack.info` and `xpack.usage` are special here though because they
don't belong to any proper category. For now I'm just calling
`xpack.info` `xPackInfo()` and intend to call usage `xPackUsage` though
I'm not convinced that this is the final name for them. But it does get
us started.
= Jars, jars everywhere!
This change makes the `xpack:protocol` project a `compile` scoped
dependency of the `x-pack:plugin:core` and `client:rest-high-level`
projects. I intend to keep it a compile scoped dependency of
`x-pack:plugin:core` but I intend to bundle the contents of the protocol
jar into the `client:rest-high-level` jar in a follow up. This change
has grown large enough at this point.
In that followup I'll address javadoc issues as well.
= Breaking-Java
This breaks that transport client by a few classes around. We've
traditionally been ok with doing this to the transport client.
It seems that java 11 tightened some validations with regard to
time formats. The random instance creator was setting an odd
time format to the data description which is invalid when run
with java 11. This commit changes it to a valid format.
Today TransportService is tightly coupled with Transport since it
requires an instance of TransportService in order to receive responses
and send requests. This is mainly due to the Request and Response handlers
being maintained in TransportService but also because of the lack of a proper
callback interface.
This change moves request handler registry and response handler registration into
Transport and adds all necessary methods to `TransportConnectionListener` in order
to remove the `TransportService` dependency from `Transport`
Transport now accepts one or more `TransportConnectionListener` instances that are
executed sequentially in a blocking fashion.
Add hard limit to the number of items
a filter may have. This serves to protect
from excessive overhead due to the filters
taking too much memory or lookups becoming
too expensive.
This change adds stats about forecasts, to the jobstats api as well as xpack/_usage. The following
information is collected:
_xpack/ml/anomaly_detectors/{jobid|_all}/_stats:
- total number of forecasts
- memory statistics (mean/min/max)
- runtime statistics
- record statistics
- counts by status
_xpack/usage
- collected by job status as well as overall (_all):
- total number of forecasts
- number of jobs that have at least 1 forecast
- memory, runtime, record statistics
- counts by status
Fixes#31395
Support multiple system store types
When falling back to using the system keystore and - most usually -
truststore, do not assume that it will be a JKS store, but deduct
its type from {@code KeyStore#getDefaultKeyStoreType}. This allows
the use of any store type the Security Provider supports by setting
the keystore.type java security property.
Make password hashing algorithm/cost configurable for the
stored passwords of users for the realms that this applies
(native, reserved). Replaces predefined choice of bcrypt with
cost factor 10.
This also introduces PBKDF2 with configurable cost
(number of iterations) as an algorithm option for password hashing
both for storing passwords and for the user cache.
Password hash validation algorithm selection takes into
consideration the stored hash prefix and only a specific number
of algorithnm and cost factor options for brypt and pbkdf2 are
whitelisted and can be selected in the relevant setting.
This creates a YAML test "features" that indices if the cluster being
tested has xpack installed (`xpack`) or if it does *not* have xpack
installed (`no_xpack`). It uses those features to centralize skipping
a few tests that fail if xpack is installed.
The plan is to use this in a followup to skip docs tests that require
xpack when xpack is not installed. We *plan* to use the declaration
of required license level on the docs page to generate the required
`skip`.
Closes#30933.
TransportAction currently contains 2 doExecute methods, one which takes
a the task, and one that does not. The latter is what some subclasses
implement, while the first one just calls the latter, dropping the given
task. This commit combines these methods, in favor of just always
assuming a task is present.
This adds an api to allow updating a filter:
POST _xpack/ml/filters/{filter_id}/_update
The request body may have:
- description: setting a new description
- add_items: a list of the items to add
- remove_items: a list of the items to remove
This commit also changes the PUT filter api to
error when the filter_id is already used. As
now there is an api for updating filters, the
put api should only be used to create new ones.
Also, updating a filter results into a notification
message auditing the change for every job that is
using that filter.
Most transport actions don't need the node ThreadPool. This commit
removes the ThreadPool as a super constructor parameter for
TransportAction. The actions that do need the thread pool then have a
member added to keep it from their own constructor.
Most transport actions don't need to resolve index names. This commit
removes the index name resolver as a super constructor parameter for
TransportAction. The actions that do need the resolver then have a
member added to keep the resolver from their own constructor.
The changes made to disable security for trial licenses unless security
is explicitly enabled caused issues when a 6.3 node attempts to join a
cluster that already has a production license installed. The new node
starts off with a trial license and `xpack.security.enabled` is not
set for the node, which causes the security code to skip attaching the
user to the request. The existing cluster has security enabled and the
lack of a user attached to the requests causes the request to be
rejected.
This commit changes the security code to check if the state has been
recovered yet when making the decision on whether or not to attach a
user. If the state has not yet been recovered, the code will attach
the user to the request in case security is enabled on the cluster
being joined.
Closes#31332
This is a general cleanup of channels and exception handling in http.
This commit introduces a CloseableChannel that is a superclass of
TcpChannel and HttpChannel. This allows us to unify the closing logic
between tcp and http transports. Additionally, the normal http channels
are extracted to the abstract server transport.
Finally, this commit (mostly) unifies the exception handling between nio
and netty4 http server transports.
This commit makes it so that cluster state update tasks always run under the system context, only
restoring the original context when the listener that was provided with the task is called. A notable
exception is the clusterStatePublished(...) callback which will still run under system context,
because it's defined on the executor-level, and not the task level, and only called once for the
combined batch of tasks and can therefore not be uniquely identified with a task / thread context.
Relates #30603
This is related to #28898. This PR implements pooling of bytes arrays
when reading from the wire in the http server transport. In order to do
this, we must integrate with netty reference counting. That manner in
which this PR implements this is making Pages in InboundChannelBuffer
reference counted. When we accessing the underlying page to pass to
netty, we retain the page. When netty releases its bytebuf, it releases
the underlying pages we have passed to it.
This adds a `description` to ML filters in order
to allow users to describe their filters in a human
readable form which is also editable (filter updates
to be added shortly).
The parser for the Metric config was directly instantiating
the config object, rather than using the builder. That means it was
bypassing the validation logic built into the builder, and would allow
users to create invalid metric configs (like using unsupported metrics).
The job would later blow up and abort due to bad configs, but this isn't
immediately obvious to the user since the PutJob API succeeded.
This change prevents a datafeed using cross cluster search from starting if the remote cluster
does not have x-pack installed and a sufficient license. The check is made only when starting a
datafeed.
Rules allow users to supply a detector with domain
knowledge that can improve the quality of the results.
The model detects statistically anomalous results but it
has no knowledge of the meaning of the values being modelled.
For example, a detector that performs a population analysis
over IP addresses could benefit from a list of IP addresses
that the user knows to be safe. Then anomalous results for
those IP addresses will not be created and will not affect
the quantiles either.
Another example would be a detector looking for anomalies
in the median value of CPU utilization. A user might want
to inform the detector that any results where the actual
value is less than 5 is not interesting.
This commit introduces a `custom_rules` field to the `Detector`.
A detector may have multiple rules which are combined with `or`.
A rule has 3 fields: `actions`, `scope` and `conditions`.
Actions is a list of what should happen when the rule applies.
The current options include `skip_result` and `skip_model_update`.
The default value for `actions` is the `skip_result` action.
Scope is optional and allows for applying filters on any of the
partition/over/by field. When not defined the rule applies to
all series. The `filter_id` needs to be specified to match the id
of the filter to be used. Optionally, the `filter_type` can be specified
as either `include` (default) or `exclude`. When set to `include`
the rule applies to entities that are in the filter. When set to
`exclude` the rule only applies to entities not in the filter.
There may be zero or more conditions. A condition requires `applies_to`,
`operator` and `value` to be specified. The `applies_to` value can be
either `actual`, `typical` or `diff_from_typical` and it specifies
the numerical value to which the condition applies. The `operator`
(`lt`, `lte`, `gt`, `gte`) and `value` complete the definition.
Conditions are combined with `and` and allow to specify numerical
conditions for when a rule applies.
A rule must either have a scope or one or more conditions. Finally,
a rule with scope and conditions applies when all of them apply.
This commit upgrades us to Netty 4.1.25. This upgrade is more
challenging than past upgrades, all because of a new object cleaner
thread that they have added. This thread requires an additional security
permission (set context class loader, needed to avoid leaks in certain
scenarios). Additionally, there is not a clean way to shutdown this
thread which means that the thread can fail thread leak control during
tests. As such, we have to filter this thread from thread leak control.