Previously, we were using a simple CRC32 for the IDs of rollup documents.
This is a very poor choice however, since 32bit IDs leads to collisions
between documents very quickly.
This commit moves Rollups over to a 128bit ID. The ID is a concatenation
of all the keys in the document (similar to the rolling CRC before),
hashed with 128bit Murmur3, then base64 encoded. Finally, the job
ID and a delimiter (`$`) are prepended to the ID.
This gurantees that there are 128bits per-job. 128bits should
essentially remove all chances of collisions, and the prepended
job ID means that _if_ there is a collision, it stays "within"
the job.
BWC notes:
We can only upgrade the ID scheme after we know there has been a good
checkpoint during indexing. We don't rely on a STARTED/STOPPED
status since we can't guarantee that resulted from a real checkpoint,
or other state. So we only upgrade the ID after we have reached
a checkpoint state during an active index run, and only after the
checkpoint has been confirmed.
Once a job has been upgraded and checkpointed, the version increments
and the new ID is used in the future. All new jobs use the
new ID from the start
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `x-pack/plugin` project to use the new versions.
This commit introduces "Application Privileges" to the X-Pack security
model.
Application Privileges are managed within Elasticsearch, and can be
tested with the _has_privileges API, but do not grant access to any
actions or resources within Elasticsearch. Their purpose is to allow
applications outside of Elasticsearch to represent and store their own
privileges model within Elasticsearch roles.
Access to manage application privileges is handled in a new way that
grants permission to specific application names only. This lays the
foundation for more OLS on cluster privileges, which is implemented by
allowing a cluster permission to inspect not just the action being
executed, but also the request to which the action is applied.
To support this, a "conditional cluster privilege" is introduced, which
is like the existing cluster privilege, except that it has a Predicate
over the request as well as over the action name.
Specifically, this adds
- GET/PUT/DELETE actions for defining application level privileges
- application privileges in role definitions
- application privileges in the has_privileges API
- changes to the cluster permission class to support checking of request
objects
- a new "global" element on role definition to provide cluster object
level security (only for manage application privileges)
- changes to `kibana_user`, `kibana_dashboard_only_user` and
`kibana_system` roles to use and manage application privileges
Closes#29820Closes#31559
* Complete changes for running IT in a fips JVM
- Mute :x-pack:qa:sql:security:ssl:integTest as it
cannot run in FIPS 140 JVM until the SQL CLI supports key/cert.
- Set default JVM keystore/truststore password in top level build
script for all integTest tasks in a FIPS 140 JVM
- Changed top level x-pack build script to use keys and certificates
for trust/key material when spinning up clusters for IT
Now write operations like Index, Delete, Update rely on the write-index associated with
an alias to operate against. This means writes will be accepted even when an alias points to multiple indices, so long as one is the write index. Routing values will be used from the AliasMetaData for the alias in the write-index. All read operations are left untouched.
This introduces a new GetRollupIndexCaps API which allows the user to retrieve rollup capabilities of a specific rollup index (or index pattern). This is distinct from the existing RollupCaps endpoint.
- Multiple jobs can be stored in multiple indices and point to a single target data index pattern (logstash-*). The existing API finds capabilities/config of all jobs matching that data index pattern.
- One rollup index can hold data from multiple jobs, targeting multiple data index patterns. This new API finds the capabilities based on the concrete rollup indices.
We can leverage the composite agg's new `missing_bucket` feature on
terms groupings. This means the aggregation criteria used in the indexer
will now return null buckets for missing keys.
Because all buckets are now returned (even if a key is null),
we can guarantee correct doc counts with
"combined" jobs (where a job rolls up multiple schemas). This was
previously impossible since composite would ignore documents that
didn't have _all_ the keys, meaning non-overlapping schemas would
cause composite to return no buckets.
Note: date_histo does not use `missing_bucket`, since a timestamp is
always required.
The docs have been adjusted to recommend a single, combined job. It
also makes reference to the previous issue to help users that are upgrading
(rather than just deleting the sections).
The xcontent parameters were not passed to the xcontent serialization
of the chain input for each chain. This could lead to wrongly stored
watches, which did not contain passwords but only their redacted counterparts, when an input inside of a chain input contained a password.
It is useful to have a processor similar to
logstash-filter-fingerprint
in Elasticsearch. A processor that leverages a variety of hashing algorithms
to create cryptographically-secure one-way hashes of values in documents.
This processor introduces a pbkdf2hmac hashing scheme to fields in documents
for indexing
* remove left-over comment
* make sure of the property for plugins
* skip installing modules if these exist in the distribution
* Log the distrbution being ran
* Don't allow running with integ-tests-zip passed externally
* top level x-pack/qa can't run with oss distro
* Add support for matching objects in lists
Makes it possible to have a key that points to a list and assert that a
certain object is present in the list. All keys have to be present and
values have to match. The objects in the source list may have additional
fields.
example:
```
match: { 'nodes.$master.plugins': { name: ingest-attachment } }
```
* Update plugin and module tests to work with other distributions
Some of the tests expected that the integration tests will always be ran
with the `integ-test-zip` distribution so that there will be no other
plugins loaded.
With this change, we check for the presence of the plugin without
assuming exclusivity.
* Allow modules to run on other distros as well
To match the behavior of tets.distributions
* Add and use a new `contains` assertion
Replaces the previus changes that caused `match` to do a partial match.
* Implement PR review comments
This creates a YAML test "features" that indices if the cluster being
tested has xpack installed (`xpack`) or if it does *not* have xpack
installed (`no_xpack`). It uses those features to centralize skipping
a few tests that fail if xpack is installed.
The plan is to use this in a followup to skip docs tests that require
xpack when xpack is not installed. We *plan* to use the declaration
of required license level on the docs page to generate the required
`skip`.
Closes#30933.
If no version is specified when putting a watch, the index API should be
used instead of the update API, so that the whole watch gets overwritten
instead of being merged with the existing one.
Merging only happens when a version is specified, so that credentials can be omitted, which is important for the watcher UI.
This adds an api to allow updating a filter:
POST _xpack/ml/filters/{filter_id}/_update
The request body may have:
- description: setting a new description
- add_items: a list of the items to add
- remove_items: a list of the items to remove
This commit also changes the PUT filter api to
error when the filter_id is already used. As
now there is an api for updating filters, the
put api should only be used to create new ones.
Also, updating a filter results into a notification
message auditing the change for every job that is
using that filter.
This adds a `description` to ML filters in order
to allow users to describe their filters in a human
readable form which is also editable (filter updates
to be added shortly).
The parser for the Metric config was directly instantiating
the config object, rather than using the builder. That means it was
bypassing the validation logic built into the builder, and would allow
users to create invalid metric configs (like using unsupported metrics).
The job would later blow up and abort due to bad configs, but this isn't
immediately obvious to the user since the PutJob API succeeded.
Rules allow users to supply a detector with domain
knowledge that can improve the quality of the results.
The model detects statistically anomalous results but it
has no knowledge of the meaning of the values being modelled.
For example, a detector that performs a population analysis
over IP addresses could benefit from a list of IP addresses
that the user knows to be safe. Then anomalous results for
those IP addresses will not be created and will not affect
the quantiles either.
Another example would be a detector looking for anomalies
in the median value of CPU utilization. A user might want
to inform the detector that any results where the actual
value is less than 5 is not interesting.
This commit introduces a `custom_rules` field to the `Detector`.
A detector may have multiple rules which are combined with `or`.
A rule has 3 fields: `actions`, `scope` and `conditions`.
Actions is a list of what should happen when the rule applies.
The current options include `skip_result` and `skip_model_update`.
The default value for `actions` is the `skip_result` action.
Scope is optional and allows for applying filters on any of the
partition/over/by field. When not defined the rule applies to
all series. The `filter_id` needs to be specified to match the id
of the filter to be used. Optionally, the `filter_type` can be specified
as either `include` (default) or `exclude`. When set to `include`
the rule applies to entities that are in the filter. When set to
`exclude` the rule only applies to entities not in the filter.
There may be zero or more conditions. A condition requires `applies_to`,
`operator` and `value` to be specified. The `applies_to` value can be
either `actual`, `typical` or `diff_from_typical` and it specifies
the numerical value to which the condition applies. The `operator`
(`lt`, `lte`, `gt`, `gte`) and `value` complete the definition.
Conditions are combined with `and` and allow to specify numerical
conditions for when a rule applies.
A rule must either have a scope or one or more conditions. Finally,
a rule with scope and conditions applies when all of them apply.
Trying to post a new watch without any body currently results in a
NullPointerException. This change fixes that by validating that
Post and Put requests always have a body.
Closes#30057
We should not allow the user to configure index patterns that also match
the index which stores the rollup index.
For example, it is quite natural for a user to specify `metricbeat-*`
as the index pattern, and then store the rollups in `metricbeat-rolled`.
This will start throwing errors as soon as the rollup index is created
because the indexer will try to search it.
Note: this does not prevent the user from matching against existing
rollup indices. That should be prevented by the field-level validation
during job creation.
Extends ActionRequestValidationException with a rollup-specific version
to make it easier to handle mapping validation issues on the client
side.
The type will now be `rollup_action_request_validation_exception`
instead of `action_request_validation_exception`
ML has dedicated APIs for datafeeds and jobs yet base test classes and
some tests were relying on the cluster state for this state. This commit
removes this usage in favor of using the dedicated endpoints.
This commit fixes an issue with dynamic mapping updates when an index
operation is performed against an alias and when the user only has
permissions to the alias. Dynamic mapping updates resolve the concrete
index early to prevent issues so the information about the alias that
the triggering operation was being executed against is lost. When
security is enabled and a user only has privileges to the alias, this
dynamic mapping update would be rejected as it is executing against the
concrete index and not the alias. In order to handle this situation,
the security code needs to look at the concrete index and the
authorized indices of the user; if the concrete index is not authorized
the code will attempt to find an alias that the user has permissions to
update the mappings of.
Closes#30597
This commit adds the ability to configure how a docvalue field should be
formatted, so that it would be possible eg. to return a date field
formatted as the number of milliseconds since Epoch.
Closes#27740
These tests are both in the file `watcher/stats/10_basic`, and have been
failing fairly frequently over the last month with a start-up issue.
The issue is being tracked in #30298.
The current implementation starts/stops watcher using an executor. This
can result in our of order operations.
This commit reduces those executor calls to an absolute minimum in order
to be able to do state changes within the cluster state listener method,
which runs in sequence.
When a state change occurs that forces the watcher service to pause
(like no watcher index, no master node, no local shards), the service is
now in a paused state.
Pausing is a super lightweight operation, which marks the
ExecutionService as paused and waits for the currently executing watches
to finish in the background via an executor. The same applies for
stopping, the potentially long running operation is outsourced in to an
executor, as waiting for executed watches is decoupled from the current
state.
The only other long running operation is starting, where watches need to
be loaded. This is also done via an executor, but has an additional
protection by checking the cluster state version it was started with. If
another cluster state version was trying to load the watches, then this
loading will not take effect.
This PR also cleans up some unused states, like the a simple boolean in
the HistoryStore/TriggeredWatchStore marking it as started or stopped,
as this can now be caught in the execution service.
Another advantage of this approach is the fact, that now only triggered
watches are not getting executed, while watches that are run via the
Execute Watch API will still be executed regardless if watcher is
stopped or not.
Lastly the TickerScheduleTriggerEngine thread now only starts on data nodes.
Necessary changes so that the licensing functionality can be
used in a JVM in FIPS 140 approved mode.
* Uses adequate salt length in encryption
* Changes key derivation to PBKDF2WithHmacSHA512 from a custom
approach with SHA512 and manual key stretching
* Removes redundant manual padding
Other relevant changes:
* Uses the SAH512 hash instead of the encrypted key bytes as the
key fingerprint to be included in the license specification
* Removes the explicit verification check of the encryption key
as this is implicitly checked in signature verification.
We had a number of awaitsFix links that weren't updated after the xpack
merge.
Where possible I changed the links to the new locations, but in some
circumstances the original ticket was closed (suggesting the awaitsfix
should be removed) or was otherwise unclear the status.
This is related to #30134. It modifies the start_trial action to require
an acknowledgement parameter in the rest request to actually start the
trial license. There are backwards compatibility issues as prior ES
versions did not support this parameter. To handle this, it is assumed
that a request coming from a node prior to 6.3 is acknowledged. And
attempts to write a non-acknowledged request to a prior to 6.3 node will
throw an exception.
Additionally this PR adds messages about the trial license the user is
generating.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.