If there are multiple jobs that are all the "best" (e.g. share the
best interval) we have no way of knowing which is actually the best.
Unfortunately, we cannot just filter for all the jobs in a single
search because their doc_counts can potentially overlap.
To solve this, we execute an msearch-per-job so that the results
stay isolated. When rewriting the response, we iteratively
unroll and reduce the independent msearch responses into a single
"working tree". This allows us to intervene if there are
overlapping buckets and manually choose a doc_count.
Job selection is found by recursively descending through the aggregation
tree and independently pruning the list of valid job caps in each branch.
When a leaf node is reached in the branch, the remaining jobs are
sorted by "best'ness" (see comparator in RollupJobIdentifierUtils for the
implementation) and added to a global set of "best jobs". Once
all branches have been evaluated, the final set is returned to the
calling code.
Job "best'ness" is, briefly, the job(s) that have
- The largest compatible date interval
- Fewer and larger interval histograms
- Fewer terms groups
Note: the final set of "best" jobs is not guaranteed to be minimal,
there may be redundant effort due to independent branches choosing
jobs that are subsets of other branches.
Related changes:
- We have to include the job's ID in the rollup doc's
hash, so that different jobs don't overwrite the same summary
document.
- Now that we iteratively reduce the agg tree, the agg framework
injects empty buckets while we're working. In most cases this
is harmless, but for `avg` aggs the empty bucket is a SumAgg while
any unrolled versions are converted into AvgAggs... causing a cast
exception. To get around this, avg's are renamed to
`{source_name}.value` to prevent a conflict
- The job filtering has been pushed up into a query filter, since it
applies to the entire msearch rather than just individual agg components
- We no longer add a filter agg clause about the date_histo's interval, because
that is handled by the job validation and pruning.
Original commit: elastic/x-pack-elasticsearch@995be2a039
This changes `_xpack/monitoring/_bulk` to fundamentally behave in the same
way as `_bulk` and never return 202 when data is ignored (something
`_bulk` cannot do). Instead, anyone interested will have to inspect the
returned response for the ignored flag.
Original commit: elastic/x-pack-elasticsearch@07254a006d
* [Monitoring/Beats] Add new CPU fields, remove old CPU fields
* use long instead of double for cpu counters
* time => time.ms
Original commit: elastic/x-pack-elasticsearch@244b08a574
This change disables security for trial licenses unless security is
explicitly enabled in the settings. This is done to facilitate users
getting started and not having to deal with some of the complexities
involved in getting security configured. In order to do this and avoid
disabling security for existing users that have gold or platinum
licenses, we have to disable security after cluster formation so that
the license can be retrieved.
relates elastic/x-pack-elasticsearch#4078
Original commit: elastic/x-pack-elasticsearch@96bdb889fc
This commit moves the dev key into core and renames to make it clear it
is for snapshots, and makes the production key a required parameter of
release builds.
Original commit: elastic/x-pack-elasticsearch@ea299bd5a2
This creates a new "beats_system" user and role with the same
privileges as the existing "logstash_system" user/role.
The "beat_system" user is also added as a managed user within
the "setup-passwords" command.
Users who upgrade from an earlier version of Elasticsearch/X-Pack
will need to manually set a password for the beats_system user via
the change password API (or Kibana UI)
Original commit: elastic/x-pack-elasticsearch@6087d3a18e
If a watch is not active, it should still be executed, if it is called
via the execute watch API.
This commit adds an additional method to the execution context to check
for this, which returns true for a manual execution context but checks
the watch status for the triggered one.
Original commit: elastic/x-pack-elasticsearch@18f3f9e84b
This commit fixes the Javadoc build for MonitoringTemplateUtils after
changes to core removed the string and bytes methods from
XContentBuilder.
Original commit: elastic/x-pack-elasticsearch@49f3b95b22
Add realm name to all authz audit events: accessDenied, accessGranted,
runAsDenied and runAsGranted.
These event types receive the following attributes: realm,
run_by_realm and run_as_realm to go along with with the existing
attributes: principal, run_by_principal and run_as_principal. The
'effective realm name' (run_as_realm or run_by_realm) is certainly
filterable by ignore policies.
Original commit: elastic/x-pack-elasticsearch@cb3801e197
This commit adds a Samba4 test fixture that acts as a domain controller
and has the same contents as the cloud active directory instance that
we previously used for tests.
The tests also support reading information from environment variables
so that they can be run against a real active directory instance in our
CI builds.
In addition, this commit also fixes a few issues that surfaced when
making this change. The first is a change in the base DN that is
searched when performing down-level authentication. The base DN is
now the configuration object instead of the domain DN. This change was
required due to the original producing unnecessary referrals, which we
cannot easily follow when running against this test figure. Referrals
cannot easily be followed as they are returned by the ldap server with
an unresolvable DNS name unless the host points to the samba4 instance
for DNS. The port returned in the referral url is the one samba is bound
to, which differs from the port that is forwarded to the host by the
test fixture.
The other issue that is resolved by this change is the addition of
settings that allow specifying non-standard ports for active directory.
This is needed for down-level authentication as we may need to query
the regular port of active directory instead of the global catalog
port as the configuration object is not replicated to the global
catalog.
relates elastic/x-pack-elasticsearch#185
Relates elastic/x-pack-elasticsearch#3800
Original commit: elastic/x-pack-elasticsearch@883c742fba
This adds an indicator to Monitoring's portion of X-Pack usage whether or
not collection is actually enabled. It's no longer enough to have an
exporter defined by default to know if monitoring is actually running.
Original commit: elastic/x-pack-elasticsearch@b2eb881d61
This adds a minimum compatible version to the model snapshot.
Nodes with a version earlier than that version cannot read
that model snapshot. Thus, such jobs are not assigned to
incompatible nodes.
relates elastic/x-pack-elasticsearch#4077
Original commit: elastic/x-pack-elasticsearch@2ffa6adce0
This adds back usage stats by pickybacking on the watcher stats, which
are already running distributed in order to collect and merge watcher
statistics.
In order to be able to track statistics, we need to add information for
each watch in an in-memory data structure that is processed whenever a
usage request is coming in. This processing creates a number of counters
for each node, which then are merged together in the usage stats.
relates elastic/x-pack-elasticsearch#4071
Original commit: elastic/x-pack-elasticsearch@c8bfed288f
* Decouple XContentBuilder from BytesReference
This commit handles the removal of all mentions of BytesReference from
XContentBuilder. This is needed so that we can completely decouple the XContent
code and move it into its own dependency.
This is the x-pack side of https://github.com/elastic/elasticsearch/pull/28972
Original commit: elastic/x-pack-elasticsearch@8ba2e97b26
This is related to elastic/x-pack-elasticsearch#3877. This commit adds a paramer type to the
start_trial api. This parameter allows the user to pass a type (trial,
gold, or platinum) of license that will be generated. No matter what
type is choosen, you can only generate one per major version.
Original commit: elastic/x-pack-elasticsearch@b42234cbb5
This commit replaces the usage of Lucene IOUtils with Elasticsearch
IOUtils, the former of which is now forbidden.
Original commit: elastic/x-pack-elasticsearch@8e0554001f
This is related to elastic/x-pack-elasticsearch#3877. This commit adds a route /start_basic that
will self generate a basic license. The only validation that is
performed is to check that you do not already have a basic license
installed. Additionally, if you lose features from switching to a basic
license, you must acknowledge the changes.
Original commit: elastic/x-pack-elasticsearch@7b8eeb50b1
A small bug in the `IndexStatsCollector` can potentially returns
statistics for newly created indices that does not exist yet in the
collector's `ClusterState` local instance.
It happens because an instance of the current `ClusterState` is
captured and passed to all the collectors before they are executed (so
that they all share the same view of the state of the cluster). On
some clusters, if an index is created after the `ClusterState` is
captured but before the `IndicesStatsRequest` is executed then it can
appears in the index stats but have no corresponding entry in the
local cluster state.
This commit changes the IndexStatsCollector so that it only return
statistics for indices that already exist in the cluster state. This
way a consistent view is possible between indices/index/shard stats.
Original commit: elastic/x-pack-elasticsearch@da173ae0b0
SslConfiguration can depend on SecureSettings, so it must be
constructed during the correct lifecycle phase.
For PkiRealmBootstrapCheck, moved the construction of SslConfiguration
objets into the constructor rather than the check method
Original commit: elastic/x-pack-elasticsearch@1a4d147216
This is related to elastic/x-pack-elasticsearch#4095. That test uses the a basic license in a test
of the route put license. Occasionally, that license is extended due to
recent work related to indefinite basic licenses before the test
assertions can be performed. This commit changes the test to use a gold
license.
Original commit: elastic/x-pack-elasticsearch@bf2550f044
This is related to elastic/x-pack-elasticsearch#3877. It modifies self-generated basic licenses to
(practically) never expire. Specifically, self-generated basic licenses
will be set with an expiration date 1 year before Long.MAX_VALUE
Additionally, basic licenses with a different expiration date will be
replaced with a new self-generated basic licenses at startup.
Original commit: elastic/x-pack-elasticsearch@de8b343089
This properly registers the `XPackFeatureSetUsage` for Logstash and
it tests it by invoking the Usage API in a Monitoring QA test.
Without those being properly registered, the test will consistently fail.
Original commit: elastic/x-pack-elasticsearch@2e8f2376fd
We had a Usage class before, but weren't registering it with XPack.
Would be nice to add more usage info in the future (like the running
jobs on each node), but unclear the best way to do it since we'd need
to filter through the list of allocated tasks.
Original commit: elastic/x-pack-elasticsearch@5207d2758b
Up to now a job update that reduces the model memory limit
was not allowed. However, there could definitely be cases
where reducing the limit is necessary and reasonable.
This commit makes it possible to decrease the limit as long
as it does not go below the current memory usage. We obtain
the latter from the model size stats.
The conditions under which updating the model_memory_limit
is not allowed are now:
- when the job is open
- latest model_size_stats.model_bytes < new value
relates elastic/x-pack-elasticsearch#2461
Original commit: elastic/x-pack-elasticsearch@5b35923590
This removes the readFrom and writeTo methods from XContentType, instead using
the more generic `readEnum` and `writeEnum` methods. Luckily they are both
encoded exactly the same way, so there is no compatibility layer needed for
backwards compatibility.
This is the x-pack side of https://github.com/elastic/elasticsearch/pull/28927
Original commit: elastic/x-pack-elasticsearch@f1c0928490
This wraps the stream (`.streamInput()`) that is passed to many of the
`createParser` instances in the enclosing (or a new) try-with-resources block.
This ensures the `BytesReference.streamInput()` is closed.
Relates to elastic/x-pack-elasticsearch#28504
Original commit: elastic/x-pack-elasticsearch@7546e3b4d4
Analysis limits contain settings that affect the resources
used by ML jobs. Those limits always take place. However,
explictly setting them is not required as they have reasonable
defaults. For a long time those defaults lived on the c++ side.
The job could just not have any explicit limits and that meant
defaults would be used at the c++ side. This has the disadvantage
that it is not obvious to the users what these settings are set to.
Additionally, users might not be aware of the settings existence.
On top of that, since 6.1, the default model_memory_limit was lowered
from 4GB to 1GB. For BWC, this meant that jobs where model_memory_limit
is null, the default of 4GB applies. Jobs that were created from 6.1
onwards, contain an explicit setting for model_memory_limit, which is
1GB unless the user sets it differently. This adds additional confusion.
This commit makes analysis limits an always explicit setting on the job.
Regardless of whether the user sets custom limits or not, the job object
(and response) will contain the full analysis limits values.
The possibilities for interpretation of missing values are:
- the entire analysis_limits is null: this may only happen for jobs
created prior to 6.1. Thus we set the model_memory_limit to 4GB.
- analysis_limits are non-null but model_memory_limit is: this also
may only happen for jobs prior to 6.1. Again, we set memory limit to
4GB.
- model_memory_limit is non-null: this either means the user set an
explicit value or the job was created from 6.1 onwards and it has
the explicit default of 1GB. We simply keep the given value.
For categorization_examples_limit the default has always been 4, so
we fill that in when it's missing.
Finally, note that we still need to handle potential null values
for the situation of a mixed cluster.
Original commit: elastic/x-pack-elasticsearch@5b6994ef75
* Pass InputStream when creating XContent parser
Rather than passing the raw `BytesReference` in when creating the xcontent
parser, this passes the StreamInput (which is an InputStream), this allows us to
decouple XContent from BytesReference.
This is the x-pack side of https://github.com/elastic/elasticsearch/pull/28754
* Use the streamInput variant, not sourceAsString
Original commit: elastic/x-pack-elasticsearch@dd5d8b1654
This adds a new Rollup module to XPack, which allows users to configure periodic "rollup jobs" to pre-aggregate data. That data is then available later for search through a special RollupSearch API, which mimics the DSL and functionality of regular search.
Rollups are used to drastically reduce the on-disk footprint of metric-based data (e.g. timestamped document with numeric and keyword fields). It can also be used to speed up aggregations over large datasets, since the rolled data will be considerably smaller and fewer documents to search.
The PR adds seven new endpoints to interact with Rollups; create/get/delete job, start/stop job, a capabilities API similar to field-caps, and a Rollup-enabled search.
Original commit: elastic/x-pack-elasticsearch@dcde91aacf