* ML: Adding missing datacheck to datafeedjob
* Adding client side and docs
* Making adjustments to validations
* Making values default to on, having more sensible limits
* Intermittent commit, still need to figure out interval
* Adjusting delayed data check interval
* updating docs
* Making parameter Boolean, so it is nullable
* bumping bwc to 7 before backport
* changing to version current
* moving delayed data check config its own object
* Separation of duties for delayed data detection
* fixing checkstyles
* fixing checkstyles
* Adjusting default behavior so that null windows are allowed
* Mentioning the default value
* Fixing comments, syncing up validations
This pull request replaces some blocks of code that must be run once
and that are currently based on AtomicBoolean by the convient RunOnce
class added in #35489.
The NPE would occur if should_trim_field was overridden to
true and any field value was completely blank. This change
defends against this situation.
Fixes#35462
This commit uses the index settings version so that a follower can
replicate index settings changes as needed from the leader.
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
* Adding rollup support for datafeeds
* Fixing tests and adjusting formatting
* minor formatting chagne
* fixing some syntax and removing redundancies
* Refactoring and fixing failing test
* Refactoring, adding paranoid null check
* Moving rollup into the aggregation package
* making AggregationToJsonProcessor package private again
* Addressing test failure
* Fixing validations, chunking
* Addressing failing test
* rolling back RollupJobCaps changes
* Adding comment and cleaning up test
* Addressing review comments and test failures
* Moving builder logic into separate methods
* Addressing PR comments, adding test for rollup permissions
* Fixing test failure
* Adding rollup priv check on datafeed put
* Handling missing index when getting caps
* Fixing unused import
Stop passing `Settings` to `AbstractComponent`'s ctor. This allows us to
stop passing around `Settings` in a *ton* of places. While this change
touches many files, it touches them all in fairly small, mechanical
ways, doing a few things per file:
1. Drop the `super(settings);` line on everything that extends
`AbstractComponent`.
2. Drop the `settings` argument to the ctor if it is no longer used.
3. If the file doesn't use `logger` then drop `extends
AbstractComponent` from it.
4. Clean up all compilation failure caused by the `settings` removal
and drop any now unused `settings` isntances and method arguments.
I've intentionally *not* removed the `settings` argument from a few
files:
1. TransportAction
2. AbstractLifecycleComponent
3. BaseRestHandler
These files don't *need* `settings` either, but this change is large
enough as is.
Relates to #34488
Today when ESIntegTestCase starts some nodes it writes out the unicast hosts
files each time a node starts its transport service. This does mean that a
number of nodes can start and perform their first pinging round without any
unicast hosts which, if the timing is unlucky and a lot of nodes are all
started at the same time, can lead to a split brain as in #35052.
Prior to #33554 this was unlikely to happen since the MockUncasedHostsProvider
would always have yielded the existing hosts, so the timing would have to have
been implausibly unlucky. Since #33554, however, it's more likely because the
race occurs between the start of the first round of pinging and the writing of
the unicast hosts file. It is realistic that new nodes will be configured with
the existing nodes from startup, so this change reinstates that behaviour.
Closes#35052.
Drops the `Settings` member from `AbstractComponent`, moving it from the
base class on to the classes that use it. For the most part this is a
mechanical change that doesn't drop `Settings` accesses. The one
exception to this is naming threads where it switches from an invocation
that passes `Settings` and extracts the node name to one that explicitly
passes the node name.
This change doesn't drop the `Settings` argument from
`AbstractComponent`'s ctor because this change is big enough as is.
We'll do that in a follow up change.
Sometimes the cluster forming here will split-brain when it grows up to 5
nodes. This could be a timing issue or could be something going wrong in
discovery, so this asks for more logs. Relates #35052
The file structure finder endpoint can find the NDJSON
(newline-delimited JSON) file format, but called it
`json`. This change renames the `format` for this file
structure to `ndjson`, which is more precise and will
hopefully avoid confusion.
`AbstractComponent` is trouble because its name implies that
*everything* should extend from it. It *is* useful, but maybe too
broadly useful. The things it offers access too, the `Settings` instance
for the entire server and a logger are nice to have around, but not
really needed *everywhere*. The `Settings` instance especially adds a
fair bit of ceremony to testing without any value.
This removes the `nodeName` method from `AbstractComponent` so it is
more clear where we actually need the node name.
We currently have two different native processes:
autodetect & normalizer. There are plans for introducing
a new process. All these share many things in common.
This commit refactors the processes to extend an
`AbstractNativeProcess` class that encapsulates those
commonalities with the purpose of reusing the code
for new processes in the future.
This change ensures the `message` field is always included
in the `field_stats` for the semi-structured text log file
file structure. Previously it was not, as it will almost
certainly contain all distinct values. However, for
consistency in the UI it's useful to include it.
After discussing on the team's FixItFriday, we concluded that
static final instance variables that are mutable should be lowercased.
Historically, DeprecationLogger was uppercased more frequently than lowercased.
With this change, we apply the common test config automatically to all
newly created tasks instead of opting in specifically.
For plugin authors using the plugin externally this means that the
configuration will be applied to their RandomizedTestingTasks as well.
The purpose of the task is to simplify setup and make it easier to
change projects that use the `test` task but actually run integration
tests to use a task called `integTest` for clarity, but also because
we may want to configure and run them differently.
E.x. using different levels of concurrency.
#33708 introduced a strict deprecation mode that makes a REST request
fail if there is a warning header in the response returned by
Elasticsearch (usually a deprecation message signaling that a feature
or a field has been deprecated).
This change adds the strict deprecation mode into the REST integration
tests, and makes the tests fail if a deprecated feature is used. Also
any test using a deprecated feature has been modified to pass the build.
The YAML integration tests already analyzed HTTP warnings so they do
not use this mode, keeping their "expected vs actual" behavior.
All of the tests in PainlessDomainSplitIT have an awaitsfix, which
causes the build to fail since no tests are run. This adds an empty
test to get the build going again.
Relates #34683
Relates #32966
In some of our X-Pack REST tests we have to wait for pending tasks to
complete. We are now needing this functionality in ESRestTestCase for
the docs tests where we run against X-Pack features. This commit moves
the helper method that we have in X-Pack to ESRestTestCase, and removes
duplicate logic from waiting for rollup tasks to complete.
The setting that reduces the disk space requirement
for the forecasting integration tests was accidentally
removed in #31757 when files were moved around. This
change simply adds back the setting that existed before
that.
This commit moves the definition of domainSplit into java and exposes it
as a painless whitelist extension. The method also no longer needs
params, and version which ignores params is added and deprecated.
The ingest pipeline that is produced is very simple. It
contains a grok processor if the format is semi-structured
text, a date processor if the format contains a timestamp,
and a remove processor if required to remove the interim
timestamp field parsed out of semi-structured text.
Eventually the UI should offer the option to customize the
pipeline with additional processors to perform other data
preparation steps before ingesting data to an index.
In ccb9ab5717 we changed how we deal with time
fields to support the `DateTime`-format fields added in 6.0, but dropped
support for pre-6.x `Long`-format fields. This change reinstates this support
for cases where pre-6.x data is made available to ML (e.g. in a mixed-version
CCS setup or after an upgrade).
This changes the delete job API by adding
the choice to delete a job asynchronously.
The commit adds a `wait_for_completion` parameter
to the delete job request. When set to `false`,
the action returns immediately and the response
contains the task id.
This also changes the handling of subsequent
delete requests for a job that is already being
deleted. It now uses the task framework to check
if the job is being deleted instead of the cluster
state. This is a beneficial for it is going to also
be working once the job configs are moved out of the
cluster state and into an index. Also, force delete
requests that are waiting for the job to be deleted
will not proceed with the deletion if the first task
fails. This will prevent overloading the cluster. Instead,
the failure is communicated better via notifications
so that the user may retry.
Finally, this makes the `deleting` property of the job
visible (also it was renamed from `deleted`). This allows
a client to render a deleting job differently.
Closes#32836
This change fixes a potential deadlock problem in the unit
test introduced in #34117.
It also removes a piece of debug code and corrects a docs
formatting problem that were both added in that same PR.