This PR adds the ability to run the enrich policy execution task in the background,
returning a task id instead of waiting for the completed operation.
Prior to this change the `target_field` would always be a json array
field in the document being ingested. This to take into account that
multiple enrich documents could be inserted into the `target_field`.
However the default `max_matches` is `1`. Meaning that by default
only a single enrich document would be added to `target_field` json
array field.
This commit changes this; if `max_matches` is set to `1` then the single
document would be added as a json object to the `target_field` and
if it is configured to a higher value then the enrich documents will be
added as a json array (even if a single enrich document happens to be
enriched).
Currently, partial snapshots will eventually build up unless they are
manually deleted. Partial snapshots may be useful if there is not a more
recent successful snapshot, but should eventually be deleted if they are
no longer useful.
With this change, partial snapshots are deleted using the following
strategy: PARTIAL snapshots will be kept until the configured
expire_after period has passed, if present, and then be deleted. If
there is no configured expire_after in the retention policy, then they
will be deleted if there is at least one more recent successful snapshot
from this policy (as they may otherwise be useful for troubleshooting
purposes). Partial snapshots are not counted towards either min_count or
max_count.
Adds a new datafeed config option, max_empty_searches,
that tells a datafeed that has never found any data to stop
itself and close its associated job after a certain number
of real-time searches have returned no data.
Backport of #47922
Usually syslog timestamps have two spaces before a single
digit day-of-month. However, in some non-syslog cases
where syslog-like timestamps are used there is only one
space. The grok pattern supports this, so the timestamp
parser should too. This change makes the
find_file_structure endpoint do this.
Also fixes another problem that the same test case
exposed in the find_file_structure endpoint, which was
that the exclude_lines_pattern for delimited files was
always created on the assumption the delimiter was a
comma. Now it is based on the actual delimiter.
This commit adds two APIs that allow to pause and resume
CCR auto-follower patterns:
// pause auto-follower
POST /_ccr/auto_follow/my_pattern/pause
// resume auto-follower
POST /_ccr/auto_follow/my_pattern/resume
The ability to pause and resume auto-follow patterns can be
useful in some situations, including the rolling upgrades of
cluster using a bi-directional cross-cluster replication scheme
(see #46665).
This commit adds a new active flag to the AutoFollowPattern
and adapts the AutoCoordinator and AutoFollower classes so
that it stops to fetch remote's cluster state when all auto-follow
patterns associate to the remote cluster are paused.
When an auto-follower is paused, remote indices that match the
pattern are just ignored: they are not added to the pattern's
followed indices uids list that is maintained in the local cluster
state. This way, when the auto-follow pattern is resumed the
indices created in the remote cluster in the meantime will be
picked up again and added as new following indices. Indices
created and then deleted in the remote cluster will be ignored
as they won't be seen at all by the auto-follower pattern at
resume time.
Backport of #47510 for 7.x
Previously, Nullability was set to UNKNOWN instead of TRUE which
resulted on QueryFolder not correctly folding to NULL if any of the args
was null.
Remove the overriding nullable() also for DatePart/DateTrunc to allow
delegation the parent class.
(cherry picked from commit 05a7108e133b5ae7bec2257db5ae2d30ad926ee2)
Joda was using ResolverStyle.STRICT when parsing. This means that date will be validated to be a correct year, year-of-month, day-of-month
However, we also want to make it works with Year-Of-Era as Joda used to, hence custom temporalquery.localdate in DateFormatters.from
Within DateFormatters we use the correct uuuu year instead of yyyy year of era
worth noting: if yyyy(without an era) is used in code, the parsing result will be a TemporalAccessor which will fail to be converted into LocalDate. We mostly use DateFormatters.from so this takes care of this. If possible the uuuu format should be used.
* [ML][Analytics] fix bug where regression deleted early does not delete state (#47885)
* [ML][Analytics] fix bug where regression deleted early does not delete state
* Fixing ml with security test failure
* fixing for older java
Changes the execution logic to create a new task using the execute request,
and attaches the new task to the policy runner to be updated. Also, a new
response is now returned from the execute api, which contains either the task
id of the execution, or the completed status of the run. The fields are mutually
exclusive to make it easier to discern what type of response it is.
rename internal indexes of transform plugin
- rename audit index and create an alias for accessing it, BWC: add an alias for old indexes to
keep them working, kibana UI will switch to use the read alias
- rename config index and provide BWC to read from old and new ones
Batch transforms automatically stop after all data has processed, therefore tests can not reliable test the state. This change rewrites tests to remove the unreliable tests or use continuous transforms instead as they do not auto-stop.
fixes#47441
One of the tests in this suit stops a master node,
plus we're doing other node starts in this suit.
=> the internal test cluster should be TEST and not `SUITE`
scoped to avoid random failures like the one in #47834Closes#47834
The random timestamps were landing too close to the current time,
so an unlucky rollup interval would round such that the doc wasn't
included in the search range (and thus not "rolled up") which
would then fail the test.
The fix is to make sure the timestamp of all docs is sufficiently behind
'now' that the possible rounding intervals will always include them.
Backport of #38753 to 7.x where the test was still muted.
Refactor DateTrunc and DatePart to use separate Pipe classes which
allows the removal of the BinaryDateOperation enum.
(cherry picked from commit a6075e7718dff94a90dbc0795dd924dcb7641092)
Currently if the document being ingested contains another field value
than a string then the processor fails with an error.
This commit changes the match processor to handle number values
and array values correctly.
If a json array is detected then the `terms` query is used instead
of the `term` query.
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
Backport of (#47721) for 7.x.
Similarly to #47582, Auto-follow patterns creates following
indices as long as the remote index matches the pattern and
the remote primary shards are all started. But since 7.2 closed
indices are also replicated, and it does not play well with CCR
auto-follow patterns as they create following indices for closed
leader indices too.
This commit changes the getLeaderIndicesToFollow() so that closed
indices are excluded from auto-follow patterns.
If a cluster sending monitoring data is unhealthy and triggers an
alert, then stops sending data the following exception [1] can occur.
This exception stops the current Watch and the behavior is actually
correct in part due to the exception. Simply fixing the exception
introduces some incorrect behavior. Now that the Watch does not
error in the this case, it will result in an incorrectly "resolved"
alert. The fix here is two parts a) fix the exception b) fix the
following incorrect behavior.
a) fixing the exception is as easy as checking the size of the
array before accessing it.
b) fixing the following incorrect behavior is a bit more intrusive
- Note - the UI depends on the success/met state for each condition
to determine an "OK" or "FIRING"
In this scenario, where an unhealthy cluster triggers an alert and
then goes silent, it should keep "FIRING" until it hears back that
the cluster is green. To keep the Watch "FIRING" either the index
action or the email action needs to fire. Since the Watch is neither
a "new" alert or a "resolved" alert, we do not want to keep sending
an email (that would be non-passive too). Without completely changing
the logic of how an alert is resolved allowing the index action to
take place would result in the alert being resolved. Since we can
not keep "FIRING" either the email or index action (since we don't
want to resolve the alert nor re-write the logic for alert resolution),
we will introduce a 3rd action. A logging action that WILL fire when
the cluster is unhealthy. Specifically will fire when there is an
unresolved alert and it can not find the cluster state.
This logging action is logged at debug, so it should be noticed much.
This logging action serves as an 'anchor' for the UI to keep the state
in an a "FIRING" status until the alert is resolved.
This presents a possible scenario where a cluster starts firing,
then goes completely silent forever, the Watch will be "FIRING"
forever. This is an edge case that already exists in some scenarios
and requires manual intervention to remove that Watch.
This changes changes to use a template-like method to populate the
version_created for the default monitoring watches. The version is
set to 7.5 since that is where this is first introduced.
Fixes#43184
The currently logic shard selecting logic selects a random shard copy
instead of selecting the local shard copy and if local copy is not
available then selecting a random shard copy. The latter is desired
behaviour for enrich.
By reusing `OperationRouting#searchShards(...)` we get the desired
behaviour and reuse the same logic that the search api is using.
* Separate SLM stop/start/status API from ILM
This separates a start/stop/status API for SLM from being tied to ILM's
operation mode. These APIs look like:
```
POST /_slm/stop
POST /_slm/start
GET /_slm/status
```
This allows administrators to have fine-grained control over preventing
periodic snapshots and deletions while performing cluster maintenance.
Relates to #43663
* Allow going from RUNNING to STOPPED
* Align with the OperationMode rules
* Fix slmStopping method
* Make OperationModeUpdateTask constructor private
* Wipe snapshots better in test