* Adds a check to wait for active tasks for XPackRestIT
* uses test logger
* Change to use assertBusy instead of awaitBusy
* fixes failures with active tasks remaining
* Moves wait for pending tasks into MlRestTestStateCleaner
* remove unecessary log line
Original commit: elastic/x-pack-elasticsearch@1f098dbb64
By creating the watches via the exporter, we get to afford ourselves
with a much more automatic and simpler set of security permissions.
This does limit us in a few ways (e.g., every exporter has to deal with
cluster alerts itself, which means that newer releases of Kibana cannot
help by adding newer cluster alerts for older, still-monitored
clusters).
Original commit: elastic/x-pack-elasticsearch@448ef313c3
When Logstash 5.2 - 5.3 submit documents via the `_xpack/monitoring/_bulk`
endpoint, it sends its time-based documents with an explicit `_id` of
`""`.
This used to be automatically ignored by Monitoring, but we now accept the
_id that we are given (including `null`). ES, prior to 5.3.1, accepted
`""` as a valid `_id` through the `_bulk` endpoint, which means that it
blindly accepted and overwrote documents given that ID, meaning that all
Logstash instances "shared" the exact same document and therefore the UI
becomes useless.
This change allows `""` to be used and it simply replaces that value, and
only that value, with `null`. This enables backwards compatibility with LS
5.2 - 5.3.0.
Original commit: elastic/x-pack-elasticsearch@889578e61e
PersistentTasksCustomMetadata was using a generic param named `Params`. This conflicted with the imported interface `ToXContent.Params`. The java compiler was preferring the generic param over the interface so everything was fine but Eclipse apparently prefers the interface int his case which was screwing up the Hierarchy and causing compile errors in Eclipse. This changes fixes it by renaming the Generic param to `P`
Original commit: elastic/x-pack-elasticsearch@8528870684
- Mark all security indices (that is all indices managed by SecurityLifecycleService) as "superuser only" (only superuser role can have direct permissions)
- Add unit tests for IndexLifecycleManager
Original commit: elastic/x-pack-elasticsearch@e4478825e0
This commit removes the SecuredString class that was previously used throughout the security code
and replaces it with the SecureString class from core that was added as part of the new secure
settings infrastructure.
relates elastic/x-pack-elasticsearch#421
Original commit: elastic/x-pack-elasticsearch@e9cd117ca1
When a index name pattern contains both date math and wildcards, the name resolution does not
return the expected result. This change moves the date math resolution to before our attempts to
match wildcards so that both can be used in the same pattern.
relates elastic/x-pack-elasticsearch#1065
Original commit: elastic/x-pack-elasticsearch@9f48b42fad
let close job and stop datafeed apis redirect to elected master node.
This is for cluster state observation purposes, so that a subsequent open and then close job or
start and then stop datafeed see the same local cluster state and sanity validation doesn't fail.
Original commit: elastic/x-pack-elasticsearch@21a63184b9
introduced separate task names to register the persistent tasks executors and params.
Also renamed start and stop datafeed action names to be singular in order to be consistent with open and close action names.
Original commit: elastic/x-pack-elasticsearch@21f7b242cf
Support for default settings has been removed in core and so some
methods were refactored. This commit responds to this change in core.
Original commit: elastic/x-pack-elasticsearch@b22c612de4
The path has changed so it’s no longer possible to distinguish between data feed and job tasks.
The preceding test get_datafeed provides ample coverage anyway.
Original commit: elastic/x-pack-elasticsearch@780b1beb6b
When the execute watch API is called without recording the execution
in the watch history, the watch status is not updated, in order to not
divert the in-memory object status and the one persisted on disk.
In order to work around this issue, the execute watch API can simply
clone a new watch status and a new watch, which means the object in
the watch store is never updated. This allows for execution and changing
of the watch status, before it is returned to the client.
relates elastic/x-pack-elasticsearch#889
Original commit: elastic/x-pack-elasticsearch@6a0d9c9a78
Changes persistent task serialization and forces params and status to have the same writeable name as the task itself.
Original commit: elastic/x-pack-elasticsearch@59cf3dca39
remove `node.attr.max_running_jobs` node attribute and use `node.attr.ml.enabled` node attribute instead to know whether a node is a ml node or not.
Also renamed `max_running_jobs` setting to `xpack.ml.max_running_jobs`.
Original commit: elastic/x-pack-elasticsearch@798732886b
Removes the last pieces of ActionRequest from PersistentTaskRequest and renames it into PersistTaskParams, which is now just an interface that extends NamedWriteable and ToXContent.
Original commit: elastic/x-pack-elasticsearch@5a298b924f
Following this change, if the user runs on a platform that we don't ship
ML binaries for:
* If ML is enabled the node still refuses to start, but clearly says why
* If ML is disabled the node starts up without logging any errors
Original commit: elastic/x-pack-elasticsearch@af4fb8c411
Now that task id are strings instead of longs (elastic/x-pack-elasticsearch#1035), ml can use the job and datafeed as task id.
This removes logic that would otherwise iterate over all tasks and check if the task's request id was equal to the provided id and instead just do lookup in the task map.
Job and datafeed task ids are prefixed with either 'job-' or 'datafeed-', because job and datafeed ids don't have to be unique as they are stored separately from each other.
Original commit: elastic/x-pack-elasticsearch@b48c2b368a
This built-in watcher_admin role is able to execute all watcher actions,
read the watch history indices and read the watches index
index. The watcher_user role allows to GET a watch and to get the stats and thats it.
relates elastic/x-pack-elasticsearch#978
Original commit: elastic/x-pack-elasticsearch@11b33a413b
- stops the datafeed when post/flush throw a conflict exception.
A conflict exception signifies the job state is not opened, thus
we are better off stopping the datafeed.
- handles flushing the job the same way as posting to the job.
relates elastic/x-pack-elasticsearch#855
Original commit: elastic/x-pack-elasticsearch@49a54912c2
Makes the log more readable in editors not set to UTF-8.
Customers may well be in this situation on Linux/Windows.
Original commit: elastic/x-pack-elasticsearch@4e59fc90cf
The commit changes how LocalExporterTests stops: it now uses the
node_stats document collected on each node and check if it's older
than a given number of seconds (10). It also removes log traces.
Original commit: elastic/x-pack-elasticsearch@0384690b41
Before this change the persistent task operations related to opening
and closing jobs would time out a long time before the operations
related to native processes.
Original commit: elastic/x-pack-elasticsearch@23076b773b
Changes the logging of LDAP authentication failures from "always" to "only if the user failed to be authenticated"
Previously there were cases (such has having 2 AD realms) where successful user authentication would still cause an INFO message to be written to the log for every request.
Now that message is suppressed, but a WARN message is added _if-and-only-if_ the user cannot be authenticated by any realm.
This is implemented via a new value stored in the ThreadContext that the AuthenticationService choses to log (or not log) depending on the result of the authenticate process.
Closes: elastic/x-pack-elasticsearch#887
Original commit: elastic/x-pack-elasticsearch@b81b363729
The PR detects if SMILE is being provided, then correctly slices the stream such that each document is parsed individually. This is required because jackson's SMILE parser is stricter than it's JSON parser and will stop parsing when it hits a streamSeparator (unlike JSON, which will eagerly try to find more objects to parse).
Removes the forced-headers from the various REST tests.
relates elastic/x-pack-elasticsearch#642
Original commit: elastic/x-pack-elasticsearch@c0e97cd545
Instead of having a separate listener for indicating that the current task is finished, this commit is switching to use allocated object itself.
Original commit: elastic/x-pack-elasticsearch@7ad5362121
`PersistentTasksExecutor#getAssignment(...)` should be a cheap and side-effect free method,
but in case of `OpenJobPersistentTasksExecutor` and `StartDatafeedPersistentTasksExecutor` before this change it would index a document each time `getAssignment(...)` was invoked
Original commit: elastic/x-pack-elasticsearch@5ca5890baf
The change applies chunking by default on aggregated datafeeds.
The chunking is set to a manual mode with time_span being
1000 histogram buckets.
The motivation for the change is two-fold:
1. It helps to avoid memory pressure/blowing.
Users may perform a lookback on a very long period of time. In that
case, we may hold a search response for all that time which could
include too many buckets. By chunking, we avoid that situation
as we know we'll only keep results for 1000 buckets at a time.
2. It makes cancellation more responsive.
In elastic/x-pack-elasticsearch#862 we made the processing of a search response cancellable in a
responsive manner. However, the search phase cannot be cancelled at
the moment. Chunking makes the search phase shorter, which will
result to a better user experience when they stop an aggregated
datafeed.
Also note the change sets the default chunking_config on datafeed
creation so the setting is no longer hidden.
Relates to elastic/x-pack-elasticsearch#803
Original commit: elastic/x-pack-elasticsearch@ae8f120f5f
When a datafeed task is created but it cannot be assigned the task
has a null status. This means _stats report it as stopped, however
deleting it fails. In addition, it's a better experience to error
the start datafeed request all together and give the user the chance
to fix his data indices.
This change fails a datafeed-start if it cannot be assigned.
relates elastic/x-pack-elasticsearch#1018
Original commit: elastic/x-pack-elasticsearch@532288fda0
Retries should be already handled by TransportMasterNodeAction, there is no need to introduce another retry layer in Persistent Tasks code.
Original commit: elastic/x-pack-elasticsearch@967ac7f7fa
This commit changes how LocalExporterTests stops the monitoring
components: it first stops the monitoring service (but keeps the
local exporter enabled), deletes and checks if monitoring indices
are recreated, and then disables the local exporter.
Original commit: elastic/x-pack-elasticsearch@4c4809a660
Closing a job may take a while. In the meantime it is possible to start a datafeed, because before this change the job state remained OPENED.
With this change when the executor node receives the close job request, it will first set the status to CLOSING and after that closes the job (closing autodetect process, etc.).
relates elastic/x-pack-elasticsearch#990
Original commit: elastic/x-pack-elasticsearch@d8d89c0756
This commit removes the smoke-test-monitoring-with-security project
and replaces it with a REST test.
Original commit: elastic/x-pack-elasticsearch@f1665815c2
The execution has diverged too much from post data, flush and update process apis, since the close all jobs have been added.
The logic is now easier to understand as it exist in a single source file instead of in both CloseJobAction and TransportJobTaskAction.
Original commit: elastic/x-pack-elasticsearch@daf5fabad5
Users currently have difficulty diagnosing authentication failures.
Some logging messages mislead them, and in other cases there are unexpected behaviours that are not logged at all.
These additional DEBUG log messages and change some existing messages in an attempt to alleviate that problem.
Original commit: elastic/x-pack-elasticsearch@c6ea98b038
Increase the timeout to give enough time for a datafeed to
stop smoothly.
This is the second step to avoid hitting the default timeout.
The first was ensuring aggregated datafeed is cancellable in
a responsive manner. The third and final step will be to
apply chunking in aggregated datafeeds in order to shorten
the duration of the search, which will make cancellation even
more responsive.
Relates elastic/x-pack-elasticsearch#803
Original commit: elastic/x-pack-elasticsearch@db642330ec
This commit restores the ability to build x-pack-elasticsearch without issues when running without
access to the internet. When the `--offline` flag is used, we will not try to contact vault and the
aws apis to retrieve the ml-cpp binaries but instead gradle will use a cached version even though
it may be expired.
relates elastic/x-pack-elasticsearch#726
Original commit: elastic/x-pack-elasticsearch@b0915d8fa9
This is analagous of the bwc-zip for elasticsearch. The one caveat is
due to the structure of how ES+xpack must be checked out, we end up with
a third clone of elasticsearch (the second being in :distribution:bwc-zip).
But the rolling upgrade integ test passes with this change.
relates elastic/x-pack-elasticsearch#870
Original commit: elastic/x-pack-elasticsearch@34bdce6e99
This commit renames and moves the forked delete by query classes from being ml specific to being a
xpack common class since an upcoming security feature plans to make use of this. Additionally, this
commit fixes a issue where the dbq action was being executed by the calling user instead of the
xpack user for certain requests. This was found when adding a authorization change that restricts
this action's execution to the xpack user only.
Original commit: elastic/x-pack-elasticsearch@d5967e7255
There was a problem with the way CompositeBytesReference was used in the
StateProcessor. In the case of a large state document we ended up with a
deeply nested CompositeBytesReference that then caused a deep stack and N^2
processing in the bulk action processor.
This change uses an intermediate list of byte arrays that get combined into
a single CompositeBytesReference to avoid the deep nesting.
Additionally, errors in state processing now bubble up to close the state
stream, which will cause the C++ process to stop trying to persist more state.
Finally, the results processor also times out after a similar period (30 minutes)
to that used by the state processor.
Original commit: elastic/x-pack-elasticsearch@ceb31481d1
Rather than using an async call, this leverages
the Assignment logic while selecting nodes.
Now with 300% more tests!
Original commit: elastic/x-pack-elasticsearch@300d628f72
It has been observed that Amazon EBS volumes created from snapshots can
have very high latency the first time a given block is accessed. This
can lead to named pipes taking longer than 2 seconds to create.
Since the native processes create their named pipes immediately after
startup, and this only takes a fraction of a second on a local disk, 2
seconds was considered a generous timeout, but it seems that in the case
of a remote NAS with lazy provisioning it's not long enough. During
debugging a latency of just over 3 seconds was observed. The timeouts
have been increased to 10 seconds.
relates elastic/x-pack-elasticsearch#922
Original commit: elastic/x-pack-elasticsearch@c90434c948
Moves the direct management of the security index from SecurityLifecycleService to IndexLifecycleManager, so that the SecurityLifecycleService can take responsibility for several indices.
Multiple security indices are required as we move away from storing multiple types in a single index.
Original commit: elastic/x-pack-elasticsearch@fde3a42b4d
The IndexAuditTrailMutedTests have a threadpool but fail to set it on the test client, which causes
a NPE and tests to fail.
Original commit: elastic/x-pack-elasticsearch@d34a4ce080
As the snapshot that is loaded is an important operational
aspect of a job, this change adds a notification that displays
the loaded snapshot with its latest_record_timestamp and the
job's latest_record_timestamp. Having both allows us to discover
when a job is recovering after a node failure.
relates elastic/x-pack-elasticsearch#872
Original commit: elastic/x-pack-elasticsearch@c2dee495a2
The test fails on slow machines because of inflight bulk requests
that hit one node while the others are stopping. This commit adds
more time (10s), equivalent to 2 to 3 collection interval, to delete
the monitoring indices. It also add TRACE logging level for the test.
Original commit: elastic/x-pack-elasticsearch@b433937946
* [ML] Adds jobType to Job
This change adds `jobType` field to teh `Job` class so that when the job is written to the index a `job_type` field is written int he document. This will help separate this type of job from other new job types in the future so migrating the index to allow those new type of jobs will be easer
relates elastic/x-pack-elasticsearch#798
* Addresses review comments
Original commit: elastic/x-pack-elasticsearch@d9fd11edb3
When the LDAP SDK returns a SearchResult that has a non-success ResultCode, convert it to an exception and call onFailure
A configuration setting controls whether failures in referrals should be fatal (defaults to ignoring errors)
Closes: elastic/x-pack-elasticsearch#717
Original commit: elastic/x-pack-elasticsearch@4159758c2a
PersistentTasksService methods are not using ActionListener<PersistentTask<?>> instead of PersistentTaskOperationListener.
Original commit: elastic/x-pack-elasticsearch@f95d8bda3d
The `FieldPermissions` class incorrectly assumed that the `granted` and `denied` arrays were
sorted, so it could do a `binarySearch` to see if `_all` was in the arrays.
Original commit: elastic/x-pack-elasticsearch@49b5875602
This is a follow-on to elastic/x-pack-elasticsearch#939, which removes the use of Arrays.binarySearch in the FieldPermissions
class. This change removes other incorrect uses in the rest of the x-pack code and replaces them
with a stream based implementation.
Original commit: elastic/x-pack-elasticsearch@ccca7e9bad
Before this change, aggregation datafeeds used the histogram bucket
key as the record timestamp that is posted to the job. That meant
that the latest_record_timestamp at the end of a datafeed run was
the start of the latest seen histogram bucket. Upon continuing the
datafeed, the search starts from one millisecond after the
latest_record_timestamp. Hence, data may be fetched for a second time.
This change requires a max aggregation on the time_field nested in
the histogram bucket. It then reads the timestamp from that agg.
This ensures datafeed can restart without duplicating data.
relates elastic/x-pack-elasticsearch#874
Original commit: elastic/x-pack-elasticsearch@f820efa866
- include 'real-time' instead of now as the end time for real-time
datafeeds
- do not notify lookback is completed when datafeed was stopped
- do not notify datafeed switch to real-time when datafeed was stopped
Relates elastic/x-pack-elasticsearch#878
Original commit: elastic/x-pack-elasticsearch@aa22f9b86f
This commit is response to the renaming of the random ASCII helper
methods in ESTestCase. The name of this method was changed because these
methods only produce random strings generated from [a-zA-Z], not from
all ASCII characters.
Relates elastic/x-pack-elasticsearch#942
Original commit: elastic/x-pack-elasticsearch@a6085964d3
This commit reenables the Monitoring Bulk Api REST tests. The XPackRestIT
now enables/disables the local default exporter before executing the monitoring
tests, and also waits for the monitoring service to be started before executing
the test.
Original commit: elastic/x-pack-elasticsearch@10b696198c
State processing can take a lot longer than log processing, even after
the C++ process has closed its end of the pipe. The pipe has a buffer,
and indexing the state document(s) in that buffer can take more than a
second.
relates elastic/x-pack-elasticsearch#945
Original commit: elastic/x-pack-elasticsearch@65f5075028
* [ML] Set job create time on server
* Job.Builder serialisation tests
* Make setCreateTime package private
Original commit: elastic/x-pack-elasticsearch@d2d75e0d7b
Adds following validations:
- aggregations must contain date_histogram or histogram at the top level
- a date_histogram has to have its time_zone to UTC (or unset which
defaults to UTC)
- a date_histogram supports calendar intervals only up to 1 week
to avoid the length variability of longer intervals
- aggregation interval must be greater than zero
- aggregation interval must be less than or equal to the bucket_span
Original commit: elastic/x-pack-elasticsearch@404496a886
* Remove JobManagers dependency on JobResultsPerister
* Remove unneeded call to refresh the state index
Original commit: elastic/x-pack-elasticsearch@0b2351bba7
If jobs are being deleted then the operations required to get stats
could fail with unexpected exceptions. When stats for multiple jobs
were being requested, this would previously cause the whole operation
to fail.
This commit changes the stats endpoint to ignore jobs that are being
deleted.
Fixeselastic/prelert-legacy#837
Original commit: elastic/x-pack-elasticsearch@6ac141a987
When this happens it means the job has been deleted, which in turn means
the C++ process has been stopped, so there's no need to send it a message
and hence no problem worth logging a stack trace for.
This differs from elastic/x-pack-elasticsearch#896 because elastic/x-pack-elasticsearch#896 was for a similar situation with
closed jobs, whereas this one is for deleted jobs.
Original commit: elastic/x-pack-elasticsearch@9bb4e98fe7
The test is too rigid on checking the right number of node_stats documented that are collected. It happens if a node takes time to start, the node_stats count % numNodes will always be different than 0.
It also adds more logging for LocalBulk failures.
Original commit: elastic/x-pack-elasticsearch@1ebb20b6f6
In order to prevent tasks state updates by stale executors, this commit adds a check for correct allocation id during status update operation.
Original commit: elastic/x-pack-elasticsearch@b94eb0e863
This commit changes the LocalExporterTests so that it now test
various randomized cases in a single test. This should speed up
the test as well as minimize the failures due to multiple start
/stop of the exporter. It also uses the MonitoringBulk API
instead of calling the Exporter instances, which makes more sense
since it is the normal way to index monitoring documents.
Related elastic/x-pack-elasticsearch#416
Original commit: elastic/x-pack-elasticsearch@f8a4af15cd
Detector configs are validated both by our C++ and by our Java code.
If the C++ is stricter than the Java then error reporting is poor.
This commit adds two extra validation checks to the Java code that
were already present in the C++ validation.
relates elastic/x-pack-elasticsearch#856
Original commit: elastic/x-pack-elasticsearch@bd4ce2377c
It's possible for a C++ process to exit between the time when a
config update message for it is queued and the time that message
is processed. This commit ensures we don't spam the log with a
stack trace in this situation, as it's not a problem at all.
relates elastic/x-pack-elasticsearch#891
Original commit: elastic/x-pack-elasticsearch@81af8eaf70
Aggregated data extraction is done in 2 phases:
1. search
2. process response
The first phase cannot be currently cancelled. However, it usually
is the fastest of the two.
The second phase processes the histogram buckets in the search
response into flat JSON and then posts the result stream to the job.
This phase can be split into batches where a few buckets are posted
to the job at a time. Cancelling can then work between batches.
This commit changes the AggregationDataExtractor to process the
search response in batches. The definition of a batch is crucial
as it has to be short enough to allow for responsive cancelling,
yet long enough to minimise overhead due to multiple calls to the
post data action. The number of key-value pairs written by the
processor is a good candidate for a batch size measure. By testing,
1000 seems to be an effective number.
relates elastic/x-pack-elasticsearch#802
Original commit: elastic/x-pack-elasticsearch@ce3a172411
The native process can only handle one operation at a time, so in order the protect against multiple operation at a time (e.g. post data and flush or multiple post data operations) there should be protection in place to guarantee that at most only a single thread interacts with the native process. The current protection is broken when a job close is executed, more specifically the wait logic is broken here.
This commit changes the threading logic when interacting with the native process by using a custom `ExecutorService` that that uses a single worker thread from `ml_autodetect_process` thread pool to interact with the native process. Requests from the ml apis are initially being queued and this worker thread executes these requests one by one in the order they were specified.
Removed the general `ml` threadpool and replaced its usages with `ml_autodetect_process` or `management` threadpool.
Added a new threadpool just for (re)normalizer, so that these operations are isolated from other operations.
relates elastic/x-pack-elasticsearch#582
Original commit: elastic/x-pack-elasticsearch@ff0c8dce0b
The slack tests seem to fail periodically with not output
This commit tries to add some more verbose output by
making the query more broad and take failures into account
to uncover, what happens in this test.
Relates elastic/x-pack-elasticsearch#836
Original commit: elastic/x-pack-elasticsearch@e601b3a0df
This change adds a retain field to model snapshots.
A user can set retain to true/false via the update model snapshot API.
Model snapshots with retain set to true will not be deleted by
the daily maintenance service regardless of whether they expired.
This allows users to keep always keep certain snapshots around for
potentially reverting to in the future.
relates elastic/x-pack-elasticsearch#758
Original commit: elastic/x-pack-elasticsearch@2283989a33
* Removed OPENING and CLOSING job states. Instead when persistent task has been created and
status hasn't been set then this means we haven't yet started, when the executor changes it to STARTED we have.
The coordinating node will monitor cs for a period of time until that happens and then returns or times out.
* Refactored job close api to go to node running job task and close job there.
* Changed unexpected job and datafeed exception messages to not mention the state and instead mention that job/datafeed haven't yet started/stopped.
Original commit: elastic/x-pack-elasticsearch@37e778b585
Add the `.monitoring-alerts-2` index template via the exporter. This
avoids a very common problem where the user wipes out their monitoring
indices manually, which means that the watches would then create an index
with a dynamic mappings.
This adds a mechanism for posting a template that is not associated with a
Resolver (convenient for the forthcoming work _and_ for a future Logstash
index).
Original commit: elastic/x-pack-elasticsearch@a4cfc48191