* Methods to update the running process with new settings
* Task to update the running autodetect process
* Don’t start process update task if not config specified
Original commit: elastic/x-pack-elasticsearch@4364b141b5
Flush has the contract that when it is done results are up-to-date.
Thus, it adds no value to have it timeout. In most cases, the request
should be pretty responsive apart from when it advances time forward.
In the latter scenario, it could force results to be calculated for a
long period of time which could take long. The one use case for this
is the datafeeds and there is no issue with waiting flush to finish.
This PR changes flush to always wait to completion. However, it adds
checking that the c++ process is alive every second, to avoid long
waits in vain when something has gone horribly wrong.
Fixeselastic/elasticsearch#826
Original commit: elastic/x-pack-elasticsearch@de421ab843
After this change the build requires a github.token file in the root directory of the repository so that it can authenticate with the Vault service to get AWS credentials to download the ml-cpp artifacts
Original commit: elastic/x-pack-elasticsearch@630efadef8
Elasticsearch changed doc_values of date fields to return a
joda DateTime object. Thus, we need to call getMillis() to extract
the epoch millis value.
Original commit: elastic/x-pack-elasticsearch@b992882af5
A JobStorageDeletionTask is created, which supervises the physical deletion of the job. This
task is a child of the DeleteJob action. After the DBQ finishes, the normal flow
resumes (physical index deleted, job removed from CS)
Original commit: elastic/x-pack-elasticsearch@5d6f694408
* Removed getPersistStream() method from this interface and let the NativeAutodetectProcess implementation deal with this. The persist stream is an implementation detail and BlackHoleAutodetectProcess doesn't deal with this too.
* Replaced getProcessOutStream() method with readAutodetectResults() method. This method now returns a `Iterator<AutodetectResult>` instead of an inputstream. This makes the BlackHoleAutodetectProcess and future mocked implementations easier.
Original commit: elastic/x-pack-elasticsearch@086e7b40ab
* Reintroduce chunking to improve data extractor performance
Performing a sorted search/scroll over a period of time that matches
a lot of documents is very expensive because for each page all
documents are traversed.
The solution is to chunk the search time and perform separate
search/scrolls for each chunk.
This commit is introducing a new `chung` config in `datafeed_config`
whose mode can be set to either of AUTO, OFF, MANUAL, with the latter
allowing to specify an explicit chunk size.
When set to AUTO, a heuristic is used in order to determine the chunk
size. The heuristic is based on estimating the time interval within
which we expect `scroll_size` documents and then taking the 10x multiple
of that. Based on benchmarking, this method gives a dramatic performance
increase. For example, for the citizens dataset it improved the ingest
rate from 0.33M docs / minute to 13.6M docs / minute. Farequote is now
done in ~1 second.
Finally, note that when `chunk` is not specified, it defaults to AUTO
when aggregations are not set and to OFF otherwise. This is because
the chunk size heuristic does not lend itself great for aggregations
where one needs to chunk based on the cardinality of buckets rather
than simply time.
Relates to elastic/elasticsearch#734
Original commit: elastic/x-pack-elasticsearch@a738e86d21
Similarly to task status on normal tasks it's now possible to update task status on the persistent tasks. This should allow updating the state of the running tasks (such as loading, started, etc) as well as store intermediate state or progress.
Original commit: elastic/x-pack-elasticsearch@ed109cfa84
The ScrollDataExtractor needs to clear the scroll after
it is complete. Originally, it was thought that completing a scroll
leads to an automatic clearing of its context. That is not true,
thus manual clearing has to be requested.
- Also removes sorting in AggregationDataExtractor as it was redundant
Original commit: elastic/x-pack-elasticsearch@8f955da8ce
If `domainSplit(` is detected in an inline script, the function and params are injected into
the script.
The majority of this PR is actually test-related. Adds a unit test to check for the injected
script/params. Also adds another QA test which -- through a very round-about mechanism --
confirms that the injected script compiles and functions correctly. The QA test can
be simplified greatly once the Preview API is added.
Original commit: elastic/x-pack-elasticsearch@c7c35a982c
Changes are:
1. The detector validation endpoint is changed from /_xpack/ml/_validate/detector
to /_xpack/ml/anomaly_detectors/_validate/detector
2. A new endpoint is added for validating an entire job config:
/_xpack/ml/anomaly_detectors/_validate
Relates elastic/elasticsearch#630
Original commit: elastic/x-pack-elasticsearch@7b2031e746
* Store input fields for anomaly records and influencers
* Address review comments
* Remove DotNotationReverser
* Remove duplicated constants
* Can’t use the same date for all records as they will have equivalent Ids
Original commit: elastic/x-pack-elasticsearch@40796b5efc
This needs to be moved to the single-node-tests qa modules since integTests shouldn’t access modules.
Original commit: elastic/x-pack-elasticsearch@289b697eb8