* Handle manual aggregations in datafeeds
Adds a DataExtractor implementation that runs aggregated searches.
The manual aggregations supported have the following limitations:
- each aggregation can hava 0 or 1 sub-aggregations
- the top aggregation has to be a histogram
- sub-aggregations have to be either terms aggregations or single value
metric aggregations.
The response is converted into flat JSON documents that contain only the
fields of interest and can be parsed without additional context from our
JSON parser. The fields in the JSON documents correspond to the names of the aggregations.
Closeselastic/elasticsearch#680
Original commit: elastic/x-pack-elasticsearch@7dfd2d31e6
The new constructor takes an Environment object. This is needed for migration to X-Pack since the environment instance is built by the XPackPlugin and then passed into the feature plugins.
Original commit: elastic/x-pack-elasticsearch@f25225bc6a
Most transforms will be replaced with Painless scripts.
The exception is the DateTransform, whose functionality is now simplified
to what existed before the other transforms were added.
The SINGLE_LINE format relied on transforms to extract fields, so has also
been removed, but this is reasonable as it strays into Logstash territory.
Relates elastic/elasticsearch#630Closeselastic/elasticsearch#39
Original commit: elastic/x-pack-elasticsearch@a593d3e0ad
This matches the way tests that need to run without an Elasticsearch
bootstrap are run in core Elasticsearch. This should make merging to
x-pack easier.
Note that the no bootstrap tests now run after the integration tests, but
this doesn't really matter.
Original commit: elastic/x-pack-elasticsearch@5547f457b6
The bulk request needed resetting after it was executed otherwise stale documents are persisted repeatedly after they have been updated causing a versioning error
Original commit: elastic/x-pack-elasticsearch@263fa9d25d
* Gets build to use elasticsearch-extras
Also adds ci script for building repo on CI servers
To use this change you need to:
1. Clone elasticsearch: `git@github.com:elastic/elasticsearch.git`
2. create a directory at the same level as elasticsearch called `elasticsearch-extra`
3. Clone this repository into the `elasticsearch-extra` directory
4. Run `gradle build` from the `elasticsearch-extra/prelert-legacy` directory or run `gradle :prelert-legacy:build` from the `elasticsearch directory
* Adds USE_SSH option to ci script
* iter
Original commit: elastic/x-pack-elasticsearch@ea127dfef0
The job open api starts a task and ties that AutodetectCommunicator.
The job close api is a sugar api, that uses the list and cancel task api to close a AutodetectCommunicator instance.
The flush job and post data api redirect to the node holding the job task and then delegate the flush or data to the AutodetectCommunicator instance.
Also:
* Added basic multi node cluster test.
* Fixed cluster state diffs bugs, forgot to mark ml metadata diffs as named writeable.
* Moved waiting for open job logic into OpenJobAction.TransportAction and moved the logic that was original there to a new action named InternalOpenJobAction.
Original commit: elastic/x-pack-elasticsearch@194a058dd2
* removes upload pack task from build
This is preventing us from being an elasticsearch-extra project and we cannot have this task when we move to x-pack. Once we are in X-Pack the unified build will be uploading the final artifact so for now we will change the CI build to add a build step to upload the pack artifact.
* Removes OS specific stuff from the build
the CPP_LOCAL_DIST will now look for any `ml-cpp` artifacts for the same version in the specified directory.
* review corrections
Original commit: elastic/x-pack-elasticsearch@be15e55ddb
This commit contains some more of the endpoint changes Sophie and Steve
agreed with Clint:
1. get_jobs_stats renamed to get_job_stats
2. Revert snapshot must now be done using an ID - other options removed
3. Renamed "categorydefinitions" to "categories" in endpoints
4. get_jobs now has an implicit _all if no job ID/wildcard is specified
5. There is an option to retrieve a specific model snapshot by ID in
get_model_snapshots
Relates elastic/elasticsearch#630
Original commit: elastic/x-pack-elasticsearch@9dd71c64a8
This change prepares for elastic/elasticsearch/elastic/elasticsearch#22575, where we don't have ClusterService available in rest actions.
Original commit: elastic/x-pack-elasticsearch@87658c7fe8
This commit performs the following improvements:
- the time field is always requested as doc_value. This makes
specifying a time format for scheduled jobs unnecessary.
- adds DataDescription as a param to the PostDataAction. When set,
it overrides the job's DataDescription. This allows the scheduler to
override the job's DataDescription since it knows the data format (JSON)
and the time format (epoch_ms). This is not exposed in the REST API to
discourage users from using it.
- by default, data extractor search now requests doc_values for analysis fields. This is
expected to result in increased performance.
- a `_source` field is added to the scheduler config. This needs to be
set to true when one or more of the analysis fields do not have
doc_values.
- the ELASTICSEARCH data format is removed as is now redundant.
- fixes the usage of `script_fields`. Previously, setting
`script_fields` would result to none of the source to be returned. Thus,
is the analysis fields were a mixture of script and non-script fields it
would not work.
- ensures nested fields are handled properly
Closeselastic/elasticsearch#679, Closeselastic/elasticsearch#267
Original commit: elastic/x-pack-elasticsearch@fed35ed354
NB: The actual C++ code will be deleted in a separate commit to
avoid swamping this commit.
If you want to have the Java build pick up locally built C++ then:
export CPP_LOCAL_DISTS=$CPP_SRC_HOME/build/distributions
Otherwise, C++ artifacts will be downloaded from S3.
Original commit: elastic/x-pack-elasticsearch@246672e81d
If scheduled job concurrently gets stopped from within (e.g. lookback) and externally via the stop scheduler api then make sure to execute the stop logic only once.
Original commit: elastic/x-pack-elasticsearch@505c44f515
The _all field is now deprecated and disabled by default in elasticsearch
6.0.0. We no longer need to disable it explicitly.
Original commit: elastic/x-pack-elasticsearch@c71465083a
When a user makes a GET request to retrieve all resources of a type
(e.g. anomaly_detectors) and none exists, the response should be an
empty array with 200 status code. This commit fixes this issue for:
* anomaly_detectors and _stats
* schedulers and _stats
* lists
* buckets
All other GETs work fine already.
Original commit: elastic/x-pack-elasticsearch@4daaa91aa4
I thought QUERY_AND_FETCH was the most efficient for the data extractor
but it does not work with sorting. It causes all shard results to be
returned before sorting and thus we may get out-of-order errors.
This commit switches to the default search type.
Original commit: elastic/x-pack-elasticsearch@d8a8155973
* Extract method ScheduledJob#postData
* Remove unreachable else statement
* Restrain usage of DataExtractor in a single thread
Original commit: elastic/x-pack-elasticsearch@5b9b310d9d
* prelert to ml
* Prelert to Ml
* PRELERT to ML
Exceptions:
* prelert.com - because it generally appears in links to our website, and
although these will eventually break it will be possible for people to see
what was there using https://archive.org/web/
* PRELERT_AWS_ACCESS_KEY_ID and PRELERT_AWS_SECRET_ACCESS_KEY - because it
creates a knock-on effect on infra that will be temporary anyway because once
we're in x-pack we'll use x-pack keys
* prelert-artifacts - this is the name of the s3 bucket we're currently using
and you cannot rename s3 buckets - as with the access keys it will become
obsolete when we merge to x-pack so there's no point changing it now
* prelert-legacy - the name of our legacy Git repo has not changed
Original commit: elastic/x-pack-elasticsearch@720e83c7f2
and re-enabled some quantiles persistence unit tests (which can remain to be blocking as they aren't used on a network thread)
Original commit: elastic/x-pack-elasticsearch@cf8e78f42d
* Replace http data extractor with a client extractor
This first implementation replaces the HTTP extractor
with a client extractor that uses search & scroll.
Note that this first implementation has some limitations:
- Only reads data that are in the _source
- Does not handle aggregated searches
These limitations will be addressed in follow up PRs.
Relates to elastic/elasticsearch#154
Original commit: elastic/x-pack-elasticsearch@f692ed961c
* Upgrades to ES 6.0.0-alpha1-SNAPSHOT
* Kibana changes to run upgrade to 6.0.0-alpha1-SNAPSHOT
* Other version changes to 6.0.0-alpha1-SNAPSHOT
Original commit: elastic/x-pack-elasticsearch@574d8573ab
This commit contains around half of the endpoint changes Sophie and Steve
agreed with Clint:
1) Automatic job ID generation is removed
2) Job IDs must now be specified in the URL when putting a job; to avoid
breaking many test configs, job IDs may also be specified in the job config
body, but in this case the value specified must match the URL argument
3) The endpoint name for posting data is now post_data instead of job_data
4) The post_data endpoint ends with _data instead of data
5) modelsnapshots is renamed to model_snapshots in all related endpoints
6) PUT model_snapshots/description is changed to POST model_snapshots/_update
Relates elastic/elasticsearch#630
Original commit: elastic/x-pack-elasticsearch@c379a23f3c
The `influencer_field_name` field was declared two in the results mapping. Once directly from `ElasticsearchMappings.resultsMapping()` and again from `addInfluencerFieldsToMapping(XContentBuilder)` which the `resultsMapping()` method calls.
this change removes the duplicate.
Original commit: elastic/x-pack-elasticsearch@5707a5ee53
Allow deletes to proceed even if index is missing
Also adds some tests. All non-IndexNotFound exceptions will still abort the delete.
We can revisit this if we find other edge-cases.
Original commit: elastic/x-pack-elasticsearch@823d00d8a7
and FixBlockingClientOperations in two places where blocking client calls are ok,
because these methods aren't called from a network thread.
Original commit: elastic/x-pack-elasticsearch@a6dc34651c
Merged categoryDefinition(...) into categoryDefinitions(...) as the two did similar things. The get call has been replaced with a search with a query on the _uid field and routing on category id, so that the response handling code can be reused.
Original commit: elastic/x-pack-elasticsearch@4243917b00
The start scheduler waits until the scheduler state has been set to started before returning.
Before this change after the scheduler state has been set to started, the scheduler would link itself to the task the start scheduler api has created.
If the stop scheduler api was called immediately after the start scheduler api then this could lead the stop scheduler api cancelling the task without
stopping the scheduler, as the scheduler could not have been linked to the task.
Now the scheduler gets linked to the task before the scheduler state is set to started, fixing the problematic situation discribed above.
Original commit: elastic/x-pack-elasticsearch@8334ae1967
Also merged the JobProvider#getBucket(...) method into Jobprovider#getBuckets(...) method, because
it contained a lot of similar logic, otherwise it had to be converted to use non blocking client calls too.
Part of elastic/elasticsearch#127
Original commit: elastic/x-pack-elasticsearch@b1e66b62cb
There was an N-squared algorithm in the state processing code, leading
to large state persistence eventually timing out. Large state documents
are read from the network in 8KB chunks, and the old code was checking
ALL previously read chunks for separators every time a new chunk was read.
Fixeselastic/elasticsearch#635
Original commit: elastic/x-pack-elasticsearch@c814858c2c
Deleting a job now starts a three-step process:
1. Job status updated to DELETING
2. Physical index is deleted
3. Job removed from cluster state
When jobs are in DELETING, they cannot be modified/updated/changed at all. Only jobs that are DELETING can actually be removed from the CS.
Original commit: elastic/x-pack-elasticsearch@2cd99a240c
with the fix we also make sure that prelert metatadata is taken into account when verifying the cluste state consistency
Original commit: elastic/x-pack-elasticsearch@1deaec3836