Commit Graph

307 Commits

Author SHA1 Message Date
David Kyle c84f227857 Set memory usage log message to trace (elastic/elasticsearch#829)
Original commit: elastic/x-pack-elasticsearch@13412cc4cf
2017-01-31 09:54:56 +00:00
David Roberts ab957b6d91 Adjust validation endpoints (elastic/elasticsearch#812)
Changes are:

1. The detector validation endpoint is changed from /_xpack/ml/_validate/detector
   to /_xpack/ml/anomaly_detectors/_validate/detector
2. A new endpoint is added for validating an entire job config:
   /_xpack/ml/anomaly_detectors/_validate

Relates elastic/elasticsearch#630

Original commit: elastic/x-pack-elasticsearch@7b2031e746
2017-01-30 17:10:22 +00:00
David Kyle 4eab74ce29 Store input fields for anomaly records and influencers (elastic/elasticsearch#799)
* Store input fields for anomaly records and influencers

* Address review comments

* Remove DotNotationReverser

* Remove duplicated constants

* Can’t use the same date for all records as they will have equivalent Ids

Original commit: elastic/x-pack-elasticsearch@40796b5efc
2017-01-30 14:05:18 +00:00
Colin Goodheart-Smithe 79d1a10a86 Mutes DataFeedJobIT test method that uses painless
This needs to be moved to the single-node-tests qa modules since integTests shouldn’t access modules.

Original commit: elastic/x-pack-elasticsearch@289b697eb8
2017-01-30 10:55:59 +00:00
Colin Goodheart-Smithe 618cb2a1a0 Make all action names cluster actions
Original commit: elastic/x-pack-elasticsearch@815d8f0aac
2017-01-30 10:00:23 +00:00
Igor Motov 827118e154 Adds support for persistent actions
A persistent action is a transport-like action that is using the cluster state instead of transport to start tasks. This allows persistent tasks to survive restart of executing nodes. A persistent action can be implemented by extending TransportPersistentAction. TransportPersistentAction will start the task by using PersistentActionService, which controls persistent tasks lifecycle.  See TestPersistentActionPlugin for an example implementing a persistent action.

Original commit: elastic/x-pack-elasticsearch@8ef4103cd6
2017-01-27 11:20:54 -05:00
Martijn van Groningen ff65c38253 [TEST] fixed mocking logic to include id
Original commit: elastic/x-pack-elasticsearch@7b20e92fdc
2017-01-27 17:20:18 +01:00
Martijn van Groningen 2059b91620 Workaround for index request without an id being retried that are tripping an assertion in internal engine. (2)
Original commit: elastic/x-pack-elasticsearch@22d5060deb
2017-01-27 17:07:51 +01:00
Martijn van Groningen ad4218320c Workaround for index request without an id being retried that are tripping an assertion in internal engine.
Original commit: elastic/x-pack-elasticsearch@ba44acc28b
2017-01-27 16:15:00 +01:00
David Roberts 64fdb039ab Reduce the controller connect timeout (elastic/elasticsearch#804)
This used to be 60 seconds, dating back to the days when the controller
had to be started manually after starting Elasticsearch.  However, now
Elasticsearch starts it automatically it should already be running when
we try to connect, so the timeout can be much lower.  It just needs to
be long enough to give the C++ process time to create its named pipes.
2 seconds seems reasonable, and matches what we use for autodetect and
normalize.

Original commit: elastic/x-pack-elasticsearch@7300d68482
2017-01-27 14:23:18 +00:00
Zachary Tong 9395ef81b1 Painless DomainSplit tests in new Single-Node QA Module (elastic/elasticsearch#787)
This contains the Painless-based DomainSplit function, generated static maps and basic tests.  Due to cross-module complications, the tests are run by executing searches with script_fields and checking the response


Original commit: elastic/x-pack-elasticsearch@c6c2942e01
2017-01-27 08:52:48 -05:00
Dimitris Athanasiou 91be1e719d Disable stored_fields when possible in ScrollDataExtractor (elastic/elasticsearch#801)
When source fields are not required, stored_fields can be disabled.
This can make the query faster as no stored fields have to be
decompressed. Note that this means no metadata (_id, _index, _type, etc.)
will be returned.

Original commit: elastic/x-pack-elasticsearch@b1ea526d83
2017-01-27 11:38:54 +00:00
Dimitris Athanasiou 5790a6f152 Handle shard failures in extractors (elastic/elasticsearch#794)
Even though a search response may return a 200 status code, things could
still have gone wrong. A search response may report shard failures.

The datafeed extractors should check for that and report an extraction
error accordingly.

Closes elastic/elasticsearch#775

Original commit: elastic/x-pack-elasticsearch@5d6d899738
2017-01-26 16:01:43 +00:00
David Kyle efc47c2a6f Remove Usage classes (elastic/elasticsearch#796)
* Delete usage class

* Delete usage reporter

* Remove unused constant

Original commit: elastic/x-pack-elasticsearch@c7a6c457bd
2017-01-26 11:50:08 +00:00
David Kyle db14d89358 Fix checkstyle
Original commit: elastic/x-pack-elasticsearch@05d59da705
2017-01-26 10:05:03 +00:00
David Kyle e3bb7cfea3 Split ml-int index into .ml-audit and .ml-meta (elastic/elasticsearch#752)
* Audit messages in .ml-audit

* Rename ml-int to .ml-meta

* Remove no release comment

* Fix compilation after classes moved to a different package

* Create the Audit, state and meta indices every time a job is created

* Revert change creating the audit index etc when the job is created

* Rename index .ml-audit -> .ml-notifications

Original commit: elastic/x-pack-elasticsearch@95168fa341
2017-01-26 09:44:54 +00:00
Martijn van Groningen 3a36f94a4a When timeout has been reached, check one more time if the job / datafeed status has the expected value.
Decreased wait timeout from 30s to 20s

Original commit: elastic/x-pack-elasticsearch@b46fb0abe3
2017-01-25 23:32:04 +01:00
Dimitris Athanasiou 86291c12e2 Handle manual aggregations in datafeeds (elastic/elasticsearch#784)
* Handle manual aggregations in datafeeds

Adds a DataExtractor implementation that runs aggregated searches.

The manual aggregations supported have the following limitations:

- each aggregation can hava 0 or 1 sub-aggregations
- the top aggregation has to be a histogram
- sub-aggregations have to be either terms aggregations or single value
metric aggregations.

The response is converted into flat JSON documents that contain only the
fields of interest and can be parsed without additional context from our
JSON parser. The fields in the JSON documents correspond to the names of the aggregations.

Closes elastic/elasticsearch#680

Original commit: elastic/x-pack-elasticsearch@7dfd2d31e6
2017-01-25 19:13:03 +00:00
Colin Goodheart-Smithe 716f543f7b Adds a new constructor to plugin
The new constructor takes an Environment object. This is needed for migration to X-Pack since the environment instance is built by the XPackPlugin and then passed into the feature plugins.

Original commit: elastic/x-pack-elasticsearch@f25225bc6a
2017-01-25 18:45:04 +00:00
David Roberts 4b366f8ef6 Removing transforms and the SINGLE_LINE input format (elastic/elasticsearch#790)
Most transforms will be replaced with Painless scripts.

The exception is the DateTransform, whose functionality is now simplified
to what existed before the other transforms were added.

The SINGLE_LINE format relied on transforms to extract fields, so has also
been removed, but this is reasonable as it strays into Logstash territory.

Relates elastic/elasticsearch#630

Closes elastic/elasticsearch#39

Original commit: elastic/x-pack-elasticsearch@a593d3e0ad
2017-01-25 15:51:50 +00:00
Colin Goodheart-Smithe 603fa47580 Adds an option to disable the ML plugin (elastic/elasticsearch#785)
Adds an `xpack.ml.enabled` node level setting that can be used to enable and disable the plugin. This will be important when we migrate to X-Pack

Closes elastic/elasticsearch#781

Original commit: elastic/x-pack-elasticsearch@e5c4969a96
2017-01-24 16:14:56 +00:00
Martijn van Groningen e9f899e57a Improved datafeed logging for stopping
Original commit: elastic/x-pack-elasticsearch@94bd5d6a00
2017-01-24 16:00:54 +01:00
Martijn van Groningen b636a4b829 Fixed timeout (de-)serialization for start and stop datafeeder and open job apis.
Original commit: elastic/x-pack-elasticsearch@be054db48c
2017-01-24 15:53:54 +01:00
Martijn van Groningen 8dbaef186e [TEST] use unique job ids to make debugging log files easier
Original commit: elastic/x-pack-elasticsearch@9f04e1b01f
2017-01-24 12:01:21 +01:00
Martijn van Groningen 29451bb7e3 [TEST] select timestamp differently for test documents
Original commit: elastic/x-pack-elasticsearch@679273012c
2017-01-24 11:07:24 +01:00
David Roberts 215410e93f Rename list to filter (elastic/elasticsearch#774)
Part of the endpoint rename Sophie and Steve agreed with Clint

Relates elastic/elasticsearch#630

Original commit: elastic/x-pack-elasticsearch@6ded117849
2017-01-24 10:01:24 +00:00
Martijn van Groningen a7d1918461 [TEST] added more logging
Original commit: elastic/x-pack-elasticsearch@062a0b41b8
2017-01-23 17:54:14 +01:00
David Roberts cd2332730b Move the named pipe no bootstrap test to a separate qa module (elastic/elasticsearch#769)
This matches the way tests that need to run without an Elasticsearch
bootstrap are run in core Elasticsearch.  This should make merging to
x-pack easier.

Note that the no bootstrap tests now run after the integration tests, but
this doesn't really matter.

Original commit: elastic/x-pack-elasticsearch@5547f457b6
2017-01-23 12:08:35 +00:00
Dimitris Athanasiou b3b8a7edc9 Restructure packages (elastic/elasticsearch#767)
Restructure packages according to plan described in elastic/elasticsearch#730.
Copying here for completeness.

* Move config classes
* DataCounts/Quantiles/state are process.autodetect.state
* AutodetectProcessManager move to process.autodetect
* lists -> config
* PageParams, QueryPage -> action.util
* StatusReporter -> process.DataCountsReporter
* CountingInputStream -> process
* JobManager -> job
* CppLog* -> process.logging
* DataProcessor -> collapse into implementation
* job.audit -> ml.notifications

Closes elastic/elasticsearch#730

Original commit: elastic/x-pack-elasticsearch@769ea1ed69
2017-01-23 10:12:21 +00:00
Martijn van Groningen 0c40317ed2 Remove unneeded inject annotations, because they are no longer needed since rest controllers have been de-guiced.
Original commit: elastic/x-pack-elasticsearch@d272ae1a1c
2017-01-23 10:05:09 +01:00
polyfractal ae7446f78e De-guice getRestHandlers() to follow upstream changes
Original commit: elastic/x-pack-elasticsearch@04bd9e688f
2017-01-20 17:13:35 -05:00
David Kyle ecd462bf89 Fix bug updating normalised results (elastic/elasticsearch#765)
The bulk request needed resetting after it was executed otherwise stale documents are persisted repeatedly after they have been updated causing a versioning error

Original commit: elastic/x-pack-elasticsearch@263fa9d25d
2017-01-20 17:33:37 +00:00
Colin Goodheart-Smithe 4c6989212a Gets build to use elasticsearch-extras (elastic/elasticsearch#758)
* Gets build to use elasticsearch-extras

Also adds ci script for building repo on CI servers

To use this change you need to:
1. Clone elasticsearch: `git@github.com:elastic/elasticsearch.git`
2. create a directory at the same level as elasticsearch called `elasticsearch-extra`
3. Clone this repository into the `elasticsearch-extra` directory
4. Run `gradle build` from the `elasticsearch-extra/prelert-legacy` directory or run `gradle :prelert-legacy:build` from the `elasticsearch directory

* Adds USE_SSH option to ci script

* iter

Original commit: elastic/x-pack-elasticsearch@ea127dfef0
2017-01-20 15:11:21 +00:00
David Roberts bddfac59ed Minor adjustments to datafeed endpoints (elastic/elasticsearch#761)
1. get_datafeeds_stats -> get_datafeed_stats

2. get_datafeeds now accepts implicit _all

Relates elastic/elasticsearch#630

Original commit: elastic/x-pack-elasticsearch@1fc0f69ee2
2017-01-20 10:57:06 +00:00
David Kyle 2eb0499454 Remove JobDataDeleterFactory and OldDataDeleter (elastic/elasticsearch#759)
Original commit: elastic/x-pack-elasticsearch@ac5b75eb58
2017-01-20 09:49:24 +00:00
Martijn van Groningen 9665368755 Changed job lifecycle to be task oriented.
The job open api starts a task and ties that AutodetectCommunicator.
The job close api is a sugar api, that uses the list and cancel task api to close a AutodetectCommunicator instance.
The flush job and post data api redirect to the node holding the job task and then delegate the flush or data to the AutodetectCommunicator instance.

Also:
* Added basic multi node cluster test.
* Fixed cluster state diffs bugs, forgot to mark ml metadata diffs as named writeable.
* Moved waiting for open job logic into OpenJobAction.TransportAction and moved the logic that was original there to a new action named InternalOpenJobAction.

Original commit: elastic/x-pack-elasticsearch@194a058dd2
2017-01-19 23:15:00 +01:00
Martijn van Groningen f20f56e2e1 fixed compile errors due upstream change
Original commit: elastic/x-pack-elasticsearch@0a9d73d2be
2017-01-19 20:42:44 +01:00
Colin Goodheart-Smithe 3fa87c0994 Removes upload pack task from build (elastic/elasticsearch#757)
* removes upload pack task from build

This is preventing us from being an elasticsearch-extra project and we cannot have this task when we move to x-pack. Once we are in X-Pack the unified build will be uploading the final artifact so for now we will change the CI build to add a build step to upload the pack artifact.

* Removes OS specific stuff from the build

the CPP_LOCAL_DIST will now look for any `ml-cpp` artifacts for the same version in the specified directory.

* review corrections

Original commit: elastic/x-pack-elasticsearch@be15e55ddb
2017-01-19 16:14:08 +00:00
Colin Goodheart-Smithe 55dd438557 Fixes typo on cpp dependencies
Original commit: elastic/x-pack-elasticsearch@1bf51ac6f5
2017-01-19 15:36:45 +00:00
Colin Goodheart-Smithe 62cb7a17c5 Changes build to get c++ lib as a standard dependency (elastic/elasticsearch#756)
Original commit: elastic/x-pack-elasticsearch@d46990da49
2017-01-19 15:22:55 +00:00
Colin Goodheart-Smithe 33800bae5e Changes build to make cpp artifact download correct with ml-cpp changes (elastic/elasticsearch#754)
https://github.com/elastic/machine-learning-cpp/pull/3 changes the artifact names and paths for the ml-cpp build. This change makes it so the machine learning build references the artifacts in their new location.

Original commit: elastic/x-pack-elasticsearch@d3916b6a7f
2017-01-19 13:51:01 +00:00
Dimitris Athanasiou 0b084ea0e6 Treat timestamps without timezone as UTC (elastic/elasticsearch#753)
Original commit: elastic/x-pack-elasticsearch@33ab2fb781
2017-01-19 13:49:14 +00:00
David Roberts 36bdcaff5d Rename scheduler/scheduled to datafeed (elastic/elasticsearch#755)
Relates elastic/elasticsearch#630

The more subtle changes to the datafeed endpoints required by elastic/elasticsearch#630
are NOT in this commit, as they would be drowned out by the rename

Original commit: elastic/x-pack-elasticsearch@3318971da9
2017-01-19 13:44:19 +00:00
David Roberts 10441a3e38 More endpoint adjustments (elastic/elasticsearch#750)
This commit contains some more of the endpoint changes Sophie and Steve
agreed with Clint:

1. get_jobs_stats renamed to get_job_stats

2. Revert snapshot must now be done using an ID - other options removed

3. Renamed "categorydefinitions" to "categories" in endpoints

4. get_jobs now has an implicit _all if no job ID/wildcard is specified

5. There is an option to retrieve a specific model snapshot by ID in
   get_model_snapshots

Relates elastic/elasticsearch#630

Original commit: elastic/x-pack-elasticsearch@9dd71c64a8
2017-01-19 11:41:35 +00:00
David Kyle e826a56212 Make document Ids unique if in a shared index (elastic/elasticsearch#749)
Original commit: elastic/x-pack-elasticsearch@ecc7e876ce
2017-01-19 09:31:03 +00:00
Martijn van Groningen d3c589c33d Moved waiting for scheduler started logic into StartSchedulerAction.TransportAction and moved the logic that was original there to a new action named InternalStartSchedulerAction.
This change prepares for elastic/elasticsearch/elastic/elasticsearch#22575, where we don't have ClusterService available in rest actions.

Original commit: elastic/x-pack-elasticsearch@87658c7fe8
2017-01-19 09:38:10 +01:00
Dimitris Athanasiou c33f26976d Improve field extraction in scheduler (elastic/elasticsearch#748)
This commit performs the following improvements:

- the time field is always requested as doc_value. This makes
specifying a time format for scheduled jobs unnecessary.
- adds DataDescription as a param to the PostDataAction. When set,
it overrides the job's DataDescription. This allows the scheduler to
override the job's DataDescription since it knows the data format (JSON)
and the time format (epoch_ms). This is not exposed in the REST API to
discourage users from using it.
- by default, data extractor search now requests doc_values for analysis fields. This is
expected to result in increased performance.
- a `_source` field is added to the scheduler config. This needs to be
set to true when one or more of the analysis fields do not have
doc_values.
- the ELASTICSEARCH data format is removed as is now redundant.
- fixes the usage of `script_fields`. Previously, setting
`script_fields` would result to none of the source to be returned. Thus,
is the analysis fields were a mixture of script and non-script fields it
would not work.
- ensures nested fields are handled properly

Closes elastic/elasticsearch#679, Closes elastic/elasticsearch#267 

Original commit: elastic/x-pack-elasticsearch@fed35ed354
2017-01-18 18:46:43 +00:00
David Kyle 4c0d2a492d Refactor get methods (elastic/elasticsearch#747)
Original commit: elastic/x-pack-elasticsearch@d300be2dde
2017-01-18 13:35:25 +00:00
Martijn van Groningen 40332c7e1c use client instead of transport action directly in rest actions
Original commit: elastic/x-pack-elasticsearch@4c3380ceb9
2017-01-17 20:38:53 +01:00
Martijn van Groningen d9a75424d0 fixed wrong mockito import in test
Original commit: elastic/x-pack-elasticsearch@c6a7232a87
2017-01-17 20:22:30 +01:00