Commit Graph

418 Commits

Author SHA1 Message Date
Colin Goodheart-Smithe 1b2381d355 Migrates Elasticsearch files
Original commit: elastic/x-pack-elasticsearch@370af88d14
2017-02-08 16:58:55 +00:00
Colin Goodheart-Smithe fa7a82a945 Removes files no longer needed
Original commit: elastic/x-pack-elasticsearch@8f197075a3
2017-02-08 16:58:55 +00:00
Martijn van Groningen 14a677396e [TEST] Reduce size of large documents to be more heap memory friendly in xpack build
Original commit: elastic/x-pack-elasticsearch@d3864a5021
2017-02-08 17:45:50 +01:00
Zachary Tong 1591003c7d Fix rest of mocking issues, remove awaitsFix
Original commit: elastic/x-pack-elasticsearch@d5e876e867
2017-02-08 10:57:17 -05:00
David Kyle 6e929fb290 Fix test
Original commit: elastic/x-pack-elasticsearch@0e656d8906
2017-02-08 15:52:23 +00:00
Zachary Tong 99ce9be6ca Fix (some) mocking issues due to upstream changes.
Two suites marked as awaitsFix while being worked on.

Original commit: elastic/x-pack-elasticsearch@06eb352b1e
2017-02-08 10:36:15 -05:00
Colin Goodheart-Smithe 30b6425b3a Convert ml-cpp repo to be part of elasticsearch-extra (elastic/elasticsearch#890)
This means we can reference the local build from within the prelert-legacy build script and build it directly

Original commit: elastic/x-pack-elasticsearch@14024841ab
2017-02-08 14:39:22 +00:00
Zachary Tong 91883ad57b Upstream fixes: use getter methods instead of (now) private members
Original commit: elastic/x-pack-elasticsearch@80786e4f84
2017-02-08 09:37:10 -05:00
David Kyle 9dc4a2f31c Online updates to the running autodetect process (elastic/elasticsearch#886)
* Methods to update the running process with new settings

* Task to update the running autodetect process

* Don’t start process update task if not config specified

Original commit: elastic/x-pack-elasticsearch@4364b141b5
2017-02-08 14:19:24 +00:00
David Roberts 639c02a45e Change variable name
Original commit: elastic/x-pack-elasticsearch@5576ec2196
2017-02-07 17:16:45 +00:00
David Roberts 3eec3ab42a Increase time allowed for large state test (elastic/elasticsearch#871)
Previously it would fail on some old/slow development machines

Closes elastic/elasticsearch#805

Original commit: elastic/x-pack-elasticsearch@6f182ed125
2017-02-07 15:39:59 +00:00
David Roberts af10f880fb Allow vault to work on Windows (elastic/elasticsearch#878)
Original commit: elastic/x-pack-elasticsearch@a404f4793a
2017-02-07 14:59:55 +00:00
Dimitris Athanasiou 15160e41a2 Fix datafeed with date_histogram aggregation (elastic/elasticsearch#876)
date_histogram buckets return the key as a DateTime object.
This PR checks if the key is DateTime and returns the epoch
millis when suitable.

Fixes elastic/elasticsearch#869

Original commit: elastic/x-pack-elasticsearch@8e39760dad
2017-02-07 14:45:02 +00:00
Dimitris Athanasiou 678ae53596 Make flush wait to completion (elastic/elasticsearch#875)
Flush has the contract that when it is done results are up-to-date.
Thus, it adds no value to have it timeout. In most cases, the request
should be pretty responsive apart from when it advances time forward.
In the latter scenario, it could force results to be calculated for a
long period of time which could take long. The one use case for this
is the datafeeds and there is no issue with waiting flush to finish.

This PR changes flush to always wait to completion. However, it adds
checking that the c++ process is alive every second, to avoid long
waits in vain when something has gone horribly wrong.

Fixes elastic/elasticsearch#826

Original commit: elastic/x-pack-elasticsearch@de421ab843
2017-02-07 14:28:01 +00:00
Colin Goodheart-Smithe 0c64c22883 Fixed vault URL for ci build
Original commit: elastic/x-pack-elasticsearch@c9cb05bf0e
2017-02-07 12:32:38 +00:00
Colin Goodheart-Smithe 9ed50211d1 fixed CI env variable for vault
Original commit: elastic/x-pack-elasticsearch@623ea83217
2017-02-07 12:28:00 +00:00
Colin Goodheart-Smithe 7dc4adf238 Adds vault access to build to get aws creds (elastic/elasticsearch#874)
After this change the build requires a github.token file in the root directory of the repository so that it can authenticate with the Vault service to get AWS credentials to download the ml-cpp artifacts

Original commit: elastic/x-pack-elasticsearch@630efadef8
2017-02-07 12:22:34 +00:00
Dimitris Athanasiou ccb9ab5717 Fix time field extraction after upstream change (elastic/elasticsearch#873)
Elasticsearch changed doc_values of date fields to return a
joda DateTime object. Thus, we need to call getMillis() to extract
the epoch millis value.

Original commit: elastic/x-pack-elasticsearch@b992882af5
2017-02-07 12:04:01 +00:00
Zachary Tong 0b71e015d8 Remove incorrect/unused parameter
Original commit: elastic/x-pack-elasticsearch@4f33186b5c
2017-02-06 14:50:10 -05:00
Zachary Tong a1a5d590b6 Integrate DBQ into job deletion process (elastic/elasticsearch#691)
A JobStorageDeletionTask is created, which supervises the physical deletion of the job.  This
task is a child of the DeleteJob action.  After the DBQ finishes, the normal flow
resumes (physical index deleted, job removed from CS)

Original commit: elastic/x-pack-elasticsearch@5d6f694408
2017-02-06 14:34:36 -05:00
Igor Motov a0b37a2510 Replace List with Map in PersistentTasksInProgress
Store currently running persistent tasks in a map instead of a list.

Original commit: elastic/x-pack-elasticsearch@f383b0bbed
2017-02-06 12:18:42 -05:00
Dimitrios Athanasiou aa86d57487 Include start/end in log message for starting datafeed
Original commit: elastic/x-pack-elasticsearch@7b88bb27c1
2017-02-06 13:49:03 +00:00
David Roberts f594030c9e Fix some mappings on the .ml indexes (elastic/elasticsearch#870)
Closes elastic/elasticsearch#814

Original commit: elastic/x-pack-elasticsearch@206efacc4c
2017-02-06 12:26:03 +00:00
David Kyle 3f9741b85f Make config and result objects with dates human readable (elastic/elasticsearch#863)
Original commit: elastic/x-pack-elasticsearch@9c0c306741
2017-02-06 09:46:21 +00:00
David Roberts 50c4090541 Remove timeout setting from Job (elastic/elasticsearch#866)
This setting was related to auto-close, and Jobs no longer auto-close.

Closes elastic/elasticsearch#832

Original commit: elastic/x-pack-elasticsearch@fef81f9c3b
2017-02-06 09:39:55 +00:00
David Kyle 5a1cd69a6a More checkstyle fixes
Original commit: elastic/x-pack-elasticsearch@4c454c5061
2017-02-03 16:58:19 +00:00
David Roberts fb0ccde8d8 More checkstyle fixes
Original commit: elastic/x-pack-elasticsearch@f68e57835b
2017-02-03 16:32:09 +00:00
David Kyle 55482ca65c Fix check style error after upgrade
Original commit: elastic/x-pack-elasticsearch@db802d1837
2017-02-03 16:08:07 +00:00
Martijn van Groningen 1b65366478 Simplified AutodetectProcess interface:
* Removed getPersistStream() method from this interface and let the NativeAutodetectProcess implementation deal with this. The persist stream is an implementation detail and BlackHoleAutodetectProcess doesn't deal with this too.
* Replaced getProcessOutStream() method with readAutodetectResults() method. This method now returns a `Iterator<AutodetectResult>` instead of an inputstream. This makes the BlackHoleAutodetectProcess and future mocked implementations easier.

Original commit: elastic/x-pack-elasticsearch@086e7b40ab
2017-02-03 16:52:51 +01:00
Dimitris Athanasiou 9d9572e2b2 Reintroduce chunking to improve data extractor performance (elastic/elasticsearch#849)
* Reintroduce chunking to improve data extractor performance

Performing a sorted search/scroll over a period of time that matches
a lot of documents is very expensive because for each page all
documents are traversed.

The solution is to chunk the search time and perform separate
search/scrolls for each chunk.

This commit is introducing a new `chung` config in `datafeed_config`
whose mode can be set to either of AUTO, OFF, MANUAL, with the latter
allowing to specify an explicit chunk size.

When set to AUTO, a heuristic is used in order to determine the chunk
size. The heuristic is based on estimating the time interval within
which we expect `scroll_size` documents and then taking the 10x multiple
of that. Based on benchmarking, this method gives a dramatic performance
increase. For example, for the citizens dataset it improved the ingest
rate from 0.33M docs / minute to 13.6M docs / minute. Farequote is now
done in ~1 second.

Finally, note that when `chunk` is not specified, it defaults to AUTO
when aggregations are not set and to OFF otherwise. This is because
the chunk size heuristic does not lend itself great for aggregations
where one needs to chunk based on the cardinality of buckets rather
than simply time.

Relates to elastic/elasticsearch#734

Original commit: elastic/x-pack-elasticsearch@a738e86d21
2017-02-03 15:50:01 +00:00
David Kyle 21adb19b22 Checkstyle fix
Original commit: elastic/x-pack-elasticsearch@1d0eaed282
2017-02-03 15:35:36 +00:00
Martijn van Groningen a7d95951a6 Removed forgotten blocking call when opening a job.
Original commit: elastic/x-pack-elasticsearch@e1dfa54240
2017-02-03 16:24:43 +01:00
Dimitris Athanasiou 9b0344cd90 Write enum values in lowercase (elastic/elasticsearch#861)
Original commit: elastic/x-pack-elasticsearch@6788ad3304
2017-02-03 15:10:11 +00:00
David Kyle e7dcab48ab Test was testing the wrong endpoint
Original commit: elastic/x-pack-elasticsearch@ca7a1a1097
2017-02-03 14:50:08 +00:00
David Kyle 70b8129b78 Add job update endpoint (elastic/elasticsearch#854)
* Remove redundant code

* Add job update endpoint

* Support updating detector description & rules

* Fix merge conflicts

* Use toStrings and fix race condition in update

* Revert to using xpack.ml.support.AbstractSerializingTestCase

Original commit: elastic/x-pack-elasticsearch@771ada0572
2017-02-03 14:22:36 +00:00
Dimitrios Athanasiou 2883b00b7c Also rename some *Status*Tests to *State*Tests
Original commit: elastic/x-pack-elasticsearch@6e1d3e2bba
2017-02-03 11:08:02 +00:00
Dimitris Athanasiou ca4badeb46 Rename {Job|Datafeed}Status to {Job|Datafeed}State (elastic/elasticsearch#856)
This is more consistent with elasticsearch where an index
has state [open, close], etc.

Original commit: elastic/x-pack-elasticsearch@30bf720c3e
2017-02-03 10:43:05 +00:00
David Kyle b940dbf6d9 Remove the overwrite option from PUT job (elastic/elasticsearch#855)
Original commit: elastic/x-pack-elasticsearch@0f7e0d35a9
2017-02-03 09:54:47 +00:00
Igor Motov 53a5e19c70 Add support for task status on persistent tasks
Similarly to task status on normal tasks it's now possible to update task status on the persistent tasks. This should allow updating the state of the running tasks (such as loading, started, etc) as well as store intermediate state or progress.

Original commit: elastic/x-pack-elasticsearch@ed109cfa84
2017-02-02 13:35:37 -05:00
Martijn van Groningen 06d688eb74 AutodetectProcessManager#getStatistics(...) should can now just return stats for single job as the _all expension is done on the transport layer
Original commit: elastic/x-pack-elasticsearch@02d5272a4e
2017-02-02 13:11:39 +01:00
Dimitris Athanasiou 5ba9a6cfcc Clear scroll after it is complete (elastic/elasticsearch#847)
The ScrollDataExtractor needs to clear the scroll after
it is complete. Originally, it was thought that completing a scroll
leads to an automatic clearing of its context. That is not true,
thus manual clearing has to be requested.

- Also removes sorting in AggregationDataExtractor as it was redundant

Original commit: elastic/x-pack-elasticsearch@8f955da8ce
2017-02-02 10:18:36 +00:00
polyfractal 3504608a1e [TEST] more robust regex for tests missing subdomains
Original commit: elastic/x-pack-elasticsearch@28e5d14c22
2017-02-01 11:04:55 -05:00
Zachary Tong a11ddd1e04 Integrate domainSplit function into datafeeds (elastic/elasticsearch#841)
If `domainSplit(` is detected in an inline script, the function and params are injected into
the script.

The majority of this PR is actually test-related.  Adds a unit test to check for the injected
script/params.  Also adds another QA test which -- through a very round-about mechanism --
confirms that the injected script compiles and functions correctly.  The QA test can
be simplified greatly once the Preview API is added.

Original commit: elastic/x-pack-elasticsearch@c7c35a982c
2017-02-01 10:20:00 -05:00
Martijn van Groningen ec902c4dc3 test: change assertion to be more lenient to platform specifics
Original commit: elastic/x-pack-elasticsearch@2131e7f0c7
2017-02-01 14:49:45 +01:00
Martijn van Groningen 051d8d8fdf Moved start and stop datafeeder apis over the persistent task infrastructure
Original commit: elastic/x-pack-elasticsearch@8e15578fb7
2017-01-31 22:50:00 +01:00
Martijn van Groningen 22282e9d56 cleanup toString() methods
Original commit: elastic/x-pack-elasticsearch@17a10ea68f
2017-01-31 22:35:54 +01:00
Martijn van Groningen ce6dc4a506 Make job stats api task aware.
This will allow the job stats api to redirect the request to node where job is running.

Original commit: elastic/x-pack-elasticsearch@9f1d12dfcb
2017-01-31 22:35:54 +01:00
Martijn van Groningen b07e9bbd07 Fixed AOBE caused by fetching model state when opening a job.
This error only occurred for jobs that have been opened before and persisted model state.

Closes elastic/elasticsearch#836

Original commit: elastic/x-pack-elasticsearch@ad76f4167f
2017-01-31 19:56:24 +01:00
David Roberts 7ff3b707a8 Make renormalization thread-safe (elastic/elasticsearch#840)
Each ScoresUpdater needs its own JobRenormalizedResultsPersister, because
each JobRenormalizedResultsPersister has a single BulkRequest that various
methods update.

Fixes elastic/elasticsearch#838

Original commit: elastic/x-pack-elasticsearch@90f4bbd5a0
2017-01-31 16:51:26 +00:00
Colin Goodheart-Smithe f804bb1917 Removes ensureGreen from PlainlessDomainSplitIT (elastic/elasticsearch#839)
This shouldn’t be needed as the cluster no longer goes red when an index is created.

Original commit: elastic/x-pack-elasticsearch@b554ea9caf
2017-01-31 16:09:25 +00:00