OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-06 13:08:29 +00:00

Author	SHA1	Message	Date
Martijn van Groningen	1b65366478	Simplified AutodetectProcess interface: * Removed getPersistStream() method from this interface and let the NativeAutodetectProcess implementation deal with this. The persist stream is an implementation detail and BlackHoleAutodetectProcess doesn't deal with this too. * Replaced getProcessOutStream() method with readAutodetectResults() method. This method now returns a `Iterator<AutodetectResult>` instead of an inputstream. This makes the BlackHoleAutodetectProcess and future mocked implementations easier. Original commit: elastic/x-pack-elasticsearch@086e7b40ab	2017-02-03 16:52:51 +01:00
Dimitris Athanasiou	9d9572e2b2	Reintroduce chunking to improve data extractor performance (elastic/elasticsearch#849 ) * Reintroduce chunking to improve data extractor performance Performing a sorted search/scroll over a period of time that matches a lot of documents is very expensive because for each page all documents are traversed. The solution is to chunk the search time and perform separate search/scrolls for each chunk. This commit is introducing a new `chung` config in `datafeed_config` whose mode can be set to either of AUTO, OFF, MANUAL, with the latter allowing to specify an explicit chunk size. When set to AUTO, a heuristic is used in order to determine the chunk size. The heuristic is based on estimating the time interval within which we expect `scroll_size` documents and then taking the 10x multiple of that. Based on benchmarking, this method gives a dramatic performance increase. For example, for the citizens dataset it improved the ingest rate from 0.33M docs / minute to 13.6M docs / minute. Farequote is now done in ~1 second. Finally, note that when `chunk` is not specified, it defaults to AUTO when aggregations are not set and to OFF otherwise. This is because the chunk size heuristic does not lend itself great for aggregations where one needs to chunk based on the cardinality of buckets rather than simply time. Relates to elastic/elasticsearch#734 Original commit: elastic/x-pack-elasticsearch@a738e86d21	2017-02-03 15:50:01 +00:00
David Kyle	21adb19b22	Checkstyle fix Original commit: elastic/x-pack-elasticsearch@1d0eaed282	2017-02-03 15:35:36 +00:00
Martijn van Groningen	a7d95951a6	Removed forgotten blocking call when opening a job. Original commit: elastic/x-pack-elasticsearch@e1dfa54240	2017-02-03 16:24:43 +01:00
Dimitris Athanasiou	9b0344cd90	Write enum values in lowercase (elastic/elasticsearch#861 ) Original commit: elastic/x-pack-elasticsearch@6788ad3304	2017-02-03 15:10:11 +00:00
David Kyle	e7dcab48ab	Test was testing the wrong endpoint Original commit: elastic/x-pack-elasticsearch@ca7a1a1097	2017-02-03 14:50:08 +00:00
David Kyle	70b8129b78	Add job update endpoint (elastic/elasticsearch#854 ) * Remove redundant code * Add job update endpoint * Support updating detector description & rules * Fix merge conflicts * Use toStrings and fix race condition in update * Revert to using xpack.ml.support.AbstractSerializingTestCase Original commit: elastic/x-pack-elasticsearch@771ada0572	2017-02-03 14:22:36 +00:00
Dimitrios Athanasiou	2883b00b7c	Also rename some StatusTests to StateTests Original commit: elastic/x-pack-elasticsearch@6e1d3e2bba	2017-02-03 11:08:02 +00:00
Dimitris Athanasiou	ca4badeb46	Rename {Job\|Datafeed}Status to {Job\|Datafeed}State (elastic/elasticsearch#856 ) This is more consistent with elasticsearch where an index has state [open, close], etc. Original commit: elastic/x-pack-elasticsearch@30bf720c3e	2017-02-03 10:43:05 +00:00
David Kyle	b940dbf6d9	Remove the overwrite option from PUT job (elastic/elasticsearch#855 ) Original commit: elastic/x-pack-elasticsearch@0f7e0d35a9	2017-02-03 09:54:47 +00:00
Igor Motov	53a5e19c70	Add support for task status on persistent tasks Similarly to task status on normal tasks it's now possible to update task status on the persistent tasks. This should allow updating the state of the running tasks (such as loading, started, etc) as well as store intermediate state or progress. Original commit: elastic/x-pack-elasticsearch@ed109cfa84	2017-02-02 13:35:37 -05:00
Martijn van Groningen	06d688eb74	AutodetectProcessManager#getStatistics(...) should can now just return stats for single job as the _all expension is done on the transport layer Original commit: elastic/x-pack-elasticsearch@02d5272a4e	2017-02-02 13:11:39 +01:00
Dimitris Athanasiou	5ba9a6cfcc	Clear scroll after it is complete (elastic/elasticsearch#847 ) The ScrollDataExtractor needs to clear the scroll after it is complete. Originally, it was thought that completing a scroll leads to an automatic clearing of its context. That is not true, thus manual clearing has to be requested. - Also removes sorting in AggregationDataExtractor as it was redundant Original commit: elastic/x-pack-elasticsearch@8f955da8ce	2017-02-02 10:18:36 +00:00
Zachary Tong	a11ddd1e04	Integrate domainSplit function into datafeeds (elastic/elasticsearch#841 ) If `domainSplit(` is detected in an inline script, the function and params are injected into the script. The majority of this PR is actually test-related. Adds a unit test to check for the injected script/params. Also adds another QA test which -- through a very round-about mechanism -- confirms that the injected script compiles and functions correctly. The QA test can be simplified greatly once the Preview API is added. Original commit: elastic/x-pack-elasticsearch@c7c35a982c	2017-02-01 10:20:00 -05:00
Martijn van Groningen	ec902c4dc3	test: change assertion to be more lenient to platform specifics Original commit: elastic/x-pack-elasticsearch@2131e7f0c7	2017-02-01 14:49:45 +01:00
Martijn van Groningen	051d8d8fdf	Moved start and stop datafeeder apis over the persistent task infrastructure Original commit: elastic/x-pack-elasticsearch@8e15578fb7	2017-01-31 22:50:00 +01:00
Martijn van Groningen	22282e9d56	cleanup toString() methods Original commit: elastic/x-pack-elasticsearch@17a10ea68f	2017-01-31 22:35:54 +01:00
Martijn van Groningen	ce6dc4a506	Make job stats api task aware. This will allow the job stats api to redirect the request to node where job is running. Original commit: elastic/x-pack-elasticsearch@9f1d12dfcb	2017-01-31 22:35:54 +01:00
Martijn van Groningen	b07e9bbd07	Fixed AOBE caused by fetching model state when opening a job. This error only occurred for jobs that have been opened before and persisted model state. Closes elastic/elasticsearch#836 Original commit: elastic/x-pack-elasticsearch@ad76f4167f	2017-01-31 19:56:24 +01:00
David Roberts	7ff3b707a8	Make renormalization thread-safe (elastic/elasticsearch#840 ) Each ScoresUpdater needs its own JobRenormalizedResultsPersister, because each JobRenormalizedResultsPersister has a single BulkRequest that various methods update. Fixes elastic/elasticsearch#838 Original commit: elastic/x-pack-elasticsearch@90f4bbd5a0	2017-01-31 16:51:26 +00:00
David Kyle	97970b94cd	Remove the ignoreDowntime parameter from the _data endpoint (elastic/elasticsearch#834 ) The parameter only applies when a job is opened Original commit: elastic/x-pack-elasticsearch@37b902aa2a	2017-01-31 11:48:58 +00:00
David Roberts	34274a30ed	Make transport layer names consistent with corresponding endpoints (elastic/elasticsearch#822 ) Closes elastic/elasticsearch#630 Original commit: elastic/x-pack-elasticsearch@32aae3e1d9	2017-01-31 11:42:06 +00:00
David Kyle	c84f227857	Set memory usage log message to trace (elastic/elasticsearch#829 ) Original commit: elastic/x-pack-elasticsearch@13412cc4cf	2017-01-31 09:54:56 +00:00
David Roberts	ab957b6d91	Adjust validation endpoints (elastic/elasticsearch#812 ) Changes are: 1. The detector validation endpoint is changed from /_xpack/ml/_validate/detector to /_xpack/ml/anomaly_detectors/_validate/detector 2. A new endpoint is added for validating an entire job config: /_xpack/ml/anomaly_detectors/_validate Relates elastic/elasticsearch#630 Original commit: elastic/x-pack-elasticsearch@7b2031e746	2017-01-30 17:10:22 +00:00
David Kyle	4eab74ce29	Store input fields for anomaly records and influencers (elastic/elasticsearch#799 ) * Store input fields for anomaly records and influencers * Address review comments * Remove DotNotationReverser * Remove duplicated constants * Can’t use the same date for all records as they will have equivalent Ids Original commit: elastic/x-pack-elasticsearch@40796b5efc	2017-01-30 14:05:18 +00:00
Colin Goodheart-Smithe	79d1a10a86	Mutes DataFeedJobIT test method that uses painless This needs to be moved to the single-node-tests qa modules since integTests shouldn’t access modules. Original commit: elastic/x-pack-elasticsearch@289b697eb8	2017-01-30 10:55:59 +00:00
Colin Goodheart-Smithe	618cb2a1a0	Make all action names cluster actions Original commit: elastic/x-pack-elasticsearch@815d8f0aac	2017-01-30 10:00:23 +00:00
Igor Motov	827118e154	Adds support for persistent actions A persistent action is a transport-like action that is using the cluster state instead of transport to start tasks. This allows persistent tasks to survive restart of executing nodes. A persistent action can be implemented by extending TransportPersistentAction. TransportPersistentAction will start the task by using PersistentActionService, which controls persistent tasks lifecycle. See TestPersistentActionPlugin for an example implementing a persistent action. Original commit: elastic/x-pack-elasticsearch@8ef4103cd6	2017-01-27 11:20:54 -05:00
Martijn van Groningen	ff65c38253	[TEST] fixed mocking logic to include id Original commit: elastic/x-pack-elasticsearch@7b20e92fdc	2017-01-27 17:20:18 +01:00
Martijn van Groningen	2059b91620	Workaround for index request without an id being retried that are tripping an assertion in internal engine. (2) Original commit: elastic/x-pack-elasticsearch@22d5060deb	2017-01-27 17:07:51 +01:00
Martijn van Groningen	ad4218320c	Workaround for index request without an id being retried that are tripping an assertion in internal engine. Original commit: elastic/x-pack-elasticsearch@ba44acc28b	2017-01-27 16:15:00 +01:00
David Roberts	64fdb039ab	Reduce the controller connect timeout (elastic/elasticsearch#804 ) This used to be 60 seconds, dating back to the days when the controller had to be started manually after starting Elasticsearch. However, now Elasticsearch starts it automatically it should already be running when we try to connect, so the timeout can be much lower. It just needs to be long enough to give the C++ process time to create its named pipes. 2 seconds seems reasonable, and matches what we use for autodetect and normalize. Original commit: elastic/x-pack-elasticsearch@7300d68482	2017-01-27 14:23:18 +00:00
Zachary Tong	9395ef81b1	Painless DomainSplit tests in new Single-Node QA Module (elastic/elasticsearch#787 ) This contains the Painless-based DomainSplit function, generated static maps and basic tests. Due to cross-module complications, the tests are run by executing searches with script_fields and checking the response Original commit: elastic/x-pack-elasticsearch@c6c2942e01	2017-01-27 08:52:48 -05:00
Dimitris Athanasiou	91be1e719d	Disable stored_fields when possible in ScrollDataExtractor (elastic/elasticsearch#801 ) When source fields are not required, stored_fields can be disabled. This can make the query faster as no stored fields have to be decompressed. Note that this means no metadata (_id, _index, _type, etc.) will be returned. Original commit: elastic/x-pack-elasticsearch@b1ea526d83	2017-01-27 11:38:54 +00:00
Dimitris Athanasiou	5790a6f152	Handle shard failures in extractors (elastic/elasticsearch#794 ) Even though a search response may return a 200 status code, things could still have gone wrong. A search response may report shard failures. The datafeed extractors should check for that and report an extraction error accordingly. Closes elastic/elasticsearch#775 Original commit: elastic/x-pack-elasticsearch@5d6d899738	2017-01-26 16:01:43 +00:00
David Kyle	efc47c2a6f	Remove Usage classes (elastic/elasticsearch#796 ) * Delete usage class * Delete usage reporter * Remove unused constant Original commit: elastic/x-pack-elasticsearch@c7a6c457bd	2017-01-26 11:50:08 +00:00
David Kyle	db14d89358	Fix checkstyle Original commit: elastic/x-pack-elasticsearch@05d59da705	2017-01-26 10:05:03 +00:00
David Kyle	e3bb7cfea3	Split ml-int index into .ml-audit and .ml-meta (elastic/elasticsearch#752 ) * Audit messages in .ml-audit * Rename ml-int to .ml-meta * Remove no release comment * Fix compilation after classes moved to a different package * Create the Audit, state and meta indices every time a job is created * Revert change creating the audit index etc when the job is created * Rename index .ml-audit -> .ml-notifications Original commit: elastic/x-pack-elasticsearch@95168fa341	2017-01-26 09:44:54 +00:00
Martijn van Groningen	3a36f94a4a	When timeout has been reached, check one more time if the job / datafeed status has the expected value. Decreased wait timeout from 30s to 20s Original commit: elastic/x-pack-elasticsearch@b46fb0abe3	2017-01-25 23:32:04 +01:00
Dimitris Athanasiou	86291c12e2	Handle manual aggregations in datafeeds (elastic/elasticsearch#784 ) * Handle manual aggregations in datafeeds Adds a DataExtractor implementation that runs aggregated searches. The manual aggregations supported have the following limitations: - each aggregation can hava 0 or 1 sub-aggregations - the top aggregation has to be a histogram - sub-aggregations have to be either terms aggregations or single value metric aggregations. The response is converted into flat JSON documents that contain only the fields of interest and can be parsed without additional context from our JSON parser. The fields in the JSON documents correspond to the names of the aggregations. Closes elastic/elasticsearch#680 Original commit: elastic/x-pack-elasticsearch@7dfd2d31e6	2017-01-25 19:13:03 +00:00
Colin Goodheart-Smithe	716f543f7b	Adds a new constructor to plugin The new constructor takes an Environment object. This is needed for migration to X-Pack since the environment instance is built by the XPackPlugin and then passed into the feature plugins. Original commit: elastic/x-pack-elasticsearch@f25225bc6a	2017-01-25 18:45:04 +00:00
David Roberts	4b366f8ef6	Removing transforms and the SINGLE_LINE input format (elastic/elasticsearch#790 ) Most transforms will be replaced with Painless scripts. The exception is the DateTransform, whose functionality is now simplified to what existed before the other transforms were added. The SINGLE_LINE format relied on transforms to extract fields, so has also been removed, but this is reasonable as it strays into Logstash territory. Relates elastic/elasticsearch#630 Closes elastic/elasticsearch#39 Original commit: elastic/x-pack-elasticsearch@a593d3e0ad	2017-01-25 15:51:50 +00:00
Colin Goodheart-Smithe	603fa47580	Adds an option to disable the ML plugin (elastic/elasticsearch#785 ) Adds an `xpack.ml.enabled` node level setting that can be used to enable and disable the plugin. This will be important when we migrate to X-Pack Closes elastic/elasticsearch#781 Original commit: elastic/x-pack-elasticsearch@e5c4969a96	2017-01-24 16:14:56 +00:00
Martijn van Groningen	e9f899e57a	Improved datafeed logging for stopping Original commit: elastic/x-pack-elasticsearch@94bd5d6a00	2017-01-24 16:00:54 +01:00
Martijn van Groningen	b636a4b829	Fixed timeout (de-)serialization for start and stop datafeeder and open job apis. Original commit: elastic/x-pack-elasticsearch@be054db48c	2017-01-24 15:53:54 +01:00
Martijn van Groningen	8dbaef186e	[TEST] use unique job ids to make debugging log files easier Original commit: elastic/x-pack-elasticsearch@9f04e1b01f	2017-01-24 12:01:21 +01:00
Martijn van Groningen	29451bb7e3	[TEST] select timestamp differently for test documents Original commit: elastic/x-pack-elasticsearch@679273012c	2017-01-24 11:07:24 +01:00
David Roberts	215410e93f	Rename list to filter (elastic/elasticsearch#774 ) Part of the endpoint rename Sophie and Steve agreed with Clint Relates elastic/elasticsearch#630 Original commit: elastic/x-pack-elasticsearch@6ded117849	2017-01-24 10:01:24 +00:00
Martijn van Groningen	a7d1918461	[TEST] added more logging Original commit: elastic/x-pack-elasticsearch@062a0b41b8	2017-01-23 17:54:14 +01:00
David Roberts	cd2332730b	Move the named pipe no bootstrap test to a separate qa module (elastic/elasticsearch#769 ) This matches the way tests that need to run without an Elasticsearch bootstrap are run in core Elasticsearch. This should make merging to x-pack easier. Note that the no bootstrap tests now run after the integration tests, but this doesn't really matter. Original commit: elastic/x-pack-elasticsearch@5547f457b6	2017-01-23 12:08:35 +00:00

1 2 3 4 5 ...

329 Commits