OpenSearch

Commit Graph

Author	SHA1	Message	Date
Przemysław Witek	40d3c60d7a	Make testDatafeedTimingStats_DatafeedJobIdUpdated test easier to debug (#44206 ) (#44268 )	2019-07-12 13:52:26 +02:00
Przemysław Witek	44781e415e	[7.x] [ML] Add DatafeedTimingStats to datafeed GetDatafeedStatsAction.Response (#43045 ) (#44118 )	2019-07-10 11:51:44 +02:00
Alpar Torok	1b6109517a	Mute failing test Tracking in #43960	2019-07-04 12:13:02 +03:00
Dimitris Athanasiou	8f49d01113	[7.x][ML] Rename df-analytics `_id_copy` to `ml__id_copy` (#43754 ) (#43783 ) Renames `_id_copy` to `ml__id_copy` as field names starting with underscore are deprecated. The new field name `ml__id_copy` was chosen as an obscure enough field that users won't have in their data. Otherwise, this field is only intented to be used by df-analytics.	2019-06-30 19:37:00 +03:00
David Roberts	b599c68d23	[ML] Assert that a no-op job creates no results nor state (#43681 ) If a job is opened and then closed and does nothing in between then it should not persist any results or state documents. This change adapts the no-op job test to assert no results in addition to no state, and to log any documents that cause this assertion to fail. Relates elastic/ml-cpp#512 Relates #43680	2019-06-29 14:57:49 +01:00
Dimitris Athanasiou	cab879118d	[7.x][ML] Support multiple source indices for df-analytics (#43702 ) (#43731 ) This commit adds support for multiple source indices. In order to deal with multiple indices having different mappings, it attempts a best-effort approach to merge the mappings assuming there are no conflicts. In case conflicts exists an error will be returned. To allow users creating custom mappings for special use cases, the destination index is now allowed to exist before the analytics job runs. In addition, settings are no longer copied except for the `index.number_of_shards` and `index.number_of_replicas`.	2019-06-28 13:28:03 +03:00
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
Jason Tedor	fa09113080	Remove trace logging for ML native multi-node tests This trace logging looks like it was copy/pasted from another test, where the logging in that test was only added to investigate a test failure. This commit removes the trace logging.	2019-06-18 22:28:27 -04:00
Alpar Torok	94930d0e84	Testclusters: convert ml qa tests (#43229 ) * Testclusters: convert ml qa tests This PR converts the ML tests to use testclusters.	2019-06-18 11:55:11 +03:00
Benjamin Trent	79052050bf	[ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds (#42969 ) (#43069 ) * [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds * only supporting doc_values for geo_point fields * moving validation into GeoPointField ctor	2019-06-10 21:52:53 -05:00
David Turner	68339f90e9	Mute AutodetectMemoryLimitIT#testTooManyPartitions Relates #43013	2019-06-10 09:20:36 +01:00
Mark Vieira	e44b8b1e2e	[Backport] Remove dependency substitutions 7.x (#42866 ) * Remove unnecessary usage of Gradle dependency substitution rules (#42773) (cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)	2019-06-04 13:50:23 -07:00
Ed Savage	d97f4d5e28	[ML][TEST] Fix limits in AutodetectMemoryLimitIT (#42279 ) Re-enable muted tests and accommodate recent backend changes that result in higher memory usage being reported for a job at the start of its life-cycle	2019-05-21 18:44:47 +01:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Ed Savage	840af87a74	[ML] Temporarily muting failing tests Muting a number of AutoDetectMemoryLimitIT tests to give CI a chance to settle before easing in required backend changes. relates elastic/ml-cpp#486 relates #42086	2019-05-19 08:29:50 -04:00
Benjamin Trent	bf5a40c754	[ML] relax set upgrade mode test to match what is guaranteed (#41958 ) (#41979 ) * [ML] relax set upgrade mode test to match what is guaranteed * removing unused import	2019-05-09 14:28:50 -05:00
Tom Veasey	b3f4533e1c	[ML] Update for model selection change and disable temporarily (#41482 ) (#41682 )	2019-04-30 15:47:54 -05:00
Ed Savage	c20ea9a2dd	[ML][TEST] Fix failing test testPersistJobOnGracefulShutdown_givenTimeAdvancedAfterNoNewData (#40363 ) Ensure that there is at least a 1s delay between the time that state is persisted by each of the two jobs in the test. Model snapshot IDs use the current time in epoch seconds to distinguish themselves, hence snapshots will be overwritten by another if it occurs in the same 1s window. Closes #40347	2019-03-25 17:55:10 +00:00
David Turner	1265a15b75	Mute testPersistJobOnGracefulShutdown_givenTimeAdvancedAfterNoNewData	2019-03-22 08:46:51 +00:00
Ed Savage	23d5f7babf	[ML] Add integration tests to check persistence (#40272 ) (#40315 ) Additional checks to exercise the behaviour of persistence on graceful close of an anomaly job. Related to elastic/ml-cpp#393 Backports #40272	2019-03-21 17:01:10 +00:00
Benjamin Trent	2016e23285	[ML] Refactor common utils out of ML plugin to XPack.Core (#39976 ) (#40009 ) * [ML] Refactor common utils out of ML plugin to XPack.Core * implementing GET filters with abstract transport * removing added rest param * adjusting how defaults can be supplied	2019-03-13 17:08:43 -05:00
Dimitris Athanasiou	79e414df86	[ML] Fix datafeed skipping first bucket after lookback when aggs are … (#39859 ) (#39958 ) The problem here was that `DatafeedJob` was updating the last end time searched based on the `now` even though when there are aggregations, the extactor will only search up to the floor of `now` against the histogram interval. This commit fixes the issue by using the end time as calculated by the extractor. It also adds an integration test that uses aggregations. This test would fail before this fix. Unfortunately the test is slow as we need to wait for the datafeed to work in real time. Closes #39842	2019-03-13 09:09:07 +02:00
Benjamin Trent	4da04616c9	[ML] refactoring lazy query and agg parsing (#39776 ) (#39881 ) * [ML] refactoring lazy query and agg parsing * Clean up and addressing PR comments * removing unnecessary try/catch block * removing bad call to logger * removing unused import * fixing bwc test failure due to serialization and config migrator test * fixing style issues * Adjusting DafafeedUpdate class serialization * Adding todo for refactor in v8 * Making query non-optional so it does not write a boolean byte	2019-03-10 14:54:02 -05:00
Dimitris Athanasiou	5c023770d2	[ML] Disable security audit trail in native integ tests suite (#39683 ) Investigating how to make DeleteExpiredDataIT faster, it was revealed that the security audit trail threads were quite hot. Disabling that seems to be helping quite a bit with making this test faster. This commit also unmutes the test to see how it goes with the audit trail disabled. Relates #39658 Closes #39575	2019-03-05 12:43:15 +02:00
David Kyle	a58145f9e6	[ML] Transition to typeless (mapping) APIs (#39573 ) ML has historically used doc as the single mapping type but reindex in 7.x will change the mapping to _doc. Switching to the typeless APIs handles case where the mapping type is either doc or _doc. This change removes deprecated typed usages.	2019-03-04 13:52:05 +00:00
David Roberts	085ff38122	Mute DeleteExpiredDataIT.testDeleteExpiredData Due to https://github.com/elastic/elasticsearch/issues/39575	2019-03-03 18:34:30 +00:00
Dimitris Athanasiou	8843832039	[ML] Shave off DeleteExpiredDataIT runtime (#39557 ) This commit parallelizes some parts of the test and its remove an unnecessary refresh call. On my local machine it shaves off about 15 seconds for a test execution time of ~64s (down from ~80s). This test is still slow but progress over perfection. Relates #37339	2019-03-01 19:10:00 +02:00
Dimitris Athanasiou	8122650a55	[ML] Add integration test for interim results after advancing bucket (#39447 ) This is an integration test that captures the issue described in elastic/ml-cpp#324	2019-02-28 11:12:08 +02:00
Mehran Koushkebaghi	1d0097b5e8	[ML] Refactoring scheduled event to store instant instead of zoned time zone (#39380 ) The ScheduledEvent class has never preserved the time zone so it makes more sense for it to store the start and end time using Instant rather than ZonedDateTime. Closes #38620	2019-02-27 09:27:04 +00:00
David Roberts	4f2bd238d2	[ML] Increase datafeed integration test timeout for slow machines (#39311 ) The assertBusy() that waits the default 10 seconds for a datafeed to complete very occasionally times out on slow machines. This commit increases the timeout to 60 seconds. It will almost never actually take this long, but it's better to have a timeout that will prevent time being wasted looking at spurious test failures.	2019-02-22 15:35:32 +00:00
Dimitris Athanasiou	1c6818fe74	[ML] Improve DeleteExpiredDataIT failure message (#39298 ) (#39310 ) This test failed once in a very long time with the assertion that there is no document for the `non_existing_job` in the state index. I could not see how that is possible and I cannot reproduce. With this commit the failure message will reveal some examples of the left behind docs which might shed a light about what could go wrong.	2019-02-22 16:15:11 +02:00
Benjamin Trent	109b6451fd	ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs (#38822 ) (#39119 ) * ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs * Addressing pr feedback	2019-02-19 12:45:56 -06:00
Dimitris Athanasiou	21f76aba28	[ML] Extract base class for integ tests with native processes (#38850 ) (#38860 )	2019-02-14 12:15:00 +02:00
Benjamin Trent	7e4c0e6991	ML: Adds set_upgrade_mode API endpoint (#37837 ) * ML: Add MlMetadata.upgrade_mode and API * Adding tests * Adding wait conditionals for the upgrade_mode call to return * Adding tests * adjusting format and tests * Adjusting wait conditions for api return and msgs * adjusting doc tests * adding upgrade mode tests to black list	2019-01-28 09:07:30 -06:00
David Kyle	c0409fb9f0	[ML] Marginal gains in slow multi node QA tests (#37825 ) Move 2 tests that are simple rest tests and out of the QA suite and cut the number of post data calls in ForecastIT	2019-01-28 10:00:59 +00:00
David Roberts	57d321ed5f	[ML] Tighten up use of aliases rather than concrete indices (#37874 ) We have read and write aliases for the ML results indices. However, the job still had methods that purported to reliably return the name of the concrete results index being used by the job. After reindexing prior to upgrade to 7.x this will be wrong, so the method has been renamed and the comments made more explicit to say the returned index name may not be the actual concrete index name for the lifetime of the job. Additionally, the selection of indices when deleting the job has been changed so that it works regardless of concrete index names. All these changes are nice-to-have for 6.7 and 7.0, but will become critical if we add rolling results indices in the 7.x release stream as 6.7 and 7.0 nodes may have to operate in a mixed version cluster that includes a version that can roll results indices.	2019-01-28 09:38:46 +00:00
Benjamin Trent	9e932f4869	ML: removing unnecessary upgrade code (#37879 )	2019-01-25 13:57:41 -06:00
Christoph Büscher	b4b4cd6ebd	Clean codebase from empty statements (#37822 ) * Remove empty statements There are a couple of instances of undocumented empty statements all across the code base. While they are mostly harmless, they make the code hard to read and are potentially error-prone. Removing most of these instances and marking blocks that look empty by intention as such. * Change test, slightly more verbose but less confusing	2019-01-25 14:23:02 +01:00
Lee Hinman	427bc7f940	Use ILM for Watcher history deletion (#37443 ) * Use ILM for Watcher history deletion This commit adds an index lifecycle policy for the `.watch-history-*` indices. This policy is automatically used for all new watch history indices. This does not yet remove the automatic cleanup that the monitoring plugin does for the .watch-history indices, and it does not touch the `xpack.watcher.history.cleaner_service.enabled` setting. Relates to #32041	2019-01-23 10:18:08 -07:00
Benjamin Trent	12cdf1cba4	ML: Add support for single bucket aggs in Datafeeds (#37544 ) Single bucket aggs are now supported in datafeed aggregation configurations.	2019-01-18 15:08:53 -06:00
Benjamin Trent	5384162a42	ML: creating ML State write alias and pointing writes there (#37483 ) * ML: creating ML State write alias and pointing writes there * Moving alias check to openJob method * adjusting concrete index lookup for ml-state	2019-01-18 14:32:34 -06:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Benjamin Trent	19a7e0f4eb	ML: update .ml-state actions to support > 1 index (#37307 ) * ML: Updating .ml-state calls to be able to support > 1 index * Matching bulk delete behavior with dbq * Adjusting state name * refreshing indices before search * fixing line length * adjusting index expansion options	2019-01-11 08:03:41 -06:00
markharwood	434430506b	Type removal - added deprecation warnings to _bulk apis (#36549 ) Added warnings checks to existing tests Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice. Related to #35190	2019-01-10 21:35:19 +00:00
Benjamin Trent	df3b58cb04	ML: add migrate anomalies assistant (#36643 ) * ML: add migrate anomalies assistant * adjusting failure handling for reindex * Fixing request and tests * Adding tests to blacklist * adjusting test * test fix: posting data directly to the job instead of relying on datafeed * adjusting API usage * adding Todos and adjusting endpoint * Adding types to reindexRequest * removing unreliable "live" data test * adding index refresh to test * adding index refresh to test * adding index refresh to yaml test * fixing bad exists call * removing todo * Addressing remove comments * Adjusting rest endpoint name * making service have its own logger * adjusting validity check for newindex names * fixing typos * fixing renaming	2019-01-09 14:25:35 -06:00
Benjamin Trent	1780ced82d	ML: changing JobResultsProvider.getForecastRequestStats to support > 1 index (#37157 ) * ML: changing JobResultsProvider.getForecastRequestStats to support more than one index * moving to use idsQuery()	2019-01-07 10:58:55 -06:00
Armin Braun	31c33fdb9b	MINOR: Remove some Deadcode in Gradle (#37160 )	2019-01-07 09:21:25 +01:00
Dimitris Athanasiou	0fd27d4d6f	[ML] Unused state remover should also account for jobs in index (#37119 ) The unused state remover was never adjusted to account for jobs stored in the config index. The result was that when triggered it removed state for all jobs stored in the config index. This commit fixes the issue. Closes #37109	2019-01-04 12:43:44 +02:00
Dimitris Athanasiou	586453fef1	[ML] Remove types from datafeed (#36538 ) Closes #34265	2019-01-04 09:43:44 +02:00
David Roberts	13649aa70a	[TEST] Revert "Mute ForecastIT.testSingleSeries" (#37110 ) The problem that caused the test to be muted was fixed in https://github.com/elastic/ml-cpp/pull/332 Closes #36258	2019-01-03 16:23:18 +00:00

1 2

74 Commits