OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	4275a715c9	[ML] adjusting inference processor to support foreach usage (#60915 ) (#61022 ) `foreach` processors store information within the `_ingest` metadata object. This commit adds the contents of the `_ingest` metadata (if it is not empty). And will append new inference results if the result field already exists. This allows a `foreach` to execute and multiple inference results being written to the same result field. closes https://github.com/elastic/elasticsearch/issues/60867	2020-08-12 08:34:18 -04:00
markharwood	66098e0bf4	Search fix: query_string regex/wildcard searches not working on wildcard fields (#60959 ) (#61010 ) The Query string parser was not delegating the construction of wildcard/regex queries to the underlying field type. The wildcard field has special data structures and queries that operate on them so cannot rely on the basic regex/wildcard queries that were being used for other fields. Closes #60957	2020-08-12 10:44:52 +01:00
Armin Braun	32423a486d	Simplify and Speed up some Compression Usage (#60953 ) (#61008 ) Use thread-local buffers and deflater and inflater instances to speed up compressing and decompressing from in-memory bytes. Not manually invoking `end()` on these should be safe since their off-heap memory will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals that are not instantiated at a high frequency. This significantly reduces the amount of byte copying and object creation relative to the previous approach which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied bytes out of that buffer to a freshly allocated `byte[]`, used 4k stream buffers needlessly when working with bytes that are already in arrays (`writeTo` handles efficient writing to the compression logic now) etc. Relates #57284 which should be helped by this change to some degree. Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these code paths.	2020-08-12 11:06:23 +02:00
Andrei Dan	35423a75af	Tests: don't fail if ILM executed the action already (#60916 ) (#60982 ) (cherry picked from commit 8c970ad20f4f55a9c0d6a256aa643ea037281e75) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-08-12 09:04:04 +01:00
Dimitris Athanasiou	2e18c0f2ac	[7.x][ML] Audit force stopping data frame analytics (#60973 ) (#61004 ) Audits a message when a data frame analytics job is force stopped. Backport of #60973	2020-08-12 07:45:26 +03:00
Nhat Nguyen	ceaa28e97b	Increase timeout in testFollowIndexWithConcurrentMappingChanges (#60534 ) The test failed because the leader was taking a lot of CPUs to process many mapping updates. This commit reduces the mapping updates, increases timeout, and adds more debug info. Closes #59832	2020-08-11 17:03:22 -04:00
Nhat Nguyen	bf7eecf1dc	Fix synchronization in ShardFollowNodeTask (#60490 ) The leader mapping, settings, and aliases versions in a shard follow-task are updated without proper synchronization and can go backward.	2020-08-11 14:52:52 -04:00
Francisco Fernández Castaño	d544528c7b	Increase information on assertRecoveryStats assertion (#60960 ) Backport of #60952	2020-08-11 15:30:59 +02:00
Dimitris Athanasiou	6062672148	[7.x][ML] Monitor reindex response in DF analytics (#60911 ) (#60958 ) Examines the reindex response in order to report potential problems that occurred during the reindexing phase of data frame analytics jobs. Backport of #60911	2020-08-11 16:17:37 +03:00
Mark Tozzi	ab8518fb5b	[7.x] Extensibility for Composite Agg #59648 (#60842 )	2020-08-11 09:14:33 -04:00
Dan Hermann	839c6cdfc0	Un-mute data stream REST test (#60120 ) (#60939 )	2020-08-11 08:10:04 -05:00
David Kyle	18a65c5b9a	DFA Get Stats can return multiple responses if more than one error occurs (#60950 ) If the search for get stats with multiple job Ids fails the listener is called for each failure. This change waits for all responses then returns the first error if there was one.	2020-08-11 10:28:05 +01:00
Alan Woodward	54279212cf	Make MetadataFieldMapper extend ParametrizedFieldMapper (#59847 ) (#60924 ) This commit cuts over all metadata field mappers to parametrized format.	2020-08-11 09:02:28 +01:00
Benjamin Trent	66b3e89482	[ML] enable logging for test failures (#60902 ) (#60910 )	2020-08-10 12:36:30 -04:00
Francisco Fernández Castaño	2a4fd8329b	Avoid a race condition while waiting for pre warm to finish on SearchableSnapshotDirectoryTests (#60906 ) Backport of #60885. Closes #60813	2020-08-10 17:29:16 +02:00
Jim Ferenczi	f30f1f04e2	Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60816 ) This commit removes the ability to test the top level result of an aggregator before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test the final output (the one sent to the end user) rather than an intermediary result that could be different. This change also removes spurious commits triggered on top of a random index writer. These commits slow down the tests and are redundant with the commits that the random index writer performs.	2020-08-10 17:23:00 +02:00
David Roberts	dd02e9f31a	[TEST] Mute SearchableSnapshotActionIT testSearchableSnapshotForceMergesIndexToOneSegment (#60904 ) Due to https://github.com/elastic/elasticsearch/issues/60901	2020-08-10 15:25:39 +01:00
Henning Andersen	a155315ceb	Autoscaling decider and decision service (#59005 ) (#60884 ) Split the autoscaling decider into a service and configuration in order to enable having additional context information available in the service. Added AutoscalingDeciderContext holding generic information all deciders are expected to need. Implemented GET _autoscaling/decision	2020-08-10 15:28:52 +02:00
Andrei Dan	235e5ed3ea	[7.x] ILM: add force-merge step to searchable snapshots action (#60819 ) (#60882 ) This adds a force-merge step to the searchable snapshot action, enabled by default, but parameterizable using the `force_merge-index" optional boolean. eg. ``` PUT _ilm/policy/my_policy { "policy": { "phases": { "cold": { "actions": { "searchable_snapshot" : { "snapshot_repository" : "backing_repo", "force_merge_index": true } } } } } } ``` (cherry picked from commit d0a17b2d35f1b083b574246bdbf3e1929471a4a9) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-08-10 13:45:11 +01:00
Martijn van Groningen	64bb082f9b	Improve error message for non append-only writes that target data stream (#60874 ) Backport of #60809 to 7.x branch. Closes #60581	2020-08-10 13:18:59 +02:00
David Kyle	6b2ddf4453	Fix typo in DataHistogramGroupByIT name (#60880 ) (#60883 )	2020-08-10 11:55:01 +01:00
David Turner	f168bdac7d	Change transitive -> transient in ILM log message (#60871 ) "Transitive" is technically ok here but it's an overloaded word and it's not immediately clear which meaning is intended so this log message always makes me do a double-take. I think both "transient" and "transitory" are clearer, with "transient" being the usual choice.	2020-08-10 11:37:49 +01:00
David Turner	a2d5bfca2f	Even longer timeout for XPackRestIT (#60812 ) This suite is still occasionally failing with a timeout on macOS. Suggest further increasing this timeout until this suite is broken up. Relates #58071	2020-08-10 10:26:21 +01:00
Benjamin Trent	bc17afc535	[7.x] [ML] have DELETE analytics ignore stats failures and clean up unused stats (#60776 ) (#60784 ) * [ML] have DELETE analytics ignore stats failures and clean up unused stats (#60776) When deleting an analytics configuration, the request MIGHT fail if the .ml-stats index does not exist or is in strange state (shards unallocated). Instead of making the request fail, we should log that we were unable to delete the stats docs and then have them cleaned up in the 'delete_expire_data' janitorial process	2020-08-06 08:55:35 -04:00
David Turner	05b2a2db8b	AwaitsFix for #60781	2020-08-06 12:28:53 +01:00
David Turner	f24a3a4e81	AwaitsFix for 60781	2020-08-06 11:35:44 +01:00
Hendrik Muhs	b210aaf666	[Transform] remove wrong test (#60807 ) remove test, scripts are excluded in the change collector, the test is a leftover from a previous solution of #57332, which has been discarded relates #60724 fixes #60794	2020-08-06 11:56:19 +02:00
Dimitris Athanasiou	cedbe6968b	[7.x][ML] Include cause in logging during test inference (#60749 ) (#60805 ) When an exception is thrown during test inference we are not including the cause message in our logging. This commit addresses this issue. Backport of #60749	2020-08-06 11:45:59 +03:00
Ryan Ernst	d88098c1d5	Mute flaky transform pivot test see https://github.com/elastic/elasticsearch/issues/60794	2020-08-05 14:53:25 -07:00
Francisco Fernández Castaño	b4044004aa	Add recovery state tracking for Searchable Snapshots (#60751 ) This pull request adds recovery state tracking for Searchable Snapshots. In order to track recoveries for searchable snapshot backed indices, this pull request adds a new type of RecoveryState. This newRecoveryState instance is able to deal with the small differences that arise during Searchable snapshots recoveries. Those differences can be summarized as follows: - The Directory implementation that's provided by SearchableSnapshots mark the snapshot files as reused during recovery. In order to keep track of the recovery process as the cache is pre-warmed, those files shouldn't be marked as reused. - Once the shard is created, the cache starts its pre-warming phase, meaning that we should keep track of those downloads during that process and tie the recovery to this pre-warming phase. The shard is considered recovered once this pre-warming phase has finished. Backport of #60505	2020-08-05 17:41:49 +02:00
Hendrik Muhs	08f94c914b	[Transform] disable optimizations when using scripts in group_by (#60724 ) disable optimizations when using scripts in group_by, when scripts using scripts we can not predict the outcome and we have no query counterpart. Other optimizations for other group_by's are not affected. fixes #57332	2020-08-05 17:27:19 +02:00
Hendrik Muhs	2b6891b584	[7.x][Transform] implement test suite to test continuous transforms (#60725 ) implements a test suite for testing continuous transform with randomization in terms of mappings, index settings, transform configuration. Add a test case for terms and date histogram. The test covers: - continuous mode with several checkpoints created - correctness of results - optimizations (minimal necessary writes) - permutations of features (index settings, aggs, data types, index or data stream)	2020-08-05 16:56:01 +02:00
Albert Zaharovits	e5dce5e805	Use the Index Access Control from the scroll search context (#60640 ) When the RBACEngine authorizes scroll searches it sets the index access control to the very limiting IndicesAccessControl.ALLOW_NO_INDICES value. This change will set it to the value for the index access control that was produced during the authorization of the initial search that created the scroll, which is now stored in the scroll context.	2020-08-05 15:37:37 +03:00
Przemysław Witek	0afa1bd972	Deprecate allow_no_jobs and allow_no_datafeeds in favor of allow_no_match (#60601 ) (#60727 )	2020-08-05 13:39:40 +02:00
Yannick Welsch	9f6f66f156	Fail searchable snapshot shards on invalid license (#60722 ) Implements license degradation behavior for searchable snapshots. Snapshot-backed shards are failed when the license becomes invalid, and shards won't be reallocated. After valid license is put in place again, shards are allocated again.	2020-08-05 13:14:15 +02:00
Adrien Grand	67f6f34c23	Remove dataset.* fields. (#60720 ) These are being replaced by the `data_stream.*` fields.	2020-08-05 11:35:05 +02:00
Rory Hunter	43762f69d1	Move deprecation HTTP tests to deprecation plugin (#60523 ) Backport of #60298. This PR moves the deprecation HTTP tests under the deprecation plugin, as a precursor to adding further tests as part of #58924.	2020-08-05 09:54:34 +01:00
Adrien Grand	602d269059	Rename `datastream` to `data_stream`. (#60714 ) The name of the feature having a space: "data stream", the key should have an underscore.	2020-08-05 09:55:02 +02:00
Russ Cam	e9c0bf1566	Remove body from indices.create_data_stream REST spec (#60705 ) This commit removes the body property from the indices.create_data_stream.json REST API spec as the API does not support sending a body. Update the description of the API to remove that a data stream can be updated with the API - data streams can only be created with this API and attempting to update yields a `resource_already_exists_exception`. Closes #60704 (cherry picked from commit 2cab2e0ee094769852df31566dbe22b5df59d900)	2020-08-05 17:01:28 +10:00
Igor Motov	959690a64a	Refactor extendedBounds to use DoubleBounds (#60556 ) (#60681 ) Refactors extendedBounds to use DoubleBounds instead of 2 variables. This is a follow up for #59175	2020-08-04 16:45:47 -04:00
Francisco Fernández Castaño	b500b3d55a	Decrease restore rate limit value to enforce its usage on SearchableSnapshotsIntegTests#testMaxRestoreBytesPerSecIsUsed (#60650 ) Fixes #59287. Backport of #59592	2020-08-04 17:44:47 +02:00
Alan Woodward	b3ae5d26bd	Move mapper validation to the mappers themselves (#60072 ) (#60649 ) Currently, validation of mappers (checking that cross-references are correct, limits on field name lengths and object depths, multiple definitions, etc) is performed by the MapperService. This means that any mapper-specific validation, for example that done on the CompletionFieldMapper, needs to be called specifically from core server code, and so we can't add validation to mappers that live in plugins. This commit reworks the validation framework so that mapper-specific validation is done on the Mapper itself. Mapper gets a new `validate(MappingLookup)` method (already present on `MetadataFieldMapper` and now pulled up to the parent interface), which is called from a new `DocumentMapper.validate()` method. All the validation code currently living on `MapperService` moves either to individual mapper implementations (FieldAliasMapper, CompletionFieldMapper) or into `MappingLookup`, an altered `DocumentFieldMappers` which now knows about object fields and can check for duplicate definitions, or into DocumentMapper which handles soft limit checks.	2020-08-04 14:39:20 +01:00
Rene Groeschke	bdd7347bbf	Merge test runner task into RestIntegTest (7.x backport) (#60600 ) * Merge test runner task into RestIntegTest (#60261) * Merge test runner task into RestIntegTest * Reorganizing Standalone runner and RestIntegTest task * Rework general test task configuration and extension * Fix merge issues * use former 7.x common test configuration	2020-08-04 14:46:32 +02:00
Adrien Grand	20ae1b75bd	Rename dataset to datastream (#60638 ) Co-authored-by: ruflin <spam@ruflin.com>	2020-08-04 09:58:54 +02:00
Armin Braun	7ae9dc2092	Unify Stream Copy Buffer Usage (#56078 ) (#60608 ) We have various ways of copying between two streams and handling thread-local buffers throughout the codebase. This commit unifies a number of them and removes buffer allocations in many spots.	2020-08-04 09:54:52 +02:00
Yang Wang	54aaadade7	API key name should always be required for creation (#59836 ) (#60636 ) The name is now required when creating or granting API keys.	2020-08-04 13:28:47 +10:00
Tim Vernum	c58e32bb27	Improve assertion failure when error is not empty (#60572 ) This commit changes TokenAuthIntegTests so all occurrences of assertThat(x.size(), equalTo(0)); become assertThat(x, empty()); This means that the assertion failure message will include the contents of the list (`x`) instead of just its size, which facilitates easier failure diagnosis. Relates: #56903 Backport of: #60496	2020-08-04 11:05:18 +10:00
Jake Landis	bcb9d06bb6	[7.x] Cleanup xpack build.gradle (#60554 ) (#60603 ) This commit does three things: * Removes all Copyright/license headers for the build.gradle files under x-pack. (implicit Apache license) * Removes evaluationDependsOn(xpackModule('core')) from build.gradle files under x-pack * Removes a place holder test in favor of disabling the test task (in the async plugin)	2020-08-03 13:11:43 -05:00
Hendrik Muhs	1e01832b0c	fix possible NPE introduced in #60591	2020-08-03 16:40:38 +02:00
Hendrik Muhs	cd6492fc11	[Transform] fix regression of date histogram optimization (#60591 ) fixes mix up of input and output field name for date histogram optimization. minimal fix, more tests to be added with #60469 fixes #60590	2020-08-03 15:52:08 +02:00
Yannick Welsch	b0d601fa63	Adjust searchable snapshot license (#60578 ) No longer needs Platinum license for testing on staging.	2020-08-03 13:19:53 +02:00
Yannick Welsch	9e24a54382	Clean existing index folder when loading searchable snapshot (#60122 ) Closing a regular index and mounting a snapshot-backed index into that existing index does not clean the existing index folders of those preexisting shards. This PR removes the existing Lucene / translog files once the searchable snapshot shard is starting up. Future PRs will make reuse of the existing index files to populate the cache.	2020-08-03 13:19:11 +02:00
Yang Wang	a76fc324d4	Fix get-license test failure by ensure cluster is ready (#60498 ) (#60569 ) When a new cluster starts, the HTTP layer becomes ready to accept incoming requests while the basic license is still being populated in the background. When a get license request comes in before the license is ready, it can get 404 error. This PR fixes it by either wrap the license check in assertBusy or ensure the license is ready before perform the check. This is a backport for both #60498 and #60573	2020-08-03 19:40:03 +10:00
Tim Vernum	1a373b0c21	Only call listener once (SP template registration) (#60567 ) This fixes a bug in the IdP's template registration that would sometimes call the listener twice. Resolves: #54285 Resolves: #54423 Backport of: #60497	2020-08-03 13:45:16 +10:00
Andrei Dan	ac258f10d6	Data streams: throw ResourceAlreadyExists exception (#60518 ) (#60536 ) For consistency reasons (and reducing the overload of IllegalArgumentException) this changes the exception thrown when trying to create a data stream that already exists. (cherry picked from commit ac2184c4614bba0f3ee377da49aea0daed98bab4) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-08-01 16:31:09 +01:00
Julie Tibshirani	f1d4fd8c3e	Correct name of IndexFieldData#loadGlobalDirect. (#60492 ) It seems 'localGlobalDirect' was just a typo.	2020-07-31 10:53:21 -07:00
Rene Groeschke	ed4b70190b	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 ) - Replace immediate task creations by using task avoidance api - One step closer to #56610 - Still many tasks are created during configuration phase. Tackled in separate steps	2020-07-31 13:09:04 +02:00
Hendrik Muhs	a721d6d19b	[Transform] use correct version in BWC serialization test (#60500 ) use correct version in BWC serialization test fixes #60464	2020-07-31 11:23:05 +02:00
Julie Tibshirani	8ac81a3447	Remove IndexFieldData#clear since it is unused. (#60475 ) This method was never called. It also seemed tricky that calling a method on `IndexFieldData` could clear the contents of a shared cache.	2020-07-30 14:07:55 -07:00
Mark Tozzi	970a0c8957	[7.x] Aggregation tests for Wildcard Field (#58507 ) (#60423 )	2020-07-30 08:56:21 -04:00
Przemysław Witek	9e27f7474c	Make MlDailyMaintenanceService delete jobs that are in deleting state anyway (#60121 ) (#60439 )	2020-07-30 09:53:11 +02:00
Hendrik Muhs	aaed6b59d6	[7.x][Transform] add support for missing bucket (#59591 ) (#60390 ) add support for "missing_bucket" in group_by fixes #42941 fixes #55102 backport #59591	2020-07-30 08:26:51 +02:00
Bogdan Pintea	8c22adc447	SQL: Add option to provide the delimiter for the CSV format (#59907 ) (#60420 ) * SQL: Add option to provide the delimiter for the CSV format (#59907) * Add option to provide the delimiter to the CSV fmt This adds the option to provide the desired character as the separator for the CSV format (the default remains comma). A set of characters are excluded though - like CR, LF, `"` - to avoid slipping onto the CSV-dialects slope. The tab is also forbidden, the user needs to choose the "tsv" format explicitely. Update the doc to make it clear that the textual CSV, TSV and TXT formats pass the cursor back to the user through the Cursor HTTP header. (cherry picked from commit 3a8b00cc7480f7ada57fcea3cbac957facac08fc) * Java8 fixes - replace Set#of(); - URLDecoder#decode() requires a string (vs a charset) as 2nd arg.	2020-07-29 21:40:11 +02:00
Bogdan Pintea	30610d962a	Fix SYS COLUMNS schema in ODBC mode (#59513 ) (#60418 ) * Fix SYS COLUMNS schema in ODBC mode (#59513) * Fix SYS COLUMNS schema in ODBC mode This fixes a regression when certain ODBC-specific columns that need to be of the short type were returned as the integer type. This also fixes the stubbing for the -indices SYS COLUMN commands. (cherry picked from commit 96d89dc9b1fd731e736ef804a16bd05496c1dea6) Java8 fix: avoid diamond notation in test. Qualify anonymous class in test.	2020-07-29 21:19:32 +02:00
Bogdan Pintea	4c771485f6	SQL: fix NPE on ambiguous GROUP BY (#59370 ) (#60416 ) * fix npe on ambiguous group by * add tests for aggregates and group by, add quotes to error message * add more cases for Group By ambiguity test * change error messages for field ambiguity * change collection aliases approach * add locations of attributes for ambiguous grouping error * Adress review comments - remove Comparable implementations from Attribute and Location; - add ad-hoc comparator for sorting locations in ambiguity message; - remove added AttributeAlias class with Touple; - add code comment to explain issue with Location overwriting. * Fix c&p error in location ref generation comparator Fix copy&paste error in dedicated comparator used for sorting ambiguity location references. Slightly increase its readability. Co-authored-by: Nikita Verkhovin <verkhovin13@gmail.com> (cherry picked from commit 9ba70a3483f0f4987229bec231cdc004f51b88a5)	2020-07-29 20:44:28 +02:00
Bogdan Pintea	79ef263fc2	Add test with alias reuse and grouping (#60396 ) (#60421 ) Add test with alias reuse and grouping. (cherry picked from commit 37ee819eb98fd10c1b16a61e4e1d446d0ee859de)	2020-07-29 20:43:04 +02:00
Mark Vieira	39fa1c4df0	Add compatibility testing for JDBC driver (#60409 ) This commit adds compatibility testing of our JDBC driver against different Elasticsearch versions. Although we are really testing the forwards compatibility nature of the JDBC driver we model the testing the same as we do existing BWC tests, that is, with the current branch fetching the earlier versions of the artifact that is to be tested. In this case, that's the JDBC driver itself. Because the tests include the JDBC driver jar on it's classpath we had to change the packaging of the driver jar in order to avoid jarhell and other conflicting dependency issues when using an old JDBC driver with later branches. For this we simply relocate all driver dependencies in the shadow jar under a "shadowed" package. This allows the JDBC driver to use the correct version of Elasticsearch libs classes, while the tests themselves use their versions. Since this required a change to the driver jar compatibility testing can only go back as far as that version which at the time of this commit is 7.8.1.	2020-07-29 10:45:11 -07:00
David Roberts	2a0116f51b	[ML] Take more care that memory estimation uses unique named pipes (#60405 ) Prior to this change ML memory estimation processes for a given job would always use the same named pipe names. This would often cause one of the processes to fail. This change avoids this risk by adding an incrementing counter value into the named pipe names used for memory estimation processes. Backport of #60395	2020-07-29 17:29:55 +01:00
Armin Braun	bfee7b91ff	Increase Timeouts in SLMBlockingIntegTests (#60356 ) (#60403 ) The retention run goes through a number of steps and can randomly take more than 10s. => increased timeout to 30s like we did in other spots in this test Also, noticed that we had a hard wait of 10s in this test, removed it and adjusted following busy assert in a way that can deal with a missing snapshot (from when the assert runs before the snapshot was put into the CS). Closes #60336	2020-07-29 17:34:49 +02:00
Benjamin Trent	76359aaa53	[ML] always write prediction_[score\|probability] for classification inference (#60335 ) (#60397 ) In order to unify model inference and analytics results we need to write the same fields. prediction_probability and prediction_score are now written for inference calls against classification models.	2020-07-29 10:58:14 -04:00
Nhat Nguyen	9d4a64e749	Allow CCR on nodes with legacy roles only (#60093 ) CCR will stop functioning if the master node is on 7.8, but data nodes are before that version because the master node considers that all data nodes do not have the remote cluster client role. This commit allows CCR work on data nodes with legacy roles only. Relates #54146 Relates #59375	2020-07-29 10:57:31 -04:00
Benjamin Trent	a6da1fd73e	[ML] require alias when indexing to an alias that should be created (#60315 ) (#60394 ) This sets up all indexing to one of our write aliases to require it actually be an alias. This allows failures scenarios to be captured quickly, loudly, and then potentially recovered.	2020-07-29 10:52:36 -04:00
Jim Ferenczi	578749a5e8	Fix AsyncResultsServiceTests#testRetrieveFromMemoryWithExpiration (#60337 ) This change ensures that the expiration time that is set in the test is long enough to not be triggered by a slow execution. Closes #60255	2020-07-29 09:47:47 +02:00
Hendrik Muhs	5eb04fb413	[Transform] fix performance regression introduced in #60196 (#60276 ) re-work #60196, to not skip building change collectors as otherwise date histogram only pivots would run slow relates #60125	2020-07-29 09:44:03 +02:00
Armin Braun	753fd4f6bc	Cleanup and optimize More Serialization Spots (#59959 ) (#60331 ) Same as #59626 for a few more spots.	2020-07-29 07:20:44 +02:00
Benjamin Trent	54c8936508	[ML] do not summerize importance for custom features (#60198 ) (#60333 ) If a feature is created via a custom pre-processor, we should return the importance for that feature. This means we will not return the importance for the original document field for custom processed features. closes https://github.com/elastic/elasticsearch/issues/59330	2020-07-28 15:58:20 -04:00
Julie Tibshirani	c7bfb5de41	Add search `fields` parameter to support high-level field retrieval. (#60258 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-28 10:58:20 -07:00
Nhat Nguyen	416e51980c	Relax ShardFollowTasksExecutor validation (#60054 ) If a primary shard of a follower index is being relocated, then we will fail to create a follow-task. This validation is too restricted. We should ensure that all primaries of the follower index are active instead. Closes #59625	2020-07-28 13:46:49 -04:00
Nhat Nguyen	6ece629ec3	Set timeout of master requests on follower to unbounded (#60070 ) Today, a follow task will fail if the master node of the follower cluster is temporarily overloaded and unable to process master node requests (such as update mapping, setting, or alias) from a follow-task within the default timeout. This error is transient, and follow-tasks should not abort. We can avoid this problem by setting the timeout of master node requests on the follower cluster to unbounded. Closes #56891	2020-07-28 13:46:49 -04:00
Zachary Tong	9f8ec3e3fb	Mute SSLDriverTests#testCloseDuringHandshakePreJDK11 Tracking issue: https://github.com/elastic/elasticsearch/issues/59992	2020-07-28 13:20:53 -04:00
markharwood	e0286e9bd3	Search - remove allow-expensive-query checks from wildcard field. (#60273 ) (#60308 ) Removing allow-expensive-query checks because we think this field type is fast enough. Closes #60139	2020-07-28 17:12:33 +01:00
Dimitris Athanasiou	ed7dcff7c4	[7.x][ML] Audit updates on data frame analytics jobs (#60126 ) (#60287 ) Closes #59652 Backport of #60126	2020-07-28 16:33:35 +03:00
Dimitris Athanasiou	16ffcfb9f6	[7.x][ML] Ensure bulk requests are not over memory limit (#60219 ) (#60283 ) Data frame analytics jobs that work with very large datasets may produce bulk requests that are over the memory limit for indexing. This commit adds a helper class that bundles index requests in bulk requests that steer away from the memory limit. We then use this class both from the results joiner and the inference runner ensuring data frame analytics jobs do not generate bulk requests that are too large. Note the limit was implemented in #58885. Backport of #60219	2020-07-28 16:04:03 +03:00
Dimitris Athanasiou	981e436d6c	[7.x][ML] Improve assertion on regression alias field test (#60221 ) (#60264 ) Previously the test was asserting the prediction on each document was close 10.0 from the expected. It turned out that was not enough as we occasionally saw the test failing by little. Instead of relaxing that assertion, this commit changes it to assert the mean prediction error is less than 10.0. This should reduce the chances of the test failing significantly. Fixes #60212 Backport of #60221	2020-07-28 11:48:00 +03:00
Dan Hermann	b98caf58ee	Mark data stream APIs as stable (#59860 ) (#60206 )	2020-07-27 10:37:52 -05:00
Benjamin Trent	ea3c49979e	Test mute for issue 60212 (#60214 )	2020-07-27 10:10:40 -04:00
Hendrik Muhs	95c99ca887	[Transform] Fix Regression: continuous transform can fail for (date) histogram group_by(#60196 ) do not create change collector if group_by configuration does not support change detection fixes #60125	2020-07-27 14:50:03 +02:00
Dimitris Athanasiou	439b7f7e59	[7.x][ML] DFA result processor should only skip rows and model chunks on cancel (#60113 ) (#60193 ) When the job is force-closed or shutting down due to a fatal error we clean up all cancellable job operations. This includes cancelling the results processor. However, this means that we might not persist objects that are written from the process like stats, memory usage, etc. In hindsight, we do not gain from cancelling the results processor in its entirety. It makes more sense to skip row results and model chunks but keep stats and instrumentation about the job as the latter may contain useful information to understand what happened to the job. Backport of #60113	2020-07-27 13:42:46 +03:00
David Roberts	89466eefa5	Don't require separate privilege for internal detail of put pipeline (#60190 ) Putting an ingest pipeline used to require that the user calling it had permission to get nodes info as well as permission to manage ingest. This was due to an internal implementaton detail that was not visible to the end user. This change alters the behaviour so that a user with the manage_pipeline cluster privilege can put an ingest pipeline regardless of whether they have the separate privilege to get nodes info. The internal implementation detail now runs as the internal _xpack user when security is enabled. Backport of #60106	2020-07-27 10:44:48 +01:00
Nhat Nguyen	bc65b3a590	Increase timeout in AutoFollowIT (#60004 ) It can take more than 10 seconds to auto-follow and create a follow-task on a slow CI. This commit increases timeout in AutoFollowIT by replacing assertBusy with assertLongBusy. Closes #59952	2020-07-23 16:36:53 -04:00
Nhat Nguyen	0fe4d5df67	Increase timeout testFollowIndexWithConcurrentMappingChanges Fixes #59273	2020-07-23 16:22:58 -04:00
Dimitris Athanasiou	6b9a362ec2	[7.x][ML] Skip test inference if DFA task has been stopped (#60116 ) (#60127 ) If the job is stopped before starting inference on test data, we should skip inference entirely. Backport of #60116	2020-07-23 18:34:09 +03:00
Dan Hermann	ca25f6ae6f	Include the resolve index action in the view_index_metadata privilege (#59785 ) (#60112 )	2020-07-23 08:13:56 -05:00
Dan Hermann	fe12217c7f	[7.x] Move REST specs for data streams (#60111 )	2020-07-23 08:10:54 -05:00
Armin Braun	ebb6677815	Formalize and Streamline Buffer Sizes used by Repositories (#59771 ) (#60051 ) Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient. By the same token, the use of stream copying with the default 8k buffer size for blob writes was inefficient as well. We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`. This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs. This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.	2020-07-22 21:06:31 +02:00
Larry Gregory	a686ccc9b2	[Backport][7.x] Introduce reserved_ml_apm_user kibana privilege (#59854 ) (#60047 )	2020-07-22 11:06:10 -04:00
Jay Modi	c8ef2e18f7	Thread safe clean up of LocalNodeModeListeners (#60007 ) This commit continues on the work in #59801 and makes other implementors of the LocalNodeMasterListener interface thread safe in that they will no longer allow the callbacks to run on different threads and possibly race each other. This also helps address other issues where these events could be queued to wait for execution while the service keeps moving forward thinking it is the master even when that is not the case. In order to accomplish this, the LocalNodeMasterListener no longer has the executorName() method to prevent future uses that could encounter this surprising behavior. Each use was inspected and if the class was also a ClusterStateListener, the implementation of LocalNodeMasterListener was removed in favor of a single listener that combined the logic. A single listener is used and there is currently no guarantee on execution order between ClusterStateListeners and LocalNodeMasterListeners, so a future change there could cause undesired consequences. For other classes, the implementations of the callbacks were inspected and if the operations were lightweight, the overriden executorName method was removed to use the default, which runs on the same thread. Backport of #59932	2020-07-22 08:02:18 -06:00
Dimitris Athanasiou	7e652ca873	[7.x][ML] Include same fields during test inference as in training (#… (#60034 ) In #58877, when we switched test inference on java, we just use the doc's `_source` as features. However, this could be missing out on features that were used during training, e.g. alias fields, etc. This commit addresses this by extracting fields to use as features during inference the same way they are extracted in `DataFrameDataExtractor` when they are used for training. Backport of #59963	2020-07-22 12:54:13 +03:00
David Roberts	7358f9fb05	[ML] Mute ForecastIT.testOverflowToDisk in EAR builds (#60040 ) Due to https://github.com/elastic/elasticsearch/issues/58806	2020-07-22 10:17:37 +01:00
James Baiera	1c1a4297e0	Track backing indices in data streams stats from cluster state (#59817 ) (#60015 ) If shard level results are incomplete in the data streams stats call, it is possible to get inaccurate counts of the number of backing indices, despite this data being accurate and available in the cluster state.	2020-07-21 23:21:33 -04:00

1 2 3 4 5 ...

5381 Commits