OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tanguy Leroux	8669766a81	Reduce contention in CacheFile.fileLock() method (#55662 ) The CacheFile.fileLock() method is used to acquire a lock on a cache file so that the file can't be deleted (or its file handle closed) during the execution of a read or a write operation. Today this lock is obtained by first acquiring the eviction lock (the write lock of the readwrite lock), then by checking if the cache file is evicted and the file channel still open, and finally by obtaining the file lock (the read lock of the readwrite lock). Acquiring the read lock while the eviction lock is held ensures that the cache file eviction cannot start in the meanwhile. But eviction starts (and terminations) also acquire the eviction lock; and this lock cannot be obtained while a read lock is held (the write lock of a readwrite lock is exclusive). If we were acquiring a read lock and checking the eviction flag and file channel existence while holding the read lock we know that no eviction can start or finish until the read lock is released.	2020-04-23 14:40:27 +02:00
Rory Hunter	d66af46724	Always use deprecateAndMaybeLog for deprecation warnings (#55319 ) Backport of #55115. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages can potentially be deduplicated.	2020-04-23 09:20:54 +01:00
David Roberts	87f4751eca	[ML] Make find_file_structure recognize Kibana CSV report timestamps (#55609 ) The Kibana CSV export feature uses a non-standard timestamp format. This change adds it to the formats the find_file_structure endpoint recognizes out-of-the-box, to make round-tripping data from Kibana back to Kibana via CSV files easier. Fixes #55586	2020-04-23 08:39:07 +01:00
Jake Landis	25ea6a74f0	[7.x] Validate REST specs against schema (#55117 ) (#55563 ) A JSON schema was recently introduced for the REST API specification. #54252 This PR introduces a 3rd party validation tool to ensure that the REST specification conforms to the schema. The task is applied to the 3 projects that contain REST API specifications. The plugin wires this task into the precommit commit task, and should be considered as part of the public API for the build tools for any plugin developer to contribute their plugin's specification. An ignore parameter has been introduced for the task to allow specific file to be ignored from the validation. The ignored files in this PR will soon get issues logged and a link so they can be fixed. Closes #54314	2020-04-22 14:14:03 -05:00
Albert Zaharovits	82ed0ab420	Update the audit logfile list of system users (#55578 ) Out of the box "access granted" audit events are not logged for system users. The list of system users was stale and included only the _system and _xpack users. This commit expands this list with _xpack_security and _async_search, effectively reducing the auditing noise by not logging the audit events of these system users out of the box. Closes #37924	2020-04-22 21:59:31 +03:00
Tal Levy	c370b83bd7	Fix locale lowercase test issue in GenerateSnapshotNameStepTests (#55597 ) (#55605 ) The testPerformAction test has been failing periodically due to how Hamcrest's containsStringIgnoringCase does not lowercase using the same Locale set in the test infrastructure. This commit falls back to explicitly lowercasing using the root locale	2020-04-22 11:29:57 -07:00
Tal Levy	f27ce69f0c	[backport] Add geo_bounds aggregation support for geo_shape (#55328 ) (#55600 ) This commit adds a new GeoShapeBoundsAggregator to the spatial plugin and registers it with the GeoShapeValuesSourceType. This enables geo_bounds aggregations on geo_shape fields	2020-04-22 11:29:35 -07:00
Tal Levy	0844455505	Add geo_shape mapper supporting doc-values in Spatial Plugin (#55037 ) (#55500 ) After #53562, the `geo_shape` field mapper is registered within a module. This opens the door for introducing a new `geo_shape` field mapper into the Spatial Plugin that has doc-values support. This is very much an extension of server's GeoShapeFieldMapper, but with the addition of the doc values implementation.	2020-04-22 08:12:54 -07:00
Dimitris Athanasiou	50a5afed15	[7.x][ML] Prepare parsing phase_progress from DFA process (#55580 ) (#55587 ) Data frame analytics process currently reports progress as an integer `progress_percent`. We parse that and report it from the _stats API as the progress of the `analyzing` phase. However, we want to allow the DFA process to report progress for more than one phase. This commit prepares for this by parsing `phase_progress` from the process, an object that contains the `phase` name plus the `progress_percent` for that phase. Backport of #55580	2020-04-22 16:38:32 +03:00
Benjamin Trent	7c81cd7833	[ML] explicitly disallow partial results in datafeed extractors (#55537 ) (#55585 ) Instead of doing our own checks against REST status, shard counts, and shard failures, this commit changes all our extractor search requests to set `.setAllowPartialSearchResults(false)`. - Scrolls are automatically cleared when a search failure occurs with `.setAllowPartialSearchResults(false)` set. - Code error handling is simplified closes https://github.com/elastic/elasticsearch/issues/40793	2020-04-22 09:07:44 -04:00
David Roberts	810caf5ffe	[ML] Test that audit message is written when closing unassigned job (#55582 ) Issue #55521 suggested that audit messages were not written when closing an unassigned job. This is not the case, but we didn't have a test to prove it. Backport of #55571	2020-04-22 13:23:43 +01:00
David Roberts	2dc5586afe	[ML] Add effective max model memory limit to ML info (#55581 ) The ML info endpoint returns the max_model_memory_limit setting if one is configured. However, it is still possible to create a job that cannot run anywhere in the current cluster because no node in the cluster has enough memory to accommodate it. This change adds an extra piece of information, limits.effective_max_model_memory_limit, to the ML info response that returns the biggest model memory limit that could be run in the current cluster assuming no other jobs were running. The idea is that the ML UI will be able to warn users who try to create jobs with higher model memory limits that their jobs will not be able to start unless they add a bigger ML node to their cluster. Backport of #55529	2020-04-22 12:28:50 +01:00
David Roberts	da5aeb8be7	[ML] Return assigned node in start/open job/datafeed response (#55570 ) Adds a "node" field to the response from the following endpoints: 1. Open anomaly detection job 2. Start datafeed 3. Start data frame analytics job If the job or datafeed is assigned to a node immediately then this field will return the ID of that node. In the case where a job or datafeed is opened or started lazily the node field will contain an empty string. Clients that want to test whether a job or datafeed was opened or started lazily can therefore check for this. Backport of #55473	2020-04-22 12:06:53 +01:00
David Kyle	e99ef3542c	Mute ModelLoadingServiceTests::testMaxCachedLimitReached	2020-04-22 11:53:07 +01:00
Tim Vernum	8b566aea47	Fix use of password protected PKCS#8 keys for SSL (#55567 ) PEMUtils would incorrectly fill the encryption password with zeros (the '\0' character) after decrypting a PKCS#8 key. Since PEMUtils did not take ownership of this password it should not zero it out because it does not know whether the caller will use that password array again. This is actually what PEMKeyConfig does - it uses the key encryption password as the password for the ephemeral keystore that it creates in order to build a KeyManager. Backport of: #55457	2020-04-22 16:38:51 +10:00
Yang Wang	32e46bf552	Fix certutil http for empty password with JDK 11 and lower (#55437 ) (#55565 ) Fix elasticseaerch-certutil http command so that it correctly accepts empty keystore password with JDK version 11 and lower.	2020-04-22 15:03:10 +10:00
David Kyle	8e8c6b4aee	Fix accounting in ModelLoadingServiceTests (#55307 ) (#55547 ) In the test after the first load event is is not known which models are cached as loading a later one will evict an earlier one and the order is not known. The models could have been loaded 1 or 2 times not exactly twice	2020-04-21 19:25:06 +01:00
Armin Braun	db7eb8e8ff	Remove Redundant CS Update on Snapshot Finalization (#55276 ) (#55528 ) This change folds the removal of the in-progress snapshot entry into setting the safe repository generation. Outside of removing an unnecessary cluster state update, this also has the advantage of removing a somewhat inconsistent cluster state where the safe repository generation points at `RepositoryData` that contains a finished snapshot while it is still in-progress in the cluster state, making it easier to reason about the state machine of upcoming concurrent snapshot operations.	2020-04-21 15:33:17 +02:00
David Turner	be60d50452	Allow searching of snapshot taken while indexing (#55511 ) Today a read-only engine requires a complete history of operations, in the sense that its local checkpoint must equal its maximum sequence number. This is a valid check for read-only engines that were obtained by closing an index since closing an index waits for all in-flight operations to complete. However a snapshot may not have this property if it was taken while indexing was ongoing, but that's ok. This commit weakens the check for a complete history to exclude the case of a searchable snapshot. Relates #50999	2020-04-21 13:21:38 +01:00
Ignacio Vera	e4c65b4388	mute test SSLReloadDuringStartupIntegTests.testReloadDuringStartup (#55525 )	2020-04-21 14:13:13 +02:00
Jim Ferenczi	0b3bdfcc3e	Fix expiration time in async search response (#55435 ) This change ensures that we return the latest expiration time when retrieving the response from the index. This commit also fixes a bug that stops the garbage collection of saved responses if the async search index is deleted.	2020-04-21 14:04:29 +02:00
Przemysław Witek	59d377462f	Apply default timeout in StopDataFrameAnalyticsAction.Request (#55512 ) (#55517 )	2020-04-21 13:05:48 +02:00
Nhat Nguyen	3cc4e0dd09	Retry follow task when remote connection queue full (#55314 ) If more than 100 shard-follow tasks are trying to connect to the remote cluster, then some of them will abort with "connect listener queue is full". This is because we retry on ESRejectedExecutionException, but not on RejectedExecutionException.	2020-04-20 22:43:05 -04:00
Stuart Tettemer	93a2e9b0f9	Test: MockScoreScript can be cacheable. (#55499 ) Backport: 0ed1eb5	2020-04-20 17:09:58 -06:00
Benjamin Trent	cabff65aec	[ML] Fixing inference stats race condition (#55163 ) (#55486 ) `updateAndGet` could actually call the internal method more than once on contention. If I read the JavaDocs, it says: ```* @param updateFunction a side-effect-free function``` So, it could be getting multiple updates on contention, thus having a race condition where stats are double counted. To fix, I am going to use a `ReadWriteLock`. The `LongAdder` objects allows fast thread safe writes in high contention environments. These can be protected by the `ReadWriteLock::readLock`. When stats are persisted, I need to call reset on all these adders. This is NOT thread safe if additions are taking place concurrently. So, I am going to protect with `ReadWriteLock::writeLock`. This should prevent race conditions while allowing high (ish) throughput in the highly contention paths in inference. I did some simple throughput tests and this change is not significantly slower and is simpler to grok (IMO). closes https://github.com/elastic/elasticsearch/issues/54786	2020-04-20 16:21:18 -04:00
Benjamin Trent	24d41eb695	[ML] partitions model definitions into chunks (#55260 ) (#55484 ) This paves the data layer way so that exceptionally large models are partitioned across multiple documents. This change means that nodes before 7.8.0 will not be able to use trained inference models created on nodes on or after 7.8.0. I chose the definition document limit to be 100. This SHOULD be plenty for any large model. One of the largest models that I have created so far had the following stats: ~314MB of inflated JSON, ~66MB when compressed, ~177MB of heap. With the chunking sizes of `16 * 1024 * 1024` its compressed string could be partitioned to 5 documents. Supporting models 20 times this size (compressed) seems adequate for now.	2020-04-20 16:08:54 -04:00
Benjamin Trent	fa0373a19f	[7.x] [ML] Fix log spam and disable ILM/SLM history for native ML tests (#55475 ) * [ML] fix native ML test log spam (#55459) This adds a dependency to ingest common. This removes the log spam resulting from basic plugins being enabled that require the common ingest processors. * removing unnecessary changes * removing unused imports * removing unnecessary java setting	2020-04-20 15:41:30 -04:00
Lee Hinman	9eddd2bcc9	[7.x] Add prefer_v2_templates flag and index setting (#55411 ) (#55476 ) This commit adds a new querystring parameter on the following APIs: - Index - Update - Bulk - Create Index - Rollover These APIs now support a `?prefer_v2_templates=true\|false` flag. This flag changes the preference creation to use either V2 index templates or V1 templates. This flag defaults to `false` and will be changed to `true` for 8.0+ in subsequent work. Additionally, setting this flag internally sets the `index.prefer_v2_templates` index-level setting. This setting is used so that actions that automatically create a new index (things like rollover initiated by ILM) will inherit the preference from the original index. This setting is dynamic so that a transition from v1 to v2 templates can occur for long-running indices grouped by an alias performing periodic rollover. This also adds support for sending this parameter to the High Level Rest Client. Relates to #53101	2020-04-20 12:05:42 -06:00
Armin Braun	a0763d958d	Make RepositoryData Less Memory Heavy (#55293 ) (#55468 ) We don't really need `LinkedHashSet` here. We can assume that all the entries are unique and just use a list and use the list utilities to create the cheapest possible version of the list. Also, this fixes a bug in `addSnapshot` which would mutate the existing linked hash set on the current instance (fortunately this never caused a real world bug) and brings the collection in line with the java docs on its getter that claim immutability.	2020-04-20 18:28:06 +02:00
William Brafford	7817948926	Disable monitoring in ML multinode tests (#55461 ) Removing the deprecated "xpack.monitoring.enabled" setting introduced log spam and potentially some failures in ML tests. It's possible to use a different, non-deprecated setting to disable monitoring, so we do that here.	2020-04-20 10:51:16 -04:00
David Turner	0df329dde7	Use soft deletes for searchable snapshots tests (#55453 ) This allows us to perform some dummy indexing including updates/deletes.	2020-04-20 14:37:51 +01:00
Przemysław Witek	7d5f74e964	Fix and unmute testSetUpgradeMode_ExistingTaskGetsUnassigned (#55368 ) (#55452 )	2020-04-20 13:29:29 +02:00
Yannick Welsch	b9da307cd1	Add GCS support for searchable snapshots (#55403 ) Adds ranged read support for GCS repositories in order to enable searchable snapshot support for GCS. As part of this PR, I've extracted some of the test infrastructure to make sure that GoogleCloudStorageBlobContainerRetriesTests and S3BlobContainerRetriesTests are covering similar test (as I saw those diverging in what they cover)	2020-04-20 13:02:59 +02:00
Jason Tedor	9ecb222bfa	Remove unneeded validation in feature set usage This validation is not needed, as we have discovered the source of the serialization error that was leading to some usage instances appearing to not have a name.	2020-04-18 14:29:59 -04:00
Jason Tedor	23049391be	Upgrade feature aware check usage of ASM to 7.3.1 (#54577 ) This commit upgrades the ASM dependency used in the feature aware check to 7.3.1. This gives support for JDK 14. Additionally, now that Gradle understands JDK 13, it means we can remove a restriction on running the feature aware check to JDK 12 and lower.	2020-04-18 10:49:57 -04:00
Jay Modi	405ff0ce27	Handle TLS file updates during startup (#55330 ) This change reworks the loading and monitoring of files that are used for the construction of SSLContexts so that updates to these files are not lost if the updates occur during startup. Previously, the SSLService would parse the settings, build the SSLConfiguration objects, and construct the SSLContexts prior to the SSLConfigurationReloader starting to monitor these files for changes. This allowed for a small window where updates to these files may never be observed until the node restarted. To remove the potential miss of a change to these files, the code now parses the settings and builds SSLConfiguration instances prior to the construction of the SSLService. The files back the SSLConfiguration instances are then registered for monitoring and finally the SSLService is constructed from the previously parse SSLConfiguration instances. As the SSLService is not constructed when the code starts monitoring the files for changes, a CompleteableFuture is used to obtain a reference to the SSLService; this allows for construction of the SSLService to complete and ensures that we do not miss any file updates during the construction of the SSLService. While working on this change, the SSLConfigurationReloader was also refactored to reflect how it is currently used. When the SSLConfigurationReloader was originally written the files that it monitored could change during runtime. This is no longer the case as we stopped the monitoring of files that back dynamic SSLContext instances. In order to support the ability for items to change during runtime, the class made use of concurrent data structures. The use of these concurrent datastructures has been removed. Closes #54867 Backport of #54999	2020-04-17 20:10:33 -06:00
Zachary Tong	f46b567563	Convert InternalAggTestCase to AbstractNamedWriteableTestCase (#55250 ) Some aggregations, such as the Terms* family, will use an alternate class to represent unmapped shard results (while the rest of the aggs use the same object but with some form of "empty" or "nullish" values to represent unmapped). This was problematic with AbstractWireSerializingTestCase because it expects the instanceReader to always match the original class. Instead, we need to use the NamedWriteable version so that the registry can be consulted for the proper deserialization reader.	2020-04-17 16:39:38 -04:00
Ryan Ernst	66071b2f6e	Remove combo security and license helper from license state (#55366 ) (#55417 ) Security features in the license state currently do a dynamic check on whether security is enabled. This is because the license level can change the default security enabled state. This commit splits out the check on security being enabled, so that the combo method of security enabled plus license allowed is no longer necessary.	2020-04-17 13:07:02 -07:00
William Brafford	49e30b15a2	Deprecate disabling basic-license features (#54816 ) (#55405 ) We believe there's no longer a need to be able to disable basic-license features completely using the "xpack..enabled" settings. If users don't want to use those features, they simply don't need to use them. Having such features always available lets us build more complex features that assume basic-license features are present. This commit deprecates settings of the form "xpack..enabled" for basic-license features, excluding "security", which is a special case. It also removes deprecated settings from integration tests and unit tests where they're not directly relevant; e.g. monitoring and ILM are no longer disabled in many integration tests.	2020-04-17 15:04:17 -04:00
Benjamin Trent	4be3663968	[7.x] [ML] fix bugs with prediction field value settings (#55333 ) (#55394 ) * [ML] fix bugs with prediction field value settings (#55333) This fixes two unreleased bugs: 1. Prediction value type of `number` might show unexpected classes Analytics created models may have class labels like `1, 5, 10` (or some collection of discrete, whole numbers). These labels are passed to the inference model config in the `classification_labels` field. When the predicted value format is `numeric` it should attempt to see if the classification labels are provided and are numeric. If so, use those. If not, use the underlying value. 2. When supplying an update overwrite, inference was losing the default prediction field value. This is because it was not copied over in the copy ctor in the ClassificationConfig.Builder class. closes #55332	2020-04-17 14:45:02 -04:00
Jake Landis	eb30cf5c89	[7.x] Move Watcher config out of RestResourcesPlugin (#55136 ) (#55336 )	2020-04-17 12:38:01 -05:00
Benjamin Trent	8c581c3388	[ML] fixing and unmuting testHRDSplit test (#55349 ) (#55393 ) This fixes the long muted testHRDSplit. Some minor adjustments for modern day elasticsearch changes :). The cause of the failure is that a new `by` field entering the model with an exceptionally high count does not cause an anomaly. We have since stopped combining the `rare` and `by` in this manner. New entries in a `by` field are not anomalous because we have no history on them yet. closes https://github.com/elastic/elasticsearch/issues/32966	2020-04-17 09:55:52 -04:00
Tanguy Leroux	eb52df6652	Mute GraphTests.testTimedoutQueryCrawl (#55397 ) Relates #55396 Relates #53913	2020-04-17 15:31:48 +02:00
Benjamin Trent	65e0084120	[ML] do not start stopping tasks on reassignment (#55315 ) (#55388 ) When a anomaly jobs, datafeeds, and analytics tasks are stopped, they enter an ephemeral state called `STOPPING`. If the node executing the task fails while this is occurring, they could be stuck in the limbo state of `STOPPING`. It is best to mark the tasks as completed if they get reassigned to a node.	2020-04-17 08:57:12 -04:00
Tanguy Leroux	290361c63b	Mute MlConfigIndexMappingsFullClusterRestartIT.testMlConfigIndexMappingsAfterMigration (#55389 ) Relates #54415	2020-04-17 14:54:17 +02:00
Costin Leau	fc6261967b	SQL: Streamline declaration of LeafAggs (#55380 ) Avoid repetition of the aggregation builder setup Relates #55241 (cherry picked from commit 6cfe130e5da4aac11bad64f187fecc411139f5e2)	2020-04-17 15:04:54 +03:00
markharwood	7761b01a33	Remove normalizer support from wildcard field while we decide on approach for handling case insensitvity (#55294 ) (#55375 ) Closes #55288	2020-04-17 11:43:26 +01:00
Marios Trivyzas	f958e9abdc	SQL: Implement scripting inside aggs (#55241 ) (#55371 ) Implement the use of scalar functions inside aggregate functions. This allows for complex expressions inside aggregations, with or without GROUBY as well as with or without a HAVING clause. e.g.: ``` SELECT MAX(CASE WHEN a IS NULL then -1 ELSE abs(a * 10) + 1 END) AS max, b FROM test GROUP BY b HAVING MAX(CASE WHEN a IS NULL then -1 ELSE abs(a * 10) + 1 END) > 5 ``` Scalar functions are still not allowed for `KURTOSIS` and `SKEWNESS` as this is currently not implemented on the ElasticSearch side. Fixes: #29980 Fixes: #36865 Fixes: #37271 (cherry picked from commit 506d1beea7abb2b45de793bba2e349090a78f2f9)	2020-04-17 12:41:22 +02:00
Tanguy Leroux	71855fbfe0	Mute testSupportedFieldTypes in HDRPreAggregatedPercentile tests (#55369 ) Relates #55360	2020-04-17 10:49:43 +02:00
Martijn van Groningen	417d5f2009	Make data streams in APIs resolvable. (#55337 ) Backport from: #54726 The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used. In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs. Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex(). If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api). Relates to #53100	2020-04-17 08:33:37 +02:00
Jason Tedor	9a9c1a721c	Add validation to feature set usage name (#55350 ) We do not validate the name is not null, and not empty. Even though it never should be, we had a build failure where it appears that somehow this did happen. We add some validation here, in case this really is happening, we will have a more clear indication where this is coming from, and of course, validation that name fits the implicit assumptions that it is not null and not empty.	2020-04-16 18:16:53 -04:00
Mark Tozzi	22c55180c1	[7.x] Backport ValuesSourceRegistry and related work (#54922 ) * Add ValuesSource Registry and associated logic (#54281) * Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638) * ValuesSourceRegistry Prototype (#48758) * Remove generics from ValuesSource related classes (#49606) * fix percentile aggregation tests (#50712) * Basic thread safety for ValuesSourceRegistry (#50340) * Remove target value type from ValuesSourceAggregationBuilder (#49943) * Cleanup default values source type (#50992) * CoreValuesSourceType no longer implements Writable (#51276) * Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131) * Put values source types on fields (#51503) * Remove VST Any (#51539) * Rewire terms agg to use new VS registry (#51182) Also adds some basic AggTestCases for untested code paths (and boilerplate for future tests once the IT are converted over) * Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337) * Wire Percentiles aggregator into new VS framework (#51639) This required a bit of a refactor to percentiles itself. Before, the Builder would switch on the chosen algo to generate an algo-specific factory. This doesn't work (or at least, would be difficult) in the new VS framework. This refactor consolidates both factories together and introduces a PercentilesConfig object to act as a standardized way to pass algo-specific parameters through the factory. This object is then used when deciding which kind of aggregator to create Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will be moved in a subsequent PR. * Remove generics and target value type from MultiVSAB (#51647) * fix checkstyle after merge (#52008) * Plumb ValuesSourceRegistry through to QuerySearchContext (#51710) * Convert RareTerms to new VS registry (#52166) * Wire up Value Count (#52225) * Wire up Max & Min aggregations (#52219) * ValuesSource refactoring: Wire up Sum aggregation (#52571) * ValuesSource refactoring: Wire up SigTerms aggregation (#52590) * Soft immutability for VSConfig (#52729) * Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734) Also fixes Percentiles which was incorrectly specified to only accept numeric, but in fact also accepts Boolean and Date (because those are numeric on master - thanks `testSupportedFieldTypes` for catching it!) * VS refactoring: Wire up stats aggregation (#52891) * ValuesSource refactoring: Wire up string_stats aggregation (#52875) * VS refactoring: Wire up median (MAD) aggregation (#52945) * fix valuesourcetype issue with constant_keyword field (#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java this commit implements `getValuesSourceType` for the ConstantKeyword field type. master was merged into feature/extensible-values-source introducing a new field type that was not implementing `getValuesSourceType`. * ValuesSource refactoring: Wire up Avg aggregation (#52752) * Wire PercentileRanks aggregator into new VS framework (#51693) * Add a VSConfig resolver for aggregations not using the registry (#53038) * Vs refactor wire up ranges and date ranges (#52918) * Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034) This commit updates the geo_bounds aggregation to depend on registering itself in the ValuesSourceRegistry relates #42949. * VS refactoring: convert Boxplot to new registry (#53132) * Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037) This commit updates the geo_grid aggregations to depend on registering itself in the ValuesSourceRegistry relates to the values-source refactoring meta issue #42949. Wire-up geo_centroid agg to ValuesSourceRegistry (#53040) This commit updates the geo_centroid aggregation to depend on registering itself in the ValuesSourceRegistry. relates to the values-source refactoring meta issue #42949. * Fix type tests for Missing aggregation (#53501) * ValuesSource Refactor: move histo VSType into XPack module (#53298) - Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core. - This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept - Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs * Wire up DateHistogram to the ValuesSourceRegistry (#53484) * Vs refactor parser cleanup (#53198) Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com> * First batch of easy fixes * Remove List.of from ValuesSourceRegistry Note that we intend to have a follow up PR dealing with the mutability of the registry, so I didn't even try to address that here. * More compiler fixes * More compiler fixes * More compiler fixes * Precommit is happy and so am I * Add new Core VSTs to tests * Disabled supported type test on SigTerms until we can backport it's fix * fix checkstyle * Fix test failure from semantic merge issue * Fix some metaData->metadata replacements that got lost * Fix list of supported types for MinAggregator * Fix list of supported types for Avg * remove unused import Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com>	2020-04-16 16:54:46 -04:00
Marios Trivyzas	8abdf7c7d3	SQL: Fix ODBC metadata for DATE & TIME data types (#55316 ) (#55345 ) Fix MINIMUM_SCALE, MAXIMUM_SCALE and SQL_DATETIME_SUB ODBC metadata for the DATE & TIME data types. Fixes: #41086 (cherry picked from commit c23677cd2955e25bb952c8e7ff8ca3151ee0df98)	2020-04-16 22:41:39 +02:00
Rory Hunter	49f8f66a41	Revert "Use LTS version of Ubuntu in Dockerfiles (#55327 )" This reverts commit `dd76fbac60`.	2020-04-16 20:05:22 +01:00
Rory Hunter	dd76fbac60	Use LTS version of Ubuntu in Dockerfiles (#55327 ) We have some Dockerfiles that reference Ubuntu 19.04, which is not an LTS version and has now appears to have been retired from the Ubuntu repositories. Switch to 18.04, which is the current long-term support version. Also change a usage of 16.04 to 18.04, for consistency.	2020-04-16 19:47:18 +01:00
Ioannis Kakavas	b27f23a80d	Rest spec and documentation (#54664 ) (#55305 ) This change adds the spec for the new REST APIs that we introduce for the IDP and documentation for each of the APIs. The documentation pages are intentionally not included in the API reference so as to minimize unnecessary exposure. supersedes: #53858	2020-04-16 20:18:05 +03:00
Lee Hinman	8b7bdae6cb	Ensure error handler is called during SLM retention callback failure (#55252 ) (#55321 ) When retrieving the snapshots for a set of repos or deleting a single snapshot, it's possible for the body of the `ActionListener`'s `onResponse` method to throw an Exception. In this case, the `errHandler` passed in may not be executed, resulting in the `running` boolean not being reset back to false. This commit uses `ActionListener.wrap(...)` instead of creating a new ActionListener, which ensures that if the `onResponse` fails in any way, the `onFailure` handler is still called. Resolves #55217	2020-04-16 10:50:15 -06:00
David Turner	7941f4a47e	Add RepositoriesService to createComponents() args (#54814 ) Today we pass the `RepositoriesService` to the searchable snapshots plugin during the initialization of the `RepositoryModule`, forcing the plugin to be a `RepositoryPlugin` even though it does not implement any repositories. After discussion we decided it best for now to pass this in via `Plugin#createComponents` instead, pending some future work in which plugins can depend on services more dynamically.	2020-04-16 16:27:36 +01:00
Marios Trivyzas	327d268673	SQL: [Test] Add test for a fixed bug for string scalars on aggs (#55304 ) (#55309 ) Added an integration test to validate behaviour of string scalars on top of aggregate functions. The behaviour was fixed with #49570. Relates to: #41597 (cherry picked from commit 35f964154850e3f02b6c7f9ca238da98ad83ebb3)	2020-04-16 16:41:54 +02:00
Benjamin Trent	2b68aa3471	muting test for issue 55068 (#55312 )	2020-04-16 10:32:12 -04:00
David Kyle	643ecf68b5	Remove InferenceConfigUpdate generic parameter (#55249 ) (#55301 ) Simplify the code by removing the generic type from InferenceConfigUpdate which meant wildcard types were used in many places. Instead check the class type is appropriate where used.	2020-04-16 13:44:53 +01:00
Ioannis Kakavas	ac87c10039	[7.x] Fix responses for the token APIs (#54532 ) (#55278 ) This commit fixes our behavior regarding the responses we return in various cases for the use of token related APIs. More concretely: - In the Get Token API with the `refresh` grant, when an invalid (already deleted, malformed, unknown) refresh token is used in the body of the request, we respond with `400` HTTP status code and an `error_description` header with the message "could not refresh the requested token". Previously we would return erroneously return a `401` with "token malformed" message. - In the Invalidate Token API, when using an invalid (already deleted, malformed, unknown) access or refresh token, we respond with `404` and a body that shows that no tokens were invalidated: ``` { "invalidated_tokens":0, "previously_invalidated_tokens":0, "error_count":0 } ``` The previous behavior would be to erroneously return a `400` or `401` ( depending on the case ). - In the Invalidate Token API, when the tokens index doesn't exist or is closed, we return `400` because we assume this is a user issue either because they tried to invalidate a token when there is no tokens index yet ( i.e. no tokens have been created yet or the tokens index has been deleted ) or the index is closed. - In the Invalidate Token API, when the tokens index is unavailable, we return a `503` status code because we want to signal to the caller of the API that the token they tried to invalidate was not invalidated and we can't be sure if it is still valid or not, and that they should try the request again. Resolves: #53323	2020-04-16 14:05:55 +03:00
David Roberts	8489f8c121	[ML] Add test to prove categorization state written after lookback (#55297 ) When a datafeed transitions from lookback to real-time we request that state is persisted from the autodetect process in the background. This PR adds a test to prove that for a categorization job the state that is persisted includes the categorization state. Without the fix from elastic/ml-cpp#1137 this test fails. After that C++ fix is merged this test should pass. Backport of #55243	2020-04-16 11:55:18 +01:00
Dimitris Athanasiou	6c9e1fecc5	[7.x][ML] Mark task as completed when DFA job is stopped while reindexing (#55286 ) (#55290 ) After #54650 we catch `TaskCancelledException` when we wait for reindexing to complete as it may be thrown. However, when that happens we do not mark the task as completed. This results in the stop request never returning and the failures we saw in #55068. Closes #55068 Backport of #55286	2020-04-16 13:08:54 +03:00
David Roberts	ac11dd619c	Only ship Linux binaries for the correct architecture (#55280 ) Following elastic/ml-cpp#1135 there are now Linux binaries for both x86_64 and aarch64. The code that finds the correct binaries to ship with each distribution was including both on every Linux distribution. This change alters that logic to consider the architecture as well as the operating system. Also, there is no need to disable ML on aarch64 now that we have the native binaries available. ML is still not supported on aarch64, but the processes at least run up and work at a superficial level. Backport of #55256	2020-04-16 09:45:52 +01:00
David Roberts	5de6ddfef2	Mute ClassificationIT.testSetUpgradeMode_ExistingTaskGetsUnassigned Due to https://github.com/elastic/elasticsearch/issues/55221	2020-04-16 09:03:46 +01:00
Jay Modi	2d9e3c7794	Start resource watcher service early (#55275 ) The ResourceWatcherService enables watching of files for modifications and deletions. During startup various consumers register the files that should be watched by this service. There is behavior that might be unexpected in that the service may not start polling until later in the startup process due to the use of lifecycle states to control when the service actually starts the jobs to monitor resources. This change removes this unexpected behavior so that upon construction the service has already registered its tasks to poll resources for changes. In making this modification, the service no longer extends AbstractLifecycleComponent and instead implements the Closeable interface so that the polling jobs can be terminated when the service is no longer required. Relates #54867 Backport of #54993	2020-04-15 20:45:39 -06:00
Jason Tedor	cad1a3b0ad	Fix imports in CCRFeatureSet This commit fixes some imports that were mixed up during a backport. Because, backports.	2020-04-15 19:37:25 -04:00
Jason Tedor	a18faacf1b	Make feature usage version aware (#55246 ) Today we indiscriminately serialize these independent of the version on the stream, even though the other side might not understand a new feature set usage that we have added. For example, if we add feature set usage in 7.7 for EQL, in a mixed cluster context if a request is sent to an old coordinating node, but the master is a new version, then it would attempt to serialize the usage information for the new feature back to the old coordinating node, who will blow up on the unrecognized named writeable. This commit addresses this by making feature usage version aware, and only serializing those that the other side would understand.	2020-04-15 19:24:47 -04:00
William Brafford	2ba3be9db6	Remove deprecated third-party methods from tests (#55255 ) (#55269 ) I've noticed that a lot of our tests are using deprecated static methods from the Hamcrest matchers. While this is not a big deal in any objective sense, it seems like a small good thing to reduce compilation warnings and be ready for a new release of the matcher library if we need to upgrade. I've also switched a few other methods in tests that have drop-in replacements.	2020-04-15 17:54:47 -04:00
Ryan Ernst	29b70733ae	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:27:53 -07:00
Henning Andersen	b3eb57a094	CCR: Test follow on top of closed index (#54956 ) Added testing of following on top of a closed index. This could for instance be the old leader index in cases where leader and follower clusters have been swapped.	2020-04-15 20:13:32 +02:00
Mark Vieira	5d4bc8aea6	Mute ModelLoadingServiceTests.testMaxCachedLimitReached	2020-04-15 10:25:51 -07:00
Ignacio Vera	a677b63daa	Upgrade to lucene 8.5.1 release (#55229 ) (#55235 ) Upgrade to lucene 8.5.1 release that contains a bug fix for a bug that might introduce index corruption when deleting data from an index that was previously shrunk.	2020-04-15 17:35:42 +02:00
Benjamin Trent	8ff2cbf1a3	[7.x] [ML] adding prediction_field_type to inference config (#55128 ) (#55230 ) * [ML] adding prediction_field_type to inference config (#55128) Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). Analytics provides the default `prediction_field_type` when the model is created from the process.	2020-04-15 09:45:22 -04:00
Armin Braun	2f91e2aab7	Fix Race in Snapshot Abort (#54873 ) (#55233 ) We can be a little more efficient when aborting a snapshot. Since we know the new repository data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners. This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.	2020-04-15 15:42:15 +02:00
Przemysław Witek	b5fe565c89	Add log line that will help debug item failures during multi search request (#55220 ) (#55227 )	2020-04-15 15:01:17 +02:00
Hendrik Muhs	9ec9866acb	[Transform] simplify TransformConfigUpdate (#55224 ) removes the unnecessary ToXContent method in TransformConfigUpdate	2020-04-15 13:22:50 +02:00
Yannick Welsch	d1123281b1	Use unlimited cache size by default (#55218 ) Sets the default cache size for searchable snapshots to unlimited, which, for testing purposes, is a better default than the 1GB that we currently have.	2020-04-15 12:20:51 +02:00
David Kyle	bdf0eab78d	[7.x] Fix non-deterministic behaviour in ModelLoadingServiceTests (#55008 ) (#55213 )	2020-04-15 11:09:12 +01:00
Ioannis Kakavas	0f51934bcf	[7.x] Add support for more named curves (#55179 ) (#55211 ) We implicitly only supported the prime256v1 ( aka secp256r1 ) curve for the EC keys we read as PEM files to be used in any SSL Context. We would not fail when trying to read a key pair using a different curve but we would silently assume that it was using `secp256r1` which would lead to strange TLS handshake issues if the curve was actually another one. This commit fixes that behavior in that it supports parsing EC keys that use any of the named curves defined in rfc5915 and rfc5480 making no assumptions about whether the security provider in use supports them (JDK8 and higher support all the curves defined in rfc5480).	2020-04-15 12:33:40 +03:00
Dimitris Athanasiou	4000138105	[7.x][ML] Add debug logging for outlier detection stop and restart integ test (#55169 ) (#55202 ) To understand the failures in #55068 Backport of #55169	2020-04-15 10:40:38 +03:00
Lee Hinman	36f6e542a2	Ignore ILM indices in the TerminalPolicyStep (#55184 ) Prior to the change in #51631 indices were moved to the `TerminalPolicyStep` when their ILM actions had completed. Once we switched ILM to stop in the last policy configured, these steps because inaccessible from the policy's perspective. This meant that indices upgraded from ES prior to 7.7.0 could see the following error spammed in their logs every 10 minutes (by default) for every index in this state: ``` [2020-04-14T15:52:23,764][ERROR][o.e.x.i.IndexLifecycleRunner] [midgar] current step [{"phase":"completed","action":"completed","name":"completed"}] for index [foo] with policy [full] is not recognized ``` This changes the runner to ignore these steps, which is what is desired anyway since the index is already in the terminal phase.	2020-04-14 16:45:03 -06:00
Igor Motov	1754e50cbd	[7.x] Add analytics plugin usage stats to _xpack/usage (#54911 ) (#55162 ) Adds analytics plugin usage stats to _xpack/usage. Closes #54847	2020-04-14 17:03:14 -04:00
Mark Vieira	ce85063653	[7.x] Re-add origin url information to publish POM files (#55173 )	2020-04-14 13:24:15 -07:00
Albert Zaharovits	7f35b927d1	Preserve parent task id for ml transform (#55124 ) This change ensures that internal client requests spawned by the transform persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because transform tasks already implements graceful shutdown of the child tasks, this change does not interfere with that by opting out of the persistent task cancellation of child tasks. Relates #55046 #54943 #52314 Closes #54957	2020-04-14 18:43:47 +03:00
Andrei Dan	d918ef0da9	[Tests] Enable searchable_snapshots for non-snapshot builds (#55151 ) (#55157 ) Fixes https://github.com/elastic/elasticsearch/issues/55050 (cherry picked from commit 13391ceff1cbf6db69706c5f46127b6ff8850a1f) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-14 16:13:39 +01:00
Albert Zaharovits	5998486ce8	Refactor AuditTrail for TransportRequests instead of TransportMessage (#55141 ) This commit refactors the `AuditTrail` to use the `TransportRequest` as a parameter for all its audit methods, instead of the current `TransportMessage` super class. The goal is to gain access to the `TransportRequest#parentTaskId` member, so that it can be audited. The `parentTaskId` is used internally when spawning tasks that handle transport requests; in this way tasks across nodes are related by the same parent task. Relates #52314	2020-04-14 16:53:59 +03:00
Yannick Welsch	a610513ec7	Provide repository-level stats for searchable snapshots (#55051 ) Provides basic repository-level stats that will allow us to get some insight into how many requests are actually being made by the underlying SDK. Currently only tracks GET and LIST calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint that will help us better understand some of the low-level dynamics of searchable snapshots.	2020-04-14 14:34:08 +02:00
Przemysław Witek	d5bb574e1e	[7.x] Unassign DFA tasks in SetUpgradeModeAction (#54523 ) (#55143 )	2020-04-14 14:09:02 +02:00
Igor Motov	8a669dc9b7	EQL: Add cascading search cancellation (#54843 ) EQL search cancellation now propagates cancellation to underlying search operations. Relates to #49638	2020-04-14 08:06:02 -04:00
David Turner	87e8367ece	Fix testCreateAndRestoreSearchableSnapshot (#55147 ) Fixes a couple of related failures in SearchableSnapshotsIntegTests. Firstly, we were not correctly accounting for the case where the cache was so small that some/all files were read directly; fixed this by only asserting that the cache is definitely used if the corresponding node has a cache that's large enough to hold the whole index. Secondly, we were not permitting shards to be completely empty, which might be the case (rarely) if there were not many documents indexed and the distribution of IDs was a bit unlucky; fixed this by asserting that we get stats for at least one file for the whole index, rather than for each shard separately. Closes #55126	2020-04-14 11:54:46 +01:00
Ioannis Kakavas	70cc1d57fb	Mute failing test (#54734 )	2020-04-14 10:18:33 +01:00
Mark Vieira	cb58725164	Mute InferenceIngestIT.testPipelineIngest	2020-04-14 09:27:56 +01:00
debadair	e8fa539bea	[DOCS] Removed obsolete warning about no way to securely store passwords (#55133 ) (#55140 ) * [DOCS] Removed obsolete warning about no way to securely store passwords. * Update x-pack/docs/en/watcher/actions/email.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2020-04-13 21:38:32 -07:00
William Brafford	52bebec51f	NodeInfo response should use a collection rather than fields (#54460 ) (#55132 ) This is a first cut at giving NodeInfo the ability to carry a flexible list of heterogeneous info responses. The trick is to be able to serialize and deserialize an arbitrary list of blocks of information. It is convenient to be able to deserialize into usable Java objects so that we can aggregate nodes stats for the cluster stats endpoint. In order to provide a little bit of clarity about which objects can and can't be used as info blocks, I've introduced a new interface called "ReportingService." I have removed the hard-coded getters (e.g., getOs()) in favor of a flexible method that can return heterogeneous kinds of info blocks (e.g., getInfo(OsInfo.class)). Taking a class as an argument removes the need to cast in the client code.	2020-04-13 17:18:39 -04:00
Ryan Ernst	ae14d1661e	Replace license check isAuthAllowed with isSecurityEnabled (#54547 ) (#55082 ) The isAuthAllowed() method for license checking is used by code that wants to ensure security is both enabled and available. The enabled state is dynamic and provided by isSecurityEnabled(). But since security is available with all license types, an check on the license level is not necessary. Thus, this change replaces isAuthAllowed() with calling isSecurityEnabled().	2020-04-13 12:26:39 -07:00
Benjamin Trent	d32f6fed1d	[ML] inference only persist if there are stats (#54752 ) (#55121 ) We needlessly send documents to be persisted. If there are no stats added, then we should not attempt to persist them. Also, this PR fixes the race condition that caused issue: https://github.com/elastic/elasticsearch/issues/54786	2020-04-13 14:03:05 -04:00
Igor Motov	51c6f69e02	[7.x] Add support for filters to T-Test aggregation (#54980 ) (#55066 ) Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes #53692	2020-04-13 12:28:58 -04:00
Jake Landis	a2fafa6af4	[7.x] Lazy test cluster module and plugins (#54852 ) (#55087 ) This change converts the module and plugin parameters for testClusters to be lazy. Meaning that the values are not resolved until they are actually used. This removes the requirement to use project.afterEvaluate to be able to resolve the bundle artifact. Note - this does not completely remove the need for afterEvaluate since it is still needed for the custom resource extension.	2020-04-13 10:53:35 -05:00

1 2 3 4 5 ...

5337 Commits