OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Kyle	2905a2f623	Use Search After job iterators (#57875 ) (#57923 ) Search after is a better choice for the delete expired data iterators where processing takes a long time as unlike scroll a context does not have to be kept alive. Also changes the delete expired data endpoint to 404 if the job is unknown	2020-06-11 10:06:18 +01:00
Costin Leau	ff0ea62cb8	EQL: Fix casing for tiebreaker field (#57943 ) Use tiebreaker instead of tieBreaker (cherry picked from commit 3c774948a5d5e10fac267cb9a54f5d0559a00c1d)	2020-06-11 00:10:19 +03:00
Albert Zaharovits	c57ccd99f7	Just log 401 stacktraces (#55774 ) Ensure stacktraces of 401 errors for unauthenticated users are logged but not returned in the response body.	2020-06-10 20:39:32 +03:00
Valeriy Khakhutskyy	c0f368bbf3	[7.x][ML] Adjust assertion for job case memory usage estimates (#57929 ) Since we change the memory estimates for data frame analytics jobs from worst case to a realistic case, the strict less-than assertion in the test does not hold anymore. I replaced it with a less-or-equal-than assertion. Backport or #57882	2020-06-10 15:17:16 +02:00
Aleksandr Maus	ec60335496	EQL: implement case sensitivity for indexOf and endsWith string functions (#57707 ) (#57908 ) * EQL: implement case sensitivity for indexOf and endsWith string functions	2020-06-10 08:55:49 -04:00
Andrei Dan	9f280621ba	[7.x] ILM add data stream support to searchable snapshot action (#57873 ) (#57916 ) (cherry picked from commit 34856a90532c6c62a53817bb395399c8a8c17c0f) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-10 10:16:57 +01:00
Yannick Welsch	80f221e920	Use clean thread context for transport and applier service (#57792 ) (#57914 ) Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and also that thread contexts are not leaked). Moves the ClusterApplierService to use the system context (same as we do for MasterService), which allows to remove a hack from TemplateUgradeService and makes it clearer that applying CS updates is fully executing under system context.	2020-06-10 10:30:28 +02:00
Hendrik Muhs	95bd7b63b0	[Transform] fix page size return in cat transform, add dps (#57871 ) fixes the page size reported after moving page size to settings(#56007) and adds documents per second(throttling) to the output. fixes #56498	2020-06-10 08:10:25 +02:00
Yang Wang	72a6441a88	Revert "Resolve anonymous roles and deduplicate roles during authentication (#53453 ) (#55995 )" (#57858 ) This reverts commit `84a2f1adf2`.	2020-06-10 10:42:52 +10:00
Jake Landis	a370d5eead	[7.x] Ensure Joni warning are logged at debug (#57302 ) (#57897 ) When Joni, the regex engine that powers grok emits a warning it does so by default to System.err. System.err logs are all bucketed together in the server log at WARN level. When Joni emits a warning, it can be extremely verbose, logging a message for each execution again that pattern. For ingest node that means for every document that is run that through Grok. Fortunately, Joni provides a call back hook to push these warnings to a custom location. This commit implements Joni's callback hook to push the Joni warning to the Elasticsearch server logger (logger.org.elasticsearch.ingest.common.GrokProcessor) at debug level. Generally these warning indicate a possible issue with the regular expression and upon creation of the Grok processor will do a "test run" of the expression and log the result (if any) at WARN level. This WARN level log should only occur on pipeline creation which is a much lower frequency then every document. Additionally, the documentation is updated with instructions for how to set the logger to debug level.	2020-06-09 17:06:29 -05:00
Yannick Welsch	9eec819c5b	Revert "Use clean thread context for transport and applier service (#57792 )" This reverts commit `259be236cf`.	2020-06-09 22:24:54 +02:00
Costin Leau	439205d1ea	EQL: Introduce tie breaker support (#57787 ) Allow a field inside the data to be used as a tie breaker for events that have the same timestamp. The field is optional by default. If used, the tie-breaker always requires a non-null value since it is used inside `search_after` which requires a non-null value. Fix #56824 (cherry picked from commit e5719ecb474b32730d93afdbb6834a32b0b2df8b)	2020-06-09 22:50:19 +03:00
Andrei Dan	3945712c72	[7.x] ILM add data stream support to the Shrink action (#57616 ) (#57884 ) The shrink action creates a shrunken index with the target number of shards. This makes the shrink action data stream aware. If the ILM managed index is part of a data stream the shrink action will make sure to swap the original managed index with the shrunken one as part of the data stream's backing indices and then delete the original index. (cherry picked from commit 99aeed6acf4ae7cbdd97a3bcfe54c5d37ab7a574) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-09 19:45:22 +01:00
Nik Everett	44a79d1739	Deprecte Rounding#round (#57845 ) (#57893 ) This deprecates `Rounding#round` and `Rounding#nextRoundingValue` in favor of calling ``` Rounding.Prepared prepared = rounding.prepare(min, max); ... prepared.round(val) ``` because it is always going to be faster to prepare once. There are going to be some cases where we won't know what to prepare for and in those cases you can call `prepareForUnknown` and stil be faster than calling the deprecated method over and over and over again. Ultimately, this is important because it doesn't look like there is an easy way to cache `Rounding.Prepared` or any of its precursors like `LocalTimeOffset.Lookup`. Instead, we can just build it at most once per request. Relates to #56124	2020-06-09 14:30:56 -04:00
Dan Hermann	b501b282f8	Change default backing index naming scheme	2020-06-09 09:31:34 -05:00
Hossein Dehghan	2c6bd978d8	[Docs] Fix missing closing bracket for watcher webhook.asciidoc (#57803 )	2020-06-09 13:59:51 +02:00
Yannick Welsch	259be236cf	Use clean thread context for transport and applier service (#57792 ) Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and also that thread contexts are not leaked). Moves the ClusterApplierService to use the system context (same as we do for MasterService), which allows to remove a hack from TemplateUgradeService and makes it clearer that applying CS updates is fully executing under system context.	2020-06-09 12:32:28 +02:00
Andrei Stefan	3cc8166946	SQL: handle MIN and MAX functions on dates in Painless scripts (#57605 ) (#57863 ) * Convert to date/datetime the result of numeric aggregations (min, max) in Painless scripts (cherry picked from commit f1de99e2a6fbf3806c4f2b6b809738aa8faa2d75)	2020-06-09 10:09:01 +03:00
Benjamin Trent	d5522c2747	[ML] add new circuit breaker for inference model caching (#57731 ) (#57830 ) This adds new plugin level circuit breaker for the ML plugin. `model_inference` is the circuit breaker qualified name. Right now it simply adds to the breaker when the model is loaded (and possibly breaking) and removing from the breaker when the model is unloaded.	2020-06-08 16:02:48 -04:00
Armin Braun	0987c0a5f3	Fix Broken Numeric Shard Generations in RepositoryData (#57813 ) (#57821 ) Fix broken numeric shard generations when reading them from the wire or physically from the physical repository. This should be the cheapest way to clean up broken shard generations in a BwC and safe-to-backport manner for now. We can potentially further optimize this by also not doing the checks on the generations based on the versions we see in the `RepositoryData` but I don't think it matters much since we will read `RepositoryData` from cache in almost all cases. Closes #57798	2020-06-08 18:36:56 +02:00
Przemysław Witek	7a1300a09e	[7.x] Make ModelPlotConfig.annotations_enabled default to ModelPlotConfig.enabled if unset (#57808 ) (#57815 )	2020-06-08 17:41:12 +02:00
Mayya Sharipova	70e63a365a	Refactor how to determine if a field is metafield (#57378 ) (#57771 ) Before to determine if a field is meta-field, a static method of MapperService isMetadataField was used. This method was using an outdated static list of meta-fields. This PR instead changes this method to the instance method that is also aware of meta-fields in all registered plugins. Related #38373, #41656 Closes #24422	2020-06-08 09:16:18 -04:00
Andrei Dan	1b84e93d83	[7.x] DataStream creation validation allows for prefixed indices (#57750 ) (#57799 ) We want to validate the DataStreams on creation to make sure the future backing indices would not clash with existing indices in the system (so we can always rollover the data stream). This changes the validation logic to allow for a DataStream to be created with a backing index that has a prefix (eg. `shrink-foo-000001`) even if the former backing index (`foo-000001`) exists in the system. The new validation logic will look for potential index conflicts with indices in the system that have the counter in the name greater than the data stream's generation. This ensures that the `DataStream`'s future rollovers are safe because for a `DataStream` `foo` of generation 4, we will look for standalone indices in the form of `foo-%06d` with the counter greater than 4 (ie. validation will fail if `foo-000006` exists in the system), but will also allow replacing a backing index with an index named by prefixing the backing index it replaces. (cherry picked from commit 695b242d69f0dc017e732b63737625adb01fe595) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-08 13:31:52 +01:00
David Kyle	08d1286de7	[7.x] Delete expired data by job (#57337 ) (#57796 ) Deleting expired data can take a long time leading to timeouts if there are many jobs. Often the problem is due to a few large jobs which prevent the regular maintenance of the remaining jobs. This change adds a job_id parameter to the delete expired data endpoint to help clean up those problematic jobs.	2020-06-08 13:00:23 +01:00
Luca Cavanna	7a06a13d99	Add description to submit and get async search, as well as cancel tasks (#57745 ) This makes it easier to debug where such tasks come from in case they are returned from the get tasks API. Also renamed the last occurrence of waitForCompletion to waitForCompletionTimeout in get async search request.	2020-06-08 11:17:29 +02:00
Luca Cavanna	06ef3042c1	Specify reason whenever async search gets cancelled (#57761 ) This allows to trace where the cancel tasks request came from given that it may be triggered for multiple reasons.	2020-06-08 10:25:31 +02:00
David Roberts	1d64d55a86	[7.x][ML] Add per-partition categorization option (#57723 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs. Backport of #57683	2020-06-06 08:15:17 +01:00
Benjamin Trent	9666a895f7	[ML] inference performance optimizations and refactor (#57674 ) (#57753 ) This is a major refactor of the underlying inference logic. The main refactor is now we are separating the model configuration and the inference interfaces. This has the following benefits: - we can store extra things with the model that are not necessary for inference (i.e. treenode split information gain) - we can optimize inference separate from model serialization and storage. - The user is oblivious to the optimizations (other than seeing the benefits). A major part of this commit is removing all inference related methods from the trained model configurations (ensemble, tree, etc.) and moving them to a new class. This new class satisfies a new interface that is ONLY for inference. The optimizations applied currently are: - feature maps are flattened once - feature extraction only happens once at the highest level (improves inference + feature importance through put) - Only storing what we need for inference + feature importance on heap	2020-06-05 14:20:58 -04:00
Jake Landis	459ab9a0b2	[7.x] Ensure type exists for all monitoring configuration (#57399 ) (#57704 ) #47711 and #47246 helped to validate that monitoring settings are rejected at time of setting the monitoring settings. Else an invalid monitoring setting can find it's way into the cluster state and result in an exception thrown [1] on the cluster state application (there by causing significant issues). Some additional monitoring settings have been identified that can result in invalid cluster state that also result in exceptions thrown on cluster state application. All settings require a type of either http or local to be applicable. When a setting is changed, the exporters are automatically updated with the new settings. However, if the old or new settings lack of a type setting an exception will be thrown (since exporters are always of type 'http' or 'local'). Arguably we shouldn't blindly create and destroy new exporters on each monitoring setting update, but the lifecycle of the exporters is abit out the scope this PR is trying to address. This commit introduces a similar methodology to check for validity as #47711 and #47246 but this time for ALL (including non-http) settings. Monitoring settings are not useful unless there an exporter with a type defined. The type is used as dependent setting, such that it must exist to set the value. This ensures that when any monitoring settings changes that they can only get added to cluster state if the type exists. If the type exists (and the other validations pass) then the exporters will get re-built and the cluster state remains valid. Tests have been included to ensure that all dynamic monitoring settings have the type as dependent settings. [1] org.elasticsearch.common.settings.SettingsException: missing exporter type for [found-user-defined] exporter at org.elasticsearch.xpack.monitoring.exporter.Exporters.initExporters(Exporters.java:126) ~[?:?]	2020-06-05 10:47:11 -05:00
Dimitris Athanasiou	f49a14ce6f	[7.x][ML] Fix race condition when force stopping DF analytics job (#57680 ) (#57717 ) When we force delete a DF analytics job, we currently first force stop it and then we proceed with deleting the job config. This may result in logging errors if the job config is deleted before it is retrieved while the job is starting. Instead of force stopping the job, it would make more sense to try to stop the job gracefully first. So we now try that out first. If normal stop fails, then we resort to force stopping the job to ensure we can go through with the delete. In addition, this commit introduces `timeout` for the delete action and makes use of it in the child requests. Backport of #57680	2020-06-05 17:50:01 +03:00
Tanguy Leroux	0e57528d5d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:34:06 +02:00
Hendrik Muhs	61c496d320	[Transform] use old roles only together with old endpoints (#57710 ) avoids a CI failure if new endpoints used together with old roles and warnings are asserted.	2020-06-05 10:08:05 +02:00
Hendrik Muhs	e91b975878	[Transform] mark old data frame transform roles deprecated (#57655 ) mark old data frame transform roles deprecated fixes #50087	2020-06-05 09:20:35 +02:00
Hendrik Muhs	c1c8817eae	[7.x][Transform] improve update API (#57685 ) rewrite config on update if either version is outdated, credentials change, the update changes the config or deprecated settings are found. Deprecated settings get migrated to the new format. The upgrade can be easily extended to do any necessary re-writes. fixes #56499 backport #57648	2020-06-05 08:48:47 +02:00
Jake Landis	f4a3d969ad	[7.x] Ensure default watches are updated for rolling upgrades. (#57185 ) (#57563 ) For a rolling/mixed cluster upgrade (add new version to existing cluster then shutdown old instances), the watches that ship by default with monitoring may not get properly updated to the new version. Monitoring watches can only get published if the internal state is marked as dirty. If a node is not master, will also get marked as clean (e.g. not dirty). For a mixed cluster upgrade, it is possible for the new node to be added, not as master, the internal state gets marked as clean so that no more attempts can be made to publish the watches. This happens on all new nodes. Once the old nodes are de-commissioned one of the new version nodes in the cluster gets promoted to master. However, that new master node (with out intervention like restarting the node or removing/adding exporters) will never attempt to re-publish since the internal state was already marked as clean. This commit adds a cluster state listener to mark the resource dirty when a node is promoted to master. This will allow the new resource to be published without any intervention.	2020-06-04 16:44:36 -05:00
William Brafford	dfb6def3da	Revert "Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 )" This reverts commit `7a67fb2d04`.	2020-06-04 16:25:05 -04:00
Ioannis Kakavas	8afd55ebe6	Disable testing conventions for idp in fips (#57663 ) (#57676 ) Since we disable both integTest and test tasks. This should have been part of #57048 but we missed it.	2020-06-04 20:51:38 +03:00
Ioannis Kakavas	af9f9d7f03	[7.x] Add http proxy support for OIDC realm (#57039 ) (#57584 ) This change introduces support for using an http proxy for egress communication of the OpenID Connect realm.	2020-06-04 20:51:00 +03:00
William Brafford	7a67fb2d04	Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 ) In #55592 and #55416, we deprecated the settings for enabling and disabling basic license features and turned those settings into no-ops. Since doing so, we've had feedback that this change may not give users enough time to cleanly switch from non-ILM index management tools to ILM. If two index managers operate simultaneously, results could be strange and difficult to reconstruct. We don't know of any cases where SLM will cause a problem, but we are restoring that setting as well, to be on the safe side. This PR is not a strict commit reversion. First, we are keeping the new xpack.watcher.use_ilm_index_management setting, introduced when xpack.ilm.enabled was made a no-op, so that users can begin migrating to using it. Second, the SLM setting was modified in the same commit as a group of other settings, so I have taken just the changes relating to SLM.	2020-06-04 13:38:22 -04:00
Mark Vieira	9b0f5a1589	Include vendored code notices in distribution notice files (#57017 ) (#57569 ) (cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)	2020-06-04 10:34:24 -07:00
Przemysław Witek	6b5f49d097	[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539 ) (#57641 )	2020-06-04 15:15:35 +02:00
Benjamin Trent	ea9b8b9d41	[ML] fix setting forecasts to failed method (#57654 ) (#57656 )	2020-06-04 08:54:46 -04:00
Rene Groeschke	751f16858b	Remove duplicate ssl setup in sql/qa projects (#57319 ) (#57643 ) * Remove duplicate ssl setup in sql/qa projects * Fix enforcement of task instances * Use static data for cert generation * Move ssl testing logic into a plugin * Document test cert creation	2020-06-04 14:53:23 +02:00
Marios Trivyzas	5f8442d1f4	SQL: Improve performances of LTRIM/RTRIM (#57603 ) Change custom stripping leading and trailing whitespaces implementation to substantially improves performance: ``` Benchmark Mode Cnt Score Error Units StringTrim.testWithStringBuilder avgt 25 82547.575 ± 66.244 ns/op (existing impl) StringTrim.testWithSubstring avgt 25 1398.762 ± 101.152 ns/op (new impl) StringTrim.testWithJavaStrip avgt 25 1186.120 ± 10.374 ns/op (for reference) ``` Java's string stripLeading()/stripTrailing() not available to all supported JDKs. Enhanced LENGTH unit tests and compine a couple of LTRIM/RTRIM integ tests. Relates to: #57594 (partially cherry picked from commit ee7868d68733f195dc46926a7eab3d9dd7033ef4) Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>	2020-06-04 13:43:49 +02:00
Igor Motov	8d7f389f3a	Increase search.max_buckets to 65,535 (#57042 ) Increases the default search.max_buckets limit to 65,535, and only counts buckets during reduce phase. Closes #51731	2020-06-03 15:35:41 -04:00
Julie Tibshirani	e0a15e8dc4	Remove the 'array value parser' marker interface. (#57571 ) (#57622 ) This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.	2020-06-03 11:30:14 -07:00
Marios Trivyzas	a674844893	SQL: Implement TRIM function (#57518 ) (#57593 ) Add `TRIM` function which combines the functionality of both `LTRIM` and `RTRIM` by stripping both leading and trailing whitespaces. Refers to #41195 (cherry picked from commit 6c86c919e12f0c4cb5e39d129aa65ab3e274268f)	2020-06-03 15:19:48 +02:00
Ioannis Kakavas	64583f7ec4	Mute EmailSslTests test case in fips (#57576 ) (#57577 ) We test expected TLS failures by catching SSLException, but other security providers ( i.e. BCFIPS ) might throw a different one. In this case, BCFIPS throws org.bouncycastle.tls.TlsFatalAlert	2020-06-03 11:23:31 +03:00
Marios Trivyzas	634936e3be	SQL: [Tests] Enable tests which have been fixed (#57526 ) (#57538 ) Enable integration tests for issues that have been fixed over time. (cherry picked from commit 117759ee152bcfb0043e5af3a784302ca31f6b8c)	2020-06-02 23:38:33 +02:00
Nik Everett	2a27c411fb	Same memory when geo aggregations are not on top (#57483 ) (#57551 ) Saves memory when the `geotile_grid` and `geohash_grid` are not on the top level by using the `LongKeyedBucketOrds` we built in #55873.	2020-06-02 16:21:50 -04:00
Dan Hermann	97a51272b0	Fix incorrect log warning when exporting monitoring via HTTP without authentication (#57552 )	2020-06-02 15:03:55 -05:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Armin Braun	ba2d70d8eb	Serialize Outbound Messages on IO Threads (#56961 ) (#57080 ) Almost every outbound message is serialized to buffers of 16k pagesize. We were serializing these messages off the IO loop (and retaining the concrete message instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the channel is ready. 1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average. 2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically. With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable. Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC. This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain). I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once. Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.	2020-06-02 16:15:18 +02:00
Rene Groeschke	8584da40af	Move classes from build scripts to buildSrc (#57197 ) (#57512 ) * Move classes from build scripts to buildSrc - move Run task - move duplicate SanEvaluator * Remove :run workaround * Some little cleanup on build scripts on the way	2020-06-02 15:33:53 +02:00
Andrei Dan	bd188f4a21	[7.x] ILM: add support for rolling over data streams (#57295 ) (#57515 ) As the datastream information is stored in the `ClusterState.Metadata` we exposed the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for the steps to be able to identify when a managed index is part of a DataStream. If a managed index is part of a DataStream the rollover target is the DataStream name and the highest generation index is the write index (ie. the rolled index). (cherry picked from commit 6b410dfb78f3676fce1b7401f1628c1ca6fbd45a) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-02 11:55:23 +01:00
Przemysław Witek	ea6cfb7c3d	[7.x] Make Annotation a result type (#56342 ) (#57508 )	2020-06-02 11:56:41 +02:00
Tanguy Leroux	b4a2cd810a	Use 3rd party task to run integration tests on external service (#56588 ) Backport of #56587 for 7.x	2020-06-02 11:26:58 +02:00
Marios Trivyzas	52c555e286	SQL: Make CASTing string to DATETIME more lenient (#57451 ) (#57509 ) Some BI tools (i.e. Tableau) would try to cast strings where the time part is separated from the date part with a whitespace instead of `T`. Adjust type conversion used by CAST to support this. (cherry picked from commit 0e18321e7ad9f779c42855efbf93f171b9128a5e)	2020-06-02 10:54:03 +02:00
Marios Trivyzas	b8a13de20f	SQL: Implement TOP as an alternative to LIMIT (#57428 ) (#57507 ) Add basic support for `TOP X` as a synonym to LIMIT X which is used by [MS-SQL server](https://docs.microsoft.com/en-us/sql/t-sql/queries/top-transact-sql?view=sql-server-ver15), e.g.: ``` SELECT TOP 5 a, b, c FROM test ``` TOP in SQL server also supports the `PERCENTAGE` and `WITH TIES` keywords which this implementation doesn't. Don't allow usage of both TOP and LIMIT in the same query. Refers to #41195 (cherry picked from commit 2f5ab81b9ad884434d1faa60f4391f966ede73e8)	2020-06-02 10:53:42 +02:00
Przemysław Witek	ceb4b29b98	Introduce Annotation.event field (#57144 ) (#57453 )	2020-06-01 20:42:25 +02:00
Mark Tozzi	1f500583b1	Clean up Aggregator Supplier Boiler Plate (#57442 ) (#57452 )	2020-06-01 14:21:07 -04:00
Zachary Tong	daaf5a3dcc	Fix assertion catching in aggregation supported type test (#56466 ) (#57382 ) At some point, we changed the supported-type test to also catch assertion errors. This has the side effect of also catching the `fail()` call inside the try-catch, which silently smothered some failures. This modifies the test to throw at the end of the try-catch block to prevent from accidentally catching itself. Catching the AssertionError is convenient because there are other locations that do throw an assertion in tests (due to hitting an assertion before the exception is thrown) so I think we should keep it around. Also includes a variety of fixes to other tests which were failing but being silently smothered.	2020-06-01 12:10:05 -04:00
David Kyle	064093c4d4	Fix compilation after backport of #57278	2020-06-01 12:03:13 +01:00
Przemysław Witek	72ad9a4548	[7.x] Make AnnotationPersister use bulk requests instead of indexing individual documents (#57278 ) (#57354 )	2020-06-01 12:05:09 +02:00
David Roberts	9fdf1722e6	[TEST] Fix more allowed warnings for composable template rename (#57398 ) Should have been done in #57232	2020-05-31 18:14:48 +01:00
Benjamin Trent	34f1e0b6bb	[7.x] [ML] mark forecasts for force closed/failed jobs as failed (#57143 ) (#57374 ) * [ML] mark forecasts for force closed/failed jobs as failed (#57143) forecasts that are still running should be marked as failed/finished in the following scenarios: - Job is force closed - Job is re-assigned to another node. Forecasts are not "resilient". Their execution does not continue after a node failure. Consequently, forecasts marked as STARTED or SCHEDULED should be flagged as failed. These forecasts can then be deleted. Additionally, force closing a job kills the native task directly. This means that if a forecast was running, it is not allowed to complete and could still have the status of `STARTED` in the index. relates to https://github.com/elastic/elasticsearch/issues/56419	2020-05-29 14:48:10 -04:00
Benjamin Trent	35d5126cea	[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351 ) (#57368 ) * [ML] adds new for_export flag to GET _ml/inference API (#57351) Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API. This flag is useful for moving models between clusters.	2020-05-29 14:01:08 -04:00
Benjamin Trent	15aba60c02	[7.x] Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695 ) (#57359 ) * Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695) This commit lays the ground work for plugins supplying their own circuit breakers. It adds a new interface: `CircuitBreakerPlugin`. This interface provides methods for providing custom child CircuitBreaker objects. There are also facilities for allowing dynamic settings for the custom breakers. With the refactor, circuit breakers are no longer replaced on setting changes. Instead, the two mutable settings themselves are `volatile`. Plugins that want to use their custom circuit breaker should keep a reference of their constructed breaker.	2020-05-29 12:13:46 -04:00
Benjamin Trent	c8374dc9f3	[ML] add max_model_memory parameter to forecast request (#57254 ) (#57355 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 11:16:08 -04:00
Marios Trivyzas	b2651323fd	SQL: Implement TIME_PARSE function for parsing strings into TIME values (#55223 ) (#57342 ) Implement TIME_PARSE(<time_str>, <pattern_str>) function which allows to parse a time string according to the specified pattern into a time object. The patterns allowed are those of java.time.format.DateTimeFormatter. Closes #54963 Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com> Co-authored-by: Patrick Jiang(白泽) <patrickjiang0530@gmail.com> (cherry picked from commit 1fe1188d449cad7d0782a202372edc52a4014135)	2020-05-29 15:48:37 +02:00
Dan Hermann	6b0d707671	[7.x] Do not report negative values for swap sizes (#57353 )	2020-05-29 08:11:47 -05:00
Martijn van Groningen	04ef39da77	Change cluster info actions to be able to resolve data streams. (#57343 ) Backport of #56878 to 7.x branch. With this change the following APIs will be able to resolve data streams: get index, get mappings and ilm explain APIs. Relates to #53100	2020-05-29 12:17:53 +02:00
Dimitris Athanasiou	322f953060	[7.x][ML] Anomaly detection jobs should allow missing values for geo fields (#57300 ) (#57338 ) Allows geo fields (`geo_point`, `geo_shape`) to have missing values. Fixes a bug where such missing values would result in an error. Closes #57299 Backport of #57300	2020-05-29 13:06:16 +03:00
Benjamin Trent	24d605e41e	[ML] fixing GET _ml/inference so size param is respected (#57303 ) (#57308 ) `size` was previously ignored when grabbing full trained model configs. closes https://github.com/elastic/elasticsearch/issues/57298	2020-05-28 15:45:26 -04:00
Martijn van Groningen	225ccd1cfa	Ensure template exists when creating data stream (#57275 ) Backporting #56888 to 7.x branch. Limit the creation of data streams only for namespaces that have a composable template with a data stream definition. This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover. Also remove `timestamp_field` parameter from create data stream request and let the create data stream api resolve the timestamp field from the data stream definition snippet inside a composable template. Relates to #53100	2020-05-28 15:08:25 +02:00
Marios Trivyzas	fdac9e99fa	SQL: Fix unecessary evaluation for CASE/IIF (#57159 ) (#57262 ) Previously, `CASE` and `IIF` when translated to painless scripts (used in GROUP BY, HAVING, WHERE) a custom `caseFunction` registered in the `InternalSqlScriptUtils` was used. This function received and array of arbitrary length: ```[condition1, result1, condition2, result2, ... elseResult]``` Painless doesn't know of the context and therefore is evaluating all conditions and results before invoking the `caseFunction` on them. As a consequence, erroneous result expressions (i.e. division by 0) where always evaluated despite of the guarding condition. Replace the `caseFunction` with painless `<cond> ? <res1> : <res2>` expressions to properly guard the result expressions and only evaluate the one for which its guarding condition evaluates to true (or of course the elseResult). As a bonus, this approach includes performance benefits since we avoid unnecessary evaluations of both conditions and result expressions. Fixes: #49672 (cherry picked from commit 9584b345d89f797bfb658212b928b9812804f02f)	2020-05-28 11:30:14 +02:00
Tim Vernum	408250dcc4	Fix smtp.ssl.trust setting for watcher email (#57268 ) The ssl.trust setting for Watcher provides a list of hostnames that should be automatically trusted for SSL hostname verification. It was accidentally broken when we added the full ssl.* settings for email notifications (see #45272) This commit corrects this, so the setting is once again respected, as long as none of the other ssl settings are configured for email notifications. Resolves: #52153 Backport of: #56090	2020-05-28 17:34:13 +10:00
Ryan Ernst	fdb8573413	Convert remaining compilerJavaHome reference	2020-05-27 17:04:04 -07:00
Ryan Ernst	beb1d0c338	Remove compiler java version flag (#57237 ) This commit removes the compiler.java setting from the build. It was originally added when Gradle was far behind support for the latest jdk, but is no longer applicable as we don't have any need to update the supported compile version before gradle supports the newer version. Note that the runtime version changing support still exists here, this only ensures we use the same jdk to compile as we use to run gradle.	2020-05-27 16:33:38 -07:00
David Roberts	d139a79ef6	[7.x][ML] Fix monitoring if orphaned anomaly detector persistent tasks exist (#57240 ) Since #51888 the ML job stats endpoint has returned entries for jobs that have a persistent task but not job config. Such orphaned tasks caused monitoring to fail. This change ignores any such corrupt jobs for monitoring purposes. Backport of #57235	2020-05-27 22:59:11 +01:00
James Baiera	3b73ce3112	Fix enrich coordinator to reject documents instead of deadlocking (#56247 ) (#57179 ) This PR removes the blocking call to insert ingest documents into a queue in the coordinator. It replaces it with an offer call which will throw a rejection exception in the event that the queue is full. This prevents deadlocks of the write threads when the queue fills to capacity and there are more than one enrich processors in a pipeline.	2020-05-27 15:32:13 -04:00
Lee Hinman	c0f732b9f6	[7.x] Rename template V2 classes to ComposableTemplate (#57183 ) (#57232 ) Backports the following commits to 7.x: Rename template V2 classes to ComposableTemplate (#57183)	2020-05-27 11:01:59 -06:00
AndyHunt66	6760c69783	[DOCS] Fix formatting of create API key API docs (#57138 )	2020-05-27 08:34:51 -04:00
Tal Levy	81060820e9	Fix NormalizerAgg test searcher wrapping (#57171 ) The searcher was randomly wrapping its reader as slow, parallel, or filtered. This was causing casting issues in the normalizer tests. By removing the wrapping, the problem goes away. Closes #57164	2020-05-26 13:25:19 -07:00
Benjamin Trent	decc6277f9	[ML] allow unran/incomplete forecasts to be deleted for stopped/failed jobs (#57152 ) (#57172 ) If a job is NOT opened, forecasts should be able to be deleted, no matter their state. This also fixes a bug with expanding forecast IDs. We should check for wildcard `*` and `_all` when expanding the ids closes https://github.com/elastic/elasticsearch/issues/56419	2020-05-26 15:44:22 -04:00
Bogdan Pintea	74b2c8a770	Change error message for comp against fields (#57126 ) Change the error message wording for comparisons against fields in filtering (s/variables/fields). (cherry picked from commit d9a1cb50940d0a98fd75b9c0123ca6e1d862f65d)	2020-05-26 17:57:51 +02:00
Bogdan Pintea	0c379e334a	SQL: update the JLine dependency to 3.14.1 (#57111 ) * Update the JLine dependency to 3.14.1 Update the JLine dependency from 3.10.0 to 3.14.1. (cherry picked from commit c2d9b74046fa5ddb54604da3afa7887cc38548a1)	2020-05-26 17:56:34 +02:00
markharwood	b2bc6071fd	Add regex query support to wildcard field (approach 2) (#55548 ) (#57141 ) Backport of #55548 Adds equivalence for keyword field to the wildcard field. Regex, fuzzy, wildcard and prefix queries are all supported. All queries use an approximation query backed by an automaton-based verification queries. Closes #54275	2020-05-26 16:55:59 +01:00
markharwood	1d74549d7f	Wildcard field - add support for null field with test (#57047 ) (#57139 ) Backport of #57047	2020-05-26 16:07:49 +01:00
David Kyle	571477d0ad	[7.x] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041 ) (#57136 ) Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041) The queries performed by the expired data removers pull back entire documents when only a few fields are required. For ModelSnapshots in particular this is a problem as they contain quantiles which may be 100s of KB and the search size is set to 10,000. This change makes the search more efficient by only requesting the fields needed to work out which expired data should be deleted.	2020-05-26 10:56:42 +01:00
Ioannis Kakavas	6984b3ef6f	Adjust reload keystore test to pass in FIPS (#57050 ) (#57133 ) In KeystoreWrapper class we determine if the error to decrypt a given keystore is caused by a wrong password based on the exception that the SunJCE implementation of AES is throwing (AEADBadTagException). Other implementations from other Security Providers might cause decryption to fail in a different way and cause us to throw a generic error message. We handle this in this test by matching both possible exception messages. Relates: #56889	2020-05-26 11:21:50 +03:00
Ioannis Kakavas	1e03de4999	Fix key usage in SamlAuthenticatorTests (#57124 ) (#57129 ) In #51089 where SamlAuthenticatorTests were refactored, we missed to update one test case which meant that a single key would be used both for signing and encryption in the same run. As explained in #51089, and due to FIPS 140 requirements, BouncyCastle FIPS provider will block RSA keys that have been used for signing from being used for encryption and vice versa This commit changes testNoAttributesReturnedWhenTheyCannotBeDecrypted to always use the specific keys we have added for encryption.	2020-05-26 10:51:47 +03:00
Jim Ferenczi	52443d41cf	Stop async search maintenance service on restart (#56982 ) This change ensures that we stop the maintenance service on all nodes when a data node is restarted. This ensures that we don't send update_by_query requests on the node that is restarted. This commit also raises the log level to trace for some packages in order to investigate the failures to acquire a shard lock after a restart. Relates #56765	2020-05-26 09:30:33 +02:00
Przemysław Witek	ea2012778e	Mute failing test (#57112 ) (#57113 )	2020-05-25 14:06:29 +02:00
Ioannis Kakavas	174af2bb1a	[7.x] Refactor SamlAuthenticatorTests (#51089 ) (#57105 ) - Use opensaml to sign and encrypt responses/assertions/attributes instead of doing this manually - Use opensaml to build response and assertion objects instead of parsing xml strings - Always use different keys for signing and encryption. Due to FIPS 140 requirements, BouncyCastle FIPS provider will block RSA keys that have been used for signing from being used for encryption and vice versa. This change adds new encryption specific keys to be used throughout the tests.	2020-05-25 14:09:42 +03:00
Ioannis Kakavas	6c832fe4e3	Don't run IDP tests in FIPS 140 mode (#57048 ) (#57098 ) We don't support this for now so there is no need to handle all the test logic/exceptions to run this in FIPS 140 mode.	2020-05-25 14:08:48 +03:00
Armin Braun	9fa60f7367	Add History UUID Index Setting (#56930 ) (#57104 ) Pre-requesite for #50278 to be able to uniquely identify index metadata by its version fields and UUIDs when restoring into closed indices.	2020-05-25 11:26:03 +02:00
Rene Groeschke	28920a45f1	Improvement usage of gradle task avoidance api (#56627 ) (#56981 ) Use gradle task avoidance api wherever it is possible as a drop in replacement in the es build	2020-05-25 09:37:33 +02:00
Marios Trivyzas	b91bae30b1	SQL: [Tests] Move JDBC integration tests to new module (#56872 ) (#57072 ) Move the JDBC functionality integration tests from `:sql:qa` to a separate module `:sql:qa:jdbc`. This way the tests are isolated from the rest of the integration tests and they only depend to the `:sql:jdbc` module, thus removing the danger of accidentally pulling in some dependency that may hide bugs. Moreover this is a preparation for #56722, so that we can run those tests between different JDBC and ES node versions and ensure forward compatibility. Move the rest of existing tests inside a new `:sql:qa:server` project, so that the `:sql:qa` becomes the parent project for both and one can run all the integration tests by using this parent project. (cherry picked from commit c09f4a04484b8a43934fe58fbc41bd90b7dbcc76)	2020-05-22 17:49:36 +02:00
Ioannis Kakavas	6c90727166	Fix custom policy in plugins in FIPS 140 (#52046 ) (#57049 ) Our FIPS 140 testing depends on setting the appropriate java policy in order to configure the JVM in FIPS mode. Some tests ( discovery-ec2 and ccr qa ) also needed to set a custom policy file to grant a specific permission, which overwrote the FIPS related policy and tests would fail. This change ensures that when a custom policy needs to be set in these tests, the permissions that are necessary for FIPS are also set. Resolves: #51685, #52034	2020-05-21 19:26:56 +03:00
Benjamin Trent	f00dfb2d5f	[ML] adds WKT support in filestructurefinder (#57014 ) (#57032 ) Field mapping detection is done via grok patterns. This commit adds well-known text (WKT) formatted geometry detection. If everything is a `POINT`, then a `geo_point` mapping is preferred. Otherwise, if all the fields are WKT geometries a `geo_shape` mapping is preferred. This does NOT detect other types of formatted geometries (geohash, comma delimited points, etc.) closes https://github.com/elastic/elasticsearch/issues/56967	2020-05-21 08:22:51 -04:00
markharwood	eb8cb31d46	Update Lucene version to 8.6.0-snapshot-9d6c738ffce (#57024 ) Same version as master	2020-05-21 11:28:16 +01:00
James Rodewig	37e2bb7057	[DOCS] Add watcher multi-doc index ex (#52040 ) (#57011 ) Adds an example snippet for creating a `_doc` payload field with the Watcher `index` action. Co-authored-by: Luiz Guilherme Pais dos Santos <luiz.santos@elastic.co>	2020-05-20 16:57:45 -04:00
Brandon Morelli	ec41d36c62	docs: update links to beats security docs (#56875 ) (#56953 )	2020-05-20 11:28:39 -07:00
Bogdan Pintea	ec4a6aa1c6	SQL: JDBC: fix temporary directory locked test errors in Windows (#56917 ) * Fix temp dir locked errors The tests involving a temporary directory (containing the JDBC JAR) fail on Windows because they can't be deleted, due to still being in use. This commit forces a premature closing of the JAR file, which mitigates the failure by giving the JVM more time to collect any open FDs. (Calling the System.gc() in the tests is another working alternative fix.) The stream-based JAR access is taken care by disabling the cache usage (cherry picked from commit 04f97333a015404a68e8f19223f33aadeb396687)	2020-05-20 19:46:57 +02:00
Florian Kelbert	edada6bc39	[Docs] Insert missing colon (#56980 )	2020-05-20 15:49:17 +02:00
Benjamin Trent	ee4ce8ecec	Fix geotile_grid group_by field mapping (#56939 ) (#56990 ) The original implementation utilized `bbox` as the index mapping type. This would not work as it would have to be `envelope`. But, given that `envelope` and `polygon` are tessellated in the same way, we choose to use `polygon` as the geo_shape type. This is for easier support other places in the stack (a la kibana maps)	2020-05-20 08:22:13 -04:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Marios Trivyzas	644ae49817	SQL: Fix behaviour of COUNT(DISTINCT <literal>) (#56869 ) (#56932 ) Previously `COUNT(DISTINCT <literal>)` was returning the same result as `COUNT(<literal>)` which is not correct as it should always return 1 if there is at least one matching row (bucket if there is a GROUP BY), or 0 otherwise. (cherry picked from commit 7f7d7562d43034907f432d39d0d66f490d78f4a8)	2020-05-19 11:19:06 +02:00
Yannick Welsch	f296c08021	Increase timeout for assertLongBusy in AutoFollowIT (#56910 ) Closes #56891	2020-05-18 16:20:46 +02:00
Benjamin Trent	297f864884	[ML] relax throttling on expired data cleanup (#56711 ) (#56895 ) Throttling nightly cleanup as much as we do has been over cautious. Night cleanup should be more lenient in its throttling. We still keep the same batch size, but now the requests per second scale with the number of data nodes. If we have more than 5 data nodes, we don't throttle at all. Additionally, the API now has `requests_per_second` and `timeout` set. So users calling the API directly can set the throttling. This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`. This will allow users to adjust throttling of the nightly maintenance.	2020-05-18 08:46:42 -04:00
David Kyle	0fac152188	Muse AsyncSearchActionIT (#56897 ) For #56765	2020-05-18 13:36:33 +01:00
Ioannis Kakavas	bb852ab2e7	Cause is tracked in #49094 (#56887 )	2020-05-18 15:03:38 +03:00
David Kyle	52a329fa12	Mute sql.client.VersionTests suite (#56883 ) For #56882	2020-05-18 10:15:30 +01:00
Bogdan Pintea	de7dd6154e	Fix range of version number generation in test (#56849 ) The version number componenent can't equal or exceed the revision multiplier. This fixes a the VersionTests unit test. (cherry picked from commit 7d2331a2818ae20024c5c3617cd4433f90e9c098)	2020-05-16 08:59:45 +02:00
Andrei Stefan	4d47d63f55	SQL: implement SUM, MIN, MAX, AVG over literals (#56786 ) (#56850 ) * Adds support for MIN, MAX, AVG, SUM aggregates acting on literals. SELECT SUM(1) FROM index and SELECT SUM(1), AVG(2) work both on indices and as local execution. (cherry picked from commit efb72907c0391612c4a2b6256e327060b4167912)	2020-05-16 02:13:55 +03:00
Jake Landis	813609b47c	Ensure that .watcher-history-11* template is in installed prior to use (#56734 ) WatcherIndexTemplateRegistry as of https://github.com/elastic/elasticsearch/pull/52962 requires all nodes to be on 7.7.0 before it allows the version 11 index template to be installed. While in a mixed cluster, nothing prevents Watcher from running on the new host before the all of the nodes are on 7.7.0. This will result in the .watcher-history-11* index without the proper mappings. Without the proper mapping a single document (for a large watch) can exceed the default 1000 field limit and cause error to show in the logs. This commit ensures the same logic for writing to the index is applied as for installing the template. In a mixed cluster, the `10` index template will continue to be written. Only once all of nodes are on 7.7.0+ will the `11` index template be installed and used. closes #56732	2020-05-15 16:29:04 -05:00
Dimitris Athanasiou	54d3cc74ec	[7.x][ML] Ensure class is represented when its cardinality is low (#56783 ) (#56829 ) In DF analytics classification, it is possible to use no samples of a class if its cardinality is too low. This commit fixes this by ensuring the target sample count can never be zero. Backport of #56783	2020-05-15 20:52:06 +03:00
Bogdan Pintea	14ad733bd1	SQL: JDBC: fix access to the Manifest for non-entry JAR URLs (#56797 ) (#56839 ) * JDBC: fix access to the Manifest for non-entry JAR The JDBC driver will attempt to read its version from the Manifest file embedded into its JAR. The URL pointing to the JAR can be provided in a few ways. So far, accessing the Manfiest was attempted by getting a URLConnection out of the URL and then getting an input stream out of this connection. For file JAR URLs, this only works however if the URL points to the driver as a JAR file entry (i.e. <sub-url>!/jdbc-driver.jar!/). If that's not the case, the JarURLConnection will throw an IOException. This commit fixes that: in case the URL points to a JAR entry (jar:file:<path>/jdbc-driver.jar!/), the manifest is read directly with JarURLConnection#getManifest(). (cherry picked from commit 2175b7b01cf5fcf3ab2bb21404a9bd454a8df3f0) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-05-15 19:35:54 +02:00
James Baiera	4809db3ff9	EnrichProcessorFactory should not throw NPE if missing metadata (#55977 ) (#56793 ) In some cases the Enrich processor factory may be called before it is ready to create processors. While these calls are usually made in error, the response from the Enrich processor is an NPE which is almost always an unhelpful error when debugging an issue.	2020-05-15 12:02:13 -04:00
Ioannis Kakavas	239ada1669	Test adjustments for FIPS 140 (#56526 ) This change aims to fix our setup in CI so that we can run 7.x in FIPS 140 mode. The major issue that we have in 7.x and did not have in master is that we can't use the diagnostic trust manager in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it explicitly disallows the wrapping of X509TrustManager. Previous attempts like #56427 and #52211 focused on disabling the setting in all of our tests when creating a Settings object or on setting fips_mode.enabled accordingly (which implicitly disables the diagnostic trust manager). The attempts weren't future proof though as nothing would forbid someone to add new tests without setting the necessary setting and forcing this would be very inconvenient for any other case ( see #56427 (comment) for the full argumentation). This change introduces a runtime check in SSLService that overrides the configuration value of xpack.security.ssl.diagnose.trust and disables the diagnostic trust manager when we are running in Java 8 and the SunJSSE provider is set in FIPS mode.	2020-05-15 18:10:45 +03:00
Benjamin Trent	f71c305090	[7.x] [Transform] add support for terms agg in transforms (#56696 ) (#56809 ) * [Transform] add support for terms agg in transforms (#56696) This adds support for `terms` and `rare_terms` aggs in transforms. The default behavior is that the results are collapsed in the following manner: `<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...` Or if no sub aggs exist `<AGG_NAME>.<BUCKET_NAME>.<_doc_count>` The mapping is also defined as `flattened` by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.	2020-05-15 08:08:43 -04:00
David Roberts	270a23e422	[TEST] Fix log tail mocking in native process unit tests (#56804 ) This is a followup to #56632. Tests that had to be changed to mock the C++ log handler more accurately need to be more careful about when that stream ends, as ending of that stream is used to detect crashes in the production system. Fixes #56796	2020-05-15 12:46:37 +01:00
Alan Woodward	d33d13f2be	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:14:49 +01:00
David Turner	27a090232e	Suppress Kerberos tests on JDK15 (#56767 ) Somewhat convoluted AwaitsFix for #56507 that only applies on JDK15.	2020-05-15 07:41:04 +01:00
Yang Wang	c66e7ecbfe	Fix test failure of file role store auto-reload (#56398 ) (#56802 ) Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.	2020-05-15 15:10:45 +10:00
Ryan Ernst	9fb80d3827	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 20:23:07 -07:00
Tal Levy	5e90ff32f7	Add Normalize Pipeline Aggregation (#56399 ) (#56792 ) This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ```	2020-05-14 17:40:15 -07:00
Mark Vieira	0fd756d511	Enforce strict license distribution requirements (#56642 )	2020-05-14 13:57:56 -07:00
Jake Landis	a22aabcc15	[7.x] Reduce chance for test failure due to schedule (#56633 ) (#56695 ) If CI is running tests at exactly 0 or 5 minutes past the hour the ack-watch docs tests may fail with a 409 error if the ack test happens to run at the exact time that the schedule watch is running. This commit changes the public documentation (and the test) for the ack to a feb 29th at noon schedule. Test doc or tests do not really care about the schedule date and this is chosen since it is a valid date, but one that is extremely unlikely to cause issues.	2020-05-14 15:52:04 -05:00
Costin Leau	6f4af43405	EQL: Skip execution for filters with empty results (#56718 ) Optimize away events queries and joins/sequence that cannot match any results without having to query the backend. (cherry picked from commit 69c8ef8cfefd8fc6dcb6d1a566bfcd537068e3e4)	2020-05-14 22:38:23 +03:00
Mark Tozzi	b718193a01	Clean up DocValuesIndexFieldData (#56372 ) (#56684 )	2020-05-14 12:42:37 -04:00
Dimitris Athanasiou	ac5902624c	[7.x][ML] Improve error upon DF analytics mappings conflict (#56700 ) (#56776 ) Adds the conflicting types and an example of an index which specifies them in order to make it easier for the user to understand the conflict. Backport of #56700	2020-05-14 19:16:10 +03:00
Jim Ferenczi	fb5e6329b7	Stop/Start async search maintenance service in tests(#56673 ) This change ensures that the maintenance service that is responsible for deleting the expired response is stopped between each test. This is needed since we check that no search context are in-flight after each test method. Fixes #55988	2020-05-14 15:13:01 +02:00
David Turner	bec6821fe6	AwaitsFix for #56755	2020-05-14 11:46:05 +01:00
Alexander Reelsen	3a263d91f6	Ensure watcher email action message ids are always unique (#56574 ) If an email action is used in a foreach loop, message ids could have been duplicated, which then get rejected by the mail server. This commit introduces an additional static counter in the email action in order to ensure that every message id is unique.	2020-05-14 10:36:00 +02:00
Przemysław Witek	98fbd85290	[7.x] Add scope-related fields to Annotation (#56417 ) (#56681 )	2020-05-14 10:23:13 +02:00
Andrei Stefan	ddf4e47e86	EQL: fix QueryFolderOkTests (#56714 ) (#56728 ) (cherry picked from commit 8b21ccd0eac3b3d0fbd090152b3dff6ae5217b52)	2020-05-14 10:58:25 +03:00
David Roberts	3051c37f92	[ML] Tail the C++ logging pipe before connecting other pipes (#56701 ) Prior to this change the named pipes that connect the ML C++ processes to the Elasticsearch JVM were all opened before any of them were read from or written to. This created a problem, where if the C++ process logged more messages between opening the log pipe and opening the last pipe to be connected than there was space for in the named pipe's buffer then the C++ process would block. This would mean it never got as far as opening the last named pipe, so the JVM would never get as far as reading from the log pipe, hence a deadlock. This change alters the connection order so that the JVM starts reading from the logging pipe immediately after opening it so that if the C++ process logs messages while opening the other named pipes they are captured in a timely manner and there is no danger of a deadlock. Backport of #56632	2020-05-14 07:10:30 +01:00
Aleksandr Maus	87a10806ab	EQL: Fix cidrMatch function fails to match when used in scripts (#56246 ) (#56735 ) EQL: Fix cidrMatch function fails to match when used in scripts (#56246) Addresses https://github.com/elastic/elasticsearch/issues/55709	2020-05-13 22:41:24 -04:00
Nik Everett	b98b260048	Merge significant_terms into the terms package (backport of #56699 ) (#56715 ) This merges the code for the `significant_terms` agg into the package for the code for the `terms` agg. They are super entangled already, this mostly just admits that to ourselves. Precondition for the terms work in #56487	2020-05-13 17:36:21 -04:00
Ross Wolf	61e2cf89b5	EQL: Add number function (#55084 ) * EQL: Add number function * EQL: Fix the locale used for number for deterministic functionality * EQL: Add more ToNumber tests * EQL: Add more number ToNumberProcessor unit tests * EQL: Remove unnecessary overrides, fix processor methods * EQL: Remove additional unnecessary overrides * EQL: Lint fixes for ToNumber * EQL: ToNumber renames from PR feedback * EQL: Remove NumberFormat locale handling * EQL: Removed NumberFormat from ToNumber * EQL: Add number function tests * EQL: ToNumberProcessorTests formatting * EQL: Remove newline in ToNumberProcessorTests * EQL: Add number(..., null) test * EQL: Create expression.function.scalar.math package * EQL: Remove painless whitespace for ToNumber.asScript * EQL: Add Long support	2020-05-13 14:09:06 -06:00
Costin Leau	9f1ecd52eb	EQL: Introduce support for sequences (#56300 ) Initial support for EQL sequences The current algorithm is focused on correctness and does not contain any optimization which is left for the future. The current implementation uses a state machine approach which moves ascending and runs each query one after the other working on computing sequences as the data comes in. For each result, the key and its timestamp are being extracted which are then used for matching/building a sequence. (cherry picked from commit 4f3e18c894a1841d333022361ad9d1fdf1477dc3)	2020-05-13 15:42:31 +03:00
Ignacio Vera	b4521d5183	upgrade to Lucene 8.6.0 snapshot (#56661 )	2020-05-13 14:25:16 +02:00
Marios Trivyzas	cbbbd499bf	SQL/EQL: Add support for scalars within LIKE/RLIKE (#56495 ) (#56674 ) - Add support for scalar functions on the field of SQL's LIKE/RLIKE - Add support for scalar functions on the field of EQL's match/matchLite Closes: #55058 (cherry picked from commit 51c14e2dbb7fb29004a23369c449d425b3ac8fe2)	2020-05-13 13:40:24 +02:00
Luca Cavanna	30e9a1b8c7	Improve error handling when decoding async execution ids (#56285 ) When decoding async execution ids, exceptions thrown from the decode method itself were not caught, leading to cryptic errors like "Input byte array has incorrect ending byte at 68" being returned. With this commit we return "invalid id: [abcdef]". Added tests coverage for a couple of these scenarios and also added tests for equals/hashcode methods.	2020-05-13 12:26:17 +02:00
Marios Trivyzas	e781193cf9	SQL: Fix JDBC url pattern in docs and error message (#56612 ) The docs pattern url was using `*` which means zero or many instead of `?` which means zero or one. The pattern url returned in error messages was not in sync with the one in the docs. Fixes: #56476 (cherry picked from commit 1a5945c3962cdda21482f4b0b3e0ca508534c2c4)	2020-05-13 12:13:58 +02:00
David Turner	c10b4ae15a	Support cloning of searchable snapshot indices (#56595 ) Today you can convert a searchable snapshot index back into a regular index by restoring the underlying snapshot, but this is somewhat wasteful if the shards are already in cache since it copies the whole index from the repository again. Instead, we can make use of the locally-cached data by using the clone API to copy the contents of the cache into the layout expected by a regular shard. This commit marks the searchable snapshot's private index settings as `NotCopyableOnResize` so that they are removed by resize operations such as cloning. Cloning a regular index typically hard-links the underlying files rather than copying them, but this is tricky to support in the case of a searchable snapshot so this commit takes the simpler approach of always copying the underlying files.	2020-05-13 11:05:14 +01:00
Ioannis Kakavas	cc119c3853	Expose idp.metadata.http.refresh for SAML realm (#56354 ) (#56593 ) This setting was not returned in the SamlRealmSettings#getSettings so it was not possible for users to set this in the realm config in our configuration.	2020-05-13 11:51:18 +03:00
debadair	6de6ec68f2	[DOCS] Extract the cron docs from Watcher docs and add to the API conventions. (#56313 ) (#56651 ) * [DOCS] Promote cron expressions info from Watcher to a separate topic. * Fix table error * Fixed xref * Apply suggestions from code review Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Incorporated review feedback Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-05-12 16:36:18 -07:00
Jake Landis	a010f4f624	[7.x] Watcher dont add watches post index if stopped (#56556 ) (#56629 ) Watcher adds watches to the trigger service on the postIndex action for the .watches index. This has the (intentional) side effect of also adding the watches to the stats. The tests rely on these stats for their assertions. The tests also start and stop Watcher between each test for a clean slate. When Watcher executes it updates the .watches index and upon this update it will go through the postIndex method and end up added that watch to the trigger service (and stats). Functionally this is not a problem, if Watcher is stopping or stopped since Watcher is also paused and will not execute the watch. However, with specific timing and expectations of a clean slate can cause issues the test assertions against the stats. This commit ensures that the postIndex action only adds to the trigger service if the Watcher state is not stopping or stopped. When started back up it will re-read index .watches. This commit also un-mutes the tests related to #53177 and #56534	2020-05-12 16:30:27 -05:00
James Rodewig	cf76a932fb	[DOCS] Correct watcher event data example (#56469 ) * Swaps outdated index patterns for the default `logstash` index alias. Adds some related information about Logstash ILM defaults to the callout. * Swaps `.raw` fields for `.keyword` fields. The Logstash template uses `keyword` fields by default since 6.x. * Swaps instances of `ctx.payload.hits.total.value` with `ctx.payload.hits.total`	2020-05-12 16:33:33 -04:00
Jake Landis	9c76ee47c4	[7.x] json spec: allow null for documentation url (#55749 ) (#56625 ) This commit allows the JSON schema's documentation.url property to have a null value. This can useful for cases where a feature is under development, and does not have documentation published yet. This commit also adds a documentation.url for two ml resources.	2020-05-12 14:49:02 -05:00
Armin Braun	0a879b95d1	Save Bounds Checks in BytesReference (#56577 ) (#56621 ) Two spots that allow for some optimization: * We are often creating a composite reference of just a single item in the transport layer => special cased via static constructor to make sure we never do that * Also removed the pointless case of an empty composite bytes ref * `ByteBufferReference` is practically always created from a heap buffer these days so there is no point of dealing with all the bounds checks and extra references to sliced buffers from that and we can just use the underlying array directly	2020-05-12 20:33:45 +02:00
Armin Braun	c104c9a11b	Fix Missing IgnoredUnavailable Flag in 7.x SLM Retention Task (#56616 ) Without the flag we run into the situation where a broken repository (broken by some old 6.x version of ES that is missing some snap-${uuid}.dat blobs fails to run the SLM retention task since it always errors out).	2020-05-12 18:07:58 +02:00
Marios Trivyzas	4240b97d0e	SQL: [Test] Fix JdbcPreparedStatement date test Use `ORDER BY` to ensure order of the rows since more than are returned in the testDate(). Follows: #56492 (cherry picked from commit 0053a1cb515b4db160d7b0bed5cf3f13c1050687)	2020-05-12 17:08:16 +02:00
Martijn van Groningen	0c61bc63e4	Backport: auto create data streams using index templates v2 (#56596 ) Backport: #55377 This commit adds the ability to auto create data streams using index templates v2. Index templates (v2) now have a data_steam field that includes a timestamp field, if provided and index name matches with that template then a data stream (plus first backing index) is auto created. Relates to #53100	2020-05-12 17:01:15 +02:00
Andrei Stefan	f0074e93a0	QL: case sensitive support in EQL (#56404 ) (#56597 ) * QL: case sensitive support in EQL (#56404) * adds a generic startsWith function to QL * modifies the existent EQL startsWith function to be case sensitive aware * improves the existent EQL startsWith function to use a prefix query when the function is used in a case sensitive context. Same improvement is used in SQL's newly added STARTS_WITH function. * adds case sensitivity to EQL configuration through a case_sensitive parameter in the eql request, as established in #54411. The case_sensitive parameter can be specified when running queries (default is case insensitive) (cherry picked from commit ee5a09ea840167566e34c28c8225dc38bc6a7ae8)	2020-05-12 16:56:18 +03:00
Hendrik Muhs	a9425a0240	[7.x][Transform] fix count when matching exact ids(#56544 ) (#56582 ) fix count in get and get stats if explicit ids are given and ids might be duplicated when configuration are stored in different index (versions). fixes #56196	2020-05-12 14:23:13 +02:00
Marios Trivyzas	575cafb8da	SQL: Fix serialization of JDBC prep statement date/time params (#56492 ) (#56579 ) The Date/Time related query params of a JDBC prepared statement serialized using java.util.Date. The rules for serializing `java.util.Date` objects though reside in `XContentElasticsearchExtension` which is not available in the jdbc jar as this class is in `server` module. Therefore, a custom extension of the `XContentBuilderExtension` iface has been added to the jdbc module/jar. Moreover the sql's `qa` project had as dependency the `sql-action` module which depends on `server` so the `XContentBuilderExtension` was available for the integ tests hiding the real problem. Previously, when a user was setting a `java.sql.Time` to the prepStmt, the DataType used was `DATETIME` instead of `TIME` and therefore prevented from filtering with a `TIME` casted field: ``` SELECT * FROM test WHERE date::TIME = ? ``` Fixes: #56084 (cherry picked from commit f8d8e971bd2c85fa4aea44b5b3ba0cdcc950a4ed)	2020-05-12 13:25:02 +02:00
Martijn van Groningen	2e86801f61	Backport: enable searchable snapshots feature flag for xpack rest tests. Backport of: #56569 A data stream test, which tests data stream resolvability in xpack apis failed in release builds. A invocation of a searchable snapshot api failed, because the corresponding feature flag wasn't enabled for xpack rest tests. Closes #56531	2020-05-12 12:18:24 +02:00
Ignacio Vera	222ee721ec	Add moving percentiles pipeline aggregation (#55441 ) (#56575 ) Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics	2020-05-12 11:35:23 +02:00
Marios Trivyzas	5c0f26de1d	SQL: [Docs] Fix example for DATETIME_PARSE (#56409 ) When no timezone is specified the session timezone is used without conversion, fix the docs test accordingly. Follows: #56158 (cherry picked from commit 4b79b19ea5c3d17e05cb8130f3c754ac9bfd2382)	2020-05-12 09:23:00 +02:00
Ryan Ernst	902fc546bd	Migrate remaining ESIntegTestCases to internalClusterTest (#56479 ) (#56563 ) This commit migrates the ESIntegTestCase tests in x-pack to the internalClusterTest source set.	2020-05-11 21:06:04 -07:00
Nick Knize	9b64149ad2	[Geo] Refactor Point Field Mappers (#56060 ) (#56540 ) This commit refactors the following: * GeoPointFieldMapper and PointFieldMapper to AbstractPointGeometryFieldMapper derived from AbstractGeometryFieldMapper. * .setupFieldType moved up to AbstractGeometryFieldMapper * lucene indexing moved up to AbstractGeometryFieldMapper.parse * new addStoredFields, addDocValuesFields abstract methods for implementing stored field and doc values field indexing in the concrete field mappers This refactor is the next phase for setting up a framework for extending spatial field mapper functionality in x-pack.	2020-05-11 17:11:36 -05:00
Tim Brooks	760ab726c2	Share netty event loops between transports (#56553 ) Currently Elasticsearch creates independent event loop groups for each transport (http and internal) transport type. This is unnecessary and can lead to contention when different threads access shared resources (ex: allocators). This commit moves to a model where, by default, the event loops are shared between the transports. The previous behavior can be attained by specifically setting the http worker count.	2020-05-11 15:43:43 -06:00
Benjamin Trent	1d6b2f074e	[Transform] adds geotile_grid support in group_by (#56514 ) (#56549 ) This adds support for grouping by geo points. This uses the agg [geotile_grid](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geotilegrid-aggregation.html). I am opting to store the tile results of group_by as a `geo_shape` so that users can query the results. Additionally, the shapes could be visualized and filtered in the kibana maps app. relates to https://github.com/elastic/elasticsearch/issues/56121	2020-05-11 17:02:40 -04:00
Lee Hinman	1337b35572	Remove prefer_v2_templates query string parameter (#56545 ) This commit removes the `prefer_v2_templates` flag and setting. This was a brief setting that allowed specifying whether V1 or V2 template should be used when an index is created. It has been removed in favor of V2 templates always having priority. Relates to #53101 Resolves #56528 This is not a breaking change because this flag was never in a released version.	2020-05-11 14:56:42 -06:00
Brandon Morelli	659edb92ff	docs: [7.x][apm] link to master in n.x branches (#56539 )	2020-05-11 13:42:37 -07:00
zhenxianyimeng	8e96e5c936	Use CollectionUtils.isEmpty where appropriate (#55910 ) This commit uses the isEmpty utility method for arrays in place of null and greater than zero checks.	2020-05-11 09:55:57 -07:00
Armin Braun	3ab6eba6bc	Fix RollupJobTaskTests Leaking Threads on Slowness (#56438 ) (#56518 ) We are ensuring order in the two tests changed by waiting on latches. The problem is, that 3s is a pretty short wait and on CI can randomly be exceeded by pure chance. If that happened we wouldn't have visibility on it since we didn't assert that the waits actually worked. => Fixed by asserting that the waits work and upping the timeout to our standard 10s Also, moved to a per-test threadpool to make it simpler to identify which test failed, should an unexpected task run on a closed client's pool afterall.	2020-05-11 17:24:10 +02:00
Jim Ferenczi	02ab9112a9	Fix spurious failures in AsyncSearchIntegTestCase (#56026 ) Async search integration tests are subject to random failures when: * The test index has more than one replica. * The request cache is used. * Some shards are empty. * The maintenance service starts a garbage collection when node is closing. They are also slow because the test index is created/populated on each test method. This change refactors these integration tests in order to: * Create the index once for the entire test suite. * Fix the usage of the request cache and replicas. * Ensures that all shards have at least one document. * Increase the delay of the maintenance service garbage collection. Closes #55895 Closes #55988	2020-05-11 15:03:03 +02:00
Martijn van Groningen	9ae09570d8	Allow a number of broadcast transport actions to resolve data streams (#55726 ) (#56502 ) Change TransportBroadcastByNodeAction and TransportBroadcastReplicationAction to be able to resolve data streams by default. Implementations can change this ability. This change allows to following APIs to resolve data streams: flush, refresh (already supported data streams), force merge, clear indices cache, indices stats (already supported data streams), segments, upgrade stats, upgrade, validate query, searchable snapshots stats, clear searchable snapshots cache and reload analyzers APIs. Relates to #53100	2020-05-11 12:48:35 +02:00
Rene Groeschke	c29bc87040	Move bwcVersions extension property to BuildParams (back port) (#56381 ) * Move bwcVersions extension property to BuildParams (#56206) * Fix :qa Task Using Broken BwC Versions Resolution (#56332) Co-authored-by: Armin Braun <me@obrown.io>	2020-05-11 09:39:13 +02:00
Nik Everett	2f38aeb5e2	Save memory when numeric terms agg is not top (#55873 ) (#56454 ) Right now all implementations of the `terms` agg allocate a new `Aggregator` per bucket. This uses a bunch of memory. Exactly how much isn't clear but each `Aggregator` ends up making its own objects to read doc values which have non-trivial buffers. And it forces all of it sub-aggregations to do the same. We allocate a new `Aggregator` per bucket for two reasons: 1. We didn't have an appropriate data structure to track the sub-ordinals of each parent bucket. 2. You can only make a single call to `runDeferredCollections(long...)` per `Aggregator` which was the only way to delay collection of sub-aggregations. This change switches the method that builds aggregation results from building them one at a time to building all of the results for the entire aggregator at the same time. It also adds a fairly simplistic data structure to track the sub-ordinals for `long`-keyed buckets. It uses both of those to power numeric `terms` aggregations and removes the per-bucket allocation of their `Aggregator`. This fairly substantially reduces memory consumption of numeric `terms` aggregations that are not the "top level", especially when those aggregations contain many sub-aggregations. It also is a pretty big speed up, especially when the aggregation is under a non-selective aggregation like the `date_histogram`. I picked numeric `terms` aggregations because those have the simplest implementation. At least, I could kind of fit it in my head. And I haven't fully understood the "bytes"-based terms aggregations, but I imagine I'll be able to make similar optimizations to them in follow up changes.	2020-05-08 20:38:53 -04:00
Mark Vieira	0fb9bc5379	Always use archive base name as the pom artifact id (#56447 ) (#56467 )	2020-05-08 16:11:19 -07:00
Armin Braun	0a254cf223	Serialize Monitoring Bulk Request Compressed (#56410 ) (#56442 ) Even with changes from #48854 we're still seeing significant (as in tens and hundreds of MB) buffer usage for bulk exports in some cases which destabilizes master nodes. Since we need to know the serialized length of the bulk body we can't do the serialization in a streaming manner. (also it's not easily doable with the HTTP client API we're using anyway). => let's at least serialize on heap in compressed form and decompress as we're streaming to the HTTP connection. For small requests this adds negligible overhead but for large requests this reduces the size of the payload field by about an order of magnitude (empirically determined) which is a massive reduction in size when considering O(100MB) bulk requests.	2020-05-08 23:16:07 +02:00
Dimitris Athanasiou	44ffa388ac	[7.x][ML] Use non-zero timeout when force stopping DF analytics (#56423 ) (#56428 ) We have been using a zero timeout in the case that DF analytics is stopped. This may cause a timeout when we cancel, for example, the reindex task. This commit fixes this by using the default timeout instead. Backport of #56423	2020-05-08 21:12:11 +03:00
David Roberts	9a3924a641	[ML] Adjust list of platforms that have ML native code (#56426 ) Native code is now available for linux-aarch64. Note that it is _not_ currently supported!	2020-05-08 16:22:45 +01:00
Dimitris Athanasiou	c117ae7a6e	[7.x][ML] Force stopping stopped DF analytics should succeed (#56421 ) (#56424 ) Force stopping a DF analytics job whose config exists and that is stopped should succeed. This was broken by #56360. Closes #56414 Backport of #56421	2020-05-08 18:04:24 +03:00
Tanguy Leroux	8e9b69bfd7	Use snapshot information to build searchable snapshot store MetadataSnapshot (#56289 ) (#56403 ) While investigating possible optimizations to speed up searchable snapshots shard restores, we noticed that Elasticsearch builds the list of shard files on local disk in order to compare it with the list of files contained in the snapshot to restore. This list of files is materialized with a MetadataSnapshot object whose construction involves to read the footer checksum of every files of the shard using Store.checksumFromLuceneFile() method. Further investigation shows that a MetadataSnapshot object is also created for other types of operations like building the list of files to recover in a peer recovery (and primary shard relocation) or in order to assign a shard to a node. These operations use the Store.getMetadata(IndexCommit) method to build the list of files and checksums. In the case of searchable snapshots building the MetadataSnapshot object can potentially trigger cache misses, which in turn can cause the download and the writing in cache of the last range of the file in order to check the 16 bytes footer. This in turn can cause more evictions. Since searchable snapshots already contains the footer information of every file in BlobStoreIndexShardSnapshot it can directly read the checksum from it and avoid to use the cache at all to create a MetadataSnapshot for the operations mentioned above. This commit adds a shortcut to the SearchableSnapshotDirectory.openInput() method - similarly to what already exists for segment infos - so that it creates a specific IndexInput for checksum reading operation.	2020-05-08 14:16:19 +02:00
Dimitris Athanasiou	60b1c67409	[7.x][ML] Allow stopping DF analytics whose config is missing (#56360 ) (#56408 ) It is possible that the config document for a data frame analytics job is deleted from the config index. If that is the case the user is unable to stop a running job because we attempt to retrieve the config and that will throw. This commit changes that. When the request is forced, we do not expand the requested ids based on the existing configs but from the list of running tasks instead. Backport of #56360	2020-05-08 13:54:44 +03:00
Hendrik Muhs	cc35d37788	[Transform] unmute transform upgrade tests (#56296 ) the transform upgrade tests broke due to #56238, but got fixed with #56274 fixes #56269 fixes #56250	2020-05-08 10:48:58 +02:00
Dimitris Athanasiou	d064eda2b0	[7.x][ML] Ensure phase progress may only increase (#56339 ) (#56357 ) Due to multi-threading it is possible that phase progress updates written from the c++ process arrive reordered. We can address this by ensuring that progress may only increase. Closes #56282 Backport of #56339	2020-05-07 19:46:58 +03:00
William Brafford	691044e67b	Add xpack setting deprecations to deprecation API (#56290 ) * Add xpack setting deprecations to deprecation API The deprecated settings showed up in the deprecation log file by default, but I did not add them to the deprecation API. This commit fixes that. Now if you use one of the deprecated basic feature enablement settings, calling _monitoring/deprecations will inform you of that fact. * Remove incorrectly backported settings documents It seems that I backported these docs to the wrong place in #56061, in #55980, and in #56167. I hope they're in the right place now. Co-authored-by: debadair <debadair@elastic.co>	2020-05-07 10:28:17 -04:00
Nik Everett	e35919d3b8	Optimize date_histograms across daylight savings time (backport of #55559 ) (#56334 ) Rounding dates on a shard that contains a daylight savings time transition is currently something like 1400% slower than when a shard contains dates only on one side of the DST transition. And it makes a ton of short lived garbage. This replaces that implementation with one that benchmarks to having around 30% overhead instead of the 1400%. And it doesn't generate any garbage per search hit. Some background: There are two ways to round in ES: * Round to the nearest time unit (Day/Hour/Week/Month/etc) * Round to the nearest time interval (3 days/2 weeks/etc) I'm only optimizing the first one in this change and plan to do the second in a follow up. It turns out that rounding to the nearest unit really is two problems: when the unit rounds to midnight (day/week/month/year) and when it doesn't (hour/minute/second). Rounding to midnight is consistently about 25% faster and rounding to individual hour or minutes. This optimization relies on being able to usually figure out what the minimum and maximum dates are on the shard. This is similar to an existing optimization where we rewrite time zones that aren't fixed (think America/New_York and its daylight savings time transitions) into fixed time zones so long as there isn't a daylight savings time transition on the shard (UTC-5 or UTC-4 for America/New_York). Once I implement time interval rounding the time zone rewriting optimization should no longer be needed. This optimization doesn't come into play for `composite` or `auto_date_histogram` aggs because neither have been migrated to the new `DATE` `ValuesSourceType` which is where that range lookup happens. When they are they will be able to pick up the optimization without much work. I expect this to be substantial for `auto_date_histogram` but less so for `composite` because it deals with fewer values. Note: My 30% overhead figure comes from small numbers of daylight savings time transitions. That overhead gets higher when there are more transitions in logarithmic fashion. When there are two thousand years worth of transitions my algorithm ends up being 250% slower than rounding without a time zone, but java time is 47000% slower at that point, allocating memory as fast as it possibly can.	2020-05-07 09:10:51 -04:00
Tanguy Leroux	6233e32ab3	Fix SearchableSnapshotDirectoryTests.testIndexSearcher() (#56275 ) Closes #56233	2020-05-07 11:12:35 +02:00
Tanguy Leroux	65a061e33a	Fix SearchableSnapshotDirectoryTests.testClearCache (#56277 ) This test sometimes fails when prewarming is enabled because it's possible that some files are cached in background while the test tries to clear the cache. This commit disables prewarming for this test.	2020-05-07 10:59:33 +02:00
Andrei Stefan	980f175222	EQL: simplify equals/not-equals TRUE/FALSE expressions (#56191 ) (#56306 ) * Simplify equals/not-equals TRUE/FALSE expressions, by returning them as is (TRUE variant) or negating them (FALSE variant) (cherry picked from commit 17858afbe6da5fa0b3ecfc537cabb337e4baaffe)	2020-05-07 03:02:04 +03:00
Jason Tedor	c775e47054	Fix missing SHAs for Jackson 2.10.4 This was not picked up on a backport, so this commit adds the missing SHAs, and removes the old ones.	2020-05-06 17:28:24 -04:00
Jason Tedor	33669c0420	Upgrade to Jackson 2.10.4 (#56188 ) Another Jackson release is available. There are some CVEs addressed, none of which impact us, but since we can now bump Jackson easily, let us move along with the train to avoid the false positives from security scanners.	2020-05-06 17:20:23 -04:00
Przemysław Witek	0cd0ab276e	Introduce Annotation.Builder class and use it to create instances of Annotation class (#56276 ) (#56286 )	2020-05-06 20:47:03 +02:00
Julie Tibshirani	e852bb29b7	Simplify signature of FieldMapper#parseCreateField. (#56144 ) `FieldMapper#parseCreateField` accepts the parse context, plus a list of fields as an output parameter. These fields are immediately added to the document through `ParseContext#doc()`. This commit simplifies the signature by removing the list of fields, and having the mappers add the fields directly to `ParseContext#doc()`. I think this is nicer for implementors, because previously fields could be added either through the list, or the context (through `add`, `addWithKey`, etc.)	2020-05-06 11:12:09 -07:00
Dimitris Athanasiou	011e995165	[7.x][ML] Unmute ClssificationIT.testDependentVariableCardinalityTooHighButWithQueryMakesItWithinRange (#56268 ) (#56287 ) Closes #56240	2020-05-06 18:20:46 +03:00
Navneet Kumar	a649f85358	[DOCS] Create API key API requires `name` request body param (#56262 ) Fixes #56164. A minor update in the documentation, API key name is required when creating API key. If the API key name is not provided then the request will fail.	2020-05-06 08:52:45 -04:00
Luca Cavanna	9a9cb68e83	Async Search: correct shards counting (#55758 ) Async search allows users to retrieve partial results for a running search. For partial results, the number of successful shards does not include the skipped shards, while the response returned to users should. Also, we recently had a bug where async search would miss tracking shard failures, which would have been caught if we had assertions in place that verified that whenever we get the last response, the number of failures included in it is the same as the failures that were tracked through the listener notifications.	2020-05-06 12:13:30 +02:00
Tanguy Leroux	07ad742b60	Enable prewarming by default for searchable snapshots (#56201 ) Now searchable snapshots directories respect the repository rate limitations (#55952) we can enable prewarming by default for shards.	2020-05-06 10:18:34 +02:00
Tanguy Leroux	131a3911eb	Replace BlobContainerWrapper by FilterBlobContainer (#56200 ) A FilterBlobContainer class was introduced in #55952 and it delegates its behavior to a given BlobContainer while allowing to override only necessary methods. This commit replaces the existing BlobContainerWrapper class from the test framework with the new FilterBlobContainer from core.	2020-05-06 10:05:43 +02:00
Julie Tibshirani	bd7a2d2b01	Mute the geogrid agg circuit breaker tests.	2020-05-05 18:09:07 -07:00
Julie Tibshirani	dc738e34d2	Mute the mixed cluster 80_transform_jobs_crud test.	2020-05-05 17:58:17 -07:00

... 2 3 4 5 6 ...

5749 Commits