OpenSearch

Commit Graph

Author	SHA1	Message	Date
Mark Vieira	cb58725164	Mute InferenceIngestIT.testPipelineIngest	2020-04-14 09:27:56 +01:00
William Brafford	52bebec51f	NodeInfo response should use a collection rather than fields (#54460 ) (#55132 ) This is a first cut at giving NodeInfo the ability to carry a flexible list of heterogeneous info responses. The trick is to be able to serialize and deserialize an arbitrary list of blocks of information. It is convenient to be able to deserialize into usable Java objects so that we can aggregate nodes stats for the cluster stats endpoint. In order to provide a little bit of clarity about which objects can and can't be used as info blocks, I've introduced a new interface called "ReportingService." I have removed the hard-coded getters (e.g., getOs()) in favor of a flexible method that can return heterogeneous kinds of info blocks (e.g., getInfo(OsInfo.class)). Taking a class as an argument removes the need to cast in the client code.	2020-04-13 17:18:39 -04:00
Ryan Ernst	ae14d1661e	Replace license check isAuthAllowed with isSecurityEnabled (#54547 ) (#55082 ) The isAuthAllowed() method for license checking is used by code that wants to ensure security is both enabled and available. The enabled state is dynamic and provided by isSecurityEnabled(). But since security is available with all license types, an check on the license level is not necessary. Thus, this change replaces isAuthAllowed() with calling isSecurityEnabled().	2020-04-13 12:26:39 -07:00
Benjamin Trent	d32f6fed1d	[ML] inference only persist if there are stats (#54752 ) (#55121 ) We needlessly send documents to be persisted. If there are no stats added, then we should not attempt to persist them. Also, this PR fixes the race condition that caused issue: https://github.com/elastic/elasticsearch/issues/54786	2020-04-13 14:03:05 -04:00
Igor Motov	51c6f69e02	[7.x] Add support for filters to T-Test aggregation (#54980 ) (#55066 ) Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes #53692	2020-04-13 12:28:58 -04:00
Jake Landis	a2fafa6af4	[7.x] Lazy test cluster module and plugins (#54852 ) (#55087 ) This change converts the module and plugin parameters for testClusters to be lazy. Meaning that the values are not resolved until they are actually used. This removes the requirement to use project.afterEvaluate to be able to resolve the bundle artifact. Note - this does not completely remove the need for afterEvaluate since it is still needed for the custom resource extension.	2020-04-13 10:53:35 -05:00
Igor Motov	6861295706	Further improve InternalTTestTests (#55081 ) A small follow-up to #54910. Now that we can generated consistent set of internal aggs to reduce, we no longer need to keep agg parameters as class variables. Related to #54910	2020-04-13 10:26:23 -04:00
Benjamin Trent	c5c7ee9d73	[7.x] [ML] Start gathering and storing inference stats (#53429 ) (#54738 ) * [ML] Start gathering and storing inference stats (#53429) This PR enables stats on inference to be gathered and stored in the `.ml-stats-` indices. Each node + model_id will have its own running stats document and these will later be summed together when returning _stats to the user. `.ml-stats-` is ILM managed (when possible). So, at any point the underlying index could change. This means that a stats document that is read in and then later updated will actually be a new doc in a new index. This complicates matters as this means that having a running knowledge of seq_no and primary_term is complicated and almost impossible. This is because we don't know the latest index name. We should also strive for throughput, as this code sits in the middle of an ingest pipeline (or even a query).	2020-04-13 08:15:46 -04:00
Andrei Dan	c0406f78b7	ILM add cluster update timeout on step retry (#54878 ) (#55022 ) This commits adds a timeout when moving ILM back on to a failed step. In case the master is struggling with processing the cluster update requests these ones will expire (as we'll send them again anyway on the next ILM loop run) ILM more descriptive source messages for cluster updates Use the configured ILM step master timeout setting (cherry picked from commit ff6c5ed16616eadfcddd9c95317d370f0d126583) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-11 10:13:31 +01:00
Andrei Dan	b8df265b42	[7.x] ILM use Priority.IMMEDIATE for stop ILM cluster update (#54909 ) (#55018 ) * ILM use Priority.IMMEDIATE for stop ILM cluster update (#54909) This changes the priority of the cluster state update that stops ILM altogether to `IMMEDIATE`. We've chosen to change this as it can be useful to temporarily stop ILM if a cluster is overwhelmed, but a `NORMAL` priority can see the "stop ILM update" not make it up the tasks queue. On the same note, we're keeping the `start ILM` cluster update priority to `NORMAL` on purpose such that we only start `ILM` if the cluster can handle it. (cherry picked from commit d67df3a7cd2a8619c2c9efac4dde3ba83271f2fa) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-11 10:12:35 +01:00
Albert Zaharovits	f22004a262	Preserve parent task id for data frame analytics (#55046 ) This change makes sure that all internal client requests spawned by the data frame analytics persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because data frame analytics taks already implements graceful shutdown of child tasks, this change does not interfere with it by opting out of the persistent task cancellation of child tasks. Relates #54943 #52314	2020-04-10 22:27:21 +03:00
Mark Vieira	5d4ddf9146	Fixes for IntelliJ IDEA 2020.1 support (#55077 )	2020-04-10 11:57:48 -07:00
Nik Everett	c00811f3a3	Make some agg tests easier to read (#54954 ) (#55079 ) We added a fancy method to provide random realistic test data to the reduction tests in #54910. This uses that to remove some of the more esoteric machinations in the agg tests. This will marginally increase the coverage of the serialiation tests and, more importantly, remove some mysterious value generation code that only really made sense for random reduction tests but was used all over the place. It doesn't, on the other hand, make the tests shorter. Just hopefully more clear. I only cleaned up a few tests this way. If we like this it'd probably be worth grabbing others.	2020-04-10 14:15:30 -04:00
Luca Cavanna	93c39ad4e7	Async search: create internal index only before storing initial response (#54619 ) We currently create the .async-search index if necessary before performing any action (index, update or delete). Truth is that this is needed only before storing the initial response. The other operations are either update or delete, which will anyways not find the document to update/delete even if the index gets created when missing. This also caused `testCancellation` failures as we were trying to delete the document twice from the .async-search index, once from `TransportDeleteAsyncSearchAction` and once as a consequence of the search task being completed. The latter may be called after the test is completed, but before the cluster is shut down and causing problems to the after test checks, for instance if it happens after all the indices have been cleaned up. It is totally fine to try to delete a response that is no longer found, but not quite so if such call will also trigger an index creation. With this commit we remove all the calls to createIndexIfNecessary from the update/delete operation, and we leave one call only from storeInitialResponse which is where the index is expected to be created. Closes #54180	2020-04-10 18:24:05 +02:00
Ross Wolf	96a903b17f	EQL: Add string function (#54470 ) * EQL: Add string() function * EQL: Reorder queryfolder_tests * EQL: Add test queries * EQL: Fix InternalEqlScriptUtils.string and test case * EQL: Fix testStringFunctionWithText error message * EQL: Flatten ToStringFunctionPipe.equals * EQL: Reorder painless whitelist * EQL: Address feedback and remove string(null) handling * EQL: Move string(pid) test over * EQL: Rename source -> value	2020-04-10 09:48:29 -06:00
Przemysław Witek	17101d86d9	[7.x] Do not execute ML CRUD actions when upgrade mode is enabled (#54437 ) (#55049 )	2020-04-10 16:07:11 +02:00
Dimitrios Liappis	b062535e27	Mute testSearchableSnapshotAction in TimeSeriesLifecycleActions tests (#55055 ) Backport of #55052 Details in #55050	2020-04-10 16:03:09 +03:00
Jason Tedor	a370668fcc	Clean up even more instances of "metaData" We recently cleaned up the use of the word "metadata" across the codebase. Even more additional uses have trickled in, likely from in-progress work. This commit cleans up these last few additional instances. Relates #54519	2020-04-10 08:52:37 -04:00
Jason Tedor	9eeae59a83	Clarify available processors (#54907 ) The use of available processors, the terminology, and the settings around it have evolved over time. This commit cleans up some places in the codes and in the docs to adjust to the current terminology.	2020-04-10 08:48:27 -04:00
Costin Leau	a7e4f79e8f	EQL: Deprecate lenient sequence declaration (#55032 ) Deprecate alternative sequence parameter declaration (with then by) Disallow lack of time units inside maxspan Fix #55023 Relate #54680 (cherry picked from commit 201adafba9def1de4bf843760defb9def3394f63)	2020-04-10 10:30:07 +03:00
Marios Trivyzas	bf0cadb602	SQL: Implement DATETIME_PARSE function for parsing strings (#54960 ) (#55035 ) Implement DATETIME_PARSE(<datetime_str>, <pattern_str>) function which allows to parse a datetime string according to the specified pattern into a datetime object. The patterns allowed are those of java.time.format.DateTimeFormatter. Relates to #53714 (cherry picked from commit 3febcd8f3cdf9fdda4faf01f23a5f139f38b57e0)	2020-04-10 01:16:29 +02:00
Nhat Nguyen	c9f8fb2dd0	Clear recent errors when auto-follow successfully (#54997 ) Today, we do not clear the recent errors in AutoFollowCoordinator when we successfully auto-follow indices. This can lead to confusion for the operators.	2020-04-09 14:35:16 -04:00
Albert Zaharovits	f55a361b64	Preserve Task Id for ML Datafeed (#54943 ) This change preserves the task id for internal requests for the `StartDatafeedPersistentTask`. Task ids are a way to express a relationship between related internal requests. In this particular case, the task ids are used for debugging and (soon) security auditing, but not for task cancellation, because there is already a graceful-shutdown of child internal requests (given a task id) in place.	2020-04-09 13:22:29 +03:00
Hendrik Muhs	223fbb2ae7	[Transform] fix sporadic test failure due to unavailable notif… (#54939 ) move no initializing shards check before dumping audit messages fixes #54810	2020-04-09 08:04:42 +02:00
Andrei Stefan	85f129a50a	EQL: indexOf function implementation (#54543 ) (#54989 ) (cherry picked from commit a4b1d6e52d9ba22d541dd86d69861b1efee83604)	2020-04-09 02:41:01 +03:00
Mark Vieira	1552f2fa3e	Enable searchable snapshots for release tests (#54987 )	2020-04-08 14:41:03 -07:00
Mark Vieira	0fa8a14bcb	Mute SamlServiceProviderDocumentTests.testStreamRoundTripWithAllFields	2020-04-08 12:56:36 -07:00
Jay Modi	3600c9862f	Reintroduce system index APIs for Kibana (#54935 ) This change reintroduces the system index APIs for Kibana without the changes made for marking what system indices could be accessed using these APIs. In essence, this is a partial revert of #53912. The changes for marking what system indices should be allowed access will be handled in a separate change. The APIs introduced here are wrapped versions of the existing REST endpoints. A new setting is also introduced since the Kibana system indices' names are allowed to be changed by a user in case multiple instances of Kibana use the same instance of Elasticsearch. Relates #52385 Backport of #54858	2020-04-08 09:08:49 -06:00
Bogdan Pintea	8d6d7b88d8	SQL: drop BASE TABLE type in favour for just TABLE (#54836 ) (#54951 ) * Drop BASE TABLE type in favour for just TABLE This commit drops the table type 'BASE TABLE' and replaces all occurences with just 'TABLE', since his type is wider-used and friendlier to the client applications that query for certain table types in their discovery mode. The 'TABLE' type is also explicitely mentioned by the JDBC and ODBC standards and although other data source-specific types are permitted, older apps will not work well with them. * Refactor table type constants out of IndexType Move SQL_TABLE/_ALIAS out of IndexType, so that they can also be used in that Enum definition. (cherry picked from commit 70241b52697ac2cf71004040042123c1ec050299)	2020-04-08 16:02:12 +02:00
Marios Trivyzas	6afd60b082	SQL: Implement DATETIME_FORMAT function for date/time formatting (#54832 ) (#54942 ) Implement DATETIME_FORMAT(<date/datetime/time>, ) function which allows for formatting a timestamp to the specified format. The patterns allowed as those of java.time.format.DateTimeFormatter. Related to #53714 (cherry picked from commit 72be0b54a9299e87e785469cdc9aafac2a48c046)	2020-04-08 13:45:47 +02:00
David Turner	0d2195191d	Allocate searchable snapshots with the balancer (#54889 ) Today the shards of searchable snapshots are allocated with a naive `ExistingShardsAllocator` which selects the first valid node for each shard. Thanks to #54729 we can now allow these shards to fall through to the balanced shards allocator so that they are allocated in a more balanced fashion. Relates #50999	2020-04-08 10:02:42 +01:00
Ryan Ernst	37795d259a	Remove guava from transitive compile classpath (#54309 ) (#54695 ) Guava was removed from Elasticsearch many years ago, but remnants of it remain due to transitive dependencies. When a dependency pulls guava into the compile classpath, devs can inadvertently begin using methods from guava without realizing it. This commit moves guava to a runtime dependency in the modules that it is needed. Note that one special case is the html sanitizer in watcher. The third party dep uses guava in the PolicyFactory class signature. However, only calling a method on the PolicyFactory actually causes the class to be loaded, a reference alone does not trigger compilation to look at the class implementation. There we utilize a MethodHandle for invoking the relevant method at runtime, where guava will continue to exist.	2020-04-07 23:20:17 -07:00
Aleksandr Maus	d02f774cb6	EQL: implement cidrMatch function (#54186 ) (#54928 ) Related to https://github.com/elastic/elasticsearch/issues/54132	2020-04-07 22:07:28 -04:00
Tal Levy	254d1e3543	[7.x] Create new `geo` module and migrate geo_shape registration (#53562 ) (#54924 ) This commit introduces a new `geo` module that is intended to be contain all the geo-spatial-specific features in server. As a first step, the responsibility of registering the geo_shape field mapper is moved to this module. Co-authored-by: Nicholas Knize <nknize@gmail.com>	2020-04-07 16:30:58 -07:00
Aleksandr Maus	de381271f1	EQL: implement stringContains function (#54380 ) (#54923 )	2020-04-07 17:55:13 -04:00
Nik Everett	ce7ae4a7d1	Remove pipline aggs from agg result tree (backport of #54716 ) (#54920 ) This removes pipeline aggregators from the aggregation result tree except for a single field used for backwards compatibility with pre-7.8 versions of Elasticsearch. That field isn't populated unless we are serializing to pre-7.8 Elasticsearch. So, good news! We no longer build pipeline aggregators on the data node. Most of the time.	2020-04-07 17:22:23 -04:00
Nik Everett	100f7258c7	Improve agg reduce tests (#54910 ) (#54914 ) This allows subclasses of `InternalAggregationTestCase` to make a `List` of values to reduce so that it can make values that are realistic together. The first use of this is with `InternalTTest` which uses it to make results that don't cause their `sum` field to wrap. It'd likely be useful for a ton of other aggs but just one for now.	2020-04-07 17:22:04 -04:00
Aleksandr Maus	868798e4db	EQL: implement between function (#54277 ) (#54913 )	2020-04-07 16:52:30 -04:00
Costin Leau	8b1e87cb61	EQL: Change query folding spec from new lines to ; (#54882 ) The usage of blank lines as separator between tests can be tricky to deal with in case of merges where such lines can be added by accident. Further more counting non-consecutive lines is non-intuitive. The tests have been aligned to use ; at the end of the query and exceptions so that the presence or absence of empty lines is irrelevant. The parsing of the spec has been changed to perform validation to not allow invalid/incomplete specs to cause exceptions. (cherry picked from commit 192ad88d3a51e1e1f1f82830526518720ec88217)	2020-04-07 21:57:06 +03:00
Tanguy Leroux	b8d2b952b8	Only one of azure key or token can be specified in 3rd party tests (#54876 ) #54803 introduces more QA tests for Azure storage service, but they fail the build is one of the key or token is missing. It should i nstead work like repository-azure:qa tests.	2020-04-07 19:36:48 +02:00
Nik Everett	faa687c0ae	Fix InternalTTestTests `testReduceRandom` was bumping up against the serialization that I added in #54776. This makes it use random values that reduce in ways that don't cause the randomized serialization to fail.	2020-04-07 11:51:54 -04:00
Larry Gregory	8c8baa10f4	[Backport] Add reserved_ml_user and reserved_ml_admin kibana p… (#54837 ) * add reserved_ml_user and reserved_ml_admin kibana privileges * address feedback, update dataframe roles * fix checkstyle failure	2020-04-07 11:42:11 -04:00
Dimitris Athanasiou	9b4ac60b53	[7.x][ML] Cancel reindex task from correct thread context (#54874 ) (#54898 ) When a data frame analytics job is stopped, if the reindexing task was still in progress we cancel it. Cancelling it should be done from the same context as when we executed the reindexing task. That means from a thread context with ML origin. Backport of #54874	2020-04-07 18:11:58 +03:00
Andrei Dan	bbc57828c4	ILM fix retry delete action test (#54809 ) (#54895 ) Asserting on the failed_step field from the explainAPI can produce flakiness because the ILM state is moved back and forth between the (failing) step and the ERROR step (as the workflow is retry, fail then move to ERROR step, move back to the (failing) step, retry, fail, etc) and the failed_step information is only available whilst in the ERROR state. Unmute other tests as they were collateral failures A read-only index could not be deleted in the wipeCluster phase and caused these failures (cherry picked from commit 99a6d57aeb3cf11abc38b514f38a96bb1612e357) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 15:55:56 +01:00
Nik Everett	3c56e0de42	Fix scripted metric in ccs (backport of #54776 ) (#54888 ) `scripted_metric` did not work with cross cluster search because it assumed that you'd never perform a partial reduction, serialize the results, and then perform a final reduction. That serialized-after-partial-reduction step was broken. This is also required to support #54758.	2020-04-07 10:43:00 -04:00
Ignacio Vera	076c199484	Add new point field. (#53804 ) (#54879 ) This commit adds a new point field that is able to index arbitrary pair of values (x/y) in the cartesian space. It only supports filtering using shape queries at the moment.	2020-04-07 15:28:50 +02:00
Tanguy Leroux	4d36917e52	Merge feature/searchable-snapshots branch into 7.x (#54803 ) (#54825 ) This is a backport of #54803 for 7.x. This pull request cherry picks the squashed commit from #54803 with the additional commits: 6f50c92 which adjusts master code to 7.x a114549 to mute a failing ILM test (#54818) 48cbca1 and 50186b2 that cleans up and fixes the previous test aae12bb that adds a missing feature flag (#54861) 6f330e3 that adds missing serialization bits (#54864) bf72c02 that adjust the version in YAML tests a51955f that adds some plumbing for the transport client used in integration tests Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 13:28:53 +02:00
David Roberts	8f2ddaee1a	[TEST] Allow kb or mb for data frame analytics memory estimate (#54869 ) This change is to support the switch from kb to mb being made in https://github.com/elastic/ml-cpp/pull/1126	2020-04-07 11:28:29 +01:00
David Roberts	df4ae79b41	[TEST] Unmute CategorizationIT.testNumMatchesAndCategoryPreference (#54868 ) Should work again now that https://github.com/elastic/ml-cpp/issues/1121 is resolved. Backport of #54768	2020-04-07 11:04:31 +01:00
Armin Braun	1039cae2cc	Fix Repository Consistency TODOs from SLM Tests (#54767 ) (#54860 ) These TODOs don't apply any longer with the repository generation now being tracked consistently so we can remove the workarounds.	2020-04-07 09:27:50 +02:00
Jim Ferenczi	c7ff67ddef	Preserve final response headers in asynchronous search (#54349 ) This change adds the response headers of the original search request in the stored response in order to be able to restore them when retrieving a result from the async-search index. It also ensures that response headers are preserved for users that retrieve a final response on a running search task. Partial response can eventually return response headers too but this change only ensures that they are present when the response if final. Relates #33936	2020-04-07 08:37:03 +02:00
Jim Ferenczi	d57a047ab7	Fix transport serialization of AsyncSearchUser (#54761 ) This change ensures that the AsyncSearchUser is correctly (de)serialized when an action executed by this user is sent to a remote node internally (via transport client).	2020-04-07 08:25:58 +02:00
Costin Leau	99846f47b7	QL: Introduce infrastructure for surrogate functions (#54795 ) Some functions act as shortcuts for more verbose declarations (sometimes with certain constraints). This PR removes the boilerplate around declaring such functions as well as a dedicated rule for the optimizer to perform the actual substitution. Fix #54334 (cherry picked from commit 3231d01b0c583deb89252fafe84db48878da3246)	2020-04-07 00:46:50 +03:00
Costin Leau	36121117f0	EQL: Sequence/Join parsing and model (#54227 ) Add parsing and (logical) domain model for sequence and join (cherry picked from commit 9e9632d41a39877256c68634ab18e441f4b67fe8)	2020-04-06 23:15:35 +03:00
Igor Motov	1aa87cd4a9	EQL: Make EQL search task cancellable (#54598 ) First step towards async search execution. At the moment we don't try to cancel the underlying search requests, and just check if the task is canceled before performing network operation (such as field caps and search) Relates to #49638	2020-04-06 13:38:03 -04:00
Igor Motov	2794572a35	[7.x] Add Student's t-test aggregation support (#54469 ) (#54737 ) Adds t_test metric aggregation that can perform paired and unpaired two-sample t-tests. In this PR support for filters in unpaired is still missing. It will be added in a follow-up PR. Relates to #53692	2020-04-06 11:36:47 -04:00
Dimitris Athanasiou	0049e9467b	[7.x][ML] Fix node serialization on GET df-nalytics stats without id (#54808 ) (#54812 ) Previously, the id of the `GetDataFrameAnalyticsStatsAction.Request` could be `null` which caused NPE on serialization as `writeString` is used (it doesn't accept null values). This commit ensures the id is never null. Closes #54807 Backport of #54808	2020-04-06 18:13:16 +03:00
David Kyle	03bc368c14	Wait for ML templates after creating a new cluster in TooManyJobsIT (#54801 )	2020-04-06 13:45:56 +01:00
Dimitris Athanasiou	ed4ef78330	[7.x][ML] Increase open job wait time in MlDistributedFailureIT (#54792 ) (#54798 ) It seems the 20 seconds timeout is occasionally not enough. We still get sporadic failures where the logs reveal the job wasn't opened within 20 seconds. I'm increasing the wait time to 30 seconds. Closes #54448 Backport of #54792	2020-04-06 14:51:46 +03:00
Tim Vernum	30b01fe00d	Resolve SSO roles by pattern (#54777 ) This changes a SamlServiceProvider to have a function that maps from an "action-name" to set of role-names instead of a Map that does so. The on-disk representation of this mapping is a set of Java Regexp Patterns, for which the first matching group is the role name. For example "sso:(\w+)" would map any action that started with "sso:" to the corresponding role name (e.g. "sso:superuser" -> "superuser"). Backport of: #54440	2020-04-06 14:10:30 +10:00
Jason Tedor	b939b47b77	Add wire tests for get autoscaling decision objects This commit adds wire serializing tests for the get autoscaling decision request and response objects.	2020-04-05 21:34:36 -04:00
Jason Tedor	8f520f0a9c	Add wire tests to delete autoscaling policy request This commit adds some wire serializing tests for delete autoscaling policy requests.	2020-04-05 21:34:35 -04:00
Jason Tedor	98c4165348	Add wire tests for put autoscaling policy request This commit adds some wire serializing tests for put autoscaling policy requests.	2020-04-05 21:34:34 -04:00
Tim Vernum	cf442aae38	Resolve SERVICE_UNAVAILABLE in IdP IntegTest (#54700 ) The SamlIdentityProviderTests IntegTests would sometimes encounter a service unavailable exception when registering a new service provider. This change ensure that there is a data node, and that the cluster state is recovered before registering providers Backport of: #54622	2020-04-06 11:23:08 +10:00
Jason Tedor	b2cd858f29	Return 404s when autoscaling policies do not exist (#54774 ) This commit updates the autoscaling get and delete policy APIs to return 404s when the named policy does not exist.	2020-04-05 21:05:11 -04:00
Jason Tedor	184c038f59	Add get autoscaling policy API (#54762 ) This commit adds the get autoscaling policy API.	2020-04-04 18:04:25 -04:00
Nhat Nguyen	73d24203e7	Handle no such remote cluster exception in ccr (#53415 ) A remote client can throw a NoSuchRemoteClusterException while fetching the cluster state from the leader cluster. We also need to handle that exception when retrying to add a retention lease to the leader shard. Closes #53225	2020-04-04 13:55:06 -04:00
Jason Tedor	2a94672c32	Separate autoscaling REST test cases A couple of the autoscaling REST tests combine multiple tests into a single REST test. This commit separates them in to single tests.	2020-04-04 10:21:21 -04:00
Jason Tedor	d5a195ab3d	Rename the policies in put autoscaling REST tests The autoscaling REST tests use policies named "hot" in their test cases. Instead, this commit changes the name of these policies to "my_autoscaling_policy".	2020-04-04 10:14:11 -04:00
Jason Tedor	79c72cd398	Migrate common autoscaling test code to base class This commit moves some code repeated in a few autoscaling tests related to writeable and x-content registries to the autoscaling tests base class.	2020-04-04 09:57:20 -04:00
Jason Tedor	dd99e6d951	Simplify name of delete autoscaling policy handler The name here is unnecessarily long, containing the word "action" when it does not need to. This commit simplifies the name.	2020-04-03 21:46:48 -04:00
Ross Wolf	022f829d84	EQL: Add wildcard function (#54020 ) * EQL: Add wildcard function * EQL: Cleanup Wildcard.getArguments * EQL: Cleanup Wildcard and rearrange methods * EQL: Wildcard newline lint * EQL: Make StringUtils function final * EQL: Make Wildcard.asLikes return ScalarFunction * QL: Restore BinaryLogic.java * EQL: Add Wildcard PR feedback * EQL: Add Wildcard verification tests * EQL: Switch wildcard to isFoldable test * EQL: Change wildcard test to numeric field * EQL: Remove Wildcard.get_arguments	2020-04-03 10:15:43 -06:00
Ioannis Kakavas	8e255337f8	Fix SamlServiceProviderDocumentTests (#54718 ) (#54723 ) Don't assume byte for byte equality because internal structures do not guarantee order	2020-04-03 18:46:36 +03:00
Christoph Büscher	8c9ac14a98	Rename field name constants in AbstractBuilderTestCase (#53234 ) Some field name constants were not updaten when we moved from "string" to "text" and "keyword" fields. Renaming them makes it easier and faster to know which field type is used in test subclassing this base test case.	2020-04-03 17:28:22 +02:00
David Roberts	470aa9a5f1	[TEST] Mute CategorizationIT.testNumMatchesAndCategoryPreference (#54717 ) The test results are affected by the off-by-one error that is fixed by https://github.com/elastic/ml-cpp/pull/1122 This test can be unmuted once that fix is merged and has been built into ml-cpp snapshots.	2020-04-03 14:40:47 +01:00
Dimitris Athanasiou	e8c0351fd8	[7.x][ML] Allow force stopping failed and stopping DF analytics (#54650 ) (#54712 ) Force stopping a failed job used to work but it now puts the job in `stopping` state and hangs. In addition, force stopping a `stopping` job is not handled. This commit addresses those issues with force stopping data frame analytics. It inlines the approach with that followed for anomaly detection jobs. Backport of #54650	2020-04-03 16:08:06 +03:00
Maria Ralli	aa697346c4	Remove Xlint exclusions from gradle files (part 2) Backport of #54576. This commit is part of issue #40366 to remove disabled Xlint warnings from gradle files. Remove the Xlint exclusions from the following files: - x-pack/plugin/rollup/build.gradle - x-pack/plugin/monitoring/build.gradle - x-pack/qa/rolling-upgrade-basic/build.gradle Add type parameters to parameterized types. Add wildcard-type parameters or bounded wildcard-type parameters. Suppress `unchecked` and `rawtypes` warnings at method level.	2020-04-03 12:15:42 +01:00
Christoph Büscher	9f22c0d37c	Fix Eclipse compile problem in ModelLoadingService (#54670 ) Current Eclipse 4.14.0 cannot deal with the direct lambda notation, changing to an exlicite one.	2020-04-03 11:56:30 +02:00
Julie Tibshirani	5fb7602227	Disallow changing 'enabled' on the root mapper. (#54681 ) In #33933 we disallowed changing the `enabled` parameter in object mappings. However, the fix didn't cover the root object mapper. This PR adjusts the change to also include the root mapper and clarifies the error message.	2020-04-02 15:28:48 -07:00
Benjamin Trent	6e73f67f3b	[ML] unmute categorization test for native backport (#54679 )	2020-04-02 17:08:19 -04:00
Benjamin Trent	7fe38935f6	[ML] add training_percent to analytics process params (#54605 ) (#54678 ) This adds training_percent parameter to the analytics process for Classification and Regression. This parameter is then used to give more accurate memory estimations. See native side pr: elastic/ml-cpp#1111	2020-04-02 17:08:06 -04:00
Nik Everett	54ea4f4f50	Begin to drop pipeline aggs from the result tree (backport of #54311 ) (#54659 ) Removes pipeline aggregations from the aggregation result tree as they are no longer used. This stops us from building the pipeline aggregators at all on data nodes except for backwards compatibility serialization. This will save a tiny bit of space in the aggregation tree which is lovely, but the biggest benefit is that it is a step towards simplifying pipeline aggregators. This only does about half of the work to remove the pipeline aggs from the tree. Removing all of it would, well, double the size of the change and make it harder to review.	2020-04-02 16:45:12 -04:00
Benjamin Trent	4a1610265f	[7.x] [ML] add new inference_config field to trained model config (#54421 ) (#54647 ) * [ML] add new inference_config field to trained model config (#54421) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config. * fixing for backport	2020-04-02 12:25:10 -04:00
Jason Tedor	2113c1ffb6	Fix autoscaling internal cluster release tests This commit addresses an issue with the autoscaling feature flag not being registered in release builds of the internal cluster tests. This commit addresses this by enabling the system property that is needed, but only in release builds.	2020-04-02 11:48:02 -04:00
Benjamin Trent	65233383f6	[7.x] [ML] prefer secondary authorization header for data[feed\|frame] authz (#54121 ) (#54645 ) * [ML] prefer secondary authorization header for data[feed\|frame] authz (#54121) Secondary authorization headers are to be used to facilitate Kibana spaces support + ML jobs/datafeeds. Now on PUT/Update/Preview datafeed, and PUT data frame analytics the secondary authorization is preferred over the primary (if provided). closes https://github.com/elastic/elasticsearch/issues/53801 * fixing for backport	2020-04-02 11:20:25 -04:00
Zachary Tong	20d67720aa	Refactor Percentiles/Ranks aggregation builders and factories (#51887 ) (#54537 ) - Consolidates HDR/TDigest factories into a single factory - Consolidates most HDR/TDigest builder into an abstract builder - Deprecates method(), compression(), numSigFig() in favor of a new unified PercentileConfig object - Disallows setting algo options that don't apply to current algo The unified config method carries both the method and algo-specific setting. This provides a mechanism to reject settings that apply to the wrong algorithm. For BWC the old methods are retained but marked as deprecated, and can be removed in future versions. Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com> Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>	2020-04-02 10:39:41 -04:00
Jason Tedor	7467cc04ec	Remove toXContent from autoscaling request classess (#54643 ) These methods are not needed, we were only following a pattern in the rest of the codebase, but it's legacy from the HLRC sharing request/response objects with the server.	2020-04-02 10:30:20 -04:00
David Roberts	4b4800e096	[ML] Take more care that normalize processes use unique named pipes (#54641 ) When one of ML's normalize processes fails to connect to the JVM quickly enough and another normalize process for the same job starts shortly afterwards it is possible that their named pipes can get mixed up. This change avoids the risk of that by adding an incrementing counter value into the named pipe names used for normalize processes. Backport of #54636	2020-04-02 14:25:31 +01:00
Benjamin Trent	eb31be0e71	[7.x] [ML] add num_matches and preferred_to_categories to category defintion objects (#54214 ) (#54639 ) * [ML] add num_matches and preferred_to_categories to category defintion objects (#54214) This adds two new fields to category definitions. - `num_matches` indicating how many documents have been seen by this category - `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized. These fields are only guaranteed to be up to date after a `_flush` or `_close` native change: https://github.com/elastic/ml-cpp/pull/1062 * adjusting for backport	2020-04-02 09:09:19 -04:00
Jason Tedor	54ecb009bb	Add delete autoscaling policy API (#54601 ) This commit adds an API for deleting autoscaling policies.	2020-04-02 09:05:12 -04:00
Martijn Laarman	0ed20cc349	Rename cat.transform => cat.transforms (#54438 ) * Rename cat.transform => cat.transforms To match the url. We typically prefer singular url nouns but _cat tends to use plural and this API does in fact uses `/_cat/transforms` * also rename the api in the spec and tests (cherry picked from commit c495d220ac8fedba7f70f82387cd6d6a672b8b14)	2020-04-02 09:40:51 +02:00
Tim Vernum	c40ec6a577	Turn on trace logging for failing test (#54623 ) SamlIdentityProviderTests is failing with 409 conflicts that have not been reproducible outside of CI. This change turn on additional logging in this test to determine why these conflict occur. Relates: #54423 Backport of: #54475	2020-04-02 16:15:12 +11:00
Russ Cam	2978024375	Update rest API specs (#54252 ) This commit updates the rest API specs to validate against a JSON schema for the specifications. Most updates are to add a description, whilst others fix typos and unify conventions e.g. deprecations, descriptions, urls starting with /. The schema conforms to draft-07 JSON schema. (cherry picked from commit da37e01d32f9764c3937736ef0c7d3ab40af9a77)	2020-04-02 10:53:32 +10:00
Russ Cam	a2f59a2744	Add hidden value to expand_wildcards params (#54551 ) This commit adds the hidden enum value to all expand_wildcards params (cherry picked from commit 581b8cdabe11444105edb62226b439ba4c7e908a)	2020-04-02 09:01:20 +10:00
William Brafford	958e9d1b78	Refactor nodes stats request builders to match requests (#54363 ) (#54604 ) * Refactor nodes stats request builders to match requests (#54363) * Remove hard-coded setters from NodesInfoRequestBuilder * Remove hard-coded setters from NodesStatsRequest * Use static imports to reduce clutter * Remove uses of old info APIs	2020-04-01 17:03:04 -04:00
Jason Tedor	fd729a6509	Fix the name of an autoscaling policy test The test name says it is testing the put autoscaling decision API, but that is not right, since no such API exists (nor will exist). This commit corrects the name of this test to reflect the fact that the test is about the put autoscaling policy API.	2020-04-01 16:36:47 -04:00
Mayya Sharipova	bf4857d9e0	Search hit refactoring (#41656 ) (#54584 ) Refactor SearchHit to have separate document and meta fields. This is a part of bigger refactoring of issue #24422 to remove dependency on MapperService to check if a field is metafield. Relates to PR: #38373 Relates to issue #24422 Co-authored-by: sandmannn <bohdanpukalskyi@gmail.com>	2020-04-01 15:19:00 -04:00
Jason Tedor	8ed1a6cdb6	Use List.of convenience methods in 7.x autoscaling We do not have access to JDK 9 collection convenience methods in 7.x because we are compatible with JDK 8 there. Yet, we have recently added a substitute for these convenience methods that even delegate to the right places when running on JDK 9, to make backporting easier. This commit utilizes these new methods in the autoscaling codebase.	2020-04-01 12:17:56 -04:00
Ioannis Kakavas	c9ffa379ba	[7.x] Add end to end QA authentication test (#54215 ) (#54567 ) Use the same ES cluster as both an SP and an IDP and perform IDP initiated and SP initiated SSO. The REST client plays the role of both the Cloud UI and Kibana in these flows Backport of #54215 * fix compilation issues	2020-04-01 18:35:21 +03:00
Jason Tedor	a039f45604	Fix autoscaling metadata not adding X-Pack mix-in This commit addresses an issue with the autoscaling metadata not implementing a required interface, used in the feature aware checks.	2020-04-01 08:42:01 -04:00
Jason Tedor	f670ae0bc8	Introduce autoscaling policies (#54473 ) This commit is the first in a series of commits that introduces autoscaling policies, and APIs for working with them. For now, we introduce the basic infrastructure, and a single API for putting an autoscaling policy. We will follow in rapid succession with APIs for getting, and deleting autoscaling policies.	2020-04-01 08:12:26 -04:00
Przemysław Witek	1fe2705826	Skip daily maintenance activity if upgrade mode is enabled (#54565 ) (#54571 )	2020-04-01 13:29:34 +02:00
Ioannis Kakavas	1cff6897f3	Add error message in JSON response (#54389 ) (#54562 ) When the SAML authentication is not successful, we return a SAML Response with a status that indicates a failure. This commit adds an error message in the REST API response along with the SAML Response XML string so that the caller of the API can identify that this is an unsuccessful response without needing to parse the XML.	2020-04-01 13:02:52 +03:00
Luca Cavanna	d75571ff0f	[TEST] rename AsyncSearchActionTests to IT and move it out of unit tests (#54520 ) `AsyncSearchActionTests` currently fails quite often. That is since the introduction of `RestSubmitAsyncSearchActionTests` which indirectly manipulates the channels being tracked in `RestCancellableNodeClient`. There are channels left in the map after `RestSubmitAsyncSearchActionTests` is run, and later `AsyncSearchActionTests` checks that there are no channels in the map which makes each test method fail. This is particularily hard to reproduce as the order in which tests are run appears to be platform dependent. The test cluster assertion that there are no channels in the map only makes sense in the context of internal cluster tests, while there may be collisions with unit tests that register http channels as part of their testing. This can be solved by renaming `AsyncSearchActionTests` to `AsyncSearchActionIT`. This way it won't be run as part of unit tests but rather within another JVM where the number of channels is `0` and such assumption holds, because there are no expected manual manipulation of the channels. Relates to #54180	2020-04-01 11:23:27 +02:00
Ioannis Kakavas	74eeecf91b	Fix testGenerateAndSignMetadata in FIPS mode (#54115 ) (#54387 ) BC provider throws different error message on signature validation failure	2020-04-01 12:04:20 +03:00
Jason Tedor	63e5f2b765	Rename META_DATA to METADATA This is a follow up to a previous commit that renamed MetaData to Metadata in all of the places. In that commit in master, we renamed META_DATA to METADATA, but lost this on the backport. This commit addresses that.	2020-03-31 17:30:51 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Zachary Tong	c9db2de41d	[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451 ) * Comprehensively test supported/unsupported field type:agg combinations (#52493) This adds a test to AggregatorTestCase that allows us to programmatically verify that an aggregator supports or does not support a particular field type. It fetches the list of registered field type parsers, creates a MappedFieldType from the parser and then attempts to run a basic agg against the field. A supplied list of supported VSTypes are then compared against the output (success or exception) and suceeds or fails the test accordingly. Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com> * Skip fields that are not aggregatable * Use newIndexSearcher() to avoid incompatible readers (#52723) Lucene's `newSearcher()` can generate readers like ParallelCompositeReader which we can't use. We need to instead use our helper `newIndexSearcher`	2020-03-31 14:35:03 -04:00
David Roberts	b8f06df53f	[ML] Fix bug, add tests, improve estimates for estimate_model_memory (#54508 ) This PR: 1. Fixes the bug where a cardinality estimate of zero could cause a 500 status 2. Adds tests for that scenario and a few others 3. Adds sensible estimates for the cases that were previously TODO Backport of #54462	2020-03-31 17:59:38 +01:00
David Kyle	9150e77269	[7.x] Remove unused environment from anomaly detector classes (#54399 ) (#54456 )	2020-03-31 16:55:37 +01:00
Dimitris Athanasiou	e4230c533c	[7.x][ML] Move DFA MemoryUsage to stats.common pkg (#54492 ) (#54512 ) This belongs in stats.common Backport of #54492	2020-03-31 18:36:05 +03:00
Andrei Stefan	977302e46c	EQL: startsWith and endsWith functions implementation (#54504 ) * EQL: startsWith function implementation (#54400) (cherry picked from commit 666719fcfc40f6fc0535609577791369123320ab) * EQL: endsWith function implementation (#54442) (cherry picked from commit 554a4c8ef04b67eed107d29b57185e9af25d9d4f)	2020-03-31 18:06:03 +03:00
Dimitris Athanasiou	6d96ca9bc8	[7.x][ML] Reenable classification and regression integ tests (#54489 ) (#54494 ) Relates #54401 Backport of #54489	2020-03-31 17:50:08 +03:00
Andrei Stefan	364ea0a3c0	EQL: Length function implementation (#54209 ) (#54490 ) (cherry picked from commit 18493467e55e014be2c9e0ebdf734e9d7fc4beaa)	2020-03-31 16:49:18 +03:00
Ioannis Kakavas	349293da6d	Mute failing test (#54446 ) (#54487 ) see #54445	2020-03-31 15:56:10 +03:00
Tim Vernum	a0853628cd	Add wildcard service providers to IdP (#54477 ) This adds the ability for the IdP to define wildcard service providers in a JSON file within the ES node's config directory. If a request is made for a service provider that has not been registered, then the set of wildcard services is consulted. If the SP entity-id and ACS match one of the wildcard patterns, then a dynamic service provider is defined from the associated mustache template. Backport of: #54148	2020-03-31 16:53:13 +11:00
Jason Tedor	5d760051a9	Clarify autoscaling feature flag registration (#54427 ) This commit clarifies the autoscaling feature flag registration system property. The intention is that this system property is: - unset in snapshot builds - unset, true, or false in release builds - in release builds, unset behaves the same as false - therefore, we only register the enabled flag if the build is a snapshot build, or the build is a release build and the system property is set to true This commit clarifies that intention, and removed a confusion situation where the AUTOSCALING_FEATURE_FLAG_REGISTERED field would be set to false in a snapshot build, even though we were going to register the setting.	2020-03-30 21:37:25 -04:00
Ross Wolf	d11e977b1f	EQL: Use In from QL (#53244 ) * EQL: Use In from QL * EQL: Add more In tests * EQL: Test In duplicates * EQL: Add test for In mixed types * EQL: Copy In translation to QL * SQL: Use InComparisons from QL * EQL: Remove boost checks from QueryFolderOkTests * QL: Add TranslatorHandler.convert	2020-03-30 15:19:23 -06:00
Dimitris Athanasiou	b4b54efa73	[7.x][ML] Hyperparameter names should match config (#54401 ) (#54435 ) Java side of elastic/ml-cpp#1096 Backport of #54401	2020-03-30 23:32:40 +03:00
Ryan Ernst	c9421594bf	Remove allowTrial flag in license checking (#54293 ) The allowTrial flag is always true, since trial licenses act as though everything is licensed. This commit removes the allowTrial flag in license checking helper methods.	2020-03-30 12:22:38 -07:00
Nik Everett	e58ad9fed3	Clean up how pipeline aggs check for multi-bucket (backport of #54161 ) (#54379 ) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes #53215	2020-03-30 10:44:55 -04:00
Jason Tedor	39b3010578	Add node local storage deprecation check (#54383 ) The node.local_storage setting has been deprecated and will be removed in 8.0.0. This commit adds a deprecation check to 7.x.	2020-03-30 10:23:43 -04:00
Christoph Büscher	67b9b68c66	[Docs] Add HLRC Async Search API documentation (#54353 ) Adds documentation and a corresponding test case containing typical API usage for the Async Search API to the High Level Rest Client.	2020-03-30 15:37:22 +02:00
Przemysław Witek	3c604da7f6	[7.x] Create an annotation when a model snapshot is stored (#53783 ) (#54405 )	2020-03-30 15:17:08 +02:00
Benjamin Trent	374e76d7cd	[Transform] fixing naming in HLRC and _cat to match API content (#54300 ) (#54408 ) Fixing the naming of the HLRC values to match the ToXContent field names (i.e. the field names returned from an API call). Also fixes the names in the _cat API as well. closes #53946	2020-03-30 08:57:02 -04:00
Martijn van Groningen	4b4fbc160d	Refactor AliasOrIndex abstraction. (#54394 ) Backport of #53982 In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams, the abstraction needs to be made more flexible, because currently it really can be only an alias or an index. * Renamed `AliasOrIndex` to `IndexAbstraction`. * Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is. * Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum. * Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface. * Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`. * Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method. Relates to #53100	2020-03-30 10:12:16 +02:00
Jason Tedor	d2aced810d	Add assertion for get autoscaling decision API test This commit adds a match assertion to the get autoscaling decision REST test.	2020-03-29 14:36:38 -04:00
Jason Tedor	512a318b4b	Do not stash environment in security (#54372 ) Today the security plugin stashes a copy of the environment in its constructor, and uses the stashed copy to construct its components even though it is provided with an environment to create these components. What is more, the environment it creates in its constructor is not fully initialized, as it does not have the final copy of the settings, but the environment passed in while creating components does. This commit removes that stashed copy of the environment.	2020-03-28 12:47:16 -04:00
Jason Tedor	cf68ac8a2c	Do not stash environment in machine learning (#54371 ) Today the machine learning plugin stashes a copy of the environment in its constructor, and uses the stashed copy to construct its components even though it is provided with an environment to create these components. What is more, the environment it creates in its constructor is not fully initialized, as it does not have the final copy of the settings, but the environment passed in while creating components does. This commit removes that stashed copy of the environment.	2020-03-28 12:46:16 -04:00
Tim Brooks	2ccddbfa88	Move transport decoding and aggregation to server (#54360 ) Currently all of our transport protocol decoding and aggregation occurs in the individual transport modules. This means that each implementation (test, netty, nio) must implement this logic. Additionally, it means that the entire message has been read from the network before the server package receives it. This commit creates a pipeline in server which can be passed arbitrary bytes to handle. Internally, the pipeline will decode, decompress, and aggregate the messages. Additionally, this allows us to run many megabytes of bytes through the pipeline in tests to ensure that the logic works. This work will enable future work: Circuit breaking or backoff logic based on message type and byte in the content aggregator. Sharing bytes with the application layer using the ref counted releasable network bytes. Improved network monitoring based specifically on channels. Finally, this fixes the bug where we do not circuit break on the correct message size when compression is enabled.	2020-03-27 14:13:10 -06:00
Stuart Tettemer	1630de4a42	Scripting: stats per context in nodes stats (#54008 ) (#54357 ) Adds script cache stats to `_node/stats`. If using the general cache: ``` "script_cache": { "sum": { "compilations": 12, "cache_evictions": 9, "compilation_limit_triggered": 5 } } ``` If using context caches: ``` "script_cache": { "sum": { "compilations": 13, "cache_evictions": 9, "compilation_limit_triggered": 5 }, "contexts": [ { "context": "aggregation_selector", "compilations": 8, "cache_evictions": 6, "compilation_limit_triggered": 3 }, { "context": "aggs", "compilations": 5, "cache_evictions": 3, "compilation_limit_triggered": 2 }, ``` Backport of: 32f46f2 Refs: #50152	2020-03-27 12:26:00 -06:00
Lee Hinman	f2cc2b1127	[7.x] Add REST APIs for IndexTemplateV2Metadata CRUD (#54039 ) (#54347 ) * Add REST APIs for IndexTemplateV2Metadata CRUD (#54039) * Add REST APIs for IndexTemplateV2Metadata CRUD This commit adds the get/put/delete APIs for interacting with the now v2 versions of index templates. These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag. Relates to #53101 * Add exceptions for HLRC tests * Add skips for 7.x versions * Use index_template instead of template_v2 in action names * Add test for MetaDataIndexTemplateService.addIndexTemplateV2 * Move removal to static method and add test * Add unit tests for request classes (implement hashCode & equals) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Fix compilation Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-27 10:47:22 -06:00
Christoph Büscher	0d17295601	[Docs] Minor fix for SubmitAsyncSearchRequest.keepOnCompletion javadoc (#54325 ) The semantics and the default value for this parameter have changed, adapting the javadoc accordingly.	2020-03-27 16:02:03 +01:00
Przemysław Witek	2eb079b67f	Add version guards around ML hidden indices settings (#54322 )	2020-03-27 14:50:57 +01:00
Ioannis Kakavas	5983f6aceb	Mute testSpInitiatedSsoFailsForMalformedRequest (#54328 ) (#54339 ) see #54285	2020-03-27 15:46:08 +02:00
Yannick Welsch	8126ad0ab1	Increase timeout on testUpdateAnalysisLeaderIndexSettings Closes #54204	2020-03-27 13:41:47 +01:00
Przemysław Witek	d40afc7871	[7.x] Do not fail Evaluate API when the actual and predicted fields' types differ (#54255 ) (#54319 )	2020-03-27 10:05:19 +01:00
Jason Tedor	c547fabb2b	Put CCR tasks on (data && remote cluster clients) (#54146 ) Today we assign CCR persistent tasks to nodes with the data role. It could be that the data node is not capable of connecting to remote clusters, in which case the task will fail since it can not connect to the remote cluster with the leader shard. Instead, we need to assign such tasks to nodes that are capable of connecting to remote clusters. This commit addresses this by enabling such persistent tasks to only be assigned to nodes that have the data role, and also have the remote cluster client role.	2020-03-26 23:50:16 -04:00
Hendrik Muhs	4ecf9904d5	[Transform] Transform optmize date histogram (#54068 ) optimize transform for group_by on date_histogram by injecting an additional range query. This limits the number of search and index requests and avoids unnecessary updates. Only recent buckets get re-written. fixes #54254	2020-03-26 21:39:50 +01:00
Gordon Brown	0d30b48613	Disallow negative TimeValues (#53913 ) This commit causes negative TimeValues, other than -1 which is sometimes used as a sentinel value, to be rejected during parsing. Also introduces a hack to allow ILM to load policies which were written to the cluster state with a negative min_age, treating those values as 0, which should match the behavior of prior versions.	2020-03-26 13:30:35 -06:00
William Brafford	14204f8381	Use set-based interface for NodesStatsRequest (#53637 ) (#54141 ) The NodesStatsRequest class uses a set of strings for its internal serialization. This commit updates the class's interface so that we no longer use hard-coded getters and setters, but rather methods that add strings directly. For example, the old way of adding "os" metrics to a request would be to call request.os(true). The new way of doing this is to call request.addMetric("os"). For the time being, the canonical list of metrics is an enum in NodesStatsRequest. This will eventually be replaced with something pluggable.	2020-03-26 14:41:49 -04:00
Dimitris Athanasiou	13368aae37	[7.x][ML] DF Analytics should always display operational stats (#54210 ) (#54290 ) This commit populates the _stats API response with sensible "empty" `data_counts` and `memory_usage` objects when the job itself has not started reporting them. Backport of #54210	2020-03-26 20:03:14 +02:00
Christoph Büscher	da404bbce2	HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200 ) (#54266 ) Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no longer needed since we now only send the parameters along with the rest request that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct defaults are set on the client side. This change removes setting and sending these defaults where possible, leaving only the overwrite of batchedReduceSize with a default value of 5, since the default used in the vanilla SearchRequest is 512. However, we don't need to send this value along as a request parameter if its the default since the correct one will be set on the receiving end if no value is specified. Also adding tests for RestSubmitAsyncSearchAction that check the correct defaults are set when parameters are missing on the server side. Backport of #54200	2020-03-26 19:01:17 +01:00
David Turner	fc92bf4208	assertBusy in XPackRestIT#awaitCallApi (#54264 ) Retries in this method were lost in #45794. This commit reinstates them.	2020-03-26 16:16:05 +00:00
Dimitris Athanasiou	cc981fa377	[7.x][ML] Get ML filters size should default to 100 (#54207 ) (#54278 ) When get filters is called without setting the `size` paramter only up to 10 filters are returned. However, 100 filters should be returned. This commit fixes this and adds an integ test to guard it. It seems this was accidentally broken in #39976. Closes #54206 Backport of #54207	2020-03-26 17:51:43 +02:00
David Turner	f48e8f31b9	AwaitsFix for #54180	2020-03-26 15:35:36 +00:00
David Turner	ad3c96e250	AwaitsFix for #54093	2020-03-26 13:24:33 +00:00
David Turner	53e2fec93d	AwaitsFix for #53612	2020-03-26 10:41:37 +00:00
Yannick Welsch	1ba6783780	Schedule commands in current thread context (#54187 ) Changes ThreadPool's schedule method to run the schedule task in the context of the thread that scheduled the task. This is the more sensible default for this method, and eliminates a range of bugs where the current thread context is mistakenly dropped. Closes #17143	2020-03-26 10:07:59 +01:00
Luca Cavanna	ff269160af	Async search: rename REST parameters (#54198 ) This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069	2020-03-26 09:40:50 +01:00
Yang Wang	1afd510721	Check authentication type using enum instead of string (#54145 ) (#54246 ) Avoid string comparison when we can use safer enums. This refactor is a follow up for #52178. Resolves: #52511	2020-03-26 15:45:10 +11:00
Tim Vernum	1fc518c25e	Improve stability of SamlServiceProviderIndexTests (#54241 ) This test assumed cluster events would be processed quickly which is not always true Backport of: #54166	2020-03-26 13:07:42 +10:00
Ryan Ernst	5a5d6e9ef2	Invert license security disabled helper method (#54043 ) (#54239 ) Xpack license state contains a helper method to determine whether security is disabled due to license level defaults. Most code needs to know whether security is enabled, not disabled, but this method exists so that the security being explicitly disabled can be distinguished from licence level defaulting to disabled. However, in the case that security is explicitly disabled, the handlers in question are never registered, so security is implicitly not disabled explicitly, and thus we can share a single method to know whether licensing is enabled.	2020-03-25 19:20:10 -07:00
Benjamin Trent	6d68cf809c	[Transform] Remove node.attr.transform.remote_connect and use new remote cluster client node role (#54217 ) (#54224 ) With the addition of a formal role for nodes indicating remote cluster connection, the transform specific attribute `node.attr.transform.remote_connect` is no longer necessary. closes https://github.com/elastic/elasticsearch/issues/54179	2020-03-25 16:29:02 -04:00
Nik Everett	8f40f1435a	Save a little space in agg tree (backport of #53730 ) (#54213 ) This drop the "top level" pipeline aggregators from the aggregation result tree which should save a little memory and a few serialization bytes. Perhaps more imporantly, this provides a mechanism by which we can remove all pipelines from the aggregation result tree. This will save quite a bit of space when pipelines are deep in the tree. Sadly, doing this isn't simple because of backwards compatibility. Nodes before 7.7.0 need those pipelines. We provide them by setting passing a `Supplier<PipelineTree>` into the root of the aggregation tree that we only call if we need to serialize to a version before 7.7.0. This solution works for cross cluster search because we always reduce the aggregations in each remote cluster and then forward them back to the coordinating node. Its quite possible that the coordinating node needs the pipeline (say it is version 7.1.0) and the gateway node in the remote cluster doesn't (version 7.7.0). In that case the data nodes won't send the pipeline aggregations back to the gateway node. Critically, the gateway node will send the pipeline aggregations back to the coordinating node. This is all managed with that `Supplier<PipelineTree>`, but how it is managed is a bit tricky.	2020-03-25 15:51:16 -04:00
Jason Tedor	d14f170093	Add cluster.remote.connect to deprecation info API (#54142 ) This setting was recently deprecated in favor of node.remote_cluster_client. This commit adds this setting to the deprecation info API.	2020-03-25 15:11:59 -04:00
Hendrik Muhs	cb0ecafdd8	[Transform] fix transform failure case for percentiles and spa… (#54202 ) index null if percentiles could not be calculated due to sparse data fixes #54201	2020-03-25 19:28:51 +01:00
Martijn Laarman	077bf52acc	transform.cat should live in the cat namespace. (#54196 ) * transform.cat should live in the cat namespace. Similarly to to ml cat API's also living in the `cat` namespace. Clients treat the `cat` namespace differently then other API's (return types, content types). This introduces an exception to this rule. * rename the specification file as well (cherry picked from commit 0a98904b1a73a30bbaebc32bd16a238c8d03c329)	2020-03-25 18:16:01 +01:00
Mark Vieira	7728ccd920	Encore consistent compile options across all projects (#54120 ) (cherry picked from commit ddd068a7e92dc140774598664efdc15155ab05c2)	2020-03-25 08:24:21 -07:00
Dimitris Athanasiou	ba09a778dc	[7.x][ML] Unmute classification cardinality integ test (#54165 ) (#54173 ) Adjusts test to work for new cardinality limit. Backport of #54165	2020-03-25 15:00:34 +02:00
Benjamin Trent	ef05a4f416	[ML] relaxing parameters on stratified split test (#54127 ) (#54168 ) Relaxing the error rate a bit on two of the tests. Ran 1000s of times locally and never had a failure after these changes. closes https://github.com/elastic/elasticsearch/issues/54122	2020-03-25 08:06:15 -04:00
Tanguy Leroux	3a3930c7ec	Mute TooManyJobsIT.testCloseFailedJob on 7.x (#54163 ) Relates #54162	2020-03-25 12:44:41 +01:00
Tanguy Leroux	4a2db4651e	Mute ReadActionsTests (#54153 ) Relates #53340	2020-03-25 10:35:58 +01:00
Jason Tedor	381d7586e4	Introduce formal role for remote cluster client (#54138 ) This commit introduce a formal role for identifying nodes that are capable of making connections to remote clusters. Relates #53924	2020-03-24 21:59:43 -04:00
Oliver Gupte	96f0c668a8	[APM] Allow kibana to collect APM telemetry in background task (#52917 ) (#54106 ) * Required for elastic/kibana#50757. Allows the kibana user to collect APM telemetry in a background task. * removed unnecessary priviledges on `.ml-anomalies-*` for the `kibana_system` reserved role	2020-03-24 18:11:19 -07:00
David Roberts	7667004b20	[ML] Add a model memory estimation endpoint for anomaly detection (#54129 ) A new endpoint for estimating anomaly detection job model memory requirements: POST _ml/anomaly_detectors/estimate_model_memory Backport of #53507	2020-03-24 22:55:11 +00:00
Ioannis Kakavas	7c0123d6f3	Add SAML IdP plugin for internal use (#54046 ) (#54124 ) This change merges the "feature-internal-idp" branch into Elasticsearch. This introduces a small identity-provider plugin as a child of the x-pack module. This allows ES to act as a SAML IdP, for users who are authenticated against the Elasticsearch cluster. This feature is intended for internal use within Elastic Cloud environments and is not supported for any other use case. It falls under an enterprise license tier. The IdP is disabled by default. Co-authored-by: Ioannis Kakavas <ioannis@elastic.co> Co-authored-by: Tim Vernum <tim.vernum@elastic.co>	2020-03-25 09:45:13 +11:00
Gordon Brown	82e041442e	Add version guards around Transform hidden index settings (#54036 ) This commit ensures that the hidden index settings are only applied to the Transform index templates when the cluster can support those settings. Also unmutes the tests which were failing due to the previous behavior.	2020-03-24 15:52:56 -06:00
Ross Wolf	627ca03c72	EQL: Remove parser handling for functions (#54028 ) * EQL: Remove parser handling for functions * EQL: Comment out array functions in queries-unsupported.eql	2020-03-24 14:03:02 -06:00
Costin Leau	68f74cf593	EQL: Fix custom scripting for functions (#53935 ) (#54114 ) Improve separation of scripting between EQL and SQL by delegating common methods to QL. The context detection is determined based on the package to avoid having repetitive class hierarchies. The Painless whitelists have been improved so that the declaring class is used instead of the inherited one. Relates #53688 (cherry picked from commit 6d46033e736c64ac9255c5d6964600d2a931430a) EQL: Add Substring function with Python semantics (#53688) Does not reuse substring from SQL due to the difference in semantics and the accepted arguments. Currently it is missing full integration tests as, due to the usage of scripting, requires an actual integration test against a proper cluster (and likely its own QA project). (cherry picked from commit f58680bad33d5ce4139157a69a4d9f5f286bc3c4)	2020-03-24 20:54:19 +02:00
markharwood	6a60f85bba	Wildcard field - add normalizer support (#53851 ) (#54109 ) Backport support for normalisation to wildcard field Closes #53603	2020-03-24 17:37:47 +00:00
Dimitris Athanasiou	c141c1dd89	[7.x][ML] Stratified cross validation split for classification (#54087 ) (#54104 ) As classification now works for multiple classes, randomly picking training/test data frame rows is not good enough. This commit introduces a stratified cross validation splitter that maintains the proportion of the each class in the dataset in the sample that is used for training the model. Backport of #54087	2020-03-24 18:47:36 +02:00
Yannick Welsch	e006d1f6cf	Use special XContent registry for node tool (#54050 ) Fixes an issue where the elasticsearch-node command-line tools would not work correctly because PersistentTasksCustomMetaData contains named XContent from plugins. This PR makes it so that the parsing for all custom metadata is skipped, even if the core system would know how to handle it. Closes #53549	2020-03-24 17:40:51 +01:00
Luca Cavanna	6b457abbd3	Async search: prevent users from overriding pre_filter_shard_size (#54088 ) Submit async search forces pre_filter_shard_size for the underlying search that it creates. With this commit we also prevent users from overriding such default as part of request validation.	2020-03-24 17:06:04 +01:00
Luca Cavanna	3c67762f1b	Async search response: output start and expiration time as time fields (#54084 ) This commits makes start_time and expiration_time time fields, so that their date variant will be printed out when human readable output is requested.	2020-03-24 17:05:56 +01:00
Jim Ferenczi	0330bef409	Improve async search's tasks cancellation (#53799 ) This commit adds an explicit cancellation of the search task if the initial async search submit task is cancelled (connection closed by the user). This was previously done through the cancellation of the parent task but we don't handle grand-children cancellation yet so we have to manually cancel the search task in order to ensure that shard actions are cancelled too. This change can be considered as a workaround until #50990 is fixed.	2020-03-24 15:51:10 +01:00
Andrei Stefan	3234b50e95	SQL: jdbc debugging enhancement (#53880 ) (#54081 ) * add flush always output option that will flush the output printer after each debug message when enabled (disabled by default) * at debug output initializationtime, log debug output information about OS, JVM and default JVM timezone (cherry picked from commit b5db9657d1eadce9902041e5b128bf32c02d302a)	2020-03-24 16:09:53 +02:00
Alan Woodward	39d7d0dc10	Upgrade to lucene 8.5.0 release (#54077 ) Upgrades our lucene dependency to the released 8.5.0 version.	2020-03-24 13:45:50 +00:00
David Roberts	1421471556	[ML] Introduce a "starting" datafeed state for lazy jobs (#54065 ) It is possible for ML jobs to open lazily if the "allow_lazy_open" option in the job config is set to true. Such jobs wait in the "opening" state until a node has sufficient capacity to run them. This commit fixes the bug that prevented datafeeds for jobs lazily waiting assignment from being started. The state of such datafeeds is "starting", and they can be stopped by the stop datafeed API while in this state with or without force. Backport of #53918	2020-03-24 13:00:04 +00:00
Peter Schretlen	92acb2859b	Allow kibana_system to create and invalidate API keys on behalf of other users	2020-03-24 08:38:12 -04:00
Dimitris Athanasiou	be20bb5755	[7.x][ML] No refresh on indexing DFA stats (#53977 ) (#54064 ) When we index data frame analytics stats docs we do not need to refresh immediately. Backport of #53977	2020-03-24 13:13:03 +02:00
Yang Wang	d33d20bfdc	Validate role templates before saving role mapping (#52636 ) (#54059 ) Role names are now compiled from role templates before role mapping is saved. This serves as validation for role templates to prevent malformed and invalid scripts to be persisted, which could later break authentication. Resolves: #48773	2020-03-24 20:43:59 +11:00
Dimitris Athanasiou	5ce7c99e74	[7.x][ML] Data frame analytics data counts (#53998 ) (#54031 ) This commit instruments data frame analytics with stats for the data that are being analyzed. In particular, we count training docs, test docs, and skipped docs. In order to account docs with missing values as skipped docs for analyses that do not support missing values, this commit changes the extractor so that it only ignores docs with missing values when it collects the data summary, which is used to estimate memory usage. Backport of #53998	2020-03-24 11:30:43 +02:00
Hendrik Muhs	7dcacf531f	[7.x][Transform][Rollup] add processing stats to record the ti… (#54027 ) add 2 additional stats: processing time and processing total which capture the time spent for processing results and how often it ran. The 2 new stats correspond to the existing indexing and search stats. Together with indexing and search this now allows the user to see the full picture, all 3 stages.	2020-03-24 09:22:02 +01:00
Jason Tedor	e3ca124537	Introduce autoscaling decisions (#53934 ) This is the first in a series of commits that will introduce the autoscaling deciders framework. This commit introduces the basic framework for representing autoscaling decisions.	2020-03-23 23:08:06 -04:00
Tim Vernum	4bd853a6f2	Add "grant_api_key" cluster privilege (#54042 ) This change adds a new cluster privilege "grant_api_key" that allows the use of the new /_security/api_key/grant endpoint Backport of: #53527	2020-03-24 13:17:45 +11:00
Benjamin Trent	19af869243	[ML] adds multi-class feature importance support (#53803 ) (#54024 ) Adds multi-class feature importance calculation. Feature importance objects are now mapped as follows (logistic) Regression: ``` { "feature_name": "feature_0", "importance": -1.3 } ``` Multi-class [class names are `foo`, `bar`, `baz`] ``` { “feature_name”: “feature_0”, “importance”: 2.0, // sum(abs()) of class importances “foo”: 1.0, “bar”: 0.5, “baz”: -0.5 }, ``` For users to get the full benefit of aggregating and searching for feature importance, they should update their index mapping as follows (before turning this option on in their pipelines) ``` "ml.inference.feature_importance": { "type": "nested", "dynamic": true, "properties": { "feature_name": { "type": "keyword" }, "importance": { "type": "double" } } } ``` The mapping field name is as follows `ml.<inference.target_field>.<inference.tag>.feature_importance` if `inference.tag` is not provided in the processor definition, it is not part of the field path. `inference.target_field` is defaulted to `ml.inference`. //cc @lcawl ^ Where should we document this? If this makes it in for 7.7, there shouldn't be any feature_importance at inference BWC worries as 7.7 is the first version to have it.	2020-03-23 18:49:07 -04:00
Mark Vieira	70cfedf542	Refactor global build info plugin to leverage JavaInstallationRegistry (#54026 ) This commit removes the configuration time vs execution time distinction with regards to certain BuildParms properties. Because of the cost of determining Java versions for configuration JDK locations we deferred this until execution time. This had two main downsides. First, we had to implement all this build logic in tasks, which required a bunch of additional plumbing and complexity. Second, because some information wasn't known during configuration time, we had to nest any build logic that depended on this in awkward callbacks. We now defer to the JavaInstallationRegistry recently added in Gradle. This utility uses a much more efficient method for probing Java installations vs our jrunscript implementation. This, combined with some optimizations to avoid probing the current JVM as well as deferring some evaluation via Providers when probing installations for BWC builds we can maintain effectively the same configuration time performance while removing a bunch of complexity and runtime cost (snapshotting inputs for the GenerateGlobalBuildInfoTask was very expensive). The end result should be a much more responsive build execution in almost all scenarios. (cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)	2020-03-23 15:30:10 -07:00
Nik Everett	b9bfba2c8b	Move pipeline agg validation to coordinating node (backport of #53669 ) (#54019 ) This moves the pipeline aggregation validation from the data node to the coordinating node so that we, eventually, can stop sending pipeline aggregations to the data nodes entirely. In fact, it moves it into the "request validation" stage so multiple errors can be accumulated and sent back to the requester for the entire request. We can't always take advantage of that, but it'll be nice for folks not to have to play whack-a-mole with validation. This is implemented by replacing `PipelineAggretionBuilder#validate` with: ``` protected abstract void validate(ValidationContext context); ``` The `ValidationContext` handles the accumulation of validation failures, provides access to the aggregation's siblings, and implements a few validation utility methods.	2020-03-23 17:22:56 -04:00
Marios Trivyzas	3a3e964956	Reduce performance impact of ExitableDirectoryReader (#53978 ) (#54014 ) Benchmarking showed that the effect of the ExitableDirectoryReader is reduced considerably when checking every 8191 docs. Moreover, set the cancellable task before calling QueryPhase#preProcess() and make sure we don't wrap with an ExitableDirectoryReader at all when lowLevelCancellation is set to false to avoid completely any performance impact. Follows: #52822 Follows: #53166 Follows: #53496 (cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)	2020-03-23 21:30:34 +01:00
Christoph Büscher	286c3660bd	Add async_search get and delete APIs to HLRC (#53828 ) (#53980 ) This commit adds the "_async_searhc" get and delete APIs to the AsyncSearchClient in the High Level Rest Client. Relates to #49091 Backport of #53828	2020-03-23 21:21:36 +01:00
Benjamin Trent	d276058c6c	[ML] adjusting feature importance mapping for multi-class support (#53821 ) (#54013 ) Feature importance storage format is changing to encompass multi-class. Feature importance objects are now mapped as follows (logistic) Regression: ``` { "feature_name": "feature_0", "importance": -1.3 } ``` Multi-class [class names are `foo`, `bar`, `baz`] ``` { “feature_name”: “feature_0”, “importance”: 2.0, // sum(abs()) of class importances “foo”: 1.0, “bar”: 0.5, “baz”: -0.5 }, ``` This change adjusts the mapping creation for analytics so that the field is mapped as a `nested` type. Native side change: https://github.com/elastic/ml-cpp/pull/1071	2020-03-23 15:50:12 -04:00
Przemysław Witek	88c5d520b3	[7.x] Verify that the field is aggregatable before attempting cardinality aggregation (#53874 ) (#54004 )	2020-03-23 19:36:33 +01:00
Luca Cavanna	932a7e3112	Backport of async search changes (#53976 ) * Get Async Search: omit _clusters section when empty (#53907) The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content. This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`. * Async search: remove version from response (#53960) The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available). That said this commit clarifies this in the docs and removes the version field from the async search response * Async Search: replicas to auto expand from 0 to 1 (#53964) This way single node clusters that are green don't go yellow once async search is used, while all the others still have one replica. * [DOCS] address timing issue in async search docs tests (#53910) The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final. With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response. Closes #53887 Closes #53891	2020-03-23 19:13:31 +01:00
Dimitris Athanasiou	965af3a68b	[7.x][ML] Delete DF analytics stats upon job deletion (#53933 ) (#53997 ) Since a data frame analytics job may have associated docs in the .ml-stats-* indices, when the job is deleted we should delete those docs too. Backport of #53933	2020-03-23 19:55:36 +02:00
Dimitris Athanasiou	08a8345269	[7.x][ML] Fix typo in outlier detection timing stats (#53988 ) (#53995 ) The field holding the timing stats was mistakenly called `timings_stats`. Backport of #53988	2020-03-23 19:46:39 +02:00
Ryan Ernst	960d1fb578	Revert "Introduce system index APIs for Kibana (#53035 )" (#53992 ) This reverts commit `c610e0893d`. backport of #53912	2020-03-23 10:29:35 -07:00
Armin Braun	5b9864db2c	Better Incrementality for Snapshots of Unchanged Shards (#52182 ) (#53984 ) Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.	2020-03-23 16:43:41 +01:00
Dimitris Athanasiou	3873510332	[7.x][ML] Refactor DFA custom processor to cross validation splitter (#53915 ) (#53956 ) While `CustomProcessor` is generic and allows for flexibility, there are new requirements that make cross validation a concept it's hard to abstract behind custom processor. In particular, we would like to add data_counts to the DFA jobs stats. Counting training VS. test docs would be a useful statistic. We would also want to add a different cross validation strategy for multiclass classification. This commit renames custom processors to cross validation splitters which allows for those enhancements without cryptically doing things as a side effect of the abstract custom processing. Backport of #53915	2020-03-23 17:15:14 +02:00
Marios Trivyzas	af03200ad6	SQL: Extend DATE_TRUNC to also operate on intervals(elastic - #46632 ) (#47720 ) (#53972 ) The function is extended to operate on intervals according to the PostgreSQL: https://www.postgresql.org/docs/9.1/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC Closes : #46632 (cherry picked from commit 2dc79505825fa75e0711dcfa8e9c69e8028fc979) Co-authored-by: musteaf <gs_mustea@hotmail.com>	2020-03-23 15:05:16 +01:00
Martijn van Groningen	aef7b89219	Backport: initial data stream commit (#53959 ) This commits adds a data stream feature flag, initial definition of a data stream and the stubs for the data stream create, delete and get APIs. Also simple serialization tests are added and a rest test to thest the data stream API stubs. This is a large amount of code and mainly mechanical, but this commit should be straightforward to review, because there isn't any real logic. The data stream transport and rest action are behind the data stream feature flag and are only intialized if the feature flag is enabled. The feature flag is enabled if elasticsearch is build as snapshot or a release build and the 'es.datastreams_feature_flag_registered' is enabled. The integ-test-zip sets the feature flag if building a release build, otherwise rest tests would fail. Relates to #53100	2020-03-23 12:58:09 +01:00
Yannick Welsch	060c72c799	Only link fd* files during source-only snapshot (#53463 ) Source-only snapshots currently create a second full source-only copy of the shard on disk to support incrementality during upload. Given that stored fields are occupying a substantial part of a shard's storage, this means that clusters with source-only snapshots can require up to 50% more local storage. Ideally we would only generate source-only parts of the shard for the things that need to be uploaded (i.e. do incrementality checks on original file instead of trimmed-down source-only versions), but that requires much bigger changes to the snapshot infrastructure. This here is an attempt to dramatically cut down on the storage used by the source-only copy of the shard by soft-linking the stored-fields files (fd*) instead of copying them. Relates #50231	2020-03-23 11:04:53 +01:00
Tim Vernum	cde8725e3c	Create API Key on behalf of other user (#53943 ) This change adds a "grant API key action" POST /_security/api_key/grant that creates a new API key using the privileges of one user ("the system user") to execute the action, but creates the API key with the roles of the second user ("the end user"). This allows a system (such as Kibana) to create API keys representing the identity and access of an authenticated user without requiring that user to have permission to create API keys on their own. This also creates a new QA project for security on trial licenses and runs the API key tests there Backport of: #52886	2020-03-23 18:50:07 +11:00
Tim Vernum	f003a419a5	Add exception metadata for disabled features (#53941 ) This change adds a new exception with consistent metadata for when security features are not enabled. This allows clients to be able to tell that an API failed due to a configuration option, and respond accordingly. Relates: kibana#55255 Resolves: #52311, #47759 Backport of: #52811	2020-03-23 14:13:15 +11:00
Jason Tedor	27c8bcbbd1	Introduce aarch64 packaging (#53914 ) (#53926 ) This commit introduces aarch64 packaging, including bundling an aarch64 JDK distribution. We had to make some interesting choices here: - ML binaries are not compiled for aarch64, so for now we disable ML on aarch64 - depending on underlying page sizes, we have to disable class data sharing	2020-03-22 11:58:11 -04:00
Gordon Brown	10cabbbade	Transition Transforms to using hidden indices for notifcations index (#53773 ) This commit changes the Transforms notifications index to be hidden index, with a hidden alias. This commit also removes the temporary hack in MetaDataCreateIndexService that prevents deprecation warnings for known dot-prefixed index names which are not hidden/system indices, as this was the last index pattern to need that hack.	2020-03-20 15:40:58 -06:00
Ryan Ernst	caa4e0dc18	Use boolean methods for allowed realm types in license state (#53456 ) (#53834 ) In xpack the license state contains methods to determine whether a particular feature is allowed to be used. The one exception is allowsRealmTypes() which returns an enum of the types of realms allowed. This change converts the enum values to boolean methods. There are 2 notable changes: NONE is removed as we always fall back to basic license behavior, and NATIVE is not needed because it would always return true since we should always have a basic license.	2020-03-20 14:30:31 -07:00
Aleksandr Maus	fd0cdde38c	EQL: EqlActionIT improvements (#53780 ) (#53888 ) Related to https://github.com/elastic/elasticsearch/issues/53598	2020-03-20 17:28:15 -04:00
Nik Everett	c2a2fcb5a1	Clean up eclipse build (backport of #53831 ) (#53870 ) Fixes up the "forbidden" warnings that you get when you import Elasticsearch using "import gradle projects". With this, and the manual step of switching circular project definitions to warnings this gets most thing compiling.	2020-03-20 12:12:05 -04:00
Aleksandr Maus	83bef862e0	EQL: Extract query folder tests definitions into resources (#53802 ) (#53869 )	2020-03-20 10:39:35 -04:00
Luca Cavanna	03fca61fcb	[DOCS] add docs for async search (#53675 ) Relates to #49091 Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2020-03-20 14:46:38 +01:00
Christoph Büscher	8eacb153df	Add async_search.submit to HLRC #53592 (#53852 ) This commit adds a new AsyncSearchClient to the High Level Rest Client which initially supporst the submitAsyncSearch in its blocking and non-blocking flavour. Also adding client side request and response objects and parsing code to parse the xContent output of the client side AsyncSearchResponse together with parsing roundtrip tests and a simple roundtrip integration test. Relates to #49091 Backport of #53592	2020-03-20 13:15:58 +01:00
Przemysław Witek	a68071dbba	[7.x] Delete empty .ml-state* indices during nightly maintenance task. (#53587 ) (#53849 )	2020-03-20 13:08:36 +01:00
Alan Woodward	d23112f441	Report parser name and location in XContent deprecation warnings (#53805 ) It's simple to deprecate a field used in an ObjectParser just by adding deprecation markers to the relevant ParseField objects. The warnings themselves don't currently have any context - they simply say that a deprecated field has been used, but not where in the input xcontent it appears. This commit adds the parent object parser name and XContentLocation to these deprecation messages. Note that the context is automatically stripped from warning messages when they are asserted on by integration tests and REST tests, because randomization of xcontent type during these tests means that the XContentLocation is not constant	2020-03-20 11:52:55 +00:00
Dimitris Athanasiou	60153c5433	[7.x][ML] Data frame analytics analysis stats (#53788 ) (#53844 ) Adds parsing and indexing of analysis instrumentation stats. The latest one is also returned from the get-stats API. Note that we chose to duplicate objects even where they are currently similar. There are already ideas on how these will diverge in the future and while the duplication looks ugly at the moment, it is the option that offers the highest flexibility. Backport of #53788	2020-03-20 12:11:53 +02:00
Ryan Ernst	b8ef830c0a	Decouple AuditTrailService from AuditTrail (#53450 ) (#53760 ) The AuditTrailService has historically been an AuditTrail itself, acting as a composite of the configured audit trails. This commit removes that interface from the service and instead builds a composite delegating implementation internally. The service now has a single get() method to get an AuditTrail implementation which may be called. If auditing is not allowed by the license, an empty noop version is returned.	2020-03-19 14:39:01 -07:00
Christoph Büscher	d846ea43f4	Fix ReloadSynonymAnalyzerIT failure (#53663 ) (#53806 ) There is an assertion in ReloadAnalyzersResponse.merge that compares index names of merged responses that was falsely using object equality instead of String.equals(). In the past this didn't seem to matter but with changes in the test setup we started to see failures. Correcting this and also simplifying test a bit to be able to run it repeatedly if needed. Backport of #53663	2020-03-19 19:00:14 +01:00
Benjamin Trent	433952b595	[7.x] [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725 ) (#53808 ) * [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725) This fixes two issues: - Results persister would retry actions even if they are not intermittent. An example of an persistent failure is a doc mapping problem. - Data frame analytics would continue to retry to persist results even after the job is stopped. closes https://github.com/elastic/elasticsearch/issues/53687	2020-03-19 13:56:41 -04:00
Jake Landis	cce60215d8	[7.x] Add Watcher to available rest resources (#53620 ) (#53764 ) Prior to this commit Watcher explicitly copied test between two projects with a copy task. This commit removes the explicit copy in favor of adding the Watcher tests to the available restResources that may be copied between projects. This is how inter-project dependencies should be modeled. However, only Watcher is included here since it is (currently) the only project with inter-project test dependencies.	2020-03-19 12:29:36 -05:00
Jake Landis	db3420d757	[7.x] Optimize which Rest resources are used by the Rest tests… (#53766 ) This should help with Gradle's incremental compile such that projects only depend upon the resources they use. related #52114	2020-03-19 12:28:59 -05:00
Ignacio Vera	dfc1d79ddf	Add support for distance queries on shape queries (#53468 ) (#53796 ) With the upgrade to Lucene 8.5, XYShape field has support for distance queries. This change implements this new feature and removes the limitation.	2020-03-19 15:32:09 +01:00
Dominic Page	b0884baf46	Geo shape query vs geo point backport (#53774 ) Backport to 7x Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928 Co-Authored-By: @iverase	2020-03-19 13:00:36 +01:00
Ioannis Kakavas	4a36894a48	Mute failing tests (#53781 ) See #53738	2020-03-19 08:16:23 +02:00
Benjamin Trent	415d73c27d	[Transform] renamed _cat/transform to _cat/transforms (#53743 ) (#53771 ) renaming _cat/transform to _cat/transforms for uniformity with the other _cat apis.	2020-03-18 19:54:03 -04:00
Stuart Tettemer	cdbee32f55	Scripting: Per-context script cache, default off (#52855 ) (#53756 ) * Adds per context settings: `script.context.${CONTEXT}.cache_max_size` ~ `script.cache.max_size` `script.context.${CONTEXT}.cache_expire` ~ `script.cache.expire` `script.context.${CONTEXT}.max_compilations_rate` ~ `script.max_compilations_rate` * Context cache is used if: `script.max_compilations_rate=use-context`. This value is dynamically updatable, so users can switch back to the general cache if desired. * Settings for context caches take the first value that applies: 1) Context specific settings if set, eg `script.context.ingest.cache_max_size` 2) Correlated general setting is set to the non-default value, eg `script.cache.max_size` 3) Context default The reason for 2's inclusion is to allow an easy transition for users who've customized their general cache settings. Using the general cache settings for the context caches results in higher effective settings, since they are multiplied across the number of contexts. So a general cache max size of 200 will become 200 * # of contexts. However, this behavior it will avoid users snapping to a value that is too low for them. Backport of: #52855 Refs: #50152	2020-03-18 14:44:04 -06:00
Ioannis Kakavas	af519cccff	Revert "Mute TimeSeriesLifecycleActionsIT (#53741 )" This reverts commit `df0ad7569b`.	2020-03-18 18:51:06 +02:00
markharwood	ae19802e29	Fix highlighter support in PinnedQuery and added test (#53716 ) (#53729 ) CappedScoreQuery was not delegating queryVisitor calls Closes #53699	2020-03-18 15:39:17 +00:00
Ioannis Kakavas	df0ad7569b	Mute TimeSeriesLifecycleActionsIT (#53741 ) see #53738	2020-03-18 17:38:24 +02:00
Luca Cavanna	75c367de13	[TEST] Replace agg key in async search yaml test (#53727 ) Some clients have problems running this test as a numeric key is treated like an array index by default. We can work around this by renaming the aggregation key to not be a numeric.	2020-03-18 16:16:15 +01:00
Benjamin Trent	2ccb963f1d	Create GET _cat/transforms API Issue (#53643 ) (#53726 ) Adds new` _cat/transform` and `_cat/transform/{transform_id}` endpoints.	2020-03-18 10:45:28 -04:00
Alan Woodward	580bc40c0c	Make it possible to deprecate all variants of a ParseField with no replacement (#53722 ) Sometimes we want to deprecate and remove a ParseField entirely, without replacement; for example, the various places where we specify a _type field in 7x. Currently we can tell users only that a particular field name should not be used, and that another name should be used in its place. This commit adds the ability to say that a field should not be used at all.	2020-03-18 14:16:19 +00:00
Ioannis Kakavas	e5aa0906f7	Mute testHistoryIsWrittenWithDeletion (#53721 ) see #53718	2020-03-18 14:49:57 +02:00
Christoph Büscher	2384c1359d	Revert "Fix ReloadSynonymAnalyzerIT failure (#53663 )" This reverts commit `2c32173fce`.	2020-03-18 12:44:23 +01:00
Christoph Büscher	2c32173fce	Fix ReloadSynonymAnalyzerIT failure (#53663 ) There is an assertion in ReloadAnalyzersResponse.merge that compares index names of merged responses that was falsely using object equality instead of String.equals(). In the past this didn't seem to matter but with changes in the test setup we started to see failures. Correcting this and also simplifying test a bit to be able to run it repeatedly if needed. Closes #53443	2020-03-18 11:55:37 +01:00
Przemysław Witek	ec13c093df	Make ML index aliases hidden (#53160 ) (#53710 )	2020-03-18 10:28:45 +01:00
Ioannis Kakavas	873d0ecd09	Fix potential bug in concurrent token refresh support (#53668 ) (#53705 ) Ensure that we do not proceed execution after calling the listerer's onFailure	2020-03-18 09:43:26 +02:00
Hendrik Muhs	7a12300ce6	[7.x][Transform] enhance the output of preview to return full… (#53695 ) changes the output format of preview regarding deduced mappings and enhances it to return all the details about auto-index creation. This allows the user to customize the index creation. Using HLRC you can create a index request from the output of the response. backport #53572	2020-03-18 08:37:56 +01:00
Hendrik Muhs	a6dca577e5	[Transform] data nanos/date histogram IT (#53654 ) add an integration test for date nanos in combination with date_histogram	2020-03-17 20:58:57 +01:00
Ryan Ernst	5c472fcb47	Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642 ) Re-applies the change from #53523 along with test fixes. closes #53626 closes #53624 closes #53622 closes #53625 Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Jake Landis <jake.landis@elastic.co>	2020-03-17 10:28:51 -07:00
David Kyle	2b635737e1	[ML] Parse single named object in config classes (#53472 ) (#53542 )	2020-03-17 13:59:52 +00:00
Alan Woodward	71b703edd1	Rename AtomicFieldData to LeafFieldData (#53554 ) This conforms with lucene's LeafReader naming convention, and matches other per-segment structures in elasticsearch.	2020-03-17 12:30:12 +00:00
Andrei Stefan	79600eb38b	SQL: add support for index aliases for SYS COLUMNS command (#53525 ) (#53653 ) (cherry picked from commit f65e4d6ff7b2e00eb6f9c985fbe7cb24de00f045)	2020-03-17 12:49:08 +02:00
Hendrik Muhs	a0314ad015	[Transform] add transform discovery node role (#53616 ) Enhancement of #52712: Add a discovery node role using the letter t for transform. Fixes #53156	2020-03-17 11:39:20 +01:00
Ioannis Kakavas	23af171cf8	Disallow Password Change when authenticated by Token (#49694 ) (#53614 ) Password changes are only allowed when the user is currently authenticated by a realm (that permits the password to be changed) and not when authenticated by a bearer token or an API key.	2020-03-17 09:45:35 +02:00
Yang Wang	7f21ade924	Explicitly require that derived API keys have no privileges (#53647 ) (#53648 ) The current implicit behaviour is that when an API keys is used to create another API key, the child key is created without any privilege. This implicit behaviour is surprising and is a source of confusion for users. This change makes that behaviour explicit.	2020-03-17 17:56:37 +11:00
Tim Vernum	74dbdb991c	Avoid NPE in set_security_user without security (#53543 ) If security was disabled (explicitly), then the SecurityContext would be null, but the set_security_user processor was still registered. Attempting to define a pipeline that used that processor would fail with an (intentional) NPE. This behaviour, introduced in #52032, is a regression from previous releases where the pipeline was allowed, but was no usable. This change restores the previous behaviour (with a new warning). Backport of: #52691	2020-03-17 13:30:07 +11:00
Ryan Ernst	e7f38674ed	Add internalClusterTest to check task (#53444 ) This commit adds internalClusterTest in xpack core to run as part of check. This was accidentally removed in a refactoring. Other xpack modules already do this, but core was left out. This commit also mutes 2 tests that currently fail. closes #53407	2020-03-16 18:55:01 -07:00
Luca Cavanna	c3d2417448	Cumulative backport of async search changes (#53635 ) * Submit async search to work only with POST (#53368) Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method. * Refine SearchProgressListener internal API (#53373) The following cumulative improvements have been made: - rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce` - add unit test for `SearchShard` - on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private - Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class. - rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members - make `SearchShard` and `SearchShardTask` classes final * Move async search yaml tests to x-pack yaml test folder (#53537) The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too. * [DOCS] Add temporary redirect for async-search (#53454) The following API spec files contain a link to a not-yet-created async search docs page: * [async_search.delete.json][0] * [async_search.get.json][1] * [async_search.submit.json][2] The Elaticsearch-js client uses these spec files to create their docs. This created a broken link in the Elaticsearch-js docs, which has broken the docs build. This PR adds a temporary redirect for the docs page. This redirect should be removed when the actual API docs are added. [0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json [1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json [2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-03-17 00:08:17 +01:00
Nik Everett	f0beab4041	Stop using round-tripped PipelineAggregators (backport of #53423 ) (#53629 ) This begins to clean up how `PipelineAggregator`s and executed. Previously, we would create the `PipelineAggregator`s on the data nodes and embed them in the aggregation tree. When it came time to execute the pipeline aggregation we'd use the `PipelineAggregator`s that were on the first shard's results. This is inefficient because: 1. The data node needs to make the `PipelineAggregator` only to serialize it and then throw it away. 2. The coordinating node needs to deserialize all of the `PipelineAggregator`s even though it only needs one of them. 3. You end up with many `PipelineAggregator` instances when you only really need one per pipeline. 4. `PipelineAggregator` needs to implement serialization. This begins to undo these by building the `PipelineAggregator`s directly on the coordinating node and using those instead of the `PipelineAggregator`s in the aggregtion tree. In a follow up change we'll stop serializing the `PipelineAggregator`s to node versions that support this behavior. And, one day, we'll be able to remove `PipelineAggregator` from the aggregation result tree entirely. Importantly, this doesn't change how pipeline aggregations are declared or parsed or requested. They are still part of the `AggregationBuilder` tree because that makes sense.	2020-03-16 16:15:23 -04:00
Gordon Brown	880cc3ca7e	Hide I/SLM history aliases (#53564 ) This commit adjusts the aliases used for the ILM and SLM history indices to be hidden aliases. Also tweaks the configuration of the `IndexTemplateRegistry`s used by these history system to only upgrade the template from the master node, as documents are indexed from the master node, so the template version should only be upgraded from the master node.	2020-03-16 13:07:26 -06:00
Gordon Brown	031932b32f	Allow _cat indices & aliases to use indices options (#53248 ) This commit adjusts the _cat/indices and _cat/aliases APIs to allow specifying indices options, so that these APIs can handle hidden indices/aliases in the same way as other APIs. Also adds the hidden option to the expand_wildcards parameter in the YAML spec for every API that accepts it.	2020-03-16 11:25:05 -06:00
Alexander Reelsen	7571ca437a	Disable Watcher script optimization for stored scripts (#53497 ) The watcher TextTemplateEngine uses a fast path mechanism where it checks for the existence of `{{` to decide if a mustache script required compilation. This does not work for stored script, as the field that is checked contains the id of the script, which means, the name of the script is returned as its value. This commit checks for the script type and does not involve this fast path check if a stored script is used. Closes #40212	2020-03-16 18:07:54 +01:00
Andrei Stefan	91ca9c5c33	QL: constant_keyword support (#53241 ) (#53602 ) (cherry picked from commit d6cd4ce7849ba215407c8c5fa815c9b373fb8480)	2020-03-16 18:06:31 +02:00
jimczi	dc2edc97f0	Fix sporadic failures in AsyncSearchActionTests (take 2) This change removes the need to always get a new version when iterating on an async search. This is needed since we cannot guarantee that shards will be queried exactly in order. Relates #53360	2020-03-16 16:52:23 +01:00
markharwood	2c74f3e22c	Backport of new wildcard field type (#53590 ) * New wildcard field optimised for wildcard queries (#49993) Indexes values using size 3 ngrams and also stores the full original as a binary doc value. Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values. Also supports aggregations and sorting.	2020-03-16 15:07:13 +00:00
Przemysław Witek	376b2ae735	[7.x] Make classification evaluation metrics work when there is field mapping type mismatch (#53458 ) (#53601 )	2020-03-16 15:38:56 +01:00
Jim Ferenczi	e6680be0b1	Add new x-pack endpoints to track the progress of a search asynchronously (#49931 ) (#53591 ) This change introduces a new API in x-pack basic that allows to track the progress of a search. Users can submit an asynchronous search through a new endpoint called `_async_search` that works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time. ```` GET my_index_pattern/_async_search?wait_for_completion=100ms { "aggs": { "date_histogram": { "field": "@timestamp", "fixed_interval": "1h" } } } ```` If after 100ms the final response is not available, a `partial_response` is included in the body: ```` { "id": "9N3J1m4BgyzUDzqgC15b", "version": 1, "is_running": true, "is_partial": true, "response": { "_shards": { "total": 100, "successful": 5, "failed": 0 }, "total_hits": { "value": 1653433, "relation": "eq" }, "aggs": { ... } } } ```` The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed. It also contains the total hits as well as partial aggregations computed from the successful shards. To continue to monitor the progress of the search users can call the get `_async_search` API like the following: ```` GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms ```` That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version` should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress. Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response: ```` { "id": "9N3J1m4BgyzUDzqgC15b", "version": 10, "is_running": false, "response": { "is_partial": false, ... } } ```` Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time: ````` GET my_index_pattern/_async_search?wait_for_completion=100ms&keep_alive=10d ````` The default can be changed when submitting the search, the example above raises the default value for the search to `10d`. ````` GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d ````` The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days. A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result. Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed. Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows to retrieve a response even if the task finishes. ```` DELETE _async_search/9N3J1m4BgyzUDzqgC15b ```` The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`. The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks. Relates #49091 Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>	2020-03-16 15:31:27 +01:00
Marios Trivyzas	723034001c	SQL: Fix NPE for parameterized LIKE/RLIKE (#53573 ) Fix NPE when `null` is passed as a parameter for a parameterized pattern of LIKE/RLIKE. e.g.: `field LIKE ?` params=[null]` Check for null pattern in LIKE/RLIKE as for RLIKE (RegexpQuery) we get an IllegalArgumentExpression from Lucence but for LIKE (WildcardQuery) we get an NPE. Fixes: #53557 (cherry picked from commit ec3481ed13254ecdec32acf7a0fafd536ec77aff)	2020-03-16 14:44:48 +01:00
Dimitris Athanasiou	94da4ca3fc	[7.x][ML] Extend classification to support multiple classes (#53539 ) (#53597 ) Prepares classification analysis to support more than just two classes. It introduces a new parameter to the process config which dictates the `num_classes` to the process. It also changes the max classes limit to `30` provisionally. Backport of #53539	2020-03-16 15:00:54 +02:00
David Kyle	a38e5ca8e7	Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure (#53595 ) Failure tracked in #50353	2020-03-16 12:30:56 +00:00
Marios Trivyzas	1272ae411e	SQL: Fix issue with LIKE/RLIKE as painless script (#53495 ) Add missing asScript() implementation for LIKE/RLIKE expressions. When LIKE/RLIKE are used for example in GROUP BY or are wrapped with scalar functions in a WHERE clause, the translation must produce a painless script which will be executed to implement the correct behaviour and previously this was completely missing, and as a consquence wrong results were silently (no error) returned. Fixes: #53486 (cherry picked from commit eaa8ead6742a8e7dcf343bcbaff8de031550fd77)	2020-03-16 12:27:45 +01:00
Martijn van Groningen	3b9545848f	Reenable watcher rest tests (#53532 ) Also log a message instead of failing if there are active watches at a beginning of a test. Relates to #53177	2020-03-16 10:24:14 +01:00
Mark Vieira	2f0aca992b	Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 )" This reverts commit `b7dbadeea0`.	2020-03-15 18:10:40 -07:00
Jason Tedor	b7dbadeea0	Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 ) This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2 dependency to 2.13.1. Relates #53523	2020-03-14 13:28:06 -04:00
Benjamin Trent	1262ab2762	[ML] [Inference] fix number inference models returned in x-pack info call (#53540 ) (#53560 ) the ML portion of the x-pack info API was erroneously counting configuration documents and definition documents. The underlying implementation of our storage separates the two out. This PR filters the query so that only trained model config documents are counted.	2020-03-13 16:53:34 -04:00
Benjamin Trent	4e43ede735	[ML] renaming inference processor field field_mappings to new name field_map (#53433 ) (#53502 ) This renames the `inference` processor configuration field `field_mappings` to `field_map`. `field_mappings` is now deprecated.	2020-03-13 15:40:57 -04:00
Tom Veasey	690099553c	[7.x][ML] Adds the class_assignment_objective parameter to classification (#53552 ) Adds a new parameter for classification that enables choosing whether to assign labels to maximise accuracy or to maximise the minimum class recall. Fixes #52427.	2020-03-13 17:35:51 +00:00
Tim Vernum	a8677499d7	[Backport] Add support for secondary authentication (#53530 ) This change makes it possible to send secondary authentication credentials to select endpoints that need to perform a single action in the context of two users. Typically this need arises when a server process needs to call an endpoint that users should not (or might not) have direct access to, but some part of that action must be performed using the logged-in user's identity. Backport of: #52093	2020-03-13 16:30:20 +11:00
Tim Vernum	bac1740d44	Support authentication without anonymous user (#53528 ) This change adds a new parameter to the authenticate methods in the AuthenticationService to optionally exclude support for the anonymous user (if an anonymous user exists). Backport of: #52094	2020-03-13 14:27:29 +11:00
Nik Everett	9dcd64c110	Preserve metric types in top_metrics (backport of #53288 ) (#53440 ) This changes the `top_metrics` aggregation to return metrics in their original type. Since it only supports numerics, that means that dates, longs, and doubles will come back as stored, with their appropriate formatter applied.	2020-03-12 17:17:09 -04:00
Jason Tedor	5b08ea84c9	Add deprecation check for listener thread pool (#53438 ) This commit adds a deprecation check for the listener thread pool settings as these will be removed in 8.0.0.	2020-03-12 14:32:41 -04:00
Jay Modi	af36665b08	Deprecate the logstash enabled setting (#53487 ) The setting, `xpack.logstash.enabled`, exists to enable or disable the logstash extensions found within x-pack. In practice, this setting had no effect on the functionality of the extension. Given this, the setting is now deprecated in preparation for removal. Backport of #53367	2020-03-12 10:18:39 -06:00
Dan Hermann	34adfd9611	Validate SSL settings at parse time (#49196 ) (#53473 )	2020-03-12 10:14:51 -05:00
Aleksandr Maus	31d45b3c95	EQL: Improve query folder test suite (#53187 ) (#53476 ) Related to https://github.com/elastic/elasticsearch/issues/52775	2020-03-12 10:58:07 -04:00
Yannick Welsch	48124807d5	Fix SourceOnlySnapshotIT (#53462 ) The tests in this class had been failing for a while, but went unnoticed as not tested by CI (see #53442). The reason the tests fail is that the can-match phase is smarter now, and filters out access to a non-existing field. Closes #53442	2020-03-12 14:15:03 +01:00
Jason Tedor	d8e70d4688	Enable deprecation checks for removed settings (#53317 ) Today we do not have any infrastructure for adding a deprecation check for settings that are removed. This commit enables this by adding such infrastructure. Note that this infrastructure is unused in this commit, which is deliberate. However, the primary target for this commit is 7.x where this infrastructue will be used, in a follow-up.	2020-03-11 16:49:16 -04:00
Benjamin Trent	89668c5ea0	[ML][Inference] adds new default_field_map field to trained models (#53294 ) (#53419 ) Adds a new `default_field_map` field to trained model config objects. This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data. The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.	2020-03-11 13:49:39 -04:00
Jay Modi	9a21a8abf2	Opt-in logstash plugin to formatting (#53413 ) This change opts-in the logstash plugin for enforced formatting. Backport of #53370	2020-03-11 09:58:37 -06:00
Nhat Nguyen	6665ebe7ab	Harden search context id (#53143 ) Using a Long alone is not strong enough for the id of search contexts because we reset the id generator whenever a data node is restarted. This can lead to two issues: 1. Fetch phase can fetch documents from another index 2. A scroll search can return documents from another index This commit avoids these issues by adding a UUID to SearchContexId.	2020-03-11 11:48:11 -04:00
Przemysław Witek	8c4c19d310	Perform evaluation in multiple steps when necessary (#53295 ) (#53409 )	2020-03-11 15:36:38 +01:00
Przemysław Witek	063957b7d8	Simplify "refresh" calls. (#53385 ) (#53393 )	2020-03-11 12:26:11 +01:00
Dimitris Athanasiou	cc7751eb16	[7.x][ML] Add ILM policy to ml stats indices (#53349 ) (#53392 ) Adds a size based ILM policy to automatically rollover ml stats indices. Backport of #53349	2020-03-11 13:01:34 +02:00
Dimitris Athanasiou	0fd0516d0d	[7.x][ML] Rename data frame analytics maximum_number_trees to max_trees (#53300 ) (#53390 ) Deprecates `maximum_number_trees` parameter of classification and regression and replaces it with `max_trees`. Backport of #53300	2020-03-11 12:45:27 +02:00
David Roberts	532a720e1b	[ML] Skeleton estimate_model_memory endpoint for anomaly detection (#53386 ) This is a partial implementation of an endpoint for anomaly detector model memory estimation. It is not complete, lacking docs, HLRC and sensible numbers for many anomaly detector configurations. These will be added in a followup PR in time for 7.7 feature freeze. A skeleton endpoint is useful now because it allows work on the UI side of the change to commence. The skeleton endpoint handles the same cases that the old UI code used to handle, and produces very similar estimates for these cases. Backport of #53333	2020-03-11 10:20:00 +00:00
Jake Landis	2ab502afc4	[7.x] Remove dead 'beats' code (#53312 ) (#53376 )	2020-03-10 20:57:29 -05:00
Nhat Nguyen	24f114766f	Fix doc_stats and segment_stats of ReadOnlyEngine (#53345 ) We can't always have the same segment stats and doc stats between InternalEngine and ReadOnlyEngine if there are some fully deleted segments. ReadOnlyEngine always filters out them. InternalEngine, however, will keep them if peer recovery retention leases exist or the number of the retaining operations is non-zero. This change reverts the fix in #51331 and uses the wrapped reader to calculate the segment stats and doc stats. For the test, we need to disable the extra retaining soft-deletes operations. Closes #51303	2020-03-10 21:51:33 -04:00
Nhat Nguyen	cad02d4a31	Increase timeout testFollowIndexMaxOperationSizeInBytes (#53014 ) Replicating 1000 documents one by one (as we cap the request size at 1 byte) can take more than 10 seconds on a slow CI. Closes #52812	2020-03-10 21:51:33 -04:00
William Brafford	3494c73c8d	Mute failing tests (#53362 ) (#53363 )	2020-03-10 16:01:31 -04:00
Przemko Robakowski	847ac9c7d7	Fix null config in SnapshotLifecyclePolicy.toRequest (#53328 ) (#53355 ) This avoids NPE when executing SLM policy when no config was provided. Related to #44465 Closes #53171 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-10 20:44:30 +01:00
Przemysław Witek	d54d7f2be0	[7.x] Implement ILM policy for .ml-state* indices (#52356 ) (#53327 )	2020-03-10 14:24:18 +01:00
Benjamin Trent	856d9bfbc1	[ML] fixing data frame analysis test when two jobs are started in succession quickly (#53192 ) (#53332 ) A previous change (#53029) is causing analysis jobs to wait for certain indices to be made available. While this it is good for jobs to wait, they could fail early on _start. This change will cause the persistent task to continually retry node assignment when the failure is due to shards not being available. If the shards are not available by the time `timeout` is reached by the predicate, it is treated as a _start failure and the task is canceled. For tasks seeking a new assignment after a node failure, that behavior is unchanged. closes #53188	2020-03-10 08:30:47 -04:00
Hendrik Muhs	5912895838	[Transform] wait for transform templates in Rest integration t… (#53330 ) add transform templates to the list of templates to be installed before executing tests	2020-03-10 13:22:12 +01:00
Hendrik Muhs	696aa4ddaf	[7.x][Transform] add support for script in group_by (#53167 ) (#53324 ) add the possibility to base the group_by on the output of a script. closes #43152 backport #53167	2020-03-10 11:12:58 +01:00
Alan Woodward	5c861cfe6e	Upgrade to final lucene 8.5.0 snapshot (#53293 ) Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use the latest snapshot to check that there are no last-minute bugs or regressions.	2020-03-10 09:32:59 +00:00
Cauê Marcondes	b68d7b1c33	giving kibana user privileges to create custom link index (#53221 ) (#53278 )	2020-03-10 09:50:38 +01:00
Henning Andersen	a4d481f2bb	ILM Freeze step retry when not acknowledged (#53287 ) A freeze operation can partially fail in multiple places, including the close verification step. This left the index in an unfrozen but partially closed state. Now throw an exception to retry the freeze step instead.	2020-03-10 08:03:39 +01:00
Gordon Brown	1cb0a4399d	Fix Get Alias API handling of hidden indices with visible aliases (#53147 ) This commit changes the Get Aliases API to include hidden indices by default - this is slightly different from other APIs, but is necessary to make this API work intuitively.	2020-03-09 16:16:29 -06:00
Przemko Robakowski	f075d70cf8	[7.x] Avoid race condition in ILMHistorySotre (#53039 ) (#53094 ) * Avoid race condition in ILMHistorySotre (#53039) * Avoid race condition in ILMHistorySotre This change modifies ILMHistoryStore to always apply correct settings and mappings, even if template is deleted and not yet recreated. This ensures that ILM history index is correctly managed by ILM and also fixes flaky history tests that were prone to triggenring this race. This commit also refactors and simplifies ILM history tests. Closes #50353 and #52853 * Review comment Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * fixed tests * backport #53306 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-09 22:24:15 +01:00
Bogdan Pintea	62c8ac9993	SQL: transfer version compatibility decision to the server (#53082 ) (#53302 ) This commit adds a new request object field, "version", containing the version of the requesting client. This parameter is now accepted - and for certain clients required - by the server and the request is validated against it. Currently server's and client's versions still need to be equal in order for the request to be accepted. Relaxing this check is going to be part of future work. On the clients' side, the only check remaining is to ensure that the peer server is supporting version backwards compatibility (i.e. is on, or newer than a certain release). (cherry picked from commit a8f413a20fb023bec83af0de1211a2936a7f558c)	2020-03-09 21:16:57 +01:00
Aleksandr Maus	d064846416	EQL: Test infrastructure improvements (#53253 ) (#53297 ) Update CommonEqlRestTestCase code to simplify making changes as requested. Update EqlActionIT to simplify the test code as requested. Replace Jackson parser with XContent in EqlActionIT. Whitelist more EQL tests specs that are now supported.	2020-03-09 14:11:54 -04:00
Ross Wolf	f5f922c6f6	EQL: Add IsNull/IsNotNull checks (#52791 ) * EQL: Add IsNull/IsNotNull checks * EQL: Simplify IsNull/IsNotNull optimization * EQL: Split string tests over multiple lines	2020-03-09 10:41:04 -06:00
Jason Tedor	8ad0080a59	Fork CCR checkpoint listeners on CCR thread pool (#53265 ) This commit moves the global checkpoint listeners used in CCR to the CCR thread pool. This removes the last use of the listener thread pool in the codebase.	2020-03-09 08:56:30 -04:00
Martijn van Groningen	7775ddbc9c	Verify watch_count before a test starts and not after a test. This check was added as part of: `0f2d26bdca` Checking this before the test starts makes more sense, because the watches index has then also be removed. Relates to #53177	2020-03-09 07:45:44 +01:00
Jason Tedor	5e96d3e59a	Use given executor for global checkpoint listener (#53260 ) Today when notifying a global checkpoint listener, we use the listener thread pool. This commit turns this inside out so that the global checkpoint listener must provide an executor on which to notify the listener.	2020-03-08 13:51:05 -04:00
Gordon Brown	ff9b8bda63	Implement hidden aliases (#52547 ) This commit introduces hidden aliases. These are similar to hidden indices, in that they are not visible by default, unless explicitly specified by name or by indicating that hidden indices/aliases are desired. The new alias property, `is_hidden` is implemented similarly to `is_write_index`, except that it must be consistent across all indices with a given alias - that is, all indices with a given alias must specify the alias as either hidden, or all specify it as non-hidden, either explicitly or by omitting the `is_hidden` property.	2020-03-06 16:02:38 -07:00
Ross Wolf	d6813cb348	EQL: Convert wildcards to LIKE in analyzer (#51901 ) * EQL: Convert wildcard comparisons to Like * EQL: Simplify wildcard handling, update tests * EQL: Lint fixes for Optimizer.java	2020-03-06 13:13:07 -07:00
Mayya Sharipova	f96ad5c32d	Mute testSingleNumericFeatureAndMixedTrainingAndNonTrainingRows	2020-03-06 12:48:05 -05:00
Jay Modi	a81460dbf5	Make watch history indices hidden (#52974 ) This commit updates the template used for watch history indices with the hidden index setting so that new indices will be created as hidden. Relates #50251 Backport of #52962	2020-03-06 09:47:03 -07:00
Mark Vieira	09a3f45880	Mute ClassificationIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet Signed-off-by: Mark Vieira <portugee@gmail.com>	2020-03-06 07:38:04 -08:00
James Baiera	01f00df5cd	Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet	2020-03-06 07:37:57 -08:00
Dimitris Athanasiou	9abf537527	[7.x][ML] Improve DF analytics audits and logging (#53179 ) (#53218 ) Adds audits for when the job starts reindexing, loading data, analyzing, writing results. Also adds some info logging. Backport of #53179	2020-03-06 13:47:27 +02:00
Nhat Nguyen	5476a49833	Revert "upgrade to lucene-snapshot-fa75139efea (#53150 ) (#53151 )" This reverts commit `058113aa42`.	2020-03-05 17:33:00 -05:00
Benjamin Trent	af0b1c2860	[ML] Fix minor race condition in dataframe analytics _stop (#53029 ) (#53164 ) Tests have been periodically failing due to a race condition on checking a recently `STOPPED` task's state. The `.ml-state` index is not created until the task has already been transitioned to `STARTED`. This allows the `_start` API call to return. But, if a user (or test) immediately attempts to `_stop` that job, the job could stop and the task removed BEFORE the `.ml-state\|stats` indices are created/updated. This change moves towards the task cleaning up itself in its main execution thread. `stop` flips the flag of the task to `isStopping` and now we check `isStopping` at every necessary method. Allowing the task to gracefully stop. closes #53007	2020-03-05 09:59:18 -05:00
Benjamin Trent	181ee3ae0b	[ML] specifying missing_field_value value and using it instead of empty_string (#53108 ) (#53165 ) For analytics, we need a consistent way of indicating when a value is missing. Inheriting from anomaly detection, analysis sent `""` when a field is missing. This works fine with numbers, but the underlying analytics process actually treats `""` as a category in categorical values. Consequently, you end up with this situation in the resulting model ``` { "frequency_encoding" : { "field" : "RainToday", "feature_name" : "RainToday_frequency", "frequency_map" : { "" : 0.009844409027270245, "No" : 0.6472019970785184, "Yes" : 0.6472019970785184 } } } ``` For inference this is a problem, because inference will treat missing values as `null`. And thus not include them on the infer call against the model. This PR takes advantage of our new `missing_field_value` option and supplies `\0` as the value.	2020-03-05 09:50:52 -05:00
Aleksandr Maus	2dc872f052	EQL: Add HLRC for EQL stats (#53043 ) (#53148 )	2020-03-05 09:20:38 -05:00
Adrien Grand	360ac1997f	Fix test failures with the new `constant_keyword` field. (#53153 ) This test failed because YAML tests randomly install an index template that updates the default number of shards to 2. Closes #53131	2020-03-05 14:29:13 +01:00
Nik Everett	28df7ae5ed	Support multiple metrics in `top_metrics` agg (backport of #52965 ) (#53163 ) This adds support for returning multiple metrics to the `top_metrics` agg. It looks like: ``` POST /test/_search?filter_path=aggregations { "aggs": { "tm": { "top_metrics": { "metrics": [ {"field": "v"}, {"field": "m"} ], "sort": {"s": "desc"} } } } } ```	2020-03-05 08:12:01 -05:00
David Roberts	01504df876	[TEST] Force close failed job before skipping test (#53128 ) The assumption added in #52631 skips a problematic test if it fails to create the required conditions for the scenario it is supposed to be testing. (This happens very rarely.) However, before skipping the test it needs to remove the failed job it has created because the standard test cleanup code treats failed jobs as fatal errors. Closes #52608	2020-03-05 10:52:41 +00:00
Ignacio Vera	058113aa42	upgrade to lucene-snapshot-fa75139efea (#53150 ) (#53151 )	2020-03-05 10:04:05 +01:00
Ross Wolf	a5e82d7fd6	EQL: Add explicit 'any where ...' handling (#52526 )	2020-03-04 10:11:03 -07:00
Nik Everett	609c61f75c	Formalize usage stats for analytics (backport of #52966 ) (#53077 ) This moves the usage statistics gathering from the `AnalyticsPlugin` into an `AnalyicsUsage`, removing the static state. It also checks the license level when parsing all analytics aggregations. This is how we were checking them before but we did it in an easy to forget way. This way is slightly simpler, I think.	2020-03-04 10:29:11 -05:00
Martijn van Groningen	3fa5395ac8	Use correct issue number: #52453	2020-03-04 16:17:55 +01:00
Martijn van Groningen	2e325e24cb	Mute testMonitorClusterHealth test (#53109 ) Relates to #36782	2020-03-04 16:08:19 +01:00
Martijn van Groningen	b77f6746d1	unmute watcher single node test case relates to #36782	2020-03-04 15:25:17 +01:00
Aleksandr Maus	b47bffba24	EQL: consistent naming for event type vs event category (#53073 ) (#53090 ) Related to https://github.com/elastic/elasticsearch/issues/52941	2020-03-04 08:02:38 -05:00
Marios Trivyzas	e180e2738a	SQL: [Tests] Add tests for optimization of aliased expressions (#53048 ) Add a unit test to verify that the optimization of expression (e.g. COALESCE) is applied to all instances of the expression: SELECT, WHERE, GROUP BY and HAVING. Relates to #35270 (cherry picked from commit 2ceedc7f2019fad92cd86679af1a9c6fa594aa8d)	2020-03-04 11:48:06 +01:00
Marios Trivyzas	1d5c842700	SQL: Fix column size for IP data type (#53056 ) Set size/displaySize to 45 which is the maximum string for an IP (v6), since IPs are returned as strings. Fixes: #52762 (cherry picked from commit 815f01747a4d54a274ca248af6fc08e5ea0728c1)	2020-03-04 10:36:44 +01:00
Jay Modi	c610e0893d	Introduce system index APIs for Kibana (#53035 ) This commit introduces a module for Kibana that exposes REST APIs that will be used by Kibana for access to its system indices. These APIs are wrapped versions of the existing REST endpoints. A new setting is also introduced since the Kibana system indices' names are allowed to be changed by a user in case multiple instances of Kibana use the same instance of Elasticsearch. Additionally, the ThreadContext has been extended to indicate that the use of system indices may be allowed in a request. This will be built upon in the future for the protection of system indices. Backport of #52385	2020-03-03 14:11:36 -07:00
Andrei Stefan	9ad9ad7a6b	SQL: update SqlNodeSubclassTests list of min-two-parameters functions list (#53045 ) (#53058 ) (cherry picked from commit c741e49d9f5e7b78c1a78e1af97eb19354fe6864)	2020-03-03 19:37:37 +02:00
Adrien Grand	cb868d2f5e	Introduce a `constant_keyword` field. (#49713 ) (#53024 ) This field is a specialization of the `keyword` field for the case when all documents have the same value. It typically performs more efficiently than keywords at query time by figuring out whether all or none of the documents match at rewrite time, like `term` queries on `_index`. The name is up for discussion. I liked including `keyword` in it, so that we still have room for a `singleton_numeric` in the future. However I'm unsure whether to call it `singleton`, `constant` or something else, any opinions? For this field there is a choice between 1. accepting values in `_source` when they are equal to the value configured in mappings, but rejecting mapping updates 2. rejecting values in `_source` but then allowing updates to the value that is configured in the mapping This commit implements option 1, so that it is possible to reindex from/to an index that has the field mapped as a keyword with no changes to the source. Backport of #49713	2020-03-03 16:01:47 +01:00
Yang Wang	70814daa86	Allow _rollup_search with read privilege (#52043 ) (#53047 ) Currently _rollup_search requires manage privilege to access. It should really be a read only operation. This PR changes the requirement to be read indices privilege. Resolves: #50245	2020-03-03 22:29:54 +11:00
Martijn van Groningen	510db25dd0	Simplify watcher indexing listener.(#53046 ) Backport: #52627 Add watcher to trigger server after index operation has succeeded, instead of adding a watch to trigger service before the actual index operation has performed on the shard level. This logic is simpler to reason about in the case that a failure does occur during the execution of an index operation on the shard level. Relates to #52453, but I think doesn't fix it, but makes it easier to debug.	2020-03-03 11:01:57 +01:00
Hendrik Muhs	844f350774	[Transform] restructure transform yaml tests (#52956 ) restructure transform yaml tests to run cleanup in teardown phase relates #52428	2020-03-03 10:31:22 +01:00
Hendrik Muhs	d9258e210e	[Transform] fix sporadic race condition in TransformUsageIT (#52946 ) relax the test for trigger count fixes #52931	2020-03-03 10:27:36 +01:00
Costin Leau	712e0c05cd	EQL: Add implicit ordering on timestamp (#53004 ) QL: Move Sort base class from SQL to QL (cherry picked from commit 798015b7bbd565e9c4222724614baeb432c7c2b3)	2020-03-02 22:41:36 +02:00
Mark Vieira	f8396e8d15	Mute RunDataFrameAnalyticsIT.testStopOutlierDetectionWithEnoughDocumentsToScroll Signed-off-by: Mark Vieira <portugee@gmail.com>	2020-03-02 09:21:55 -08:00
Mark Vieira	5b5e92c71d	Mute NodeSubclassTests.testReplaceChildren Signed-off-by: Mark Vieira <portugee@gmail.com>	2020-03-02 09:21:54 -08:00
Lisa Cawley	4fbe1b0550	[DOCS] Adds cat anomaly detectors API (#52866 ) (#52970 )	2020-03-02 07:28:55 -08:00
Hendrik Muhs	a328a8eaf1	[7.x][Transform] implement node.transform to control where to… (#52998 ) implement transform node attributes to disable transform on certain nodes and test which nodes are allowed to do remote connections closes #52200 closes #50033 closes #48734 backport #52712	2020-03-02 16:10:57 +01:00
Aleksandr Maus	89ed857c79	EQL: Change request parameter query to filter and rule to query (#52971 ) (#53006 ) Related to https://github.com/elastic/elasticsearch/issues/52911	2020-03-02 09:26:23 -05:00
Andrei Stefan	6fecc1db84	Issue a different error message in case an index doesn't have a mapping (#52967 ) (#53003 ) (cherry picked from commit a0bd83a0579cf196a1d727de2a46b3b101d5a73b)	2020-03-02 14:04:49 +02:00
Andrei Stefan	69383acecf	Define list of Nodes that have minimum two children in tests (#52957 ) (#52994 ) (cherry picked from commit c1e43e694f02edf3e197abbab7c21008c022b516)	2020-03-02 11:26:50 +02:00
Hendrik Muhs	49f41d127b	[Transform] fix NPE in derive stats if shouldStopAtNextCheckpo… (#52940 ) fixes a NPE in _stats in case shouldStopAtNextCheckpoint is set.	2020-03-02 08:11:01 +01:00
Martijn van Groningen	d102158e6f	Improve closing mock webserver when failed to start (#52943 ) Fix NPE when closing a webserver that hasn't started correctly. This can happen when ssl context isn't initialized. The server instance is then never set, which causes an NPE that masks the actual failure. Example stacktrace that would mask an actual failure: ``` java.lang.NullPointerException at org.elasticsearch.test.http.MockWebServer.close(MockWebServer.java:271) at org.elasticsearch.xpack.watcher.test.integration.HttpSecretsIntegrationTests.cleanup(HttpSecretsIntegrationTests.java:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) ```	2020-03-02 07:19:08 +01:00
Nhat Nguyen	e6755afeeb	Upgrade to Lucene 8.5.0-snapshot-c4475920b08 (#52950 ) (#52977 ) To give LUCENE-9228 more CI cycles	2020-02-29 09:29:16 -05:00
Dimitris Athanasiou	85b4e45093	[7.x]ML] Parse and report memory usage for DF Analytics (#52778 ) (#52980 ) Adds reporting of memory usage for data frame analytics jobs. This commit introduces a new index pattern `.ml-stats-*` whose first concrete index will be `.ml-stats-000001`. This index serves to store instrumentation information for those jobs. Backport of #52778 and #52958	2020-02-29 13:03:40 +02:00
Luca Cavanna	090bdf69c0	Mute NodeSubclassTests#testReplaceChildren (#52952 ) Relates #52951	2020-02-28 16:13:17 +01:00
Andrei Stefan	c3a167830f	SQL: refactor In predicate moving it to QL project (#52870 ) (#52938 ) * Move In, InPipe and InProcessor out of SQL to the common QL project. * Move tests classes to the QL project. * Create SQL dedicated In class to handle SQL specific data types. * Update SQL classes to use the InPipe and InProcessor QL classes. * Extract common Foldables methods in QL project. * Be more explicit when folding and converting a foldable value, by removing most of the code inside Foldables class. (cherry picked from commit 7425042f86f66df8c207c5e96f9b9848bda2b4c3)	2020-02-28 14:04:10 +02:00
Costin Leau	a674085903	EQL: Disable field extraction for returned events (#52884 ) Return the whole source of matching events (cherry picked from commit 79ca586ab1d89d645fb58142b82202f14ce5d361)	2020-02-28 13:48:15 +02:00
Yang Wang	82553524af	Respect runas realm for ApiKey security operations (#52178 ) (#52932 ) When user A runs as user B and performs any API key related operations, user B's realm should always be used to associate with the API key. Currently user A's realm is used when getting or invalidating API keys and owner=true. The PR is to fix this bug. resolves: #51975	2020-02-28 10:53:52 +11:00
Nik Everett	866b08716c	Fix test for top_metrics (#52927 ) I added the wrong skips and the wrong error message. Ooops.	2020-02-27 18:30:37 -05:00
Nik Everett	1d1956ee93	Add size support to `top_metrics` (backport of #52662 ) (#52914 ) This adds support for returning the top "n" metrics instead of just the very top. Relates to #51813	2020-02-27 16:12:52 -05:00
Benjamin Trent	19a6c5d980	[7.x] [ML][Inference] Add support for multi-value leaves to the tree model (#52531 ) (#52901 ) * [ML][Inference] Add support for multi-value leaves to the tree model (#52531) This adds support for multi-value leaves. This is a prerequisite for multi-class boosted tree classification.	2020-02-27 14:05:28 -05:00
Benjamin Trent	eac38e9847	[ML] Add indices_options to datafeed config and update (#52793 ) (#52905 ) This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index. This is necessary for the following use cases: - Reading from frozen indices - Allowing certain indices in multiple index patterns to not exist yet These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object. closes https://github.com/elastic/elasticsearch/issues/48056	2020-02-27 13:43:25 -05:00
Henning Andersen	09fe4b42db	Disable ILM history in x-pack rest tests (#52868 ) The ILM history index can be delayed created from one test into the next, which can cause issues for tests using `_all`. Closes #52209	2020-02-27 17:20:33 +01:00
David Kyle	d8bdf31110	Revert "Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart" This reverts commit `ad3a3b1af9`.	2020-02-27 12:38:13 +00:00
David Kyle	6e5e64559a	Unwrap cause from remote ActionTransportExceptions (#52842 ) (#52878 ) And log the cause	2020-02-27 11:58:28 +00:00
István Zoltán Szabó	4a33352a94	[DOCS] Adds cat trained model API documentation (#52824 )	2020-02-27 12:54:11 +01:00
Costin Leau	40bc06f6ad	EQL: Hook engine to Elasticsearch (#52828 ) Add query execution and return actual results returned from Elasticsearch inside the tests (cherry picked from commit 3e039282bf991af87604a6d4f8eada19d5e33842)	2020-02-27 11:22:22 +02:00
Yang Wang	14c21aedd2	Simplify ml license checking with XpackLicenseState internals (#52684 ) (#52863 ) This change removes TrainedModelConfig#isAvailableWithLicense method with calls to XPackLicenseState#isAllowedByLicense. Please note there are subtle changes to the code logic. But they are the right changes: * Instead of Platinum license, Enterprise license nows guarantees availability. * No explicit check when the license requirement is basic. Since basic license is always available, this check is unnecessary. * Trial license is always allowed.	2020-02-27 14:14:16 +11:00
Yang Wang	f5c4e92558	Refactor license checking (#52118 ) (#52859 ) Improve code resuse and readility. Add convenience checking method which covers most use cases without having to pass many boolean arguments.	2020-02-27 13:04:19 +11:00
Jake Landis	b4179a8814	[7.x] Refactor watcher tests (#52799 ) (#52844 ) This PR moves the majority of the Watcher REST tests under the Watcher x-pack plugin. Specifically, moves the Watcher tests from: x-pack/plugin/test x-pack/qa/smoke-test-watcher x-pack/qa/smoke-test-watcher-with-security x-pack/qa/smoke-test-monitoring-with-watcher to: x-pack/plugin/watcher/qa/rest (/test and /qa/smoke-test-watcher) x-pack/plugin/watcher/qa/with-security x-pack/plugin/watcher/qa/with-monitoring Additionally, this disables Watcher from the main x-pack test cluster and consolidates the stop/start logic for the tests listed. No changes to the tests (beyond moving them) are included. 3rd party tests and doc tests (which also touch Watcher) are not included in the changes here.	2020-02-26 15:57:10 -06:00
Jay Modi	07ef8ccff4	Allow dynamic updates for index.hidden setting (#52837 ) This commit changes the `index.hidden` setting from being final to a dynamic setting. While the setting being final allows for easier reasoning about an index, making this setting update-able has more benefits in that we can upgrade existing indices to be hidden and it will enable future features that would dynamically make indices hidden. Backport of #52772	2020-02-26 11:46:29 -07:00
Nik Everett	bfaa487757	Switch pipeline agg parsing to ContextParser (#52776 ) (#52832 ) We've pretty well settled on `ContextParser` for a generic interface to `ObjectParser`-like-things. This switches the interface used for building parsing pipeline aggregations to `ContextParser` which saves a couple of little wrappers around `ObjectParser`.	2020-02-26 12:57:20 -05:00
Lisa Cawley	b788ec7157	[DOCS] Adds cat datafeeds API (#52738 )	2020-02-26 09:28:57 -08:00
Ioannis Kakavas	2d01c005ba	Update commons-collections test dependency to 3.2.2 (#52808 ) (#52817 ) This is only a test dependency but it trips scanners so upgrade to 3.2.2 which doesn't suffer from the issues mentioned in i.e. https://snyk.io/vuln/SNYK-JAVA-COMMONSCOLLECTIONS-472711	2020-02-26 17:03:45 +02:00
Adrien Grand	1807f86751	Generalize how queries on `_index` are handled at rewrite time (#52815 ) Generalize how queries on `_index` are handled at rewrite time (#52486) Since this change refactors rewrites, I also took it as an opportunity to adrress #49254: instead of returning the same queries you would get on a keyword field when a field is unmapped, queries get rewritten to a MatchNoDocsQueryBuilder. This change exposed a couple bugs, like the fact that the percolator doesn't rewrite queries at query time, or that the significant_terms aggregation doesn't rewrite its inner filter, which I fixed. Closes #49254	2020-02-26 15:37:43 +01:00
David Kyle	ad3a3b1af9	Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart	2020-02-26 14:31:00 +00:00
Jake Landis	8d311297ca	[7.x] Smarter copying of the rest specs and tests (#52114 ) (#52798 ) * Smarter copying of the rest specs and tests (#52114) This PR addresses the unnecessary copying of the rest specs and allows for better semantics for which specs and tests are copied. By default the rest specs will get copied if the project applies `elasticsearch.standalone-rest-test` or `esplugin` and the project has rest tests or you configure the custom extension `restResources`. This PR also removes the need for dozens of places where the x-pack specs were copied by supporting copying of the x-pack rest specs too. The plugin/task introduced here can also copy the rest tests to the local project through a similar configuration. The new plugin/task allows a user to minimize the surface area of which rest specs are copied. Per project can be configured to include only a subset of the specs (or tests). Configuring a project to only copy the specs when actually needed should help with build cache hit rates since we can better define what is actually in use. However, project level optimizations for build cache hit rates are not included with this PR. Also, with this PR you can no longer use the includePackaged flag on integTest task. The following items are included in this PR: * new plugin: `elasticsearch.rest-resources` * new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy * new extension 'restResources' ``` restResources { restApi { includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar includeXpack 'baz' //will include x-pack specs that start with baz } restTests { includeCore 'foo', 'bar' //will include the core tests that start with foo and bar includeXpack 'baz' //will include the x-pack tests that start with baz } } ```	2020-02-26 08:13:41 -06:00
Ioannis Kakavas	2a6c3bea3f	Update oauth2-oidc-sdk to 7.0 (#52489 ) (#52806 ) Resolves: #48409 Other changes: https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect -extensions/src/7.0.2/CHANGELOG.txt	2020-02-26 16:02:10 +02:00
István Zoltán Szabó	f57422bbfd	[DOCS] Adds cat data frame analytics API (#52764 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-02-26 11:10:42 +01:00
David Kyle	37be695d5c	[ML] Handle failed datafeed in MlDistributedFailureIT (#52631 ) (#52789 )	2020-02-26 08:18:37 +00:00
Lisa Cawley	05f1cd74a6	[DOCS] Fixes monitoring links (#52790 )	2020-02-25 18:08:23 -08:00
Tim Brooks	6669e53f08	Do not lock on reads of XPackLicenseState (#52492 ) XPackLicenseState reads to necessary to validate a number of cluster operations. This reads occasionally occur on transport threads which should not be blocked. Currently we sychronize when reading. However, this is unecessary as only a single piece of state is updateable. This commit makes this state volatile and removes the locking.	2020-02-25 15:38:35 -07:00
Andrei Stefan	51c6aefa55	SQL: Use calendar_interval of 1d for HISTOGRAMs with 1 DAY intervals (#52749 ) (#52771 ) (cherry picked from commit 556f5fa33be88570c4f8550cb8f784323d26a707)	2020-02-25 18:44:02 +02:00
Costin Leau	a8911802d3	EQL: transform query AST into queryDSL (#52432 ) (cherry picked from commit 94cef29df259319dfe2a3bf92d3f1a42d7e45781)	2020-02-25 17:53:59 +02:00
Nik Everett	02b23c37d1	Another test fix Another attempt to fix a test that fails rarely and randomly. This time try locking the query to just a single index.	2020-02-25 10:22:12 -05:00
Aleksandr Maus	a6f5b4bb78	Unmute EqlActionIT (#52757 ) Related to https://github.com/elastic/elasticsearch/issues/52737	2020-02-25 10:22:07 -05:00
David Roberts	cf122d13b8	[ML] Use event.timezone in file_structure_finder ingest pipeline (#52720 ) This is because beat.timezone was renamed to event.timezone in elastic/beats#9458	2020-02-25 12:33:53 +00:00
Aleksandr Maus	b2cb38ccf5	EQL: Expand verification tests (#52664 ) (#52725 ) * EQL: Expand verification tests (#52664) Expand verification tests Fix some error messaging consistency in EqlParser Related to https://github.com/elastic/elasticsearch/issues/51873 * Adjust for 7.x compatibility	2020-02-25 07:19:33 -05:00
Mark Vieira	025352f0a4	Mute EqlActionIT	2020-02-24 16:06:30 -08:00
Andrei Stefan	ed6b10bc03	SQL: use a calendar interval for histograms over 1 month intervals (#52586 ) (#52715 ) (cherry picked from commit 928b11a34ec92d90d082abdf4fa09f7ce1d7c0c4)	2020-02-25 01:41:51 +02:00
Nik Everett	d48870ef94	Try to fix test another way..... Explictly create the index rather than skip adding the default template....	2020-02-24 17:17:41 -05:00
Nik Everett	a7fe3329cb	Fix some top_metrics tests (#52575 ) (#52726 ) These tests didn't work properly when run against multi-shard indices. The `_score` based sorting test expects fairly specific scores which isn't going to happen with multiple shards so this disables multiple shards for that test. The other tests were failing due to a fairly sneaky race condition around `_bulk` and type inference. This fixes them by always sending metric values as floating point numbers so Elasticsearch always infers them to be doubles.	2020-02-24 14:30:37 -05:00
Ryan Ernst	8c295cdc87	Fix sql cli sourcing of x-pack-env (#52613 ) The sql-cli script sources x-pack-env, but it does so assuming the current directory is ES_HOME. This commit alters the source command to use ES_HOME which is available after running elasticsearch-env. closes #47803	2020-02-24 11:13:31 -08:00
Aleksandr Maus	a7bdb0b456	EQL: Add integration tests harness to test EQL feature parity with original implementation (#52248 ) (#52675 ) The tests use the original test queries from https://github.com/endgameinc/eql/blob/master/eql/etc/test_queries.toml for EQL implementation correctness validation. The file test_queries_unsupported.toml serves as a "blacklist" for the queries that we do not support. Currently all of the queries are blacklisted. Over the time the expectation is to eventually have an empty "blacklist" when all of the queries are fully supported. The tests use the original test vector from https://raw.githubusercontent.com/endgameinc/eql/master/eql/etc/test_data.json. Only one EQL and the response is stubbed for now to match the expected output from that query. This part would need some tweaking after EQL is fully wired. Related to https://github.com/elastic/elasticsearch/issues/49581	2020-02-24 12:46:59 -05:00
Adrien Grand	f993ef80f8	Move the terms index of `_id` off-heap. (#52518 ) In #42838 we moved the terms index of all fields off-heap except the `_id` field because we were worried it might make indexing slower. In general, the indexing rate is only affected if explicit IDs are used, as otherwise Elasticsearch almost never performs lookups in the terms dictionary for the purpose of indexing. So it's quite wasteful to require the terms index of `_id` to be loaded on-heap for users who have append-only workloads. Furthermore I've been conducting benchmarks when indexing with explicit ids on the http_logs dataset that suggest that the slowdown is low enough that it's probably not worth forcing the terms index to be kept on-heap. Here are some numbers for the median indexing rate in docs/s: \| Run \| Master \| Patch \| \| --- \| ------- \| ------- \| \| 1 \| 45851.2 \| 46401.4 \| \| 2 \| 45192.6 \| 44561.0 \| \| 3 \| 45635.2 \| 44137.0 \| \| 4 \| 46435.0 \| 44692.8 \| \| 5 \| 45829.0 \| 44949.0 \| And now heap usage in MB for segments: \| Run \| Master \| Patch \| \| --- \| ------- \| -------- \| \| 1 \| 41.1720 \| 0.352083 \| \| 2 \| 45.1545 \| 0.382534 \| \| 3 \| 41.7746 \| 0.381285 \| \| 4 \| 45.3673 \| 0.412737 \| \| 5 \| 45.4616 \| 0.375063 \| Indexing rate decreased by 1.8% on average, while memory usage decreased by more than 100x. The `http_logs` dataset contains small documents and has a simple indexing chain. More complex indexing chains, e.g. with more fields, ingest pipelines, etc. would see an even lower decrease of indexing rate.	2020-02-24 18:14:12 +01:00
David Kyle	de3d674bb7	Revert "Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart" This reverts commit `c4d91143ac`.	2020-02-24 15:22:49 +00:00
David Kyle	044a4e127a	[ML] Add reason to DataFrameAnalyticsTask setFailed log message (#52659 ) (#52707 )	2020-02-24 15:21:51 +00:00
Albert Zaharovits	33131e2dcd	Logfile audit settings validation (#52537 ) Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters..users xpack.security.audit.logfile.events.ignore_filters..realms xpack.security.audit.logfile.events.ignore_filters..roles xpack.security.audit.logfile.events.ignore_filters..indices Closes #52357 Relates #47711 #47038 Follows the example from #47246	2020-02-24 16:38:16 +02:00
Ignacio Vera	ba9d3c6389	Add support for multipoint shape queries (#52564 ) (#52705 )	2020-02-24 13:46:51 +01:00
Martijn van Groningen	225d841212	Improve watcher test by preventing a npe when closing the http client.	2020-02-24 10:23:45 +01:00
Yang Wang	7cefba78c5	License removal leads back to a basic license (#52407 ) (#52683 ) A new basic license will be generated when existing license is deleted. In addition, deleting an existing basic license is a no-op. Resolves: #45022	2020-02-24 11:02:40 +11:00
Jason Tedor	1685cbe504	Add messages for CCR on license state changes (#52470 ) When a license expires, or license state changes, functionality might be disabled. This commit adds messages for CCR to inform users that CCR functionality will be disabled when a license expires, or when license state changes to a license level lower than trial/platinum/enterprise.	2020-02-22 09:09:42 -05:00
Benjamin Trent	afd90647c9	[ML] Adds feature importance to option to inference processor (#52218 ) (#52666 ) This adds machine learning model feature importance calculations to the inference processor. The new flag in the configuration matches the analytics parameter name: `num_top_feature_importance_values` Example: ``` "inference": { "field_mappings": {}, "model_id": "my_model", "inference_config": { "regression": { "num_top_feature_importance_values": 3 } } } ``` This will write to the document as follows: ``` "inference" : { "feature_importance" : { "FlightTimeMin" : -76.90955548511226, "FlightDelayType" : 114.13514762158526, "DistanceMiles" : 13.731580450792187 }, "predicted_value" : 108.33165831875137, "model_id" : "my_model" } ``` This is done through calculating the [SHAP values](https://arxiv.org/abs/1802.03888). It requires that models have populated `number_samples` for each tree node. This is not available to models that were created before 7.7. Additionally, if the inference config is requesting feature_importance, and not all nodes have been upgraded yet, it will not allow the pipeline to be created. This is to safe-guard in a mixed-version environment where only some ingest nodes have been upgraded. NOTE: the algorithm is a Java port of the one laid out in ml-cpp: https://github.com/elastic/ml-cpp/blob/master/lib/maths/CTreeShapFeatureImportance.cc usability blocked by: https://github.com/elastic/ml-cpp/pull/991	2020-02-21 18:42:31 -05:00
Jay Modi	8abfda0b59	Rename assertThrows to prevent naming clash (#52651 ) This commit renames ElasticsearchAssertions#assertThrows to assertRequestBuilderThrows and assertFutureThrows to avoid a naming clash with JUnit 4.13+ and static imports of these methods. Additionally, these methods have been updated to make use of expectThrows internally to avoid duplicating the logic there. Relates #51787 Backport of #52582	2020-02-21 13:30:11 -07:00
Jack Conradson	c4d91143ac	Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart Relates: #52654	2020-02-21 09:32:19 -08:00
Lisa Cawley	4ff78e8a00	[7.x][DOCS] Adds X-Pack usage API (#52592 )	2020-02-21 06:57:11 -08:00
Jay Modi	f3f6ff97ee	Single instance of the IndexNameExpressionResolver (#52604 ) This commit modifies the codebase so that our production code uses a single instance of the IndexNameExpressionResolver class. This change is being made in preparation for allowing name expression resolution to be augmented by a plugin. In order to remove some instances of IndexNameExpressionResolver, the single instance is added as a parameter of Plugin#createComponents and PersistentTaskPlugin#getPersistentTasksExecutor. Backport of #52596	2020-02-21 07:50:02 -07:00
Nik Everett	ed957f35a9	Cover missing case in top_metrics test (#52517 ) The top_metrics test assumed that it'd never end up only reducing unmapped results. But, rarely, it does. This handles that case in the test. Closes #52462	2020-02-21 09:49:17 -05:00

... 6 7 8 9 10 ...

4914 Commits