OpenSearch

Commit Graph

Author	SHA1	Message	Date
Yang Wang	84a2f1adf2	Resolve anonymous roles and deduplicate roles during authentication (#53453 ) (#55995 ) Anonymous roles resolution and user role deduplication are now performed during authentication instead of authorization. The change ensures: * If anonymous access is enabled, user will be able to see the anonymous roles added in the roles field in the /_security/_authenticate response. * Any duplication in user roles are removed and will not show in the above authenticate response. * In any other case, the response is unchanged. It also introduces a behaviour change: the anonymous role resolution is now authentication node specific, previously it was authorization node specific. Details can be found at #47195 (comment)	2020-04-30 17:34:14 +10:00
Andrei Dan	6a0e1e161b	ILM stop step execution if writeIndex is false (#54805 ) (#55923 ) (cherry picked from commit 47a9fd760f7bf2cc6cd778485dc057b6aaf07709) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-29 13:39:37 +01:00
David Roberts	61ac09ae21	[ML] Add daily_model_snapshot_retention_after_days to job config (#55891 ) This change adds a new setting, daily_model_snapshot_retention_after_days, to the anomaly detection job config. Initially this has no effect, the effect will be added in a followup PR. This PR gets the complexities of making changes that interact with BWC over well before feature freeze. Backport of #55878	2020-04-29 09:12:53 +01:00
Larry Gregory	47d252424b	Backport: Deprecate the kibana reserved user (#54967 ) (#55822 )	2020-04-28 10:30:25 -04:00
Tal Levy	6ba5148ead	Add geo_shape support for the geo_centroid aggregation (#55602 ) (#55819 ) this commit leverages the new geo_shape doc values to register a new geo_centroid aggregator that works on geo_shape field.	2020-04-27 12:16:10 -07:00
Dimitris Athanasiou	7f100c1196	[7.x][ML] Allow analytics process define its own progress phases (#55763 ) (#55791 ) This is a continuation from #55580. Now that we're parsing phase progresses from the analytics process we change `ProgressTracker` to allow for custom phases between the `loading_data` and `writing_results` phases. Each `DataFrameAnalysis` may declare its own phases. This commit sets things in place for the analytics process to start reporting different phases per analysis type. However, this is still preserving existing behaviour as all analyses currently declare a single `analyzing` phase. Backport of #55763	2020-04-27 13:30:05 +03:00
David Roberts	3ba44a5af8	[ML] Adding failed_category_count to model_size_stats (#55761 ) The failed_category_count statistic records the number of times categorization wanted to create a new category but couldn't because the job had reached its model_memory_limit. Backport of #55716	2020-04-25 10:36:49 +01:00
Tanguy Leroux	41ddbd4188	Allow to prewarm the cache for searchable snapshot shards (#55322 ) Relates #50999	2020-04-24 18:03:34 +02:00
Jim Ferenczi	0a6c74b7d3	AsyncSearchMaintenanceService should stop when closing a node (#55651 ) This change turns the AsyncSearchMaintenanceService into an AbstractLifecycleComponent and ensures that the service is stopped when a node is closing. Closes #55646	2020-04-24 09:38:40 +02:00
Ryan Ernst	97c4b64fb1	Add isAllowed license utility (#55424 ) (#55700 ) License state is currently made up of boolean methods that check whether a particular feature is allowed by the current license state. Each new feature must copy/past boiler plate code. While that has gotten easier with utilities like isAllowedByLicense, this is still more cumbersome than should be necessary. This commit adds a general purpose isAllowed method which takes a new Feature enum, where each value of the enum defines the minimum license mode and whether the license must be active to be allowed. Only security features are converted in this PR, in order to keep the commit size relatively small. The rest of the features will be converted in a followup.	2020-04-23 16:28:28 -07:00
jimczi	c857adf603	Fix AsyncSearchTaskTests#testWithFetchFailures Fix usage of a possible invalid random range [1, 0]. Relates #55688	2020-04-24 00:45:17 +02:00
Jim Ferenczi	31d1727698	Fix (de)serialization of async search failures (#55688 ) The (de)serialization code of the async search response cannot handle exceptions that extend ElasticsearchException (e.g. ScriptException). This commit fixes this bug by serializing the error with the more generic StreamInput#writeException.	2020-04-24 00:44:43 +02:00
Igor Motov	8c7ef2417f	Make AsyncSearchIndexService reusable (#55598 ) EQL will require very similar functionality to async search. This PR refactors AsyncSearchIndexService to make it reusable for EQL. Supersedes #55119 Relates to #49638	2020-04-23 18:02:17 -04:00
Dan Hermann	dd5c96c2ed	[7.x] Rollover for data streams	2020-04-23 12:04:34 -05:00
Rory Hunter	d66af46724	Always use deprecateAndMaybeLog for deprecation warnings (#55319 ) Backport of #55115. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages can potentially be deduplicated.	2020-04-23 09:20:54 +01:00
Albert Zaharovits	82ed0ab420	Update the audit logfile list of system users (#55578 ) Out of the box "access granted" audit events are not logged for system users. The list of system users was stale and included only the _system and _xpack users. This commit expands this list with _xpack_security and _async_search, effectively reducing the auditing noise by not logging the audit events of these system users out of the box. Closes #37924	2020-04-22 21:59:31 +03:00
Tal Levy	c370b83bd7	Fix locale lowercase test issue in GenerateSnapshotNameStepTests (#55597 ) (#55605 ) The testPerformAction test has been failing periodically due to how Hamcrest's containsStringIgnoringCase does not lowercase using the same Locale set in the test infrastructure. This commit falls back to explicitly lowercasing using the root locale	2020-04-22 11:29:57 -07:00
Benjamin Trent	7c81cd7833	[ML] explicitly disallow partial results in datafeed extractors (#55537 ) (#55585 ) Instead of doing our own checks against REST status, shard counts, and shard failures, this commit changes all our extractor search requests to set `.setAllowPartialSearchResults(false)`. - Scrolls are automatically cleared when a search failure occurs with `.setAllowPartialSearchResults(false)` set. - Code error handling is simplified closes https://github.com/elastic/elasticsearch/issues/40793	2020-04-22 09:07:44 -04:00
David Roberts	da5aeb8be7	[ML] Return assigned node in start/open job/datafeed response (#55570 ) Adds a "node" field to the response from the following endpoints: 1. Open anomaly detection job 2. Start datafeed 3. Start data frame analytics job If the job or datafeed is assigned to a node immediately then this field will return the ID of that node. In the case where a job or datafeed is opened or started lazily the node field will contain an empty string. Clients that want to test whether a job or datafeed was opened or started lazily can therefore check for this. Backport of #55473	2020-04-22 12:06:53 +01:00
Tim Vernum	8b566aea47	Fix use of password protected PKCS#8 keys for SSL (#55567 ) PEMUtils would incorrectly fill the encryption password with zeros (the '\0' character) after decrypting a PKCS#8 key. Since PEMUtils did not take ownership of this password it should not zero it out because it does not know whether the caller will use that password array again. This is actually what PEMKeyConfig does - it uses the key encryption password as the password for the ephemeral keystore that it creates in order to build a KeyManager. Backport of: #55457	2020-04-22 16:38:51 +10:00
Armin Braun	db7eb8e8ff	Remove Redundant CS Update on Snapshot Finalization (#55276 ) (#55528 ) This change folds the removal of the in-progress snapshot entry into setting the safe repository generation. Outside of removing an unnecessary cluster state update, this also has the advantage of removing a somewhat inconsistent cluster state where the safe repository generation points at `RepositoryData` that contains a finished snapshot while it is still in-progress in the cluster state, making it easier to reason about the state machine of upcoming concurrent snapshot operations.	2020-04-21 15:33:17 +02:00
David Turner	be60d50452	Allow searching of snapshot taken while indexing (#55511 ) Today a read-only engine requires a complete history of operations, in the sense that its local checkpoint must equal its maximum sequence number. This is a valid check for read-only engines that were obtained by closing an index since closing an index waits for all in-flight operations to complete. However a snapshot may not have this property if it was taken while indexing was ongoing, but that's ok. This commit weakens the check for a complete history to exclude the case of a searchable snapshot. Relates #50999	2020-04-21 13:21:38 +01:00
Jim Ferenczi	0b3bdfcc3e	Fix expiration time in async search response (#55435 ) This change ensures that we return the latest expiration time when retrieving the response from the index. This commit also fixes a bug that stops the garbage collection of saved responses if the async search index is deleted.	2020-04-21 14:04:29 +02:00
Przemysław Witek	59d377462f	Apply default timeout in StopDataFrameAnalyticsAction.Request (#55512 ) (#55517 )	2020-04-21 13:05:48 +02:00
Stuart Tettemer	93a2e9b0f9	Test: MockScoreScript can be cacheable. (#55499 ) Backport: 0ed1eb5	2020-04-20 17:09:58 -06:00
Benjamin Trent	cabff65aec	[ML] Fixing inference stats race condition (#55163 ) (#55486 ) `updateAndGet` could actually call the internal method more than once on contention. If I read the JavaDocs, it says: ```* @param updateFunction a side-effect-free function``` So, it could be getting multiple updates on contention, thus having a race condition where stats are double counted. To fix, I am going to use a `ReadWriteLock`. The `LongAdder` objects allows fast thread safe writes in high contention environments. These can be protected by the `ReadWriteLock::readLock`. When stats are persisted, I need to call reset on all these adders. This is NOT thread safe if additions are taking place concurrently. So, I am going to protect with `ReadWriteLock::writeLock`. This should prevent race conditions while allowing high (ish) throughput in the highly contention paths in inference. I did some simple throughput tests and this change is not significantly slower and is simpler to grok (IMO). closes https://github.com/elastic/elasticsearch/issues/54786	2020-04-20 16:21:18 -04:00
Przemysław Witek	7d5f74e964	Fix and unmute testSetUpgradeMode_ExistingTaskGetsUnassigned (#55368 ) (#55452 )	2020-04-20 13:29:29 +02:00
Jason Tedor	9ecb222bfa	Remove unneeded validation in feature set usage This validation is not needed, as we have discovered the source of the serialization error that was leading to some usage instances appearing to not have a name.	2020-04-18 14:29:59 -04:00
Jay Modi	405ff0ce27	Handle TLS file updates during startup (#55330 ) This change reworks the loading and monitoring of files that are used for the construction of SSLContexts so that updates to these files are not lost if the updates occur during startup. Previously, the SSLService would parse the settings, build the SSLConfiguration objects, and construct the SSLContexts prior to the SSLConfigurationReloader starting to monitor these files for changes. This allowed for a small window where updates to these files may never be observed until the node restarted. To remove the potential miss of a change to these files, the code now parses the settings and builds SSLConfiguration instances prior to the construction of the SSLService. The files back the SSLConfiguration instances are then registered for monitoring and finally the SSLService is constructed from the previously parse SSLConfiguration instances. As the SSLService is not constructed when the code starts monitoring the files for changes, a CompleteableFuture is used to obtain a reference to the SSLService; this allows for construction of the SSLService to complete and ensures that we do not miss any file updates during the construction of the SSLService. While working on this change, the SSLConfigurationReloader was also refactored to reflect how it is currently used. When the SSLConfigurationReloader was originally written the files that it monitored could change during runtime. This is no longer the case as we stopped the monitoring of files that back dynamic SSLContext instances. In order to support the ability for items to change during runtime, the class made use of concurrent data structures. The use of these concurrent datastructures has been removed. Closes #54867 Backport of #54999	2020-04-17 20:10:33 -06:00
Ryan Ernst	66071b2f6e	Remove combo security and license helper from license state (#55366 ) (#55417 ) Security features in the license state currently do a dynamic check on whether security is enabled. This is because the license level can change the default security enabled state. This commit splits out the check on security being enabled, so that the combo method of security enabled plus license allowed is no longer necessary.	2020-04-17 13:07:02 -07:00
William Brafford	49e30b15a2	Deprecate disabling basic-license features (#54816 ) (#55405 ) We believe there's no longer a need to be able to disable basic-license features completely using the "xpack..enabled" settings. If users don't want to use those features, they simply don't need to use them. Having such features always available lets us build more complex features that assume basic-license features are present. This commit deprecates settings of the form "xpack..enabled" for basic-license features, excluding "security", which is a special case. It also removes deprecated settings from integration tests and unit tests where they're not directly relevant; e.g. monitoring and ILM are no longer disabled in many integration tests.	2020-04-17 15:04:17 -04:00
Benjamin Trent	4be3663968	[7.x] [ML] fix bugs with prediction field value settings (#55333 ) (#55394 ) * [ML] fix bugs with prediction field value settings (#55333) This fixes two unreleased bugs: 1. Prediction value type of `number` might show unexpected classes Analytics created models may have class labels like `1, 5, 10` (or some collection of discrete, whole numbers). These labels are passed to the inference model config in the `classification_labels` field. When the predicted value format is `numeric` it should attempt to see if the classification labels are provided and are numeric. If so, use those. If not, use the underlying value. 2. When supplying an update overwrite, inference was losing the default prediction field value. This is because it was not copied over in the copy ctor in the ClassificationConfig.Builder class. closes #55332	2020-04-17 14:45:02 -04:00
Martijn van Groningen	417d5f2009	Make data streams in APIs resolvable. (#55337 ) Backport from: #54726 The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used. In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs. Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex(). If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api). Relates to #53100	2020-04-17 08:33:37 +02:00
Jason Tedor	9a9c1a721c	Add validation to feature set usage name (#55350 ) We do not validate the name is not null, and not empty. Even though it never should be, we had a build failure where it appears that somehow this did happen. We add some validation here, in case this really is happening, we will have a more clear indication where this is coming from, and of course, validation that name fits the implicit assumptions that it is not null and not empty.	2020-04-16 18:16:53 -04:00
Mark Tozzi	22c55180c1	[7.x] Backport ValuesSourceRegistry and related work (#54922 ) * Add ValuesSource Registry and associated logic (#54281) * Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638) * ValuesSourceRegistry Prototype (#48758) * Remove generics from ValuesSource related classes (#49606) * fix percentile aggregation tests (#50712) * Basic thread safety for ValuesSourceRegistry (#50340) * Remove target value type from ValuesSourceAggregationBuilder (#49943) * Cleanup default values source type (#50992) * CoreValuesSourceType no longer implements Writable (#51276) * Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131) * Put values source types on fields (#51503) * Remove VST Any (#51539) * Rewire terms agg to use new VS registry (#51182) Also adds some basic AggTestCases for untested code paths (and boilerplate for future tests once the IT are converted over) * Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337) * Wire Percentiles aggregator into new VS framework (#51639) This required a bit of a refactor to percentiles itself. Before, the Builder would switch on the chosen algo to generate an algo-specific factory. This doesn't work (or at least, would be difficult) in the new VS framework. This refactor consolidates both factories together and introduces a PercentilesConfig object to act as a standardized way to pass algo-specific parameters through the factory. This object is then used when deciding which kind of aggregator to create Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will be moved in a subsequent PR. * Remove generics and target value type from MultiVSAB (#51647) * fix checkstyle after merge (#52008) * Plumb ValuesSourceRegistry through to QuerySearchContext (#51710) * Convert RareTerms to new VS registry (#52166) * Wire up Value Count (#52225) * Wire up Max & Min aggregations (#52219) * ValuesSource refactoring: Wire up Sum aggregation (#52571) * ValuesSource refactoring: Wire up SigTerms aggregation (#52590) * Soft immutability for VSConfig (#52729) * Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734) Also fixes Percentiles which was incorrectly specified to only accept numeric, but in fact also accepts Boolean and Date (because those are numeric on master - thanks `testSupportedFieldTypes` for catching it!) * VS refactoring: Wire up stats aggregation (#52891) * ValuesSource refactoring: Wire up string_stats aggregation (#52875) * VS refactoring: Wire up median (MAD) aggregation (#52945) * fix valuesourcetype issue with constant_keyword field (#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java this commit implements `getValuesSourceType` for the ConstantKeyword field type. master was merged into feature/extensible-values-source introducing a new field type that was not implementing `getValuesSourceType`. * ValuesSource refactoring: Wire up Avg aggregation (#52752) * Wire PercentileRanks aggregator into new VS framework (#51693) * Add a VSConfig resolver for aggregations not using the registry (#53038) * Vs refactor wire up ranges and date ranges (#52918) * Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034) This commit updates the geo_bounds aggregation to depend on registering itself in the ValuesSourceRegistry relates #42949. * VS refactoring: convert Boxplot to new registry (#53132) * Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037) This commit updates the geo_grid aggregations to depend on registering itself in the ValuesSourceRegistry relates to the values-source refactoring meta issue #42949. Wire-up geo_centroid agg to ValuesSourceRegistry (#53040) This commit updates the geo_centroid aggregation to depend on registering itself in the ValuesSourceRegistry. relates to the values-source refactoring meta issue #42949. * Fix type tests for Missing aggregation (#53501) * ValuesSource Refactor: move histo VSType into XPack module (#53298) - Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core. - This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept - Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs * Wire up DateHistogram to the ValuesSourceRegistry (#53484) * Vs refactor parser cleanup (#53198) Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com> * First batch of easy fixes * Remove List.of from ValuesSourceRegistry Note that we intend to have a follow up PR dealing with the mutability of the registry, so I didn't even try to address that here. * More compiler fixes * More compiler fixes * More compiler fixes * Precommit is happy and so am I * Add new Core VSTs to tests * Disabled supported type test on SigTerms until we can backport it's fix * fix checkstyle * Fix test failure from semantic merge issue * Fix some metaData->metadata replacements that got lost * Fix list of supported types for MinAggregator * Fix list of supported types for Avg * remove unused import Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com>	2020-04-16 16:54:46 -04:00
David Turner	7941f4a47e	Add RepositoriesService to createComponents() args (#54814 ) Today we pass the `RepositoriesService` to the searchable snapshots plugin during the initialization of the `RepositoryModule`, forcing the plugin to be a `RepositoryPlugin` even though it does not implement any repositories. After discussion we decided it best for now to pass this in via `Plugin#createComponents` instead, pending some future work in which plugins can depend on services more dynamically.	2020-04-16 16:27:36 +01:00
David Kyle	643ecf68b5	Remove InferenceConfigUpdate generic parameter (#55249 ) (#55301 ) Simplify the code by removing the generic type from InferenceConfigUpdate which meant wildcard types were used in many places. Instead check the class type is appropriate where used.	2020-04-16 13:44:53 +01:00
Ioannis Kakavas	ac87c10039	[7.x] Fix responses for the token APIs (#54532 ) (#55278 ) This commit fixes our behavior regarding the responses we return in various cases for the use of token related APIs. More concretely: - In the Get Token API with the `refresh` grant, when an invalid (already deleted, malformed, unknown) refresh token is used in the body of the request, we respond with `400` HTTP status code and an `error_description` header with the message "could not refresh the requested token". Previously we would return erroneously return a `401` with "token malformed" message. - In the Invalidate Token API, when using an invalid (already deleted, malformed, unknown) access or refresh token, we respond with `404` and a body that shows that no tokens were invalidated: ``` { "invalidated_tokens":0, "previously_invalidated_tokens":0, "error_count":0 } ``` The previous behavior would be to erroneously return a `400` or `401` ( depending on the case ). - In the Invalidate Token API, when the tokens index doesn't exist or is closed, we return `400` because we assume this is a user issue either because they tried to invalidate a token when there is no tokens index yet ( i.e. no tokens have been created yet or the tokens index has been deleted ) or the index is closed. - In the Invalidate Token API, when the tokens index is unavailable, we return a `503` status code because we want to signal to the caller of the API that the token they tried to invalidate was not invalidated and we can't be sure if it is still valid or not, and that they should try the request again. Resolves: #53323	2020-04-16 14:05:55 +03:00
David Roberts	ac11dd619c	Only ship Linux binaries for the correct architecture (#55280 ) Following elastic/ml-cpp#1135 there are now Linux binaries for both x86_64 and aarch64. The code that finds the correct binaries to ship with each distribution was including both on every Linux distribution. This change alters that logic to consider the architecture as well as the operating system. Also, there is no need to disable ML on aarch64 now that we have the native binaries available. ML is still not supported on aarch64, but the processes at least run up and work at a superficial level. Backport of #55256	2020-04-16 09:45:52 +01:00
Jay Modi	2d9e3c7794	Start resource watcher service early (#55275 ) The ResourceWatcherService enables watching of files for modifications and deletions. During startup various consumers register the files that should be watched by this service. There is behavior that might be unexpected in that the service may not start polling until later in the startup process due to the use of lifecycle states to control when the service actually starts the jobs to monitor resources. This change removes this unexpected behavior so that upon construction the service has already registered its tasks to poll resources for changes. In making this modification, the service no longer extends AbstractLifecycleComponent and instead implements the Closeable interface so that the polling jobs can be terminated when the service is no longer required. Relates #54867 Backport of #54993	2020-04-15 20:45:39 -06:00
Jason Tedor	cad1a3b0ad	Fix imports in CCRFeatureSet This commit fixes some imports that were mixed up during a backport. Because, backports.	2020-04-15 19:37:25 -04:00
Jason Tedor	a18faacf1b	Make feature usage version aware (#55246 ) Today we indiscriminately serialize these independent of the version on the stream, even though the other side might not understand a new feature set usage that we have added. For example, if we add feature set usage in 7.7 for EQL, in a mixed cluster context if a request is sent to an old coordinating node, but the master is a new version, then it would attempt to serialize the usage information for the new feature back to the old coordinating node, who will blow up on the unrecognized named writeable. This commit addresses this by making feature usage version aware, and only serializing those that the other side would understand.	2020-04-15 19:24:47 -04:00
William Brafford	2ba3be9db6	Remove deprecated third-party methods from tests (#55255 ) (#55269 ) I've noticed that a lot of our tests are using deprecated static methods from the Hamcrest matchers. While this is not a big deal in any objective sense, it seems like a small good thing to reduce compilation warnings and be ready for a new release of the matcher library if we need to upgrade. I've also switched a few other methods in tests that have drop-in replacements.	2020-04-15 17:54:47 -04:00
Ryan Ernst	29b70733ae	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:27:53 -07:00
Benjamin Trent	8ff2cbf1a3	[7.x] [ML] adding prediction_field_type to inference config (#55128 ) (#55230 ) * [ML] adding prediction_field_type to inference config (#55128) Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). Analytics provides the default `prediction_field_type` when the model is created from the process.	2020-04-15 09:45:22 -04:00
Armin Braun	2f91e2aab7	Fix Race in Snapshot Abort (#54873 ) (#55233 ) We can be a little more efficient when aborting a snapshot. Since we know the new repository data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners. This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.	2020-04-15 15:42:15 +02:00
Hendrik Muhs	9ec9866acb	[Transform] simplify TransformConfigUpdate (#55224 ) removes the unnecessary ToXContent method in TransformConfigUpdate	2020-04-15 13:22:50 +02:00
Ioannis Kakavas	0f51934bcf	[7.x] Add support for more named curves (#55179 ) (#55211 ) We implicitly only supported the prime256v1 ( aka secp256r1 ) curve for the EC keys we read as PEM files to be used in any SSL Context. We would not fail when trying to read a key pair using a different curve but we would silently assume that it was using `secp256r1` which would lead to strange TLS handshake issues if the curve was actually another one. This commit fixes that behavior in that it supports parsing EC keys that use any of the named curves defined in rfc5915 and rfc5480 making no assumptions about whether the security provider in use supports them (JDK8 and higher support all the curves defined in rfc5480).	2020-04-15 12:33:40 +03:00
Igor Motov	1754e50cbd	[7.x] Add analytics plugin usage stats to _xpack/usage (#54911 ) (#55162 ) Adds analytics plugin usage stats to _xpack/usage. Closes #54847	2020-04-14 17:03:14 -04:00
Mark Vieira	ce85063653	[7.x] Re-add origin url information to publish POM files (#55173 )	2020-04-14 13:24:15 -07:00
David Turner	87e8367ece	Fix testCreateAndRestoreSearchableSnapshot (#55147 ) Fixes a couple of related failures in SearchableSnapshotsIntegTests. Firstly, we were not correctly accounting for the case where the cache was so small that some/all files were read directly; fixed this by only asserting that the cache is definitely used if the corresponding node has a cache that's large enough to hold the whole index. Secondly, we were not permitting shards to be completely empty, which might be the case (rarely) if there were not many documents indexed and the distribution of IDs was a bit unlucky; fixed this by asserting that we get stats for at least one file for the whole index, rather than for each shard separately. Closes #55126	2020-04-14 11:54:46 +01:00
Ryan Ernst	ae14d1661e	Replace license check isAuthAllowed with isSecurityEnabled (#54547 ) (#55082 ) The isAuthAllowed() method for license checking is used by code that wants to ensure security is both enabled and available. The enabled state is dynamic and provided by isSecurityEnabled(). But since security is available with all license types, an check on the license level is not necessary. Thus, this change replaces isAuthAllowed() with calling isSecurityEnabled().	2020-04-13 12:26:39 -07:00
Benjamin Trent	d32f6fed1d	[ML] inference only persist if there are stats (#54752 ) (#55121 ) We needlessly send documents to be persisted. If there are no stats added, then we should not attempt to persist them. Also, this PR fixes the race condition that caused issue: https://github.com/elastic/elasticsearch/issues/54786	2020-04-13 14:03:05 -04:00
Benjamin Trent	c5c7ee9d73	[7.x] [ML] Start gathering and storing inference stats (#53429 ) (#54738 ) * [ML] Start gathering and storing inference stats (#53429) This PR enables stats on inference to be gathered and stored in the `.ml-stats-` indices. Each node + model_id will have its own running stats document and these will later be summed together when returning _stats to the user. `.ml-stats-` is ILM managed (when possible). So, at any point the underlying index could change. This means that a stats document that is read in and then later updated will actually be a new doc in a new index. This complicates matters as this means that having a running knowledge of seq_no and primary_term is complicated and almost impossible. This is because we don't know the latest index name. We should also strive for throughput, as this code sits in the middle of an ingest pipeline (or even a query).	2020-04-13 08:15:46 -04:00
Albert Zaharovits	f22004a262	Preserve parent task id for data frame analytics (#55046 ) This change makes sure that all internal client requests spawned by the data frame analytics persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because data frame analytics taks already implements graceful shutdown of child tasks, this change does not interfere with it by opting out of the persistent task cancellation of child tasks. Relates #54943 #52314	2020-04-10 22:27:21 +03:00
Jason Tedor	a370668fcc	Clean up even more instances of "metaData" We recently cleaned up the use of the word "metadata" across the codebase. Even more additional uses have trickled in, likely from in-progress work. This commit cleans up these last few additional instances. Relates #54519	2020-04-10 08:52:37 -04:00
Larry Gregory	8c8baa10f4	[Backport] Add reserved_ml_user and reserved_ml_admin kibana p… (#54837 ) * add reserved_ml_user and reserved_ml_admin kibana privileges * address feedback, update dataframe roles * fix checkstyle failure	2020-04-07 11:42:11 -04:00
Tanguy Leroux	4d36917e52	Merge feature/searchable-snapshots branch into 7.x (#54803 ) (#54825 ) This is a backport of #54803 for 7.x. This pull request cherry picks the squashed commit from #54803 with the additional commits: 6f50c92 which adjusts master code to 7.x a114549 to mute a failing ILM test (#54818) 48cbca1 and 50186b2 that cleans up and fixes the previous test aae12bb that adds a missing feature flag (#54861) 6f330e3 that adds missing serialization bits (#54864) bf72c02 that adjust the version in YAML tests a51955f that adds some plumbing for the transport client used in integration tests Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 13:28:53 +02:00
Jim Ferenczi	d57a047ab7	Fix transport serialization of AsyncSearchUser (#54761 ) This change ensures that the AsyncSearchUser is correctly (de)serialized when an action executed by this user is sent to a remote node internally (via transport client).	2020-04-07 08:25:58 +02:00
Igor Motov	2794572a35	[7.x] Add Student's t-test aggregation support (#54469 ) (#54737 ) Adds t_test metric aggregation that can perform paired and unpaired two-sample t-tests. In this PR support for filters in unpaired is still missing. It will be added in a follow-up PR. Relates to #53692	2020-04-06 11:36:47 -04:00
Dimitris Athanasiou	0049e9467b	[7.x][ML] Fix node serialization on GET df-nalytics stats without id (#54808 ) (#54812 ) Previously, the id of the `GetDataFrameAnalyticsStatsAction.Request` could be `null` which caused NPE on serialization as `writeString` is used (it doesn't accept null values). This commit ensures the id is never null. Closes #54807 Backport of #54808	2020-04-06 18:13:16 +03:00
Tim Vernum	30b01fe00d	Resolve SSO roles by pattern (#54777 ) This changes a SamlServiceProvider to have a function that maps from an "action-name" to set of role-names instead of a Map that does so. The on-disk representation of this mapping is a set of Java Regexp Patterns, for which the first matching group is the role name. For example "sso:(\w+)" would map any action that started with "sso:" to the corresponding role name (e.g. "sso:superuser" -> "superuser"). Backport of: #54440	2020-04-06 14:10:30 +10:00
Dimitris Athanasiou	e8c0351fd8	[7.x][ML] Allow force stopping failed and stopping DF analytics (#54650 ) (#54712 ) Force stopping a failed job used to work but it now puts the job in `stopping` state and hangs. In addition, force stopping a `stopping` job is not handled. This commit addresses those issues with force stopping data frame analytics. It inlines the approach with that followed for anomaly detection jobs. Backport of #54650	2020-04-03 16:08:06 +03:00
Julie Tibshirani	5fb7602227	Disallow changing 'enabled' on the root mapper. (#54681 ) In #33933 we disallowed changing the `enabled` parameter in object mappings. However, the fix didn't cover the root object mapper. This PR adjusts the change to also include the root mapper and clarifies the error message.	2020-04-02 15:28:48 -07:00
Benjamin Trent	7fe38935f6	[ML] add training_percent to analytics process params (#54605 ) (#54678 ) This adds training_percent parameter to the analytics process for Classification and Regression. This parameter is then used to give more accurate memory estimations. See native side pr: elastic/ml-cpp#1111	2020-04-02 17:08:06 -04:00
Benjamin Trent	4a1610265f	[7.x] [ML] add new inference_config field to trained model config (#54421 ) (#54647 ) * [ML] add new inference_config field to trained model config (#54421) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config. * fixing for backport	2020-04-02 12:25:10 -04:00
Benjamin Trent	eb31be0e71	[7.x] [ML] add num_matches and preferred_to_categories to category defintion objects (#54214 ) (#54639 ) * [ML] add num_matches and preferred_to_categories to category defintion objects (#54214) This adds two new fields to category definitions. - `num_matches` indicating how many documents have been seen by this category - `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized. These fields are only guaranteed to be up to date after a `_flush` or `_close` native change: https://github.com/elastic/ml-cpp/pull/1062 * adjusting for backport	2020-04-02 09:09:19 -04:00
Jason Tedor	f670ae0bc8	Introduce autoscaling policies (#54473 ) This commit is the first in a series of commits that introduces autoscaling policies, and APIs for working with them. For now, we introduce the basic infrastructure, and a single API for putting an autoscaling policy. We will follow in rapid succession with APIs for getting, and deleting autoscaling policies.	2020-04-01 08:12:26 -04:00
Jason Tedor	63e5f2b765	Rename META_DATA to METADATA This is a follow up to a previous commit that renamed MetaData to Metadata in all of the places. In that commit in master, we renamed META_DATA to METADATA, but lost this on the backport. This commit addresses that.	2020-03-31 17:30:51 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Dimitris Athanasiou	e4230c533c	[7.x][ML] Move DFA MemoryUsage to stats.common pkg (#54492 ) (#54512 ) This belongs in stats.common Backport of #54492	2020-03-31 18:36:05 +03:00
Dimitris Athanasiou	b4b54efa73	[7.x][ML] Hyperparameter names should match config (#54401 ) (#54435 ) Java side of elastic/ml-cpp#1096 Backport of #54401	2020-03-30 23:32:40 +03:00
Ryan Ernst	c9421594bf	Remove allowTrial flag in license checking (#54293 ) The allowTrial flag is always true, since trial licenses act as though everything is licensed. This commit removes the allowTrial flag in license checking helper methods.	2020-03-30 12:22:38 -07:00
Nik Everett	e58ad9fed3	Clean up how pipeline aggs check for multi-bucket (backport of #54161 ) (#54379 ) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes #53215	2020-03-30 10:44:55 -04:00
Przemysław Witek	3c604da7f6	[7.x] Create an annotation when a model snapshot is stored (#53783 ) (#54405 )	2020-03-30 15:17:08 +02:00
Martijn van Groningen	4b4fbc160d	Refactor AliasOrIndex abstraction. (#54394 ) Backport of #53982 In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams, the abstraction needs to be made more flexible, because currently it really can be only an alias or an index. * Renamed `AliasOrIndex` to `IndexAbstraction`. * Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is. * Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum. * Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface. * Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`. * Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method. Relates to #53100	2020-03-30 10:12:16 +02:00
Lee Hinman	f2cc2b1127	[7.x] Add REST APIs for IndexTemplateV2Metadata CRUD (#54039 ) (#54347 ) * Add REST APIs for IndexTemplateV2Metadata CRUD (#54039) * Add REST APIs for IndexTemplateV2Metadata CRUD This commit adds the get/put/delete APIs for interacting with the now v2 versions of index templates. These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag. Relates to #53101 * Add exceptions for HLRC tests * Add skips for 7.x versions * Use index_template instead of template_v2 in action names * Add test for MetaDataIndexTemplateService.addIndexTemplateV2 * Move removal to static method and add test * Add unit tests for request classes (implement hashCode & equals) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Fix compilation Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-27 10:47:22 -06:00
Christoph Büscher	0d17295601	[Docs] Minor fix for SubmitAsyncSearchRequest.keepOnCompletion javadoc (#54325 ) The semantics and the default value for this parameter have changed, adapting the javadoc accordingly.	2020-03-27 16:02:03 +01:00
Przemysław Witek	2eb079b67f	Add version guards around ML hidden indices settings (#54322 )	2020-03-27 14:50:57 +01:00
Przemysław Witek	d40afc7871	[7.x] Do not fail Evaluate API when the actual and predicted fields' types differ (#54255 ) (#54319 )	2020-03-27 10:05:19 +01:00
Hendrik Muhs	4ecf9904d5	[Transform] Transform optmize date histogram (#54068 ) optimize transform for group_by on date_histogram by injecting an additional range query. This limits the number of search and index requests and avoids unnecessary updates. Only recent buckets get re-written. fixes #54254	2020-03-26 21:39:50 +01:00
Gordon Brown	0d30b48613	Disallow negative TimeValues (#53913 ) This commit causes negative TimeValues, other than -1 which is sometimes used as a sentinel value, to be rejected during parsing. Also introduces a hack to allow ILM to load policies which were written to the cluster state with a negative min_age, treating those values as 0, which should match the behavior of prior versions.	2020-03-26 13:30:35 -06:00
Dimitris Athanasiou	13368aae37	[7.x][ML] DF Analytics should always display operational stats (#54210 ) (#54290 ) This commit populates the _stats API response with sensible "empty" `data_counts` and `memory_usage` objects when the job itself has not started reporting them. Backport of #54210	2020-03-26 20:03:14 +02:00
Dimitris Athanasiou	cc981fa377	[7.x][ML] Get ML filters size should default to 100 (#54207 ) (#54278 ) When get filters is called without setting the `size` paramter only up to 10 filters are returned. However, 100 filters should be returned. This commit fixes this and adds an integ test to guard it. It seems this was accidentally broken in #39976. Closes #54206 Backport of #54207	2020-03-26 17:51:43 +02:00
Luca Cavanna	ff269160af	Async search: rename REST parameters (#54198 ) This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069	2020-03-26 09:40:50 +01:00
Yang Wang	1afd510721	Check authentication type using enum instead of string (#54145 ) (#54246 ) Avoid string comparison when we can use safer enums. This refactor is a follow up for #52178. Resolves: #52511	2020-03-26 15:45:10 +11:00
Ryan Ernst	5a5d6e9ef2	Invert license security disabled helper method (#54043 ) (#54239 ) Xpack license state contains a helper method to determine whether security is disabled due to license level defaults. Most code needs to know whether security is enabled, not disabled, but this method exists so that the security being explicitly disabled can be distinguished from licence level defaulting to disabled. However, in the case that security is explicitly disabled, the handlers in question are never registered, so security is implicitly not disabled explicitly, and thus we can share a single method to know whether licensing is enabled.	2020-03-25 19:20:10 -07:00
Jason Tedor	381d7586e4	Introduce formal role for remote cluster client (#54138 ) This commit introduce a formal role for identifying nodes that are capable of making connections to remote clusters. Relates #53924	2020-03-24 21:59:43 -04:00
Oliver Gupte	96f0c668a8	[APM] Allow kibana to collect APM telemetry in background task (#52917 ) (#54106 ) * Required for elastic/kibana#50757. Allows the kibana user to collect APM telemetry in a background task. * removed unnecessary priviledges on `.ml-anomalies-*` for the `kibana_system` reserved role	2020-03-24 18:11:19 -07:00
Ioannis Kakavas	7c0123d6f3	Add SAML IdP plugin for internal use (#54046 ) (#54124 ) This change merges the "feature-internal-idp" branch into Elasticsearch. This introduces a small identity-provider plugin as a child of the x-pack module. This allows ES to act as a SAML IdP, for users who are authenticated against the Elasticsearch cluster. This feature is intended for internal use within Elastic Cloud environments and is not supported for any other use case. It falls under an enterprise license tier. The IdP is disabled by default. Co-authored-by: Ioannis Kakavas <ioannis@elastic.co> Co-authored-by: Tim Vernum <tim.vernum@elastic.co>	2020-03-25 09:45:13 +11:00
Dimitris Athanasiou	c141c1dd89	[7.x][ML] Stratified cross validation split for classification (#54087 ) (#54104 ) As classification now works for multiple classes, randomly picking training/test data frame rows is not good enough. This commit introduces a stratified cross validation splitter that maintains the proportion of the each class in the dataset in the sample that is used for training the model. Backport of #54087	2020-03-24 18:47:36 +02:00
Yannick Welsch	e006d1f6cf	Use special XContent registry for node tool (#54050 ) Fixes an issue where the elasticsearch-node command-line tools would not work correctly because PersistentTasksCustomMetaData contains named XContent from plugins. This PR makes it so that the parsing for all custom metadata is skipped, even if the core system would know how to handle it. Closes #53549	2020-03-24 17:40:51 +01:00
Luca Cavanna	6b457abbd3	Async search: prevent users from overriding pre_filter_shard_size (#54088 ) Submit async search forces pre_filter_shard_size for the underlying search that it creates. With this commit we also prevent users from overriding such default as part of request validation.	2020-03-24 17:06:04 +01:00
Luca Cavanna	3c67762f1b	Async search response: output start and expiration time as time fields (#54084 ) This commits makes start_time and expiration_time time fields, so that their date variant will be printed out when human readable output is requested.	2020-03-24 17:05:56 +01:00
Jim Ferenczi	0330bef409	Improve async search's tasks cancellation (#53799 ) This commit adds an explicit cancellation of the search task if the initial async search submit task is cancelled (connection closed by the user). This was previously done through the cancellation of the parent task but we don't handle grand-children cancellation yet so we have to manually cancel the search task in order to ensure that shard actions are cancelled too. This change can be considered as a workaround until #50990 is fixed.	2020-03-24 15:51:10 +01:00
David Roberts	1421471556	[ML] Introduce a "starting" datafeed state for lazy jobs (#54065 ) It is possible for ML jobs to open lazily if the "allow_lazy_open" option in the job config is set to true. Such jobs wait in the "opening" state until a node has sufficient capacity to run them. This commit fixes the bug that prevented datafeeds for jobs lazily waiting assignment from being started. The state of such datafeeds is "starting", and they can be stopped by the stop datafeed API while in this state with or without force. Backport of #53918	2020-03-24 13:00:04 +00:00
Peter Schretlen	92acb2859b	Allow kibana_system to create and invalidate API keys on behalf of other users	2020-03-24 08:38:12 -04:00
Yang Wang	d33d20bfdc	Validate role templates before saving role mapping (#52636 ) (#54059 ) Role names are now compiled from role templates before role mapping is saved. This serves as validation for role templates to prevent malformed and invalid scripts to be persisted, which could later break authentication. Resolves: #48773	2020-03-24 20:43:59 +11:00
Dimitris Athanasiou	5ce7c99e74	[7.x][ML] Data frame analytics data counts (#53998 ) (#54031 ) This commit instruments data frame analytics with stats for the data that are being analyzed. In particular, we count training docs, test docs, and skipped docs. In order to account docs with missing values as skipped docs for analyses that do not support missing values, this commit changes the extractor so that it only ignores docs with missing values when it collects the data summary, which is used to estimate memory usage. Backport of #53998	2020-03-24 11:30:43 +02:00
Hendrik Muhs	7dcacf531f	[7.x][Transform][Rollup] add processing stats to record the ti… (#54027 ) add 2 additional stats: processing time and processing total which capture the time spent for processing results and how often it ran. The 2 new stats correspond to the existing indexing and search stats. Together with indexing and search this now allows the user to see the full picture, all 3 stages.	2020-03-24 09:22:02 +01:00

1 2 3 4 5 ...

1830 Commits