OpenSearch

Commit Graph

Author	SHA1	Message	Date
Hendrik Muhs	a9425a0240	[7.x][Transform] fix count when matching exact ids(#56544 ) (#56582 ) fix count in get and get stats if explicit ids are given and ids might be duplicated when configuration are stored in different index (versions). fixes #56196	2020-05-12 14:23:13 +02:00
Ignacio Vera	222ee721ec	Add moving percentiles pipeline aggregation (#55441 ) (#56575 ) Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics	2020-05-12 11:35:23 +02:00
Ryan Ernst	902fc546bd	Migrate remaining ESIntegTestCases to internalClusterTest (#56479 ) (#56563 ) This commit migrates the ESIntegTestCase tests in x-pack to the internalClusterTest source set.	2020-05-11 21:06:04 -07:00
Tim Brooks	760ab726c2	Share netty event loops between transports (#56553 ) Currently Elasticsearch creates independent event loop groups for each transport (http and internal) transport type. This is unnecessary and can lead to contention when different threads access shared resources (ex: allocators). This commit moves to a model where, by default, the event loops are shared between the transports. The previous behavior can be attained by specifically setting the http worker count.	2020-05-11 15:43:43 -06:00
Benjamin Trent	1d6b2f074e	[Transform] adds geotile_grid support in group_by (#56514 ) (#56549 ) This adds support for grouping by geo points. This uses the agg [geotile_grid](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geotilegrid-aggregation.html). I am opting to store the tile results of group_by as a `geo_shape` so that users can query the results. Additionally, the shapes could be visualized and filtered in the kibana maps app. relates to https://github.com/elastic/elasticsearch/issues/56121	2020-05-11 17:02:40 -04:00
Jim Ferenczi	02ab9112a9	Fix spurious failures in AsyncSearchIntegTestCase (#56026 ) Async search integration tests are subject to random failures when: * The test index has more than one replica. * The request cache is used. * Some shards are empty. * The maintenance service starts a garbage collection when node is closing. They are also slow because the test index is created/populated on each test method. This change refactors these integration tests in order to: * Create the index once for the entire test suite. * Fix the usage of the request cache and replicas. * Ensures that all shards have at least one document. * Increase the delay of the maintenance service garbage collection. Closes #55895 Closes #55988	2020-05-11 15:03:03 +02:00
Dimitris Athanasiou	60b1c67409	[7.x][ML] Allow stopping DF analytics whose config is missing (#56360 ) (#56408 ) It is possible that the config document for a data frame analytics job is deleted from the config index. If that is the case the user is unable to stop a running job because we attempt to retrieve the config and that will throw. This commit changes that. When the request is forced, we do not expand the requested ids based on the existing configs but from the list of running tasks instead. Backport of #56360	2020-05-08 13:54:44 +03:00
Dimitris Athanasiou	d064eda2b0	[7.x][ML] Ensure phase progress may only increase (#56339 ) (#56357 ) Due to multi-threading it is possible that phase progress updates written from the c++ process arrive reordered. We can address this by ensuring that progress may only increase. Closes #56282 Backport of #56339	2020-05-07 19:46:58 +03:00
Przemysław Witek	0cd0ab276e	Introduce Annotation.Builder class and use it to create instances of Annotation class (#56276 ) (#56286 )	2020-05-06 20:47:03 +02:00
Tal Levy	e4f2c3105d	Add geo_shape support for geotile_grid and geohash_grid (#55966 ) (#56228 ) this commit adds aggregation support for the geo_shape field type on geo*_grid aggregations. it introduces a Tiler for both tiles and hashes that enables a new type of ValuesSource to replace the GeoPoint's CellIdSource. This makes it possible for the existing Aggregator to be re-used, so no new implementations of the grid aggregators are added.	2020-05-05 09:54:14 -07:00
William Brafford	3499fa917c	Deprecated xpack "enable" settings should be no-ops (#55416 ) (#56167 ) The following settings are now no-ops: * xpack.flattened.enabled * xpack.logstash.enabled * xpack.rollup.enabled * xpack.slm.enabled * xpack.sql.enabled * xpack.transform.enabled * xpack.vectors.enabled Since these settings no longer need to be checked, we can remove settings parameters from a number of constructors and methods, and do so in this commit. We also update documentation to remove references to these settings.	2020-05-05 10:40:49 -04:00
David Roberts	7aa0daaabd	[7.x][ML] More advanced model snapshot retention options (#56194 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Backport of #56125	2020-05-05 14:31:58 +01:00
Dimitris Athanasiou	2d7899c83c	[7.x][ML] Adjust DF Analytics process phases (#56107 ) (#56177 ) As of elastic/ml-cpp#1179, the analytics process reports phases depending on the analysis type. This commit adjusts the phases of current analyses from `analyzing` to the following: - outlier_detection: [`computing_outlier`] - regression/classification: [`feature_selection`, `coarse_parameter_search`, `fine_tuning_parameters`, `final_training`] Backport of #56107	2020-05-05 15:00:07 +03:00
Dimitris Athanasiou	75dadb7a6d	[7.x][ML] Add loss_function to regression (#56118 ) (#56187 ) Adds parameters `loss_function` and `loss_function_parameter` to regression. Backport of #56118	2020-05-05 14:59:51 +03:00
Hendrik Muhs	e177a38504	[7.x][Transform] add throttling (#56007 ) (#56184 ) add throttling to transform, throttling will slow down search requests by delaying the execution based on a documents per second metric. fixes #54862	2020-05-05 13:09:02 +02:00
Martijn van Groningen	2ac32db607	Move includeDataStream flag from IndicesOptions to IndexNameExpressionResolver.Context (#56151 ) Backport of #56034. Move includeDataStream flag from an IndicesOptions to IndexNameExpressionResolver.Context as a dedicated field that callers to IndexNameExpressionResolver can set. Also alter indices stats api to support data streams. The rollover api uses this api and otherwise rolling over data stream does no longer work. Relates to #53100	2020-05-04 22:38:33 +02:00
Martijn van Groningen	6d03081560	Add auto create action (#56122 ) Backport of #55858 to 7.x branch. Currently the TransportBulkAction detects whether an index is missing and then decides whether it should be auto created. The coordination of the index creation also happens in the TransportBulkAction on the coordinating node. This change adds a new transport action that the TransportBulkAction delegates to if missing indices need to be created. The reasons for this change: * Auto creation of data streams can't occur on the coordinating node. Based on the index template (v2) either a regular index or a data stream should be created. However if the coordinating node is slow in processing cluster state updates then it may be unaware of the existence of certain index templates, which then can load to the TransportBulkAction creating an index instead of a data stream. Therefor the coordination of creating an index or data stream should occur on the master node. See #55377 * From a security perspective it is useful to know whether index creation originates from the create index api or from auto creating a new index via the bulk or index api. For example a user would be allowed to auto create an index, but not to use the create index api. The auto create action will allow security to distinguish these two different patterns of index creation. This change adds the following new transport actions: AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via #55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created. The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege. Relates to #53100	2020-05-04 19:10:09 +02:00
Dimitris Athanasiou	76fa5a2397	[7.x][ML] Improve cleanup for DF Analytics HLRC tests (#56101 ) (#56109 ) Adds the step of stopping all data frame analytics before deleting them to the cleanup of the corresponding HLRC tests. Closes #56097 Backport of #56101	2020-05-04 16:08:08 +03:00
Przemysław Witek	44f5a8ccd3	Use snapshot's latest result time rather than snapshot's creation time when creating an annotation (#56093 ) (#56103 )	2020-05-04 12:36:12 +02:00
Armin Braun	0860d1dc74	Remove Dead Code in SLM Delete Handling (#56081 ) (#56098 ) The delete response is always acknowledged. No need to handle anything else.	2020-05-04 12:22:06 +02:00
Armin Braun	3a64ecb6bf	Allow Deleting Multiple Snapshots at Once (#55474 ) (#56083 ) * Allow Deleting Multiple Snapshots at Once (#55474) Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise. This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.	2020-05-03 20:30:58 +02:00
William Brafford	d53c941c41	Make xpack.monitoring.enabled setting a no-op (#55617 ) (#56061 ) * Make xpack.monitoring.enabled setting a no-op This commit turns xpack.monitoring.enabled into a no-op. Mostly, this involved removing the setting from the setup for integration tests. Monitoring may introduce some complexity for test setup and teardown, so we should keep an eye out for turbulence and failures * Docs for making deprecated setting a no-op	2020-05-01 16:42:11 -04:00
Ryan Ernst	52b9d8d15e	Convert remaining license methods to isAllowed (#55908 ) (#55991 ) This commit converts the remaining isXXXAllowed methods to instead of use isAllowed with a Feature value. There are a couple other methods that are static, as well as some licensed features that check the license directly, but those will be dealt with in other followups.	2020-04-30 15:52:22 -07:00
Igor Motov	d8f9df771d	Expose agg usage in Feature Usage API (#55732 ) (#56048 ) Counts usage of the aggs and exposes them on the _nodes/usage/. Closes #53746	2020-04-30 12:53:36 -04:00
Benjamin Trent	04b1f6498b	[ML] using new fixed interval in ml tests (#56021 ) (#56031 ) This commit removes deprecated references to DateHistogram.interval from ml tests	2020-04-30 10:26:39 -04:00
William Brafford	273ff6a105	Make xpack.ilm.enabled setting a no-op (#55592 ) (#55980 ) * Make xpack.ilm.enabled setting a no-op * Add watcher setting to not use ILM * Update documentation for no-op setting * Remove NO_ILM ml index templates * Remove unneeded setting from test setup * Inline variable definitions for ML templates * Use identical parameter names in templates * New ILM/watcher setting falls back to old setting * Add fallback unit test for watcher/ilm setting	2020-04-30 09:50:18 -04:00
Hendrik Muhs	d3bcef2962	[7.x][Transform] implement throttling in indexer (#55011 ) (#56002 ) implement throttling in async-indexer used by rollup and transform. The added docs_per_second parameter is used to calculate a delay before the next search request is send. With re-throttle its possible to change the parameter at runtime. When stopping a running job, its ensured that despite throttling the indexer stops in reasonable time. This change contains the groundwork, but does not expose the new functionality. relates #54862 backport: #55011	2020-04-30 11:20:35 +02:00
Ioannis Kakavas	3c7c9573b4	Fix PemKeyConfigTests (#55577 ) (#55996 ) We were creating PemKeyConfig objects using different private keys but always using testnode.crt certificate that uses the RSA public key. The PemKeyConfig was built but we would then later fail to handle SSL connections during the TLS handshake eitherway. This became obvious in FIPS tests where the consistency checks that FIPS 140 mandates kick in and failed early becausethe private key was of different type than the public key	2020-04-30 12:05:27 +03:00
Yang Wang	84a2f1adf2	Resolve anonymous roles and deduplicate roles during authentication (#53453 ) (#55995 ) Anonymous roles resolution and user role deduplication are now performed during authentication instead of authorization. The change ensures: * If anonymous access is enabled, user will be able to see the anonymous roles added in the roles field in the /_security/_authenticate response. * Any duplication in user roles are removed and will not show in the above authenticate response. * In any other case, the response is unchanged. It also introduces a behaviour change: the anonymous role resolution is now authentication node specific, previously it was authorization node specific. Details can be found at #47195 (comment)	2020-04-30 17:34:14 +10:00
Andrei Dan	6a0e1e161b	ILM stop step execution if writeIndex is false (#54805 ) (#55923 ) (cherry picked from commit 47a9fd760f7bf2cc6cd778485dc057b6aaf07709) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-29 13:39:37 +01:00
David Roberts	61ac09ae21	[ML] Add daily_model_snapshot_retention_after_days to job config (#55891 ) This change adds a new setting, daily_model_snapshot_retention_after_days, to the anomaly detection job config. Initially this has no effect, the effect will be added in a followup PR. This PR gets the complexities of making changes that interact with BWC over well before feature freeze. Backport of #55878	2020-04-29 09:12:53 +01:00
Larry Gregory	47d252424b	Backport: Deprecate the kibana reserved user (#54967 ) (#55822 )	2020-04-28 10:30:25 -04:00
Tal Levy	6ba5148ead	Add geo_shape support for the geo_centroid aggregation (#55602 ) (#55819 ) this commit leverages the new geo_shape doc values to register a new geo_centroid aggregator that works on geo_shape field.	2020-04-27 12:16:10 -07:00
Dimitris Athanasiou	7f100c1196	[7.x][ML] Allow analytics process define its own progress phases (#55763 ) (#55791 ) This is a continuation from #55580. Now that we're parsing phase progresses from the analytics process we change `ProgressTracker` to allow for custom phases between the `loading_data` and `writing_results` phases. Each `DataFrameAnalysis` may declare its own phases. This commit sets things in place for the analytics process to start reporting different phases per analysis type. However, this is still preserving existing behaviour as all analyses currently declare a single `analyzing` phase. Backport of #55763	2020-04-27 13:30:05 +03:00
David Roberts	3ba44a5af8	[ML] Adding failed_category_count to model_size_stats (#55761 ) The failed_category_count statistic records the number of times categorization wanted to create a new category but couldn't because the job had reached its model_memory_limit. Backport of #55716	2020-04-25 10:36:49 +01:00
Tanguy Leroux	41ddbd4188	Allow to prewarm the cache for searchable snapshot shards (#55322 ) Relates #50999	2020-04-24 18:03:34 +02:00
Jim Ferenczi	0a6c74b7d3	AsyncSearchMaintenanceService should stop when closing a node (#55651 ) This change turns the AsyncSearchMaintenanceService into an AbstractLifecycleComponent and ensures that the service is stopped when a node is closing. Closes #55646	2020-04-24 09:38:40 +02:00
Ryan Ernst	97c4b64fb1	Add isAllowed license utility (#55424 ) (#55700 ) License state is currently made up of boolean methods that check whether a particular feature is allowed by the current license state. Each new feature must copy/past boiler plate code. While that has gotten easier with utilities like isAllowedByLicense, this is still more cumbersome than should be necessary. This commit adds a general purpose isAllowed method which takes a new Feature enum, where each value of the enum defines the minimum license mode and whether the license must be active to be allowed. Only security features are converted in this PR, in order to keep the commit size relatively small. The rest of the features will be converted in a followup.	2020-04-23 16:28:28 -07:00
jimczi	c857adf603	Fix AsyncSearchTaskTests#testWithFetchFailures Fix usage of a possible invalid random range [1, 0]. Relates #55688	2020-04-24 00:45:17 +02:00
Jim Ferenczi	31d1727698	Fix (de)serialization of async search failures (#55688 ) The (de)serialization code of the async search response cannot handle exceptions that extend ElasticsearchException (e.g. ScriptException). This commit fixes this bug by serializing the error with the more generic StreamInput#writeException.	2020-04-24 00:44:43 +02:00
Igor Motov	8c7ef2417f	Make AsyncSearchIndexService reusable (#55598 ) EQL will require very similar functionality to async search. This PR refactors AsyncSearchIndexService to make it reusable for EQL. Supersedes #55119 Relates to #49638	2020-04-23 18:02:17 -04:00
Dan Hermann	dd5c96c2ed	[7.x] Rollover for data streams	2020-04-23 12:04:34 -05:00
Rory Hunter	d66af46724	Always use deprecateAndMaybeLog for deprecation warnings (#55319 ) Backport of #55115. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages can potentially be deduplicated.	2020-04-23 09:20:54 +01:00
Albert Zaharovits	82ed0ab420	Update the audit logfile list of system users (#55578 ) Out of the box "access granted" audit events are not logged for system users. The list of system users was stale and included only the _system and _xpack users. This commit expands this list with _xpack_security and _async_search, effectively reducing the auditing noise by not logging the audit events of these system users out of the box. Closes #37924	2020-04-22 21:59:31 +03:00
Tal Levy	c370b83bd7	Fix locale lowercase test issue in GenerateSnapshotNameStepTests (#55597 ) (#55605 ) The testPerformAction test has been failing periodically due to how Hamcrest's containsStringIgnoringCase does not lowercase using the same Locale set in the test infrastructure. This commit falls back to explicitly lowercasing using the root locale	2020-04-22 11:29:57 -07:00
Benjamin Trent	7c81cd7833	[ML] explicitly disallow partial results in datafeed extractors (#55537 ) (#55585 ) Instead of doing our own checks against REST status, shard counts, and shard failures, this commit changes all our extractor search requests to set `.setAllowPartialSearchResults(false)`. - Scrolls are automatically cleared when a search failure occurs with `.setAllowPartialSearchResults(false)` set. - Code error handling is simplified closes https://github.com/elastic/elasticsearch/issues/40793	2020-04-22 09:07:44 -04:00
David Roberts	da5aeb8be7	[ML] Return assigned node in start/open job/datafeed response (#55570 ) Adds a "node" field to the response from the following endpoints: 1. Open anomaly detection job 2. Start datafeed 3. Start data frame analytics job If the job or datafeed is assigned to a node immediately then this field will return the ID of that node. In the case where a job or datafeed is opened or started lazily the node field will contain an empty string. Clients that want to test whether a job or datafeed was opened or started lazily can therefore check for this. Backport of #55473	2020-04-22 12:06:53 +01:00
Tim Vernum	8b566aea47	Fix use of password protected PKCS#8 keys for SSL (#55567 ) PEMUtils would incorrectly fill the encryption password with zeros (the '\0' character) after decrypting a PKCS#8 key. Since PEMUtils did not take ownership of this password it should not zero it out because it does not know whether the caller will use that password array again. This is actually what PEMKeyConfig does - it uses the key encryption password as the password for the ephemeral keystore that it creates in order to build a KeyManager. Backport of: #55457	2020-04-22 16:38:51 +10:00
Armin Braun	db7eb8e8ff	Remove Redundant CS Update on Snapshot Finalization (#55276 ) (#55528 ) This change folds the removal of the in-progress snapshot entry into setting the safe repository generation. Outside of removing an unnecessary cluster state update, this also has the advantage of removing a somewhat inconsistent cluster state where the safe repository generation points at `RepositoryData` that contains a finished snapshot while it is still in-progress in the cluster state, making it easier to reason about the state machine of upcoming concurrent snapshot operations.	2020-04-21 15:33:17 +02:00
David Turner	be60d50452	Allow searching of snapshot taken while indexing (#55511 ) Today a read-only engine requires a complete history of operations, in the sense that its local checkpoint must equal its maximum sequence number. This is a valid check for read-only engines that were obtained by closing an index since closing an index waits for all in-flight operations to complete. However a snapshot may not have this property if it was taken while indexing was ongoing, but that's ok. This commit weakens the check for a complete history to exclude the case of a searchable snapshot. Relates #50999	2020-04-21 13:21:38 +01:00

1 2 3 4 5 ...

1808 Commits