OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-17 18:35:25 +00:00

Author	SHA1	Message	Date
Adrien Grand	f993ef80f8	Move the terms index of `_id` off-heap. (#52518 ) In #42838 we moved the terms index of all fields off-heap except the `_id` field because we were worried it might make indexing slower. In general, the indexing rate is only affected if explicit IDs are used, as otherwise Elasticsearch almost never performs lookups in the terms dictionary for the purpose of indexing. So it's quite wasteful to require the terms index of `_id` to be loaded on-heap for users who have append-only workloads. Furthermore I've been conducting benchmarks when indexing with explicit ids on the http_logs dataset that suggest that the slowdown is low enough that it's probably not worth forcing the terms index to be kept on-heap. Here are some numbers for the median indexing rate in docs/s: \| Run \| Master \| Patch \| \| --- \| ------- \| ------- \| \| 1 \| 45851.2 \| 46401.4 \| \| 2 \| 45192.6 \| 44561.0 \| \| 3 \| 45635.2 \| 44137.0 \| \| 4 \| 46435.0 \| 44692.8 \| \| 5 \| 45829.0 \| 44949.0 \| And now heap usage in MB for segments: \| Run \| Master \| Patch \| \| --- \| ------- \| -------- \| \| 1 \| 41.1720 \| 0.352083 \| \| 2 \| 45.1545 \| 0.382534 \| \| 3 \| 41.7746 \| 0.381285 \| \| 4 \| 45.3673 \| 0.412737 \| \| 5 \| 45.4616 \| 0.375063 \| Indexing rate decreased by 1.8% on average, while memory usage decreased by more than 100x. The `http_logs` dataset contains small documents and has a simple indexing chain. More complex indexing chains, e.g. with more fields, ingest pipelines, etc. would see an even lower decrease of indexing rate.	2020-02-24 18:14:12 +01:00
David Kyle	de3d674bb7	Revert "Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart" This reverts commit c4d91143acc8edaf2895b1d464510e92eb7e16a2.	2020-02-24 15:22:49 +00:00
David Kyle	044a4e127a	[ML] Add reason to DataFrameAnalyticsTask setFailed log message (#52659 ) (#52707 )	2020-02-24 15:21:51 +00:00
Albert Zaharovits	33131e2dcd	Logfile audit settings validation (#52537 ) Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters..users xpack.security.audit.logfile.events.ignore_filters..realms xpack.security.audit.logfile.events.ignore_filters..roles xpack.security.audit.logfile.events.ignore_filters..indices Closes #52357 Relates #47711 #47038 Follows the example from #47246	2020-02-24 16:38:16 +02:00
Ignacio Vera	ba9d3c6389	Add support for multipoint shape queries (#52564 ) (#52705 )	2020-02-24 13:46:51 +01:00
Martijn van Groningen	225d841212	Improve watcher test by preventing a npe when closing the http client.	2020-02-24 10:23:45 +01:00
Yang Wang	7cefba78c5	License removal leads back to a basic license (#52407 ) (#52683 ) A new basic license will be generated when existing license is deleted. In addition, deleting an existing basic license is a no-op. Resolves: #45022	2020-02-24 11:02:40 +11:00
Mark Vieira	72a2d0f9d8	Skip 'setupPorts' tasks when Docker is unavailable (#52679 )	2020-02-22 18:31:36 -08:00
Jason Tedor	1685cbe504	Add messages for CCR on license state changes (#52470 ) When a license expires, or license state changes, functionality might be disabled. This commit adds messages for CCR to inform users that CCR functionality will be disabled when a license expires, or when license state changes to a license level lower than trial/platinum/enterprise.	2020-02-22 09:09:42 -05:00
Benjamin Trent	afd90647c9	[ML] Adds feature importance to option to inference processor (#52218 ) (#52666 ) This adds machine learning model feature importance calculations to the inference processor. The new flag in the configuration matches the analytics parameter name: `num_top_feature_importance_values` Example: ``` "inference": { "field_mappings": {}, "model_id": "my_model", "inference_config": { "regression": { "num_top_feature_importance_values": 3 } } } ``` This will write to the document as follows: ``` "inference" : { "feature_importance" : { "FlightTimeMin" : -76.90955548511226, "FlightDelayType" : 114.13514762158526, "DistanceMiles" : 13.731580450792187 }, "predicted_value" : 108.33165831875137, "model_id" : "my_model" } ``` This is done through calculating the [SHAP values](https://arxiv.org/abs/1802.03888). It requires that models have populated `number_samples` for each tree node. This is not available to models that were created before 7.7. Additionally, if the inference config is requesting feature_importance, and not all nodes have been upgraded yet, it will not allow the pipeline to be created. This is to safe-guard in a mixed-version environment where only some ingest nodes have been upgraded. NOTE: the algorithm is a Java port of the one laid out in ml-cpp: https://github.com/elastic/ml-cpp/blob/master/lib/maths/CTreeShapFeatureImportance.cc usability blocked by: https://github.com/elastic/ml-cpp/pull/991	2020-02-21 18:42:31 -05:00
Jay Modi	8abfda0b59	Rename assertThrows to prevent naming clash (#52651 ) This commit renames ElasticsearchAssertions#assertThrows to assertRequestBuilderThrows and assertFutureThrows to avoid a naming clash with JUnit 4.13+ and static imports of these methods. Additionally, these methods have been updated to make use of expectThrows internally to avoid duplicating the logic there. Relates #51787 Backport of #52582	2020-02-21 13:30:11 -07:00
Lisa Cawley	56efd8b44d	[DOCS] Adds certutil http command to TLS setup steps (#51241 ) Co-Authored-By: Ioannis Kakavas <ikakavas@protonmail.com> Co-Authored-By: Tim Vernum <tim@adjective.org>	2020-02-21 10:11:59 -08:00
Jack Conradson	c4d91143ac	Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart Relates: #52654	2020-02-21 09:32:19 -08:00
Lisa Cawley	4ff78e8a00	[7.x][DOCS] Adds X-Pack usage API (#52592 )	2020-02-21 06:57:11 -08:00
Jay Modi	f3f6ff97ee	Single instance of the IndexNameExpressionResolver (#52604 ) This commit modifies the codebase so that our production code uses a single instance of the IndexNameExpressionResolver class. This change is being made in preparation for allowing name expression resolution to be augmented by a plugin. In order to remove some instances of IndexNameExpressionResolver, the single instance is added as a parameter of Plugin#createComponents and PersistentTaskPlugin#getPersistentTasksExecutor. Backport of #52596	2020-02-21 07:50:02 -07:00
Nik Everett	ed957f35a9	Cover missing case in top_metrics test (#52517 ) The top_metrics test assumed that it'd never end up only reducing unmapped results. But, rarely, it does. This handles that case in the test. Closes #52462	2020-02-21 09:49:17 -05:00
Igor Motov	e5b21a3fc6	Add HLRC for EQL search (#52550 ) Adds EQL HLRC client with the search method. Relates to #51961	2020-02-21 08:44:08 -05:00
Hendrik Muhs	288ccae23b	[Transform] add support for filter aggregation (#52483 ) add support for filter aggregations, refactor code for sub-aggregation support in mapping deduction fixes #52151	2020-02-21 14:05:11 +01:00
markharwood	96d603979b	Upgrade Lucene to 8.5.0-snapshot-b01d7cb (#52584 ) Upgrading 7x to same Lucene 8.5 version used in master	2020-02-21 10:25:03 +00:00
Przemko Robakowski	aff693bc9f	Make FreezeStep retryable (#52540 ) (#52559 ) * Make FreezeStep retryable This change marks `FreezeStep` as retryable and adds test to make sure we can really run it again. * refactor tests Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-21 10:11:35 +01:00
Armin Braun	4bb780bc37	Refactor Inflexible Snapshot Repository BwC (#52365 ) (#52557 ) * Refactor Inflexible Snapshot Repository BwC (#52365) Transport the version to use for a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere. Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.	2020-02-21 09:14:34 +01:00
Przemysław Witek	b84e8db7b5	[7.x] Rename .ml-state index to .ml-state-000001 to support rollover (#52510 ) (#52595 )	2020-02-21 08:55:59 +01:00
Andrei Stefan	c9b7bb282a	Move IsNull/IsNotNull predicates to QL project (#52502 ) (#52546 ) (cherry picked from commit b7d534e20c005f1c3565e52c0d0e0273f4a4cece)	2020-02-21 09:21:44 +02:00
Yang Wang	4bc7545e43	Add enterprise mode and refactor license check (#51864 ) (#52115 ) Add enterprise operation mode to properly map enterprise license. Aslo refactor XPackLicenstate class to consolidate license status and mode checks. This class has many sychronised methods to check basically three things: * Minimum operation mode required * Whether security is enabled * Whether current license needs to be active Depends on the actual feature, either 1, 2 or all of above checks are performed. These are now consolidated in to 3 helper methods (2 of them are new). The synchronization is pushed down to the helper methods so actual checking methods no longer need to worry about it. resolves: #51081	2020-02-21 14:18:18 +11:00
Benjamin Trent	2a5c181dda	[ML][Inference] don't return inflated definition when storing trained models (#52573 ) (#52580 ) When `PUT` is called to store a trained model, it is useful to return the newly create model config. But, it is NOT useful to return the inflated definition. These definitions can be large and returning the inflated definition causes undo work on the server and client side. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-20 19:47:29 -05:00
Benjamin Trent	013d5c2d24	[ML] Adds support for a global calendar via `_all` (#50372 ) (#52578 ) This adds `_all` to Calendar searches. This enables users to supply the `_all` string in the `job_ids` array when creating a Calendar. That calendar will now be applied to all jobs (existing and newly created). Closes #45013 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-20 17:22:59 -05:00
Maria Ralli	ba8d6d1fb5	Remove Xlint exclusions from gradle files Backport of #52542. This commit is part of issue #40366 to remove disabled Xlint warnings from gradle files. In particular, it removes the Xlint exclusions from the following files: - benchmarks/build.gradle - client/client-benchmark-noop-api-plugin/build.gradle - x-pack/qa/rolling-upgrade/build.gradle - x-pack/qa/third-party/active-directory/build.gradle - modules/transport-netty4/build.gradle For the first three files no code adjustments were needed. For x-pack/qa/third-party/active-directory move the suppression at the code level. For transport-netty4 replace the variable arguments with ArrayLists and remove any redundant casts.	2020-02-20 14:12:05 +00:00
Russ Cam	62da077beb	Specify name on enrich.get_policy as list type (#50217 ) This commit updates the enrich.get_policy API to specify name as a list, in line with other URL parts that accept a comma-separated list of values. In addition, update the get enrich policy API docs to align the URL part name in the documentation with the name used in the REST API specs. (cherry picked from commit 94f6f946ef283dc93040e052b4676c5bc37f4bde)	2020-02-20 11:39:28 +10:00
Ryan Ernst	3c3a0b2f37	Mute additional failing top_metrics test (#52545 ) Most top_metrics tests were muted in #52468, but the scaled float can also fail. This commit mutes that test as well. relates #52418	2020-02-19 16:14:26 -08:00
Przemko Robakowski	88bb06f055	Make DeleteStep retryable (#52494 ) (#52532 ) * Make DeleteStep retryable This change marks `DeleteStep` as retryable and adds test to make sure we really can invoke it again. * Fix unused import * revert unneeded changes * test reworked	2020-02-19 21:16:59 +01:00
Lee Hinman	22cf1140eb	[7.x] Add additional logging to SLM retention task (#52343 ) (#52535 ) This commit adds more logging to the actions that the SLM retention task does. It will help in the event that we need to diagnose any additional issues or problems while running retention.	2020-02-19 13:15:01 -07:00
David Kyle	7bbe5c8464	[Ml] Validate tree feature index is within range (#52514 ) This changes the tree validation code to ensure no node in the tree has a feature index that is beyond the bounds of the feature_names array. Specifically this handles the situation where the C++ emits a tree containing a single node and an empty feature_names list. This is valid tree used to centre the data in the ensemble but the validation code would reject this as feature_names is empty. This meant a broken workflow as you cannot GET the model and PUT it back	2020-02-19 14:41:43 +00:00
Nik Everett	8796cdce4b	Modernize boxplot's parser (backport of #52361 ) (#52372 ) Uses a newer way to build `ObjectParser` for in `boxplot` that allows us to drop a mostly ceremonial method.	2020-02-19 09:20:49 -05:00
Przemysław Witek	7cd997df84	[ML] Make ml internal indices hidden (#52423 ) (#52509 )	2020-02-19 14:02:32 +01:00
Hendrik Muhs	4d006f09d2	[Transform] fix XPackRestIT continuous transform stats test failure do not match explicit number but only test existence for duration test (#52504) fixes #52429	2020-02-19 12:32:54 +01:00
Przemysław Witek	5acee761eb	Implement unit tests for AnomalyDetectorsIndex class (#52417 ) (#52508 )	2020-02-19 12:24:59 +01:00
Tim Brooks	b5e191fa57	Use thread local random for request id generation (#52344 ) Currently we used the secure random number generate when generating http request ids in the security AuditUtil. We do not need to be using this level of randomness for this use case. Additionally, this random number generator involves locking that blocks the http worker threads at high concurrency loads. This commit modifies this randomness generator to use our reproducible randomness generator for Elasticsearch. This generator will fall back to thread local random when used in production.	2020-02-18 09:32:14 -07:00
Ioannis Kakavas	09773efb41	[7.x] Return realm name in SAML Authenticate API (#52188 ) (#52465 ) This is useful in cases where the caller of the API needs to know the name of the realm that consumed the SAML Response and authenticated the user and this is not self evident (i.e. because there are many saml realms defined in ES). Currently, the way to learn the realm name would be to make a subsequent request to the `_authenticate` API.	2020-02-18 17:16:24 +02:00
Henning Andersen	84de601551	Mute failing top_metrics tests (#52468 ) These tests fails when the global template is added, which changes number_of_shards to 2. Relates #52409 and #52418	2020-02-18 13:29:28 +01:00
Martijn van Groningen	606bc8037f	Adjusted assertion for watcher rolling upgrade test. (#52463 ) Relates to #33185	2020-02-18 13:28:15 +01:00
Ioannis Kakavas	d9ce0e6733	Update BouncyCastle to 1.64 (#52185 ) (#52464 ) This commit upgrades the bouncycastle dependency from 1.61 to 1.64.	2020-02-18 14:11:34 +02:00
David Roberts	9c49868bc5	[TEST] Use busy asserts in ML distributed failure test (#52461 ) When changing a job state using a mechanism that doesn't wait for the desired state to be reached within the production code the test code needs to loop until the cluster state has been updated. Closes #52451	2020-02-18 11:17:37 +00:00
Przemysław Witek	6fa067a2a0	Relax assertions on memory_estimation.* fields (#52452 ) (#52458 )	2020-02-18 11:57:03 +01:00
Przemko Robakowski	d467c50e90	Make TimeSeriesLifecycleActionsIT.testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore wait for snaphost (#51892 ) (#52419 ) * waitForSnapshot tests rework * Refactor assertBusy Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-18 11:01:42 +01:00
Martijn van Groningen	d17ecb5936	Change the delete policy api to not pass wildcard expressions to the delete index api (#52448 ) Backport from #52179 Don't rely on the delete index api to resolve all the enrich indices for a particular enrich policy using a '[policy_name]-*' wildcard expression. With this change, the delete policy api will resolve the indices to remove and pass that directly to the delete index api. This resolves a bug, that if `action.destructive_requires_name` setting has been set to true then the delete policy api is unable to remove the enrich indices related to the policy being deleted. Closes #51228 Co-authored-by: bellengao <gbl_long@163.com>	2020-02-18 10:53:39 +01:00
Hendrik Muhs	2071f85e1a	forward audits to logs (#52394 ) audit messages are stored in the notifications index, so audit information is lost for integration tests. This change forwards audit messages to logs, so they can help to debug issues. relates: #51627	2020-02-18 08:47:27 +01:00
Nhat Nguyen	bdb2e72ea4	Fix timeout in testDowngradeRemoteClusterToBasic (#52322 ) - ESCCRRestTestCase#ensureYellow does not work well with assertBusy - Increases timeout to 60s Closes #52036	2020-02-17 15:05:42 -05:00
David Roberts	48ccf36db9	[ML] Increase assertBusy timeout in ML node failure tests (#52425 ) Following the change to store cluster state in Lucene indices (#50907) it can take longer for all the cluster state updates associated with node failure scenarios to be processed during internal cluster tests where several nodes all run in the same JVM.	2020-02-17 17:04:18 +00:00
Costin Leau	20862fe64f	Break QueryTranslator into QL and SQL (#52397 ) Refactor the code to allow contextual parameterization of dateFormat and name. Separate aggs/query implementation though there's room for improvement in the future (cherry picked from commit e086f81b688875b33d01e4504ce7377031c8cf28)	2020-02-17 17:30:15 +02:00
Martijn van Groningen	81e47e9cab	Improve watcher rolling upgrade tests (#52404 ) Relates to #33185	2020-02-17 12:35:07 +01:00

1 2 3 4 5 ...

4782 Commits