OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tim Vernum	901c64ebbf	Add Debug/Trace logging for authentication (#49619 ) Authentication has grown more complex with the addition of new realm types and authentication methods. When user authentication does not behave as expected it can be difficult to determine where and why it failed. This commit adds DEBUG and TRACE logging at key points in the authentication flow so that it is possible to gain addition insight into the operation of the system. Backport of: #49575	2019-11-27 16:39:07 +11:00
Tim Vernum	e9ad1a7fcd	Fix iterate-from-1 bug in smart realm order (#49614 ) The AuthenticationService has a feature to "smart order" the realm chain so that whicherver realm was the last one to successfully authenticate a given user will be tried first when that user tries to authenticate again. There was a bug where the building of this realm order would incorrectly drop the first realm from the default chain unless that realm was the "last successful" realm. In most cases this didn't cause problems because the first realm is the reserved realm and so it is unusual for a user that authenticated against a different realm to later need to authenticate against the resevered realm. This commit fixes that bug and adds relevant asserts and tests. Backport of: #49473	2019-11-27 13:46:52 +11:00
Armin Braun	3862400270	Remove Redundant EsBlobStoreTestCase (#49603 ) (#49605 ) All the implementations of `EsBlobStoreTestCase` use the exact same bootstrap code that is also used by their implementation of `EsBlobStoreContainerTestCase`. This means all tests might as well live under `EsBlobStoreContainerTestCase` saving a lot of code duplication. Also, there was no HDFS implementation for `EsBlobStoreTestCase` which is now automatically resolved by moving the tests over since there is a HDFS implementation for the container tests.	2019-11-26 20:57:19 +01:00
Marios Trivyzas	b0cb7bf229	SQL: Fix issue with GROUP BY YEAR() (#49559 ) Grouping By YEAR() is translated to a histogram aggregation, but previously if there was a scalar function invloved (e.g.: `YEAR(date + INTERVAL 2 YEARS)`), there was no proper script created and the histogram was applied on a field with name: `date + INTERVAL 2 YEARS` which doesn't make sense, and resulted in null result. Check the underlying field of YEAR() and if it's a function call `asScript()` to properly get the painless script on which the histogram is applied. Fixes: #49386 (cherry picked from commit 93c37abc943d00d3a14ba08435d118a6d48874c7)	2019-11-26 14:11:11 +01:00
Marios Trivyzas	3c69d4d0bd	SQL: Add TRUNC alias for TRUNCATE (#49571 ) Add TRUNC as alias to already implemented TRUNCATE numeric function which is the flavour supported by Oracle and PostgreSQL. Relates to: #41195 (cherry picked from commit f2aa7f0779bc5cce40cc0c1f5e5cf1a5bb7d84f0)	2019-11-26 12:32:54 +01:00
j-bean	048b9dbb14	Fix expired job results deletion audit message (#49560 ) The PR fixes #49549	2019-11-26 10:48:12 +00:00
Dimitris Athanasiou	c23a2187da	[7.x][ML] Only report complete writing_results progress after completion (#49551 ) (#49577 ) We depend on the number of data frame rows in order to report progress for the writing of results, the last phase of a job run. However, results include other objects than just the data frame rows (e.g, progress, inference model, etc.). The problem this commit fixes is that if we receive the last data frame row results we'll report that progress is complete even though we still have more results to process potentially. If the job gets stopped for any reason at this point, we will not be able to restart the job properly as we'll think that the job was completed. This commit addresses this by limiting the max progress we can report for the writing_results phase before the results processor completes to 98. At the end, when the process is done we set the progress to 100. The commit also improves failure capturing and reporting in the results processor. Backport of #49551	2019-11-26 12:20:37 +02:00
Marios Trivyzas	5d306ae3b2	SQL: Fix issue with CASE/IIF pre-calculating results (#49553 ) Previously, CaseProcessor was pre-calculating (called `process()`) on all the building elements of a CASE/IIF expression, not only the conditions involved but also the results, as well as the final else result. In case one of those results had an erroneous calculation (e.g.: division by zero) this was executed and resulted in an Exception to be thrown, even if this result was not used because of the condition guarding it. e.g.: ``` SELECT CASE myField1 = 0 THEN NULL ELSE myField2 / myField1 END FROM test; ``` Fixes: #49388 (cherry picked from commit dbd169afc98686cae1bc72024fad0ca32b272efd)	2019-11-26 10:48:07 +01:00
Tim Brooks	416178c7c8	Enable simple remote connection strategy (#49561 ) This commit back ports three commits related to enabling the simple connection strategy. Allow simple connection strategy to be configured (#49066) Currently the simple connection strategy only exists in the code. It cannot be configured. This commit moves in the direction of allowing it to be configured. It introduces settings for the addresses and socket count. Additionally it introduces new settings for the sniff strategy so that the more generic number of connections and seed node settings can be deprecated. The simple settings are not yet registered as the registration is dependent on follow-up work to validate the settings. Ensure at least 1 seed configured in remote test (#49389) This fixes #49384. Currently when we select a random subset of seed nodes from a list, it is possible for 0 seeds to be selected. This test depends on at least 1 seed being selected. Add the simple strategy to cluster settings (#49414) This is related to #49067. This commit adds the simple connection strategy settings and strategy mode setting to the cluster settings registry. With these changes, the simple connection mode can be used. Additionally, it adds validation to ensure that settings cannot be misconfigured.	2019-11-25 16:53:07 -07:00
Benjamin Trent	688c78c589	[ML] Stop timing stats failure propagation (#49495 ) (#49501 )	2019-11-25 10:09:30 -05:00
David Roberts	62811c2272	[ML] Add default categorization analyzer definition to ML info (#49545 ) The categorization job wizard in the ML UI will use this information when showing the effect of the chosen categorization analyzer on a sample of input.	2019-11-25 13:39:16 +00:00
Dimitris Athanasiou	aca38f6882	[7.x][ML] DFA jobs should accept excluding an unsupported field (#49535 ) (#49544 ) Before this change excluding an unsupported field resulted in an error message that explained the excluded field could not be detected as if it doesn't exist. This error message is confusing. This commit commit changes this so that there is no error in this scenario. When excluding a field that does exist but has been automatically been excluded from the analysis there is no harm (unlike excluding a missing field which could be a typo). Backport of #49535	2019-11-25 15:13:00 +02:00
Armin Braun	af0f97d50a	Fix SLMSnapshotBlockingIntegTests.testSnapshotInProgress (#49533 ) (#49542 ) This test must check for state `SUCCESS` as well. `SUCESS` in `SnapshotsInProgress` means "all data nodes finished snapshotting sucessfully but master must still finalize the snapshot in the repo". `SUCESS` does not mean that the snapshot is actually fully finished in this object. You can easily reporduce the scenario in #49303 that has an in-progress snapshot in `SUCCESS` state by waiting 20s before running the busy assert loop on the snapshot status so that all steps but the blocked finalization can finish. Closes #49303	2019-11-25 13:31:45 +01:00
Dimitris Athanasiou	c149c64dc4	[7.x][ML] Apply source query on data frame analytics memory estimation (#49517 ) (#49532 ) Closes #49454 Backport of #49517	2019-11-25 12:51:57 +02:00
Hendrik Muhs	5256756879	[Transform] add debug log for configuration index (#49484 ) add debug log for transform creation and disallow partial results for retrieval	2019-11-25 09:49:17 +01:00
debadair	2ec047db04	[DOCS] Rename auditing topic. Closes #49012 (#49013 ) * [DOCS] Rename auditing topic. Closes #49012 * Fixed file name, fixed settings link. * Add link to settings	2019-11-22 14:16:58 -08:00
Dimitris Athanasiou	8eaee7cbdc	[7.x][ML] Explain data frame analytics API (#49455 ) (#49504 ) This commit replaces the _estimate_memory_usage API with a new API, the _explain API. The API consolidates information that is useful before creating a data frame analytics job. It includes: - memory estimation - field selection explanation Memory estimation is moved here from what was previously calculated in the _estimate_memory_usage API. Field selection is a new feature that explains to the user whether each available field was selected to be included or not in the analysis. In the case it was not included, it also explains the reason why. Backport of #49455	2019-11-22 22:06:10 +02:00
Jason Tedor	71bcfbf1e3	Replace required pipeline with final pipeline (#49470 ) This commit enhances the required pipeline functionality by changing it so that default/request pipelines can also be executed, but the required pipeline is always executed last. This gives users the flexibility to execute their own indexing pipelines, but also ensure that any required pipelines are also executed. Since such pipelines are executed last, we change the name of required pipelines to final pipelines.	2019-11-22 14:37:36 -05:00
Marios Trivyzas	0c4491964b	SQL: Fix issue with folding of CASE/IIF (#49449 ) Add extra checks to prevent ConstantFolding rule to try to fold the CASE/IIF functions early before the SimplifyCase rule gets applied. Fixes: #49387 (cherry picked from commit f35c9725350e35985d8dd3001870084e1784a5ca)	2019-11-22 18:29:49 +01:00
Benjamin Trent	276b6c67f4	[ML][Inference] Fixing pre-processor value handling and size estimate (#49270 ) (#49489 ) * [ML][Inference] Fixing pre-processor value handling and size estimate * fixing npe	2019-11-22 08:14:33 -05:00
Jim Ferenczi	ed4eecc00e	Pre-sort shards based on the max/min value of the primary sort field (#49092 ) This change automatically pre-sort search shards on search requests that use a primary sort based on the value of a field. When possible, the can_match phase will extract the min/max (depending on the provided sort order) values of each shard and use it to pre-sort the shards prior to running the subsequent phases. This feature can be useful to ensure that shards that contain recent data are executed first so that intermediate merge have more chance to contain contiguous data (think of date_histogram for instance) but it could also be used in a follow up to early terminate sorted top-hits queries that don't require the total hit count. The latter could significantly speed up the retrieval of the most/least recent documents from time-based indices. Relates #49091	2019-11-22 11:02:12 +01:00
Hendrik Muhs	1fbb248cb7	reenable warning checks in pivot tests (#49436 )	2019-11-22 08:50:10 +01:00
Tim Vernum	2e5f2dd1e1	Deprecate misconfigured SSL server config (#49280 ) This commit adds a deprecation warning when starting a node where either of the server contexts (xpack.security.transport.ssl and xpack.security.http.ssl) meet either of these conditions: 1. The server lacks a certificate/key pair (i.e. neither ssl.keystore.path not ssl.certificate are configured) 2. The server has some ssl configuration, but ssl.enabled is not specified. This new validation does not care whether ssl.enabled is true or false (though other validation might), it simply makes it an error to configure server SSL without being explicit about whether to enable that configuration. Backport of: #45892	2019-11-22 12:14:55 +11:00
Benjamin Trent	a7477ad7c3	[7.x] [ML][Inference] compressing model definition and lazy parsing (#49269 ) (#49446 ) * [ML][Inference] compressing model definition and lazy parsing (#49269) * [ML][Inference] compressing model definition and lazy parsing * addressing PR comments * adding commons io * implementing simplified bounded stream * adjusting for type inclusion	2019-11-21 15:32:32 -05:00
Benjamin Trent	d9835f7fb4	[ML] Fix r_squared eval when variance is 0 (#49439 ) (#49445 )	2019-11-21 11:22:16 -05:00
Benjamin Trent	d41b2e3f38	[ML][Inference] allowing per-model licensing (#49398 ) (#49435 ) * [ML][Inference] allowing per-model licensing * changing to internal action + removing pre-mature opt	2019-11-21 09:46:34 -05:00
Przemysław Witek	c7ac2011eb	[7.x] Implement accuracy metric for multiclass classification (#47772 ) (#49430 )	2019-11-21 15:01:18 +01:00
Martijn van Groningen	d59ea64ccd	Monitoring should wait with collecting data when cluster service is started. (#49426 ) Backport of #48277 Otherwise integration tests may fail if the monitoring interval is low: ``` [2019-10-21T09:57:25,527][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [integTest-0] fatal error in thread [elasticsearch[integTest-0][generic][T#4]], exiting java.lang.AssertionError: initial cluster state not set yet at org.elasticsearch.cluster.service.ClusterApplierService.state(ClusterApplierService.java:208) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at org.elasticsearch.cluster.service.ClusterService.state(ClusterService.java:125) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) ~[?:?] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:835) [?:?] ``` I ran into this when lowering the monitoring interval when investigating enrich monitoring test: #48258	2019-11-21 14:22:41 +01:00
Hendrik Muhs	c3e4405ddf	[7.x][Transform] Transform fix force stop race condition (#49249 ) (#49420 ) fix force stopping transform if indexer state hasn't been written and/or is set to STOPPED. In certain situations the transform could not be stopped, which means the task could not be removed. Introduces improved abstraction in order to better test state handling in future.	2019-11-21 13:52:14 +01:00
Andrei Dan	010c3de47e	Slm set operation mode to RUNNING on first run (#49236 ) (#49425 ) * SLM set the operation mode to RUNNING on first run Set the SLM operation mode to RUNNING when setting the first SLM lifecycle policy. Historically, SLM was not decoupled from ILM but now they are independent components. Setting the SLM operation mode to what the ILM running mode was when we set the first SLM lifecycle policy was a remain from those times. * SLM update package info * SLM suppress unusued warning * SLM use logger for the correct class * SLM Add integration test for operation mode * Use ESSingleNodeTestCase instead of ESIntegTestCase (cherry picked from commit 4ad3d93f89d03bf9a25685a990d1a439f33ce0e6) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-11-21 11:41:32 +00:00
István Zoltán Szabó	5b10fd301e	[DOCS] Fixes endpoint schema in PUT app privileges API docs. (#49390 )	2019-11-21 09:52:44 +01:00
Lisa Cawley	61c54fd617	[DOCS] Qualifies Watcher transforms (#47482 )	2019-11-20 16:44:18 -08:00
Nhat Nguyen	fec22130c2	Improve error message when pausing index (#48915 ) Throw an appropriate error message when the follower index is not found or is a regular index.	2019-11-20 15:58:44 -05:00
Hendrik Muhs	06c2689802	rename data frame tests to transform tests (#49361 ) rename files and tests in rolling upgrade tests to transform	2019-11-20 18:51:11 +01:00
Bogdan Pintea	8c2ab8bb72	SQL:Docs: add the PIVOT clause to SELECT section (#49129 ) The PR adds the documentation on the PIVOT clause. (cherry picked from commit a55b36065e6496c44b6e3191296931d477a8e5f5)	2019-11-20 18:21:06 +01:00
David Roberts	20558cf61c	[ML] Fix simultaneous stop and force stop datafeed (#49367 ) If a datafeed is stopped normally and force stopped at the same time then it is possible that the force stop removes the persistent task while the normal stop is performing actions. Currently this causes the normal stop to error, but since stopping a stopped datafeed is not an error this doesn't make sense. Instead the force stop should just take precedence. This is a followup to #49191 and should really have been included in the changes in that PR.	2019-11-20 12:52:47 +00:00
Mayya Sharipova	e3da60c23d	Increase the number of vector dims to 2048 (#46895 )	2019-11-20 07:47:33 -05:00
Przemysław Witek	9c0ec7ce23	[7.x] Make AnalyticsProcessManager class more robust (#49282 ) (#49356 )	2019-11-20 10:08:16 +01:00
Dimitris Athanasiou	4d6e037e90	[7.x][ML] Extract creation of DFA field extractor into a factory (#49315 ) (#49329 ) This commit moves the async calls required to retrieve the components that make up `ExtractedFieldsExtractor` out of `DataFrameDataExtractorFactory` and into a dedicated `ExtractorFieldsExtractorFactory` class. A few more refactorings are performed: - The detector no longer needs the results field. Instead, it knows whether to use it or not based on whether the task is restarting. - We pass more accurately whether the task is restarting or not. - The validation of whether fields that have a cardinality limit are valid is now performed in the detector after retrieving the respective cardinalities. Backport of #49315	2019-11-20 10:02:42 +02:00
Lisa Cawley	2b9fb7ebe2	[DOCS] Merges security overview pages (#49342 )	2019-11-19 16:19:02 -08:00
Przemysław Witek	42bb8ae525	[7.x] Extract indexData method out of RegressionIT tests (#49306 ) (#49313 )	2019-11-19 22:47:12 +01:00
Mark Tozzi	17358b5af7	(refactor) Extract Empty/Script/Missing ValuesSource behavior to an interface (#48320 ) (#49330 ) This is a pure code rearrangement refactor. Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility). ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.	2019-11-19 16:44:29 -05:00
Lisa Cawley	75f1f612c2	[DOCS] Merges duplicate pages for Active Directory realms (#49205 )	2019-11-19 13:18:01 -08:00
Jay Modi	eed4cd25eb	ThreadPool and ThreadContext are not closeable (#43249 ) (#49273 ) This commit changes the ThreadContext to just use a regular ThreadLocal over the lucene CloseableThreadLocal. The CloseableThreadLocal solves issues with ThreadLocals that are no longer needed during runtime but in the case of the ThreadContext, we need it for the runtime of the node and it is typically not closed until the node closes, so we miss out on the benefits that this class provides. Additionally by removing the close logic, we simplify code in other places that deal with exceptions and tracking to see if it happens when the node is closing. Closes #42577	2019-11-19 13:15:16 -07:00
Lisa Cawley	c4c8a7a43c	[DOCS] Merges duplicate pages for PKI realms (#49206 )	2019-11-19 10:51:09 -08:00
Lisa Cawley	2f5acae4a9	[DOCS] Groups pages related to encrypting communications (#49324 )	2019-11-19 10:10:39 -08:00
Lisa Cawley	62bbe419d3	[DOCS] Removes Beats security page (#49276 )	2019-11-19 09:15:30 -08:00
Andrei Dan	19780e20ba	Handle failure to retrieve ILM policy step better (#49193 ) (#49316 ) This commit wraps the calls to retrieve the current step in a try/catch so that the exception does not bubble up. Instead, step info is added containing the exception to the existing step. Semi-related to #49128 (cherry picked from commit 72530f8a7f40ae1fca3704effb38cf92daf29057) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-11-19 17:14:46 +00:00
Armin Braun	0acba44a2e	Make Repository.getRepositoryData an Async API (#49299 ) (#49312 ) This API call in most implementations is fairly IO heavy and slow so it is more natural to be async in the first place. Concretely though, this change is a prerequisite of #49060 since determining the repository generation from the cluster state introduces situations where this call would have to wait for other operations to finish. Doing so in a blocking manner would break `SnapshotResiliencyTests` and waste a thread. Also, this sets up the possibility to in the future make use of async IO where provided by the underlying Repository implementation. In a follow-up `SnapshotsService#getRepositoryData` will be made async as well (did not do it here, since it's another huge change to do so). Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.	2019-11-19 16:49:12 +01:00
Marios Trivyzas	fd1bb4a33a	SQL: Fix issue with mins & hours for DATEDIFF (#49252 ) Previously, DATEDIFF for minutes and hours was doing a rounding calculation using all the time fields (secs, msecs/micros/nanos). Instead it should first truncate the 2 dates to the respective field (mins or hours) zeroing out all the more detailed time fields and then make the subtraction. (cherry picked from commit 124cd18e20429e19d52fd8dc383827ea5132d428)	2019-11-19 14:25:28 +01:00

1 2 3 4 5 ...

4290 Commits