OpenSearch

Commit Graph

Author	SHA1	Message	Date
Dimitris Athanasiou	25d64508f6	[7.x][ML] Support boolean fields for DF analytics (#46037 ) (#46054 ) This commit adds support for `boolean` fields in data frame analytics (and currently both outlier detection and regression). The analytics process expects `boolean` fields to be encoded as integers with 0 or 1 value.	2019-08-28 12:02:29 +03:00
Jake Landis	154d1dd962	Watcher max_iterations with foreach action execution (#45715 ) (#46039 ) Prior to this commit the foreach action execution had a hard coded limit to 100 iterations. This commit allows the max number of iterations to be a configuration ('max_iterations') on the foreach action. The default remains 100.	2019-08-27 16:57:20 -05:00
Armin Braun	fdef293c81	Fix RegressionTests#fromXContent (#46029 ) * The `trainingPercent` must be between `1` and `100`, not `0` and `100` which is causing test failures	2019-08-27 18:24:26 +03:00
Dimitris Athanasiou	873ad3f942	[7.x][ML] Add option to regression to randomize training set (#45969 ) (#46017 ) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable.	2019-08-27 17:53:11 +03:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Ioannis Kakavas	b249e25bb4	Partly revert globalInfo.ready check (#45960 ) This check was introduced in #41392 but had the unwanted side-effect that the keystore settings in such blocks would note be added in the node's keystore. Given that we have a mid-term plan for FIPS testing that would made such checks unnecessary, and that the conditional in these two cases is not really that important, this change removes this conditional logic so that full-cluster-restart and rolling upgrade tests will run with PEM files for key/certificate material no matter if we're in a FIPS JVM or not. Resolves: #45475	2019-08-27 13:01:56 +03:00
Zachary Tong	943a016bb2	Add Cumulative Cardinality agg (and Data Science plugin) (#45990 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 16:19:55 -04:00
Benjamin Trent	a3a4ae0ac2	[ML] fixing bug where analytics process starts with 0 rows (#45879 ) (#45988 ) The native process requires that there be a non-zero number of rows to analyze. If the flag --rows 0 is passed to the executable, it throws and does not start. When building the configuration for the process we should not start the native process if there are no rows. Adding some logging to indicate what is occurring.	2019-08-26 14:18:17 -05:00
Benjamin Trent	d64018f8e1	[ML] add supported types to no fields error message (#45926 ) (#45987 ) * [ML] add supported types to no fields error message * adding supported types to logger debug	2019-08-26 14:18:00 -05:00
Jake Landis	767f648f8e	Watcher add email warning if CSV attachment contains formulas (#44460 ) (#45557 ) * Watcher add email warning if CSV attachment contains formulas (#44460) This commit introduces a Warning message to the emails generated by Watcher's reporting action. This change complements Kibana's CSV formula notifications (see elastic/kibana#37930). This is implemented by reading a header (kbn-csv-contains-formulas) provided by Kibana to notify to attach the Warning to the email. The wording of the warning is borrowed from Kibana's UI and may be overridden by a dynamic setting xpack.notification.reporting.warning.kbn-csv-contains-formulas.text. This warning is enabled by default, but may be disabled via a dynamic setting xpack.notification.reporting.warning.enabled.	2019-08-26 08:35:33 -05:00
Jake Landis	f2241a152f	watcher tests - increase stop timeout to 60s (#45679 ) (#45934 ) As of #43939 Watcher tests now correctly block until all Watch executions kicked off by that test are finished. Prior we allowed tests to finish with outstanding watch executions. It was known that this would increase the time needed to finish a test. However, running the tests on CI can be slow and on at least 1 occasion it took 60s to actually finish. This PR simply increases the max allowable timeout for Watcher tests to clean up after themselves.	2019-08-26 08:34:54 -05:00
Andrey Ershov	479ab9b8db	Fix plaintext on TLS port logging (#45852 ) Today if non-TLS record is received on TLS port generic exception will be logged with the stack-trace. SSLExceptionHelper.isNotSslRecordException method does not work because it's assuming that NonSslRecordException would be top-level. This commit addresses the issue and the log would be more concise. (cherry picked from commit 6b83527bf0c23d4d5b97fab7f290c43432945d4f)	2019-08-26 12:32:35 +02:00
Ioannis Kakavas	2bee27dd54	Allow Transport Actions to indicate authN realm (#45946 ) This commit allows the Transport Actions for the SSO realms to indicate the realm that should be used to authenticate the constructed AuthenticationToken. This is useful in the case that many authentication realms of the same type have been configured and where the caller of the API(Kibana or a custom web app) already know which realm should be used so there is no need to iterate all the realms of the same type. The realm parameter is added in the relevant REST APIs as optional so as not to introduce any breaking change.	2019-08-25 19:36:41 +03:00
Jason Tedor	040a810b3c	Add deprecation check for pidfile setting (#45939 ) The pidfile setting is deprecated. This commit adds a deprecation check for usage of this setting.	2019-08-24 17:19:20 -04:00
Jason Tedor	43ca652d11	Add deprecation check for processors (#45925 ) The processors setting is deprecated. This commit adds a deprecation check for the use of the processors setting.	2019-08-23 20:16:40 -04:00
Jason Tedor	6b116a48f3	Skip feature aware check on JDK 14 (#45928 ) ASM can not currently handle classes compiled with JDK 14. This commit skips these checks on JDK 14, for now.	2019-08-23 17:38:15 -04:00
Dimitris Athanasiou	be554fe5f0	[7.x][ML] Improve progress reportings for DF analytics (#45856 ) (#45910 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 23:04:39 +03:00
Benjamin Trent	b756e1b9be	[ML][Transforms] adjusting when and what to audit (#45876 ) (#45916 ) * [ML][Transforms] adjusting when and what to audit * Update DataFrameTransformTask.java * removing unnecessary audit message	2019-08-23 13:53:02 -05:00
Benjamin Trent	94c2de65b9	[ML][Transforms] fix doSaveState check (#45882 ) (#45902 ) * [ML][Transforms] fix doSaveState check * removing unnecessary log statement	2019-08-23 09:38:52 -05:00
Alexander Reelsen	ecafe4f4ad	Update joda to 2.10.3 (#45495 )	2019-08-23 10:39:39 +02:00
markharwood	217e41ab6c	Search - added HLRC support for PinnedQueryBuilder (#45779 ) (#45853 ) Added HLRC support for PinnedQueryBuilder Related #44074	2019-08-23 09:22:17 +01:00
Przemysław Witek	85d55e30d0	Add test that proves _timing_stats document is deleted when the job is deleted (#45840 ) (#45854 )	2019-08-23 07:03:09 +02:00
Przemysław Witek	2ed19b2c81	Put error message from inside the process into the exception that is thrown when the process doesn't start correctly. (#45846 ) (#45875 )	2019-08-23 07:02:50 +02:00
Tim Vernum	f94e4a9151	Set security index refresh interval to 1s (#45888 ) The security indices were being created without specifying the refresh interval, which means it would inherit a value from any templates that exists. However, certain security functionality depends on being able to wait_for refresh, and causes errors (e.g. in Kibana) if that time exceeds 30s. This commit changes the security indices configuration to always be created with a 1s refresh interval. This prevents any templates from inadvertantly interfering with the proper functioning of security. It is possible for an administrator to explicitly change the refresh interval after the indices have been created. Backport of: #45434	2019-08-23 12:41:37 +10:00
Tim Vernum	029725fc35	Add SSL/TLS settings for watcher email (#45836 ) This change adds a new SSL context xpack.notification.email.ssl.* that supports the standard SSL configuration settings (truststore, verification_mode, etc). This SSL context is used when configuring outbound SMTP properties for watcher email notifications. Backport of: #45272	2019-08-23 10:13:51 +10:00
Nhat Nguyen	3393f9599e	Ignore translog retention policy if soft-deletes enabled (#45473 ) Since #45136, we use soft-deletes instead of translog in peer recovery. There's no need to retain extra translog to increase a chance of operation-based recoveries. This commit ignores the translog retention policy if soft-deletes is enabled so we can discard translog more quickly. Backport of #45473 Relates #45136	2019-08-22 16:40:06 -04:00
Benjamin Trent	8e3c54fff7	[7.x] [ML] Adding data frame analytics stats to _usage API (#45820 ) (#45872 ) * [ML] Adding data frame analytics stats to _usage API (#45820) * [ML] Adding data frame analytics stats to _usage API * making the size of analytics stats 10k * adjusting backport	2019-08-22 15:15:41 -05:00
Benjamin Trent	dff3e636c2	[ML][Transforms] unifying logging, adding some more logging (#45788 ) (#45859 ) * [ML][Transforms] unifying logging, adding some more logging * using parameterizedMessage instead of string concat * fixing bracket closure	2019-08-22 13:15:07 -05:00
Benjamin Trent	e50a78cf50	[ML-DataFrame] version data frame transform internal index (#45375 ) (#45837 ) Adds index versioning for the internal data frame transform index. Allows for new indices to be created and referenced, `GET` requests now query over the index pattern and takes the latest doc (based on INDEX name).	2019-08-22 11:46:30 -05:00
Jake Landis	1dab73929f	Watcher add stopped listener (#43939 ) (#45670 ) When Watcher is stopped and there are still outstanding watches running Watcher will report it self as stopped. In normal cases, this is not problematic. However, for integration tests Watcher is started and stopped between each test to help ensure a clean slate for each test. The tests are blocking only on the stopped state and make an implicit assumption that all watches are finished if the Watcher is stopped. This is an incorrect assumption since Stopped really means, "I will not accept any more watches". This can lead to un-predictable behavior in the tests such as message : "Watch is already queued in thread pool" and state: "not_executed_already_queued". This can also change the .watcher-history if watches linger between tests. This commit changes the semantics of a manual stopping watcher to now mean: "I will not accept any more watches AND all running watches are complete". There is now an intermediary step "Stopping" and callback to allow transition to a "Stopped" state when all Watches have completed. Additionally since this impacts how long the tests will block waiting for a "Stopped" state, the timeout has been increased. Related: #42409	2019-08-22 10:54:29 -05:00
Armin Braun	bfddaaa2ae	Acknowledge Indices Were Wiped Successfully in REST Tests (#45832 ) (#45842 ) In internal test clusters tests we check that wiping all indices was acknowledged but in REST tests we didn't. This aligns the behavior in both kinds of tests. Relates #45605 which might be caused by unacked deletes that were just slow.	2019-08-22 17:19:51 +02:00
Przemysław Witek	7512337922	[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775 ) (#45825 )	2019-08-22 11:14:26 +02:00
Benjamin Trent	3ebeaa2557	Fixing rollup state tests after onFailure ordering change (#45784 ) (#45814 ) After the PR #45676 onFailure is now called before the indexer state has transitioned out of indexing. To fix these tests, I added a new check to make sure that we don't mark it as failed until AFTER doSaveState is called with a STARTED indexer.	2019-08-21 14:46:09 -05:00
Gordon Brown	47b1e2b3d0	[7.x] Use rollover for SLM's history indices (#45686 ) Following our own guidelines, SLM should use rollover instead of purely time-based indices to keep shard counts low. This commit implements lazy index creation for SLM's history indices, indexing via an alias, and rollover in the built-in ILM policy.	2019-08-21 13:42:11 -06:00
Henning Andersen	c3296d3251	Unmute testBiDirectionalIndexFollowing (#45641 ) (#45792 ) Cause is believed to be in build system caching so unmuting.	2019-08-21 20:53:14 +02:00
William Brafford	2b549e7342	CLI tools: write errors to stderr instead of stdout (#45586 ) Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools. This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr. Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo \| sort).	2019-08-21 14:46:07 -04:00
Przemysław Witek	bf701b83d2	Shorten field names in EstimateMemoryUsageResponse (#45719 ) (#45772 )	2019-08-21 12:45:09 +02:00
Zachary Tong	6b391cd0d5	Mute ShapeQueryTests#testFieldAlias() Tracking issue: https://github.com/elastic/elasticsearch/issues/45628	2019-08-21 10:31:13 +01:00
David Kyle	982560afeb	Mute RollupIndexerStateTests See #45770	2019-08-21 10:05:15 +01:00
Przemysław Witek	c6709f0979	Mute tests affected by renaming fields in Estimate memory usage response (#45743 ) (#45766 )	2019-08-21 09:57:23 +02:00
Dimitris Athanasiou	d5c3d9b50f	[7.x][ML] Do not skip rows with missing values for regression (#45751 ) (#45754 ) Regression analysis support missing fields. Even more, it is expected that the dependent variable has missing fields to the part of the data frame that is not for training. This commit allows to declare that an analysis supports missing values. For such analysis, rows with missing values are not skipped. Instead, they are written as normal with empty strings used for the missing values. This also contains a fix to the integration test. Closes #45425	2019-08-21 08:15:38 +03:00
Benjamin Trent	ba7b677618	[ML] better handle empty results when evaluating regression (#45745 ) (#45759 ) * [ML] better handle empty results when evaluating regression * adding new failure test to ml_security black list * fixing equality check for regression results	2019-08-20 17:37:04 -05:00
Armin Braun	a01bd6c5a3	Stop Executing SLM Policy Transport Action on Snapshot Pool (#45727 ) (#45748 ) * Executing SLM policies on the snapshot thread will block until a snapshot finishes if the pool is completely busy executing that snapshot * Fixes #45594	2019-08-20 19:15:36 +02:00
Nhat Nguyen	99b21d50b8	Include leases in ccr errmsg when ops no longer available (#45681 ) The setting index.soft_deletes.retention.operations is no longer needed nor recommended in CCR. We, therefore, should hint users about the retention leases period setting instead when operations are no longer available for replicating.	2019-08-20 10:40:12 -04:00
Benjamin Trent	43bb5924e6	[ML][Data Frame] fixing _start?force=true bug (#45660 ) (#45734 ) * [ML][Data Frame] fixing _start?force=true bug * removing unused import * removing old TODO	2019-08-20 09:23:07 -05:00
Dimitris Athanasiou	49edf9e5b5	[7.x][ML] Remove timeout on waiting for DF analytics result processor to complete (#45724 ) (#45733 ) We cannot know how long the analysis will take to complete thus we should not have a timeout. Note that if the process crashes, the result processor will pick the exception due to the stream closing. Closes #45723	2019-08-20 17:21:40 +03:00
Przemysław Witek	b37ebd1adf	Prepare the codebase for new Auditor subclasses (#45716 ) (#45731 )	2019-08-20 16:03:50 +02:00
Przemysław Witek	80dd0a0948	Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718 ) (#45725 )	2019-08-20 15:49:17 +02:00

1 2 3 4 5 ...

3677 Commits