OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-07 21:48:39 +00:00

Author	SHA1	Message	Date
Hendrik Muhs	42ada88c5d	cleanup static member	2019-09-06 08:55:57 +02:00
Hendrik Muhs	ab5f09d29b	[ML-DataFrame] improve error message for timeout case in stop (#46131 ) improve error message if stopping of transform times out. related #45610	2019-09-06 08:36:01 +02:00
Hendrik Muhs	78824cce81	reuse mock client to avoid probles with thread context closed errors (#46398 )	2019-09-06 08:17:25 +02:00
Benjamin Trent	457ff3e2fb	7.x/ml fix instance serialization bwc (#46404 ) * [ML] Fixing instance serialization version for bwc * fixing CppLogMessage	2019-09-05 13:23:26 -05:00
Benjamin Trent	caf3e4d654	[7.x] [ML][Transforms] fixing rolling upgrade continuous transform test (#45823 ) (#46347 ) (#46337 ) * [ML][Transforms] fixing rolling upgrade continuous transform test (#45823) * [ML][Transforms] fixing rolling upgrade continuous transform test * adjusting wait assert logic * adjusting wait conditions * [ML][Transforms] allow executor to call start on started task (#46347) * making sure we only upgrade from 7.4.0 in test	2019-09-05 10:31:11 -05:00
Benjamin Trent	5201386232	[ML] testFullClusterRestart waiting for stable cluster (#46280 ) (#46335 ) * [ML] waiting for ml indices before waiting task assignment testFullClusterRestart * waiting for a stable cluster after fullrestart * removing unused imports	2019-09-05 06:57:33 -05:00
Yogesh Gaikwad	d5acb15a71	[Backport] Initialize document subset bit set cache used for DLS (#46211 ) (#46359 ) This commit initializes DocumentSubsetBitsetCache even if DLS is disabled. Previously it would throw null pointer when querying usage stats if we explicitly disabled DLS as there would be no instance of DocumentSubsetBitsetCache to query. It is okay to initialize DocumentSubsetBitsetCache which will be empty as the license enforcement would prevent usage of DLS feature and it will not fail when accessing usage stats. Closes #45147	2019-09-05 14:34:19 +10:00
Julie Tibshirani	40c3225d26	First round of optimizations for vector functions. (#46294 ) This PR merges the `vectors-optimize-brute-force` feature branch, which makes the following changes to how vector functions are computed: * Precompute the L2 norm of each vector at indexing time. (#45390) * Switch to ByteBuffer for vector encoding. (#45936) * Decode vectors and while computing the vector function. (#46103) * Use an array instead of a List for the query vector. (#46155) * Precompute the normalized query vector when using cosine similarity. (#46190) Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2019-09-04 14:45:57 -07:00
Andrey Ershov	ece9eb4acd	Remove stack trace logging in Security(Transport\|Http)ExceptionHandler (#45966 ) As per #45852 comment we no longer need to log stack-traces in SecurityTransportExceptionHandler and SecurityHttpExceptionHandler even if trace logging is enabled. (cherry picked from commit c99224a32d26db985053b7b36e2049036e438f97)	2019-09-04 11:50:35 +03:00
Dimitris Athanasiou	8fca5b5204	[7.x][ML] Unmute testStopOutlierDetectionWithEnoughDocumentsToScroll (#46271 ) (#46282 ) The test seems to have been failing due to a race condition between stopping the task and refreshing the destination index. In particular, we were going forward with refreshing the destination index even though the task stopped in the meantime. This was fixed in request. Closes #43960 Backport of #46271	2019-09-04 10:57:01 +03:00
Marios Trivyzas	fd0affb503	SQL: Fix issue with IIF function when condition folds (#46290 ) Previously, when the condition (1st argument) of the IIF function could be evaluated (folded) to false, the `IfConditional` was eliminated which caused `IndexOutOfBoundsException` to be thrown when `info()` and `resolveType()` methods where called. Fixes: #46268 (cherry picked from commit 9a885a3ac47bc8f52c07770d1d8d670ce0af1e59)	2019-09-04 10:32:49 +03:00
Benjamin Trent	8a4ff7d57d	[ML][Transforms] protecting doSaveState with optimistic concurrency (#46156 ) (#46281 ) * [ML][Transforms] protecting doSaveState with optimistic concurrency * task code cleanup	2019-09-03 15:27:24 -05:00
Benjamin Trent	c3c1dcd2ac	[ML][Transforms] fixing listener being called twice (#46284 ) (#46292 )	2019-09-03 15:26:56 -05:00
Benjamin Trent	53df54c703	[ML][Transforms] fixing stop on changes check bug (#46162 ) (#46273 ) * [ML][Transforms] fixing stop on changes check bug * Adding new method finishAndCheckState to cover race conditions in early terminations * changing stopping conditions in `onStart` * allow indexer to finish when exiting early	2019-09-03 11:04:18 -05:00
Lee Hinman	3d4b8e01c7	Validate SLM policy ids strictly (#45998 ) (#46145 ) This uses strict validation for SLM policy ids, similar to what we use for index names. Resolves #45997	2019-09-03 09:20:02 -06:00
David Roberts	ab045744ac	[ML-DataFrame] Fix off-by-one error in checkpoint operations_behind (#46235 ) Fixes a problem where operations_behind would be one less than expected per shard in a new index matched by the data frame transform source pattern. For example, if a data frame transform had a source of foo* and a new index foo-new was created with 2 shards and 7 documents indexed in it then operations_behind would be 5 prior to this change. The problem was that an empty index has a global checkpoint number of -1 and the sequence number of the first document that is indexed into an index is 0, not 1. This doesn't matter for indices included in both the last and next checkpoints, as the off-by-one errors cancelled, but for a new index it affected the observed result.	2019-09-03 12:45:02 +01:00
Ignacio Vera	59c474e675	reset queryGeometry in ShapeQueryTests (#45974 ) (#46251 )	2019-09-03 10:24:46 +02:00
markharwood	a9e3871fc7	Test fix for PinnedQueryBuilderIT (#46187 ) (#46227 ) Fix test issue to stabilise scoring through use of DFS search mode. Randomised index-then-delete docs introduced by the test framework likely caused an imbalance in IDF scores across shards. Also made number of shards used in test a random number for added test coverage. Closes #46174	2019-09-02 13:31:22 +01:00
Marios Trivyzas	3bee647e5b	SQL: Fix issue with DataType for CASE with NULL (#46173 ) Previously, if the DataType of all the WHEN conditions of a CASE statement is NULL, then it was set to NULL even if the ELSE clause has a non-NULL data type, e.g.: ``` CASE WHEN a = 1 THEN NULL WHEN a = 5 THEN NULL ELSE 'foo' ``` Fixes: #46032 (cherry picked from commit 8c1012efbbd3a300afd0dfb9b18250f15ea753f9)	2019-09-02 11:17:24 +03:00
Benjamin Trent	d0c5573a51	[ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044 ) (#46096 ) Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot. This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.	2019-08-30 09:27:07 -05:00
Alexander Reelsen	98c32c7846	Fix wrong URL encoding in watcher HTTP client (#45894 ) The test assumption was calling the wrong method resulting in a URL encoding before returning the data. Closes #44970	2019-08-30 14:02:49 +02:00
Dimitris Athanasiou	5921ae53d8	[7.x][ML] Regression dependent variable must be numeric (#46072 ) (#46136 ) * [ML] Regression dependent variable must be numeric This adds a validation that the dependent variable of a regression analysis must be numeric. * Address review comments and fix some problems In addition to addressing the review comments, this commit fixes a few issues I found during testing. In particular: - if there were mappings for required fields but they were not included we were not reporting the error - if explicitly included fields had unsupported types we were not reporting the error Unfortunately, I couldn't get those fixed without refactoring the code in `ExtractedFieldsDetector`.	2019-08-30 09:57:43 +03:00
Zachary Tong	cf8a4171e1	Rename `data-science` plugin to `analytics` (#46133 ) Rename `data-science` plugin to `analytics`. Also removes enabled flag. Backport of #46092	2019-08-29 12:45:39 -04:00
Simon Willnauer	9b2ea07b17	Flush engine after big merge (#46066 ) (#46111 ) Today we might carry on a big merge uncommitted and therefore occupy a significant amount of diskspace for quite a long time if for instance indexing load goes down and we are not quickly reaching the translog size threshold. This change will cause a flush if we hit a significant merge (512MB by default) which frees diskspace sooner.	2019-08-29 17:54:15 +02:00
Nhat Nguyen	028e792e1d	Remove already exist assertion while renew ccr lease (#46009 ) If a CCR lease is disappeared while we are renewing it, then we will issue asyncAddRetentionLease to add that lease. And if asyncAddRetentionLease takes longer than retentionLeaseRenewInterval, then we can issue another asyncAddRetentionLease request. One of asyncAddRetentionLease requests will fail with RetentionLeaseAlreadyExistsException, hence trip the assertion. Closes #45192	2019-08-29 09:44:40 -04:00
Przemysław Witek	b8a0379057	Refactor auditor-related classes (#45893 ) (#46120 )	2019-08-29 14:21:03 +02:00
Przemysław Witek	fbe9e8a530	Do not throw an exception if the process finished quickly but without any error. (#46073 ) (#46113 )	2019-08-29 10:47:17 +02:00
Gordon Brown	47bbd9d9a9	[7.x] Fix rollover alias in SLM history index template (#46001 ) This commit adds the `rollover_alias` setting required for ILM to work correctly to the SLM history index template and adds assertions to the SLM integration tests to ensure that it works correctly.	2019-08-28 14:50:22 -07:00
Tal Levy	a356bcff41	Add Circle Processor (#43851 ) (#46097 ) add circle-processor that translates circles to polygons	2019-08-28 14:44:08 -07:00
Julie Tibshirani	d94c4dcffb	Use float instead of double for query vectors. (#46004 ) Currently, when using script_score functions like cosineSimilarity, the query vector is treated as an array of doubles. Since the stored document vectors use floats, it seems like the least surprising behavior for the query vectors to also be float arrays. In addition to improving consistency, this change may help with some optimizations we have been considering around vector dot product.	2019-08-28 11:03:14 -07:00
Mark Tozzi	9ac85a4a2b	Fix compilation in CumulativeCardinalityAggregatorTests	2019-08-28 09:31:48 -04:00
Dimitris Athanasiou	25d64508f6	[7.x][ML] Support boolean fields for DF analytics (#46037 ) (#46054 ) This commit adds support for `boolean` fields in data frame analytics (and currently both outlier detection and regression). The analytics process expects `boolean` fields to be encoded as integers with 0 or 1 value.	2019-08-28 12:02:29 +03:00
Jake Landis	154d1dd962	Watcher max_iterations with foreach action execution (#45715 ) (#46039 ) Prior to this commit the foreach action execution had a hard coded limit to 100 iterations. This commit allows the max number of iterations to be a configuration ('max_iterations') on the foreach action. The default remains 100.	2019-08-27 16:57:20 -05:00
Armin Braun	fdef293c81	Fix RegressionTests#fromXContent (#46029 ) * The `trainingPercent` must be between `1` and `100`, not `0` and `100` which is causing test failures	2019-08-27 18:24:26 +03:00
Dimitris Athanasiou	873ad3f942	[7.x][ML] Add option to regression to randomize training set (#45969 ) (#46017 ) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable.	2019-08-27 17:53:11 +03:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Zachary Tong	943a016bb2	Add Cumulative Cardinality agg (and Data Science plugin) (#45990 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 16:19:55 -04:00
Benjamin Trent	a3a4ae0ac2	[ML] fixing bug where analytics process starts with 0 rows (#45879 ) (#45988 ) The native process requires that there be a non-zero number of rows to analyze. If the flag --rows 0 is passed to the executable, it throws and does not start. When building the configuration for the process we should not start the native process if there are no rows. Adding some logging to indicate what is occurring.	2019-08-26 14:18:17 -05:00
Benjamin Trent	d64018f8e1	[ML] add supported types to no fields error message (#45926 ) (#45987 ) * [ML] add supported types to no fields error message * adding supported types to logger debug	2019-08-26 14:18:00 -05:00
Jake Landis	767f648f8e	Watcher add email warning if CSV attachment contains formulas (#44460 ) (#45557 ) * Watcher add email warning if CSV attachment contains formulas (#44460) This commit introduces a Warning message to the emails generated by Watcher's reporting action. This change complements Kibana's CSV formula notifications (see elastic/kibana#37930). This is implemented by reading a header (kbn-csv-contains-formulas) provided by Kibana to notify to attach the Warning to the email. The wording of the warning is borrowed from Kibana's UI and may be overridden by a dynamic setting xpack.notification.reporting.warning.kbn-csv-contains-formulas.text. This warning is enabled by default, but may be disabled via a dynamic setting xpack.notification.reporting.warning.enabled.	2019-08-26 08:35:33 -05:00
Jake Landis	f2241a152f	watcher tests - increase stop timeout to 60s (#45679 ) (#45934 ) As of #43939 Watcher tests now correctly block until all Watch executions kicked off by that test are finished. Prior we allowed tests to finish with outstanding watch executions. It was known that this would increase the time needed to finish a test. However, running the tests on CI can be slow and on at least 1 occasion it took 60s to actually finish. This PR simply increases the max allowable timeout for Watcher tests to clean up after themselves.	2019-08-26 08:34:54 -05:00
Andrey Ershov	479ab9b8db	Fix plaintext on TLS port logging (#45852 ) Today if non-TLS record is received on TLS port generic exception will be logged with the stack-trace. SSLExceptionHelper.isNotSslRecordException method does not work because it's assuming that NonSslRecordException would be top-level. This commit addresses the issue and the log would be more concise. (cherry picked from commit 6b83527bf0c23d4d5b97fab7f290c43432945d4f)	2019-08-26 12:32:35 +02:00
Ioannis Kakavas	2bee27dd54	Allow Transport Actions to indicate authN realm (#45946 ) This commit allows the Transport Actions for the SSO realms to indicate the realm that should be used to authenticate the constructed AuthenticationToken. This is useful in the case that many authentication realms of the same type have been configured and where the caller of the API(Kibana or a custom web app) already know which realm should be used so there is no need to iterate all the realms of the same type. The realm parameter is added in the relevant REST APIs as optional so as not to introduce any breaking change.	2019-08-25 19:36:41 +03:00
Jason Tedor	040a810b3c	Add deprecation check for pidfile setting (#45939 ) The pidfile setting is deprecated. This commit adds a deprecation check for usage of this setting.	2019-08-24 17:19:20 -04:00
Jason Tedor	43ca652d11	Add deprecation check for processors (#45925 ) The processors setting is deprecated. This commit adds a deprecation check for the use of the processors setting.	2019-08-23 20:16:40 -04:00
Jason Tedor	6b116a48f3	Skip feature aware check on JDK 14 (#45928 ) ASM can not currently handle classes compiled with JDK 14. This commit skips these checks on JDK 14, for now.	2019-08-23 17:38:15 -04:00
Dimitris Athanasiou	be554fe5f0	[7.x][ML] Improve progress reportings for DF analytics (#45856 ) (#45910 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 23:04:39 +03:00
Benjamin Trent	b756e1b9be	[ML][Transforms] adjusting when and what to audit (#45876 ) (#45916 ) * [ML][Transforms] adjusting when and what to audit * Update DataFrameTransformTask.java * removing unnecessary audit message	2019-08-23 13:53:02 -05:00

1 2 3 4 5 ...

3228 Commits