OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-18 19:05:06 +00:00

Author	SHA1	Message	Date
Jason Tedor	52b97ec539	Allow setting validation against arbitrary types (#47264 ) Today when settings validate, they can only validate against settings that are of the same type. While this strong-type is convenient from a development perspective, it is too limiting in that some settings need to validate against settings of a different type. For example, the list setting xpack.monitoring.exporters.<namespace>.host wants to validate that it is non-empty if and only if the string setting xpack.monitoring.exporters.<namespace>.type is "http". Today this is impossible since the settings validation framework only allows that setting to validate against other list settings. This commit increases the flexibility here to validate against settings of arbitrary type, at the expense of losing strong-typing during development.	2019-10-02 16:31:06 -04:00
Lee Hinman	2e3eb4b24e	Add API to execute SLM retention on-demand (#47405 ) (#47463 ) * Add API to execute SLM retention on-demand (#47405) This is a backport of #47405 This commit adds the `/_slm/_execute_retention` API endpoint. This endpoint kicks off SLM retention and then returns immediately. This in particular allows us to run retention without scheduling it (for entirely manual invocation) or perform a one-off cleanup. This commit also includes HLRC for the new API, and fixes an issue in SLMSnapshotBlockingIntegTests where retention invoked prior to the test completing could resurrect an index the internal test cluster cleanup had already deleted. Resolves #46508 Relates to #43663	2019-10-02 12:29:04 -06:00
Mark Vieira	6acc5ca8d1	Remove groovy test code from buildSrc (#47416 )	2019-10-02 11:05:04 -07:00
Lee Hinman	013d87d716	Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAlloc… (#47313 ) * Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAllocated These tests were using randomly generated includes/excludes/requires for routing, however, it was possible to generate mutually exclusive allocation settings (about 1 out of 50,000 times for my runs). This splits the test into three different tests, and removes the randomization (it doesn't add anything to the testing here) to fix the issue. Resolves #47142	2019-10-02 10:01:23 -06:00
Jim Ferenczi	c340814b34	Fix highlighting of overlapping terms in the unified highlighter (#47227 ) The passage formatter that the unified highlighter use doesn't handle terms with overlapping offsets. For tokenizer that provides multiple segmentation of the same terms (edge ngram for instance) the formatter should select the largest span in order to highlight the term only once. This change implements this logic.	2019-10-02 16:34:12 +02:00
Ioannis Kakavas	4f722f0f53	Fix Active Directory tests (#47358 ) (#47440 ) Fixes multiple Active Directory related tests that run against the samba fixture. Some were failing since we changed the realm settings format in 7.0 and a few were slightly broken in other ways. We can move to cleanup the tests in a follow up but this work fits better to be done with or after we move the tests from a Samba based fixture to a real(-ish) Microsoft Active Directory based fixture. Resolves: #33425, #35738	2019-10-02 17:18:12 +03:00
Benjamin Trent	2228a7dd8d	[ML][Inference] adding ensemble model objects (#47241 ) (#47438 ) * [ML][Inference] adding ensemble model objects * addressing PR comments * Update TreeTests.java * addressing PR comments * fixing test	2019-10-02 09:49:46 -04:00
Dimitris Athanasiou	b9541eb3af	[7.x][ML] Make PUT data frame analytics action a master node action (… (#47433 ) While it seemed like the PUT data frame analytics action did not have to be a master node action as the config is stored in an index rather than the cluster state, there are other subtle nuances which make it worthwhile to convert it. In particular, it helps maintain order of execution for put actions which are anyhow user driven and are expected to have low volume. This commit converts `TransportPutDataFrameAnalyticsAction` from a handled transport action to a master node action. Note this means that the action might fail in a mixed cluster but as the API is still experimental and not widely used there will be few moments more suitable to make this change than now.	2019-10-02 16:24:21 +03:00
Yannick Welsch	f7980e9745	Adapt version constants after backport (#47353 )	2019-10-02 14:26:23 +02:00
Yannick Welsch	99d2fe295d	Use optype CREATE for single auto-id index requests (#47353 ) Changes auto-id index requests to use optype CREATE, making it compliant with our docs. This will also make these auto-id index requests compatible with the new "create-doc" index privilege (which is based on the optype), the default optype is changed to create, just as it is already documented.	2019-10-02 14:16:52 +02:00
Yannick Welsch	0024695dd8	Disallow externally generated autoGeneratedTimestamp (#47341 ) The autoGeneratedTimestamp field is internally used to speed up indexing of operations with auto-ids, as we can rule out duplicates. Setting this field externally can make the index inconsistent, resulting in duplicate documents with same id.	2019-10-02 14:16:52 +02:00
Yannick Welsch	8c11fe610e	Use standard semantics for retried auto-id requests (#47311 ) Adds support for handling auto-id requests with optype CREATE. Also simplifies the code handling this by using the standard indexing path when dealing with possible retry conflicts. Relates #47169	2019-10-02 14:16:52 +02:00
Yannick Welsch	7b2613db55	Allow optype CREATE for append-only indexing operations (#47169 ) Bulk requests currently do not allow adding "create" actions with auto-generated IDs. This commit allows using the optype CREATE for append-only indexing operations. This is mainly the user facing aspect of it.	2019-10-02 14:16:52 +02:00
Alpar Torok	a032f9b2d5	Backport testclusters fix bwc (#47363 ) * Add support for bwc for testclusters and convert full cluster restart (#45374) * Testclusters fix bwc (#46740) Additions to make testclsuters work with lather versions of ES * Do common node config on bwc tests Before this PR we always ever ran `ElasticsearchCluster.start` once, and the common node config was never done. This becomes apparent in upgrading from `6.x` to `7.x` as the new config is missing preventing the cluster from starting. * Do common node config on bwc tests Before this PR we always ever ran `ElasticsearchCluster.start` once, and the common node config was never done. This becomes apparent in upgrading from `6.x` to `7.x` as the new config is missing preventing the cluster from starting. * Fix logic to pick up snapshot from 6.x * Make sure ports are cleared * Fix test * Don't clear all the config as we rely on it * Fix removal of keys	2019-10-02 14:37:00 +03:00
Henning Andersen	42453aec96	Fix XPackPlugin usages in tests (#47252 ) XPackPlugin holds data in statics and can only be initialized once. This caused tests to fail primarily when running with a low max-workers. Replaced usages with the LocalStateCompositeXPackPlugin, which handles this properly for testing.	2019-10-02 12:36:02 +02:00
Henning Andersen	b5a2afccb2	MockSearchService concurrency fix (#47139 ) Fixed MockSearchService concurrency, assertNoInFlightContext could have false negative result (rarely). Split out from #46060 Closes #47048	2019-10-02 12:33:18 +02:00
Alan Woodward	697c693ee7	Reset Token position on reuse in scripted analysis (#47424 ) Most of the information in AnalysisPredicateScript.Token is pulled directly from its underlying AttributeSource, but we also keep track of the token position, and this state is held directly on the Token. This information needs to be reset when the containing ScriptFilteringTokenFilter or ScriptedConditionTokenFilter is re-used. Fixes #47197	2019-10-02 11:27:04 +01:00
David Roberts	4379a3c52b	[ML] Throttle the delete-by-query of expired results (#47177 ) Due to #47003 many clusters will have built up a large backlog of expired results. On upgrading to a version where that bug is fixed users could find that the first ML daily maintenance task deletes a very large amount of documents. This change introduces throttling to the delete-by-query that the ML daily maintenance uses to delete expired results to limit it to deleting an average 200 documents per second. (There is no throttling for state/forecast documents as these are expected to be lower volume.) Additionally a rough time limit of 8 hours is applied to the whole delete expired data action. (This is only rough as it won't stop part way through a single operation - it only checks the timeout between operations.) Relates #47103	2019-10-02 11:16:34 +01:00
Jim Ferenczi	42c5054e52	Fix alias field resolution in match query (#47369 ) Synonym queries (when two tokens/paths start at the same position) use the alias field instead of the concrete field to build Lucene queries. This commit fixes this bug by resolving the alias field upfront in order to provide the concrete field to the actual query parser.	2019-10-02 11:45:43 +02:00
István Zoltán Szabó	033aa9cf9b	[DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs (#46966 ) * [DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs. * [DOCS] Removes extra lines from examples. * Update docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * Update docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * [DOCS] Explains examples.	2019-10-02 10:33:45 +02:00
David Turner	7739938930	Clarify that you cannot abort an upgrade (#47342 ) We do mention that rolling back an upgrade requires a restore from a snapshot, but it's hidden at the bottom of the "preparing to upgrade" instructions on a different page from the actual upgrade instructions. This commit duplicates the preparatory instructions onto the pages containing the actual upgrade instructions and rewords the point about rollbacks a bit.	2019-10-02 09:29:10 +01:00
Robin Clarke	98c1c2f650	Clearer language around upgrade sequence (#47422 )	2019-10-02 09:25:50 +01:00
István Zoltán Szabó	1cecbd0cd3	[DOCS] Fine tunes update anomaly detection job API documentation (#47280 ) * [DOCS] Fine tunes update anomaly detection job API documentation. * [DOCS] Removes delimiter to fix the table.	2019-10-02 10:06:49 +02:00
István Zoltán Szabó	6a9f04ee76	[DOCS] Fixes typos in the PUT dfa and the evaluate dfa documentation. (#47348 )	2019-10-02 09:52:29 +02:00
Dimitris Athanasiou	36884a3c32	[7.x][ML] Restore analytics state if available (#47128 ) (#47393 ) This commit restores the model state if available in data frame analytics jobs. In addition, this changes the start API so that a stopped job can be restarted. As we now store the progress in the state index when the task is stopped, we can use it to determine what state the job was in when it got stopped. Note that in order to be able to distinguish between a job that runs for the first time and another that is restarting, we ensure reindexing progress is reported to be at least 1 for a running task.	2019-10-02 10:24:05 +03:00
Ryan Ernst	bd5f64848e	Clarify missing java error message (#46160 ) Since the bundled jdk was added to Elasticsearch, there are now 2 ways java can be missing. Either JAVA_HOME is set but does not exist, or the bundled jdk does not exist. This commit improves the error messages in those two cases, and also ensures our tests cover both cases.	2019-10-01 22:10:19 -07:00
Nhat Nguyen	5cfcd7c458	Re-fetch shard info of primary when new node joins (#47035 ) Today, we don't clear the shard info of the primary shard when a new node joins; then we might risk of making replica allocation decisions based on the stale information of the primary. The serious problem is that we can cancel the current recovery which is more advanced than the copy on the new node due to the old info we have from the primary. With this change, we ensure the shard info from the primary is not older than any node when allocating replicas. Relates #46959 This work was done by Henning in #42518. Co-authored-by: Henning Andersen <henning.andersen@elastic.co>	2019-10-01 22:16:26 -04:00
Mark Vieira	ff15495b98	Remove empty buildSrc subproject (#47415 )	2019-10-01 16:34:31 -07:00
James Rodewig	079bf887c0	[DOCS] Reorder index APIs alphabetically (#46981 ) (#47402 )	2019-10-01 17:07:28 -04:00
Gordon Brown	ba6ee2d40d	[7.x] Adjust randomization in cluster shard limit tests (#47254 ) This commit adjusts randomization for the cluster shard limit tests so that there is often more of a gap left between the limit and the size of the first index. This allows the same randomization to be used for all tests, and alleviates flakiness in `testIndexCreationOverLimitFromTemplate`.	2019-10-01 14:53:10 -06:00
David Turner	99b25d3740	Keep nodes above watermark in testAutomaticReleaseOfIndexBlock (#47387 ) Today the comment boldly claims that this line of code keeps nodes above the 10-byte low watermark when in fact this is not true at all. This change fixes this so that it really does keep nodes above the low watermark. Fixes #45338. Again.	2019-10-01 19:58:23 +01:00
James Rodewig	0179f93544	[DOCS] Reformat simulate pipeline API (#47301 ) (#47398 )	2019-10-01 14:49:14 -04:00
James Rodewig	aeb4edce3a	[DOCS] Reformat put pipeline API (#47171 ) (#47395 )	2019-10-01 14:48:18 -04:00
Benjamin Trent	f5fe5e7cd6	[7.x] [ML][Inference] Adding preprocessors to definition object (#47320 ) (#47370 ) * [ML][Inference] Adding preprocessors to definition object (#47320) * [ML][Inference] Adding preprocessors to definition object * Update TrainedModelConfig.java * adjusting for backport	2019-10-01 13:31:25 -04:00
lcawl	66116e39ba	[DOCS] Edits ML release notes	2019-10-01 10:15:06 -07:00
Armin Braun	3d6ef6a90e	Speed up and Reorder Snapshot Delete Operations (#47293 ) (#47350 ) This is a preliminary of #46250 making the snapshot delete work by doing all the metadata updates first and then bulk deleting all of the now unreferenced blobs. Before this change, the metadata updates for each shard and subsequent deletion of the blobs that have become unreferenced due to the delete would happen sequentially shard-by-shard parallelising only over all the indices in the snapshot. This change makes it so the all the metadata updates happen in parallel on a shard level first. Once all of the updates of shard-level metadata have finished, all the now unreferenced blobs are deleted in bulk. This has two benefits (outside of making #46250 a smaller change): * We have a lower likelihood of failing to update shard level metadata because it happens with priority and a higher degree of parallelism * Deleting of unreferenced data in the shards should go much faster in many cases (rolling indices, large number of indices with many unchanged shards) as well because a number of small bulk deletions (just two blobs for `index-N` and `snap-` for each unchanged shard) are grouped into larger bulk deletes of `100-1000` blobs depending on Cloud provider (even though the final bulk deletes are happening sequentially this should be much faster in almost all cases as you'd parallelism of 50 (GCS) to 500 (S3) snapshot threads to achieve the same delete rates when deleting from unchanged shards).	2019-10-01 19:05:43 +02:00
James Rodewig	e70220857d	[DOCS] Document cat tasks API (#47321 ) (#47375 )	2019-10-01 12:22:50 -04:00
Colin Goodheart-Smithe	c93b39c65b	Adds version 7.4.1	2019-10-01 16:03:11 +01:00
Mark Tozzi	5bdf25320a	Documentation notes for Range field histograms (#46890 ) (#47366 )	2019-10-01 10:58:44 -04:00
Lisa Cawley	5ba543fd6c	[DOCS] Adds machine learning PRs to release notes (#47316 )	2019-10-01 10:17:41 -04:00
James Rodewig	2ca075dee4	[DOCS] Remove coming tags for 7.4.0 release (#47318 )	2019-10-01 10:17:36 -04:00
Albert Zaharovits	78558a7b2f	Fix AD realm additional metadata (#47179 ) Due to a regression bug the metadata Active Directory realm setting is ignored (it works correctly for the LDAP realm type). This commit redresses it. Closes #45848	2019-10-01 17:05:25 +03:00
Marios Trivyzas	f792dbf239	SQL: Implement DATE_PART function (#47206 ) DATE_PART(<datetime unit>, <date/datetime>) is a function that allows the user to extract the specified unit from a date/datetime field similar to the EXTRACT (<datetime unit> FROM <date/datetime>) but with different names and aliases for the units and it also provides more options like `DATE_PART('tzoffset', datetimeField)`. Implemented following the SQL server's spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017 with the difference that the <datetime unit> argument is either a literal single quoted string or gets a value from a table field, whereas in SQL server keywords are used (unquoted identifiers) and it's not possible to use a value coming for a table column. Closes: #46372 (cherry picked from commit ead743d3579eb753fd314d4a58fae205e465d72e)	2019-10-01 16:28:27 +03:00
Benjamin Trent	4335e07716	[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267 ) (#47310 ) * [ML][Inference] adding .ml-inference* index and storage (#47267) * [ML][Inference] adding .ml-inference* index and storage * Addressing PR comments * Allowing null definition, adding validation tests for model config * fixing line length * adjusting for backport	2019-10-01 08:20:33 -04:00
Tanguy Leroux	c43e932a0c	Fix CharArraysTests.testConstantTimeEquals() (#47346 ) The change #47238 fixed a first issue (#47076) but introduced another one that can be reproduced using: org.elasticsearch.common.CharArraysTests > testConstantTimeEquals FAILED java.lang.StringIndexOutOfBoundsException: String index out of range: 1 at __randomizedtesting.SeedInfo.seed([DFCA64FE2C786BE3:ED987E883715C63B]:0) at java.lang.String.substring(String.java:1963) at org.elasticsearch.common.CharArraysTests.testConstantTimeEquals(CharArraysTests.java:74) REPRODUCE WITH: ./gradlew ':libs:elasticsearch-core:test' --tests "org.elasticsearch.common.CharArraysTests.testConstantTimeEquals" -Dtests.seed=DFCA64FE2C786BE3 -Dtests.security.manager=true -Dtests.locale=fr-CA -Dtests.timezone=Pacific/Johnston -Dcompiler.java=12 -Druntime.java=8 that happens when the first randomized string has a length of 0.	2019-10-01 12:49:15 +02:00
Ioannis Kakavas	3b06916fcd	Revert "Fix Active Directory tests (#47266 )" This reverts commit 7d9c06421866843cf6b9c25065b591f0ba0a0cc9.	2019-10-01 13:32:31 +03:00
Howard	a9cd42c05d	Cancel recoveries even if all shards assigned (#46520 ) We cancel ongoing peer recoveries if a node joins the cluster with a completely up-to-date copy of a shard, because we can use such a copy to recover a replica instantly. However, today we only look for recoveries to cancel while there are unassigned shards in the cluster. This means that we do not contemplate the cancellation of the last few recoveries since recovering shards are not unassigned. It might take much longer for these recoveries to complete than would be necessary if they were cancelled. This commit fixes this by checking for cancellable recoveries even if all shards are assigned.	2019-10-01 10:55:32 +01:00
Ignacio Vera	03d717dc32	Provide better error when updating geo_shape field mapper settings (#47281 ) (#47338 )	2019-10-01 10:52:39 +02:00
Ioannis Kakavas	7d9c064218	Fix Active Directory tests (#47266 ) Fixes multiple Active Directory related tests that run against the samba fixture. Some were failing since we changed the realm settings format in 7.0 and a few were slightly broken in other ways. We can move to cleanup the tests in a follow up but this work fits better to be done with or after we move the tests from a Samba based fixture to a real(-ish) Microsoft Active Directory based fixture. Resolves: #33425, #35738	2019-10-01 10:52:07 +03:00
Tanguy Leroux	f5c5411fe8	Differentiate base paths in repository integration tests (#47284 ) (#47300 ) This commit change the repositories base paths used in Azure/S3/GCS integration tests so that they don't conflict with each other when tests run in parallel on real storage services. Closes #47202	2019-10-01 08:39:55 +02:00

1 2 3 4 5 ...

47995 Commits