OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jake Landis	c320b499a0	Prevent deadlock by using separate schedulers (#48697 ) (#48964 ) Currently the BulkProcessor class uses a single scheduler to schedule flushes and retries. Functionally these are very different concerns but can result in a dead lock. Specifically, the single shared scheduler can kick off a flush task, which only finishes it's task when the bulk that is being flushed finishes. If (for what ever reason), any items in that bulk fails it will (by default) schedule a retry. However, that retry will never run it's task, since the flush task is consuming the 1 and only thread available from the shared scheduler. Since the BulkProcessor is mostly client based code, the client can provide their own scheduler. As-is the scheduler would require at minimum 2 worker threads to avoid the potential deadlock. Since the number of threads is a configuration option in the scheduler, the code can not enforce this 2 worker rule until runtime. For this reason this commit splits the single task scheduler into 2 schedulers. This eliminates the potential for the flush task to block the retry task and removes this deadlock scenario. This commit also deprecates the Java APIs that presume a single scheduler, and updates any internal code to no longer use those APIs. Fixes #47599 Note - #41451 fixed the general case where a bulk fails and is retried that can result in a deadlock. This fix should address that case as well as the case when a bulk failure from the flush needs to be retried.	2019-11-11 16:31:21 -06:00
Jason Tedor	acae07113f	Fix names of UBI-based Docker build contexts This commit fixes the names of the UBI-based Docker build contexts to lift the ubi component of the name into the archive base name, instead of the classifier.	2019-11-11 15:43:53 -05:00
Benjamin Trent	46ab1db54f	[7.x] [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050 ) (#48958 ) * [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) Related PR: https://github.com/elastic/ml-cpp/pull/809 * adjusting bwc version	2019-11-11 15:43:03 -05:00
Mark Vieira	8acbd0aa2a	Ensure client jar projects generate correct POM artifacts (#48961 )	2019-11-11 12:25:14 -08:00
Mark Tozzi	d9e569278f	Refactor and DRY up Kahan Sum algorithm (#48558 ) (#48959 )	2019-11-11 15:09:19 -05:00
Armin Braun	c45470f84f	Fix ShardGenerations in RepositoryData in BwC Case (#48920 ) (#48947 ) We were tripping the assertion that the makes sure we only have empty `ShardGenerations` in `RepositoryData` in the BwC case because shard generations were passed to the `Repository` in the BwC case. Fixed by only generating empty shard gen for BwC snapshots in `SnapshotsService`.	2019-11-11 18:02:53 +01:00
Jake Landis	909fbd0015	[7.x] Mute FullClusterRestartTest#testWatcher and 30s timeout… (#48850 ) The timeout was increased to 60s to allow this test more time to reach a yellow state. However, the test will still on occasion fail even with the 60s timeout. Related: #48381 Related: #48434 Related: #47950 Related: #40178	2019-11-11 09:38:14 -06:00
Christoph Büscher	6119f0aaa2	Fix Eclipse compilation in DataFrameDataExtractorTests (#48942 )	2019-11-11 16:17:55 +01:00
Martijn van Groningen	a1dd830cb5	Re-enabled test with longer timeout waiting for monitoring. See #48258	2019-11-11 16:07:50 +01:00
István Zoltán Szabó	c2f52015d3	[DOCS] Removes best practice about fields that are highly correlated to the dependent variable. (#48935 )	2019-11-11 16:01:21 +01:00
István Zoltán Szabó	91888959e8	[DOCS] Extends analyzed_fields description in PUT DFA API docs. (#48307 )	2019-11-11 15:55:12 +01:00
Patrick Maynard	4b85498617	[DOCS] Fix typo in search type docs (#48868 )	2019-11-11 09:38:48 -05:00
Rory Hunter	014e1b1090	Improve resiliency to auto-formatting in server (#48940 ) Backport of #48450. Make a number of changes so that code in the `server` directory is more resilient to automatic formatting. This covers: * Reformatting multiline JSON to embed whitespace in the strings * Move some comments around to they aren't auto-formatted to a strange place. This also required moving some `&&` and `\|\|` operators from the end-of-line to start-of-line`. * Add helper method `reformatJson()`, to strip whitespace from a JSON document using XContent methods. This is sometimes necessary where a test is comparing some machine-generated JSON with an expected value. Also, `HyperLogLogPlusPlus.java` is now excluded from formatting because it contains large data tables that don't reformat well with the current settings, and changing the settings would be worse for the rest of the codebase.	2019-11-11 14:33:04 +00:00
James Rodewig	dd92830801	[DOCS] Reformat condition token filter (#48775 )	2019-11-11 08:49:44 -05:00
Rafael Acevedo	eb0d8f3383	update gradle to 5.6.4 (#48872 )	2019-11-11 15:30:31 +02:00
Rory Hunter	35e21f85f3	Reenable Docker tests again (#48936 ) Backport of #48898. We no longer configure distributions for prior versions for Docker. This is because doing so prompts Gradle to try and resolve the Docker dependencies, which doesn't work as they can't be downloaded via Ivy (configured in DistributionDownloadPlugin). Since we need these for the BATS upgrade tests, and those tests only cover .rpm and .deb, it's OK to omit creating such distributions in the first place. We may need to revisit this in the future, to allow upgrade testing using Docker containers.	2019-11-11 11:43:32 +00:00
Alpar Torok	e33a1b7942	Add links to infra-stats for scans generated in CI (#48732 ) * Add links to infra-stats for scans generated in CI It turns out we already gather system logs in infra-stats, and we have system metrics too there. This PR adds a links to the logs we gather for the host the build is runnig on. And a link to the host overview in the infrastructure app tuned to 5 minutes from before gradle started to 5 minutes after the scan was generated. * add buildFinished	2019-11-11 11:24:09 +02:00
Arne Welzel	f642baa9fb	[DOCS] Remove extra "when" (#48926 )	2019-11-11 10:11:02 +01:00
Yannick Welsch	87862868c6	Allow realtime get to read from translog (#48843 ) The realtime GET API currently has erratic performance in case where a document is accessed that has just been indexed but not refreshed yet, as the implementation will currently force an internal refresh in that case. Refreshing can be an expensive operation, and also will block the thread that executes the GET operation, blocking other GETs to be processed. In case of frequent access of recently indexed documents, this can lead to a refresh storm and terrible GET performance. While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted to read from the translog in case of realtime GET API or update API, this was removed in 5.0 (#20102) to avoid inconsistencies between values that were returned from the translog and those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and upsert to read from the translog again as it was easier to guarantee consistency for these, and also brought back more predictable performance characteristics of this API. Calls to the realtime GET API, however, would still always do a refresh if necessary to return consistent results. This means that users that were calling realtime GET APIs to coordinate updates on client side (realtime GET + CAS for conditional index of updated doc) would still see very erratic performance. This PR (together with #48707) resolves the inconsistencies between reading from translog and index. In particular it fixes the inconsistencies that happen when requesting stored fields, which were not available when reading from translog. In case where stored fields are requested, this PR will reparse the _source from the translog and derive the stored fields to be returned. With this, it changes the realtime GET API to allow reading from the translog again, avoid refresh storms and blocking the GET threadpool, and provide overall much better and predictable performance for this API.	2019-11-09 17:47:50 +01:00
Nhat Nguyen	ff6c121eb9	Closed shard should never open new engine (#47186 ) We should not open new engines if a shard is closed. We break this assumption in #45263 where we stop verifying the shard state before creating an engine but only before swapping the engine reference. We can fail to snapshot the store metadata or checkIndex a closed shard if there's some IndexWriter holding the index lock. Closes #47060	2019-11-08 23:40:34 -05:00
Nhat Nguyen	9a42e71dd9	Do not cancel recovery for copy on broken node (#48265 ) This change fixes a poisonous situation where an ongoing recovery was canceled because a better copy was found on a node that the cluster had previously tried allocating the shard to but failed. The solution is to keep track of the set of nodes that an allocation was failed on so that we can avoid canceling the current recovery for a copy on failed nodes. Closes #47974	2019-11-08 23:10:47 -05:00
Julian Simioni	5e4501eb3f	[Docs] Consolidate single example into a single line (#48904 ) The first example of splitting rules for the `word_delimiter` token filter was spread across two bullet points. This makes it look like they are two separate splitting rules.	2019-11-08 15:12:45 -05:00
Yannick Welsch	af887be3e5	Hide orphaned tasks from follower stats (#48901 ) CCR follower stats can return information for persistent tasks that are in the process of being cleaned up. This is problematic for tests where CCR follower indices have been deleted, but their persistent follower task is only cleaned up asynchronously afterwards. If one of the following tests then accesses the follower stats, it might still get the stats for that follower task. In addition, some tests were not cleaning up their auto-follow patterns, leaving orphaned patterns behind. Other tests cleaned up their auto-follow patterns. As always the same name was used, it just depended on the test execution order whether this led to a failure or not. This commit fixes the offensive tests, and will also automatically remove auto-follow-patterns at the end of tests, like we do for many other features. Closes #48700	2019-11-08 13:56:53 +01:00
Henning Andersen	8835142ac9	Grok processor ignore case test (#48909 ) Added test demonstrating that grok using ignore case works, since this does a minimal test that the `joni` and `jcodings` libraries are compatible. Forward-port of test from #43334	2019-11-08 00:04:29 +01:00
bellengao	bdc7057d58	[DOCS] Correct typo in split index API docs (#48894 )	2019-11-07 15:27:27 -05:00
Tanguy Leroux	8a14ea5567	Add docker-composed based test fixture for GCS (#48902 ) Similarly to what has be done for Azure in #48636, this commit adds a new :test:fixtures:gcs-fixture project which provides two docker-compose based fixtures that emulate a Google Cloud Storage service. Some code has been extracted from existing tests and placed into this new project so that it can be easily reused in other projects.	2019-11-07 13:27:22 -05:00
bellengao	293902c6a5	[DOCS] Fix shard type in CCR overview doc (#48882 ) Closes #48875	2019-11-07 10:09:45 -05:00
Rory Hunter	df16ff777e	Disable docker packaging tests again (#48896 ) Backport of #48883. Per elastic/infra#15864, the Elasticsearch CI images are failing due to a packer_cache failure. This is because Gradle is trying to resolve a `.docker` file through the Ivy repository, which doesn't work. Disable the Docker tests again until we figure out the way forward.	2019-11-07 14:28:33 +00:00
Dan Hermann	5805560a2a	Validate index name time format setting at parse time (#47911 ) (#48881 )	2019-11-07 05:24:49 -06:00
Dimitris Athanasiou	dfc6a13b44	[7.x][ML] Handle nested arrays in source fields (#48885 ) (#48889 ) Backport of #48885	2019-11-07 07:30:50 +02:00
Adrien Grand	3b9ce0a4f3	Elasticsearch 7.5 is on Lucene 8.3. (#48831 )	2019-11-06 10:13:09 -05:00
Tanguy Leroux	552381d7f9	Add mention to Pause Auto-Follower API in Upgrade Clusters docs (#48764 ) Relates #46665	2019-11-06 09:48:44 -05:00
István Zoltán Szabó	3c9bd13dca	[DOCS] Adds classification type DFA API docs and ml-shared.asciidoc (#48241 )	2019-11-06 07:41:38 -05:00
István Zoltán Szabó	70765dfb05	[DOCS] Adds classification type evaluation docs to the DFA evaluation API (#47657 )	2019-11-06 07:38:33 -05:00
James Rodewig	f1396b6322	[DOCS] Add Java to list of HTTP client libraries for basic authentication (#48647 )	2019-11-05 17:09:10 -05:00
David Turner	bd5c6c4779	Add preflight check to dynamic mapping updates (#48867 ) Today if the primary discovers that an indexing request needs a mapping update then it will send it to the master for validation and processing. If, however, the put-mapping request is invalid then the master still processes it as a (no-op) cluster state update. When there are a large number of indexing operations that result in invalid mapping updates this can overwhelm the master. However, the primary already has a reasonably up-to-date mapping against which it can check the (approximate) validity of the put-mapping request before sending it to the master. For instance it is not possible to remove fields in a mapping update, so if the primary detects that a mapping update will exceed the fields limit then it can reject it itself and avoid bothering the master. This commit adds a pre-flight check to the mapping update path so that the primary can discard obviously-invalid put-mapping requests itself. Fixes #35564 Backport of #48817	2019-11-05 18:08:22 +01:00
Rory Hunter	24f7d4e83b	Add Docker packaging tests on 7.x (#48857 ) Backport of #46599 and #47640. Add packaging tests for Docker. * Introduce packaging tests for Docker (#46599) Closes #37617. Add packaging tests for our Docker images, similar to what we have for RPMs or Debian packages. This works by running a container and probing it e.g. via `docker exec`. Test can also be run in Vagrant, by exporting the Docker images to disk and loading them again in VMs. Docker is installed via `Vagrantfile` in a selection of boxes. * Only define Docker pkg tests if Docker is available (#47640) Closes #47639, and unmutes tests that were muted in b958467. The Docker packaging tests were being defined irrespective of whether Docker was actually available in the current environment. Instead, implement exclude lists so that in environments where Docker is not available, no Docker packaging tests are defined. For CI hosts, the build checks `.ci/dockerOnLinuxExclusions`. The Vagrant VMs can defined the extension property `shouldTestDocker` property to opt-in to packaging tests. As part of this, define a seperate utility class for checking Docker, and call that instead of defining checks in-line in BuildPlugin.groovy	2019-11-05 15:17:59 +00:00
glerb	baabc21a04	[DOCS] Correct typo in Discovery docs (#48494 )	2019-11-05 08:48:43 -05:00
David Roberts	c03f7ba74c	[TEST] Mute TimeoutCheckerTests.testWatchdog Due to https://github.com/elastic/elasticsearch/issues/48861	2019-11-05 11:49:46 +00:00
Armin Braun	d83e374062	Bound Linearizability Check in CoordinatorTests (#48751 ) (#48853 ) Same as #44444 but for the coordinator tests. Closes #48742	2019-11-04 21:36:17 +01:00
Dan Hermann	c85cf7a6de	Validate proxy base path at parse time (#47912 ) (#48825 )	2019-11-04 09:51:13 -06:00
Nhat Nguyen	020ff0fef9	Do not intercept renew requests from other tests (#48833 ) We might have some outstanding renew retention lease requests after a shard has unfollowed. If testRetentionLeaseIsAddedIfItDisappearsWhileFollowing intercepts a renew request from other tests then we will never unlatch and the test will time out. Closes #45192	2019-11-02 21:15:05 -04:00
Nhat Nguyen	0887cbc964	Fix testForceMergeWithSoftDeletesRetentionAndRecoverySource (#48766 ) This test failure manifests the limitation of the recovery source merge policy explained in #41628. If we already merge down to a single segment then subsequent force merges will be noop although they can prune recovery source. We need to adjust this test until we have a fix for the merge policy. Relates #41628 Closes #48735	2019-11-02 21:14:12 -04:00
Armin Braun	3c20541823	Cleanup Concurrent RepositoryData Loading (#48329 ) (#48834 ) The loading of `RepositoryData` is not an atomic operation. It uses a list + get combination of calls. This lead to accidentally returning an empty repository data for generations >=0 which can never not exist unless the repository is corrupted. In the test #48122 (and other SLM tests) there was a low chance of running into this concurrent modification scenario and the repository actually moving two index generations between listing out the index-N and loading the latest version of it. Since we only keep two index-N around at a time this lead to unexpectedly absent snapshots in status APIs. Fixing the behavior to be more resilient is non-trivial but in the works. For now I think we should simply throw in this scenario. This will also help prevent corruption in the unlikely event but possible of running into this issue in a snapshot create or delete operation on master failover on a repository like S3 which doesn't have the "no overwrites" protection on writing a new index-N. Fixes #48122	2019-11-02 20:42:29 +01:00
Armin Braun	a22f6fbe3c	Cleanup Redundant Futures in Recovery Code (#48805 ) (#48832 ) Follow up to #48110 cleaning up the redundant future uses that were left over from that change.	2019-11-02 17:28:12 +01:00
Nhat Nguyen	4c70770877	Add debug log for CcrRetentionLeaseIT (#48820 ) testRetentionLeaseIsAddedIfItDisappearsWhileFollowing is still failing although we already have several fixes. I think other tests interfere and cause this test to fail. We can use the test scope to isolate them. However, I prefer to add debug logs so we can find the source. Relates #45192	2019-11-01 22:07:35 -04:00
Jason Tedor	c24595e2ec	Fix names of UBI-based Docker image build contexts This commit fixes the name of the UBI-based Docker image build contexts to include "7" (to set us up for the future where we are likely to have a ubi8-based image).	2019-11-01 17:29:15 -04:00
Armin Braun	e26d01e71f	Make CcrRepository#restore non-Blocking (#48814 ) (#48823 ) With the changes in #48110 there is no more need to block a generic thread when waiting for the multi file transfer in `CcrRepository`.	2019-11-01 21:02:47 +01:00
Lee Hinman	6c290ecaf7	Fix ilm/20_move_to_step basic moving to step (#48821 ) Previously this step moved to the forcemerge step, however, if the machine running the test was fast enough, it would execute the forcemerge and move to the next step (`segment-count`) so the comparison would fail. This commit changes the step to be a step that will never go anywhere else, the terminal step. Resolves #48761	2019-11-01 13:58:24 -06:00
Jason Tedor	c82ecb664c	Do not wrap ingest processor exception with IAE (#48816 ) The problem with wrapping here is that it converts any exception into an IAE, which we treat as a client error (400 status) whereas the exception being wrapped here could be a server error (e.g., NPE). This commit stops wrapping all ingest processor exceptions as IAEs.	2019-11-01 15:11:35 -04:00

1 2 3 4 5 ...

48802 Commits All Branches Search

48802 Commits

All Branches