Commit Graph

2232 Commits

Author SHA1 Message Date
Nhat Nguyen 15aa3764a4
Reduce recovery time with compress or secure transport (#36981)
Today file-chunks are sent sequentially one by one in peer-recovery. This is a
correct choice since the implementation is straightforward and recovery is
network bound in most of the time. However, if the connection is encrypted, we
might not be able to saturate the network pipe because encrypting/decrypting
are cpu bound rather than network-bound.

With this commit, a source node can send multiple (default to 2) file-chunks
without waiting for the acknowledgments from the target.

Below are the benchmark results for PMC and NYC_taxis.

- PMC (20.2 GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| -------- | -------- | -------- | -------- |
| Plain     | 184s     | 137s     | 106s     | 105s     | 106s     |
| TLS       | 346s     | 294s     | 176s     | 153s     | 117s     |
| Compress  | 1556s    | 1407s    | 1193s    | 1183s    | 1211s    |

- NYC_Taxis (38.6GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| ---------| ---------| ---------| -------- |
| Plain     | 321s     | 249s     | 191s     |  *       | *        |
| TLS       | 618s     | 539s     | 323s     | 290s     | 213s     |
| Compress  | 2622s    | 2421s    | 2018s    | 2029s    | n/a      |

Relates #33844
2019-01-14 15:14:46 -05:00
Tim Brooks 5c68338a1c
Implement ccr file restore (#37130)
This is related to #35975. It implements a file based restore in the
CcrRepository. The restore transfers files from the leader cluster
to the follower cluster. It does not implement any advanced resiliency
features at the moment. Any request failure will end the restore.
2019-01-14 13:07:55 -07:00
Christoph Büscher c801b89072
Fix Eclipse specific compilation issue (#37419)
Without pulling out the supplier function to the enclosing class, Eclipse 4.8
complains with the following error "No enclosing instance of type
CoordinatorTests.Cluster is available due to some intermediate constructor
invocation"
2019-01-14 20:39:04 +01:00
markharwood 92c6c98e8d
Performance fix. Reduce deprecation calls for the same bulk request (#37415)
DeprecationLogger has warning de-duplication logic but it is expensive to run as it involves parsing existing warning headers. This PR changes the upstream bulk indexing code to do its own "event thinning" rather than relying on DeprecationLogger's trimming.
Closes #37411
2019-01-14 17:51:49 +00:00
David Kyle 1abe5df09c Mute IndexShardRetentionLeaseTests.testCommit #37420 2019-01-14 14:17:11 +00:00
Daniel Mitterdorfer abe35fb99b
Remove unused index store in directory service
With this commit we remove the unused field `indexStore` from all
implementations of `FsDirectoryService`.

Relates #37097
2019-01-14 13:44:32 +01:00
Tanguy Leroux 07dc8c7eee
Improve CloseWhileRelocatingShardsIT (#37348) 2019-01-14 13:14:36 +01:00
Tanguy Leroux 6ca076bf74
Fix ClusterBlock serialization and Close Index API logic after backport to 6.x (#37360)
This commit changes the versions in the serialization logic of ClusterBlock 
after the backport to 6.x of the Close Index API refactoring (#37359).
2019-01-14 13:13:15 +01:00
Christoph Büscher 89b45f1fc6
Remove deprecated pipeline request contructors (#37366)
The constructors in PutPipelineRequest and SimulatePipelineRequest that guess
the xContent type from the provided source are deprecated since 6.0 and each
have a counterpart that takes the xContent type as an explicit argument.
Removing these ctors together with the builders and methods in
ClusterAdminClient that don't have the xContent type as argument.
2019-01-14 11:14:38 +01:00
Nhat Nguyen d44a6f9fbc
Simplify SyncedFlushService flow with StepListener (#37383)
Today the SyncedFlushService flow is written with multiple nested 
callbacks which are hard to read. This commit replaces them with 
sequential step listeners.
2019-01-14 03:54:34 -05:00
Luca Cavanna d54f88f62c
Remove unused empty constructors from suggestions classes (#37295)
We recently migrated suggestions to `Writeable`. That allows us to also
clean up empty constructors and methods that called them as they are no
longer needed. They are replaced by constructors that accept a
`StreamInput` instance.
2019-01-14 08:32:45 +01:00
Jason Tedor 03be4dbaca
Introduce retention lease persistence (#37375)
This commit introduces the persistence of retention leases by persisting
them in index commits and recovering them when recovering a shard from
store.
2019-01-12 14:43:19 -08:00
Nhat Nguyen 44a1071018
Make recovery source partially non-blocking (#37291)
Today a peer-recovery may run into a deadlock if the value of
node_concurrent_recoveries is too high. This happens because the
peer-recovery is executed in a blocking fashion. This commit attempts
to make the recovery source partially non-blocking. I will make three
follow-ups to make it fully non-blocking: (1) send translog operations,
(2) primary relocation, (3) send commit files.

Relates #36195
2019-01-12 12:49:48 -05:00
Armin Braun 63fe3c6ed6
Fix PrimaryAllocationIT Race Condition (#37355)
* Fix PrimaryAllocationIT Race Condition

* Forcing a stale primary allocation on a green index was tripping the assertion that was removed
   * Added a test that this case still errors out correctly
* Made the ability to wipe stopped datanode's data public on the internal test cluster and used it to ensure correct behaviour on the fixed test
   * Previously it simply passed because the test finished before the index went green and would NPE when the index was green at the time of the shard store status request, that would then come up empty
* Closes #37345
2019-01-11 23:26:04 +01:00
Nhat Nguyen 70cee18e56
Introduce StepListener (#37327)
This commit introduces StepListener which provides a simple way to write
a flow consisting of multiple asynchronous steps without having nested
callbacks.

Relates #37291
2019-01-11 13:06:17 -05:00
Christoph Büscher bb6d8784e7
Switch indices.get rest after backport of `include_type_name` (#37351)
With the `include_type_name` available now for indices.get on 6.x after the
backport, the corresponsing yaml test can include anything from 6.7 on.
Also changing the RestGetIndicesActionTests base test class.
2019-01-11 17:24:12 +01:00
Armin Braun 1eba1d1df9
Fix SnapshotDisruptionIT Race Condition (#37358)
* Due to a race between retrying the snapshot creation and the failed snapshot create trying to delete the snapshot there is no guarantee that the snapshot is eventually created by retries
   * Adjusted the assertion accordingly
* Closes #36779
2019-01-11 16:09:26 +01:00
Yannick Welsch f4abf9628a
Mock connections more accurately in DisruptableMockTransport (#37296)
This commit moves DisruptableMockTransport to use a more accurate representation of connection
management, which allows to use the full connection manager and does not require mocking out
any behavior. With this, we can implement restarting nodes in CoordinatorTests.
2019-01-11 16:06:48 +01:00
Boaz Leskes d21df2a17a
Use Sequence number powered OCC for processing updates (#37308)
Updates perform realtime get, perform the requested update and then index the document again
using optimistic concurrency control. This PR changes the logic to use sequence numbers instead
of versioning. 

Note that the current versioning logic isn't suffering from the same problem as external OCC
requests because the get and indexing is always done on the same primary.

Relates #36148 
Relates #10708
2019-01-11 06:23:55 -08:00
Yannick Welsch 4d3928d444 Increase timeouts in UnicastZenPingTests
Relates to #37268
2019-01-11 11:33:55 +01:00
Yannick Welsch 3a929c7aea Increase assertBusy timeouts for RefreshListenersTests 2019-01-11 10:15:39 +01:00
Alpar Torok 82ca2d62de Mute CloseWhileRelocatingShardsIT.testCloseWhileRelocatingShards
Tracked by #37274
2019-01-11 10:54:53 +02:00
Alpar Torok 3e73911cbe Mute PrimaryAllocationIT.testForceStaleReplicaToBePromotedToPrimaryOnWrongNode
Tracking issue: #37345
2019-01-11 10:49:25 +02:00
Ignacio Vera 0a50821bb2
Geo: Do not normalize the longitude with value -180 for Lucene shapes (#37299)
Lucene based shapes should not normalize the longitude value -180 to 180.
2019-01-11 09:37:18 +01:00
Nhat Nguyen 360c430ad7
Add runAfter and notifyOnce wrapper to ActionListener (#37331)
Relates #37291
2019-01-11 03:33:06 -05:00
Alexander Reelsen 9f3da013d8
Date/Time parsing: Use java time API instead of exception handling (#37222)
* Add benchmark

* Use java time API instead of exception handling

when several formatters are used, the existing way of parsing those is
to throw an exception catch it, and try the next one. This is is
considerably slower than the approach taken in joda time, so that
indexing is reduced when a date format like `x||y` is used and y is the
date format being used.

This commit now uses the java API to parse the date by appending the
date time formatters to each other and does not rely on exception
handling.

* fix benchmark

* fix tests by changing formatter, also expose printer

* restore optional printing logic to fix tests

* fix tests

* incorporate review comments
2019-01-11 09:25:05 +01:00
Jason Tedor 822626dadf
Make consistent empty retention lease supplier
This commit makes the use of empty retention lease suppliers to always
be an empty list as opposed to in some cases an empty set. This commit
is solely for consistency reasons, there is no functional change here.
2019-01-10 18:34:55 -08:00
Jason Tedor edc95c8a8e
Add validation for retention lease construction (#37312)
This commit adds some simple validation that the values input to the
retention lease constructor our valid values. We will later rely on
these values being within the validated range.
2019-01-10 18:13:05 -08:00
Jack Conradson b5b93a2746
Rename ParameterMap to DeprecationMap (#37317)
Mechanical change to rename ParameterMap to DeprecationMap as this 
seems more appropriate for an extended Map to issue deprecation warnings.
2019-01-10 13:58:00 -08:00
markharwood 434430506b
Type removal - added deprecation warnings to _bulk apis (#36549)
Added warnings checks to existing tests
Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice.
Related to #35190
2019-01-10 21:35:19 +00:00
Julie Tibshirani a433c4012c
Support include_type_name in the field mapping and index template APIs. (#37210)
* Add include_type_name to the get field mappings API.
* Make sure the API specification lists include_type_name as a boolean.
* Add include_type_name to the get index templates API.
* Add include_type_name to the put index templates API.
2019-01-10 09:24:08 -08:00
Ryan Ernst fcf7df3eda
Core: Handle security manager permission for deprecation log rolling (#37281)
When the deprecation log is written to within scripting support code
like ScriptDocValues, it runs under the reduces privileges of scripts.
Sometimes this can trigger log rolling, which then causes uncaught
security errors, as was handled in #28485. While doing individual
deprecation handling within each deprecation scripting location is
possible, there are a growing number of deprecations in scripts.

This commit wraps the logging call within the deprecation logger use a
doPrivileged block, just was we would within individual logging call
sites for scripting utilities.
2019-01-10 07:44:40 -08:00
Armin Braun 46237faa97
Fail Stale Primary Alloc. Req. without Data (#37226)
* Get indices shard store status before enqueuing the reallocation state update task to prevent
tasks that would fail because a node does not hold a stale copy of the shard on a best effort basis
* Closes #37098
2019-01-10 16:28:38 +01:00
Przemyslaw Gomulka c812e6aea6
Fix line length in org.elasticsearch.routing (#37253)
Remove the line length suppression for this package and fix offending
lines

relates: #34884
2019-01-10 15:23:34 +01:00
Armin Braun 26cb7466ef
SNAPSHOT+TESTS: Stabilize SnapshotDisruptionIT (#37289)
* Ensure retry by busy assert on SnapshotMissingException
* Closes #36739
2019-01-10 14:13:20 +01:00
Luca Cavanna 61b54196c4 [TEST] Fixed compile issue in SnapshotsServiceTests
Relates to #37203
2019-01-10 13:38:39 +01:00
Alexander Reelsen 71287b0759
Remove unused EpochMillisDateFormatter (#37293)
This class has been superceded by a custom java time epoch millis date
parser.
2019-01-10 13:20:24 +01:00
Yannick Welsch d499233068
Zen2: Add join validation (#37203)
Adds join validation to Zen2, which prevents a node from joining a cluster when the node does not
have the right ES version or does not satisfy any other of the join validation constraints.
2019-01-10 12:57:50 +01:00
Christoph Büscher cd608848e7
Remove deprecated QUERY_AND_FETCH SearchType (#37257)
This SearchType was deprecated since at least 6.0 and according to the
documentation is only kept around for pre-5.3 requests. Removing and leaving a
comment as placeholder so we don't reuse the byte value associated with it
without further consideration.
2019-01-10 12:50:07 +01:00
Alexander Reelsen eb12de550a
Java Time: Fix timezone parsing (#37262)
* Java Time: Fix timezone parsing

An independent test uncovered an issue when parsing a timezone
containing a colon like `01:00` - some formats did not properly support
this.

This commit adds test for all formats in the dueling tests and fixes a
few issues with existing date formatters.

* fix tests, so they run under java8
2019-01-10 09:26:01 +01:00
Alexander Reelsen b2e8437424
Tests: Add ElasticsearchAssertions.awaitLatch method (#36777)
* Tests: Add ElasticsearchAssertions.awaitLatch method

Some tests are using assertTrue(latch.await(...)) in their code. This
leads to an assertion error without any error message. This adds a
method which has a nicer error message and can be used in tests.

* fix forbidden apis

* fix spaces
2019-01-10 09:25:36 +01:00
Michael Basnight d625b79df2
Add getZone to JodaCompatibleZonedDateTime (#37084)
The ZonedDateTime#getZone() was not accessible via the Joda shim. This
commit adds getZone() and exposes it through painless.
2019-01-09 22:09:34 -06:00
Jim Ferenczi 586093ec5e Handle TopFieldDocs copy in TopDocsCollectorContext
This commit fixes the clone of TopFieldDocs.

Relates #37179
Relates #37266
2019-01-10 00:26:55 +01:00
Simon Willnauer 234059d2c0
Enable Bulk-Merge if all source remains (#37269)
Today we still wrap recovery source readers on merge even if we
keep all documents recovery source. This basically disables bulk
merging for stored fields. This change skips wrapping if all docs
sources are kept anyway.
2019-01-09 23:46:31 +01:00
Jim Ferenczi 95479f1766 Ensure that a non static top docs is created during the search phase
This change fixes an unreleased bug that trips an assertion because a static instance
shared among threads is modified during the search. This commit copies the static
instance in order to ensure that each thread can modify the value without modifying
the other instances.

Closes #37179
Closes #37266
2019-01-09 22:57:34 +01:00
Jake Landis 195873002b
ingest: compile mustache template only if field includes '{{'' (#37207)
* ingest: compile mustache template only if field includes '{{''

Prior to this change, any field in an ingest node processor that supports
script templates would be compiled as mustache template regardless if they
contain a template or not. Compiling normal text as mustache templates is
harmless. However, each compilation counts against the script compilation
circuit breaker. A large number of processors without any templates or scripts
could un-intuitively trip the too many script compilations circuit breaker.

This change simple checks for '{{' in the text before it attempts to compile.

fixes #37120
2019-01-09 14:47:47 -06:00
Jack Conradson 95eef77ad4
[Style] Fix line length violations for threadpool, indexing, and script packages (#37205) 2019-01-09 10:55:52 -08:00
Evangelos Chatzikalymnios 85a603ee61 Use List instead of priority queue for stable sorting in bucket sort aggregator (#36748)
Update BucketSortPipelineAggregator to use a List and Collections.sort() for sorting instead of a priority queue. This preserves the order for equal values. Closes #36322.
2019-01-09 18:01:39 +02:00
Armin Braun eacc63b032
TESTS: Real Coordinator in SnapshotServiceTests (#37162)
* TESTS: Real Coordinator in SnapshotServiceTests

* Introduce real coordinator in SnapshotServiceTests to be able to test network disruptions realistically
  * Make adjustments to cluster applier service so that we can pass a mocked single threaded executor for tests
2019-01-09 16:53:49 +01:00
Alpar Torok ae086ebcc4 Muting SnapshotDisruptionIT
Tracked in #36779
2019-01-09 16:55:11 +02:00