Commit Graph

2607 Commits

Author SHA1 Message Date
Evgenia Badyanova 1ed3407930
Reduce garbage from allocations in deprecation logger (#38780) (#39370)
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (#37411) by 3%. For more details: https://github.com/elastic/elasticsearch/issues/37530#issuecomment-462758384 

Relates to #37530
Relates to #37411
Relates to #35754
2019-02-25 16:23:22 -05:00
Lee Hinman 5c7dd6f0ee Set mappings when creating indices in SuggestSearchIT (#39323)
* Set mappings when creating indices in SuggestSearchIT

These tests don't test dynamic mapping, so they can use preset mappings. This
removes the possibility they may fail due to the mapping not being available
since mapping updates are asynchronous.

Resolves #39315

* Wrap creates in assertAcked
2019-02-25 13:27:03 -07:00
Mayya Sharipova bf058d6e4d
Fix anaylze NullPointerException when AnalyzeTokenList tokens is null (#39332) (#39361) 2019-02-25 12:49:18 -05:00
Nhat Nguyen 48219112e3 Do not wait for advancement of checkpoint in recovery (#39006)
With this change, we won't wait for the local checkpoint to advance to
the max_seq_no before starting phase2 of peer-recovery. We also remove
the sequence number range check in peer-recovery. We can safely do these
thanks to Yannick's finding.

The replication group to be used is currently sampled after indexing
into the primary (see `ReplicationOperation` class). This means that
when initiating tracking of a new replica, we have to consider the
following two cases:

- There are operations for which the replication group has not been
sampled yet. As we initiated the new replica as tracking, we know that
those operations will be replicated to the new replica and follow the
typical replication group semantics (e.g. marked as stale when
unavailable).

- There are operations for which the replication group has already been
sampled. These operations will not be sent to the new replica.  However,
we know that those operations are already indexed into Lucene and the
translog on the primary, as the sampling is happening after that. This
means that by taking a snapshot of Lucene or the translog, we will be
getting those ops as well. What we cannot guarantee anymore is that all
ops up to `endingSeqNo` are available in the snapshot (i.e.  also see
comment in `RecoverySourceHandler` saying `We need to wait for all
operations up to the current max to complete, otherwise we can not
guarantee that all operations in the required range will be available
for replaying from the translog of the source.`). This is not needed,
though, as we can no longer guarantee that max seq no == local
checkpoint.

Relates #39000
Closes #38949

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2019-02-25 12:10:14 -05:00
David Turner 236db51d34 Fix testSnapshotFileFailureDuringSnapshot (#39362)
Today this test catches an exception and asserts that its proximate cause has
message `Random IOException` but occasionally this exception is wrapped two
layers deep, causing the test to fail. This commit adjusts the test to look at
the root cause of the exception instead.

      1> [2019-02-25T12:31:50,837][INFO ][o.e.s.SharedClusterSnapshotRestoreIT] [testSnapshotFileFailureDuringSnapshot] --> caught a top level exception, asserting what's expected
      1> org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] Snapshot could not be read
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:212) ~[main/:?]
      1> 	at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:135) ~[main/:?]
      1> 	at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:54) ~[main/:?]
      1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:127) ~[main/:?]
      1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.doRun(TransportMasterNodeAction.java:208) ~[main/:?]
      1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[main/:?]
      1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
      1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
      1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
      1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
      1> Caused by: org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:564) ~[main/:?]
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
      1> 	... 9 more
      1> Caused by: java.io.IOException: Random IOException
      1> 	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.maybeIOExceptionOrBlock(MockRepository.java:275) ~[test/:?]
      1> 	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.readBlob(MockRepository.java:317) ~[test/:?]
      1> 	at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.readBlob(ChecksumBlobStoreFormat.java:101) ~[main/:?]
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreFormat.read(BlobStoreFormat.java:90) ~[main/:?]
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:560) ~[main/:?]
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
      1> 	... 9 more

    FAILURE 0.59s J0 | SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot <<< FAILURES!
       > Throwable #1: java.lang.AssertionError:
       > Expected: a string containing "Random IOException"
       >      but: was "[test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots"
       > 	at __randomizedtesting.SeedInfo.seed([B73CA847D4B4F52D:884E042D2D899330]:0)
       > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
       > 	at org.elasticsearch.snapshots.SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot(SharedClusterSnapshotRestoreIT.java:821)
       > 	at java.lang.Thread.run(Thread.java:748)
2019-02-25 16:43:55 +00:00
Marios Trivyzas 11fe8cd16f
[Tests] Fix flakiness by ensuring stable cluster (#39300) (#39356)
In integration tests where `setBootstrapMasterNodeIndex()` is used in
combination with `autoMinMasterNodes = false` the cluster can start
bootstrapping once the number of nodes set with the
`setBootstrapMasterNodeIndex` have been started but it's not ensured
that all nodes have successfully joined to form the cluster.

This behaviour was introduced with 5db7ed22a0
and in order to ensure that the cluster is properly formed before proceeding
with the integration test, use `ensureStableCluster()` with the
appropriate number of expected nodes.

Fixes: #39220
2019-02-25 17:26:15 +01:00
David Turner dc23be5a9d Avoid creating a green index in RetentionLeaseIT (#39347)
In #39224 we made shard history retention lease syncing ignore the
`index.write.wait_for_active_shards` setting on the index, and added a test
that showed that it was ignored. However the test as merged actually creates a
green index, so the `wait_for_active_shards` setting has no effect. This change
adjusts the test to create a yellow index to verify that
`wait_for_active_shards` really is ignored.
2019-02-25 15:33:09 +00:00
Yannick Welsch a2bc41621c Clean GatewayAllocator when stepping down as master (#38885)
This fixes an issue where a messy master election might prevent shard allocation to properly
proceed. I've encountered this in failing CI tests when we were bootstrapping multiple nodes. Tests
would sometimes time out on an `ensureGreen` after an unclean master election. The reason for
this is how the async shard information fetching works and how the clean-up logic in
GatewayAllocator is integrated with the rest of the system. When a node becomes master, it will, as
part of the first cluster state update where it becomes master, already try allocating shards (see
`JoinTaskExecutor`, in particular the call to `reroute`). This process, which runs on the
MasterService thread, will trigger async shard fetching. If the node is still processing an earlier
election failure in ClusterApplierService (e.g. due to a messy election), that will possibly trigger the
clean-up logic in GatewayAllocator after the shard fetching has been initiated by MasterService,
thereby cancelling the fetching, which means that no subsequent reroute (allocation) is triggered
after the shard fetching results return. This means that no shard allocation will happen unless the
user triggers an explicit reroute command. The bug imo is that GatewayAllocator is called from both
MasterService and ClusterApplierService threads, with no clear happens-before relation. The fix
here makes it so that the clean-up logic is also run on the MasterService thread instead of the
ClusterApplierService thread, reestablishing a clear happens-before relation. Note that testing this
is tricky. With the newly added test, I can quite often reproduce this by adding `Thread.sleep(10);`
in ClusterApplierService (to make sure it does not go too quickly) and adding `Thread.sleep(50);` in
`TransportNodesListGatewayStartedShards` to make sure that shard state fetching does not go too
quickly either.

Note that older versions of Zen discovery are affected by this as well, but did not exhibit this issue
as often because master elections are much slower there.
2019-02-25 10:37:31 +01:00
David Turner 96c09b032d Ignore waitForActiveShards when syncing leases (#39224)
Adjust the retention lease sync actions so that they do not respect the
`index.write.wait_for_active_shards` setting on an index, allowing them to sync
retention leases even if insufficiently many shards are currently active to
accept writes.

Relates #39089
2019-02-25 08:53:43 +00:00
Nhat Nguyen f17d408fbb Add cause to assert_no_failure when replay translog (#39333)
We tripped this assertion three times for the last two weeks. However,
it only says "this IndexWriter is closed" without the actual cause.

```
[2019-02-14T11:46:31,144][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-1] fatal error in thread [elasticsearch[node-1][generic][T#2]], exiting

java.lang.AssertionError: unexpected failure while replicating translog entry: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
```

This change replaces an assert with an AssertionError so that we will
have the actual cause in the next build failures.

Relates #38898
2019-02-23 13:04:43 -05:00
Zachary Tong c7516b03b6
Better HoltWinters parameter validation (#38747)
We validate HW parameters (namely, window > 2 * period) when parsing
the XContent... but that means transport clients can configure bad
params.

This change allows model to validate the window and throw an 
exception if they wish.

It also makes some test changes:

- removes testBadModelParams(), which was a junk test (didn't do
anything), and bad param checking is done elsewhere in units tests
- Fixes one of the windows in testHoltWintersNotEnoughData()
- Ensures the period in testHoltWintersNotEnoughData() is >> window
- Removes `setTypes()` since that's deprecated
2019-02-22 15:25:26 -05:00
Daniel Mitterdorfer 9fea21aca5
Remove ExceptionsHelper#detailedMessage in tests (#37921) (#39297)
With this commit we remove all usages of the deprecated method
`ExceptionsHelper#detailedMessage` in tests. We do not address
production code here but rather in dedicated follow-up PRs to keep the
individual changes manageable.

Relates #19069
2019-02-22 14:03:29 +01:00
Tim Brooks 44df76251f
Rebuild remote connections on profile changes (#39146)
Currently remote compression and ping schedule settings are dynamic.
However, we do not listen for changes. This commit adds listeners for
changes to those two settings. Additionally, when those settings change
we now close existing connections and open new ones with the settings
applied.

Fixes #37201.
2019-02-21 14:00:39 -07:00
Tanguy Leroux fc896e452c
ReadOnlyEngine should update translog recovery state information (#39238) (#39251)
`ReadOnlyEngine` never recovers operations from translog and never 
updates translog information in the index shard's recovery state, even 
though the recovery state goes through the `TRANSLOG` stage during 
the recovery. It means that recovery information for frozen shards indicates 
an unkown number of recovered translog ops in the Recovery APIs 
(translog_ops: `-1` and translog_ops_percent: `-1.0%`) and this is confusing.

This commit changes the `recoverFromTranslog()` method in `ReadOnlyEngine` 
so that it always recover from an empty translog snapshot, allowing the recovery 
state translog information to be correctly updated.

Related to #33888
2019-02-21 18:08:06 +01:00
Daniel Mitterdorfer ef921fd157
Migrate Streamable to Writeable for cluster block package (#37391) (#39236) 2019-02-21 15:21:44 +01:00
Marios Trivyzas ecfd48b6d3
[Tests] Make testEngineGCDeletesSetting deterministic (#38942) (#39231)
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: #38874

Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
2019-02-21 14:30:59 +02:00
Marios Trivyzas 1316825f52
Replace superfluous usage of Counter with Supplier (#39048) (#39225)
`Counter` was used as a means of a functional argument to pass
the relative cached time before `Supplier` iface was introduced.
2019-02-21 12:42:54 +02:00
Ignacio Vera be8a5315d7 Extend nextDoc to delegate to the wrapped doc-value iterator for date_nanos (#39176)
The type date_nanos does not direct doc-value iterators and it needs to extend `next_doc` in order to delegate the call to the wrapped iterator.
2019-02-21 11:10:51 +01:00
Tal Levy 8150ca40f2 mute test3MasterNodes2Failed 2019-02-20 17:35:37 -08:00
Nhat Nguyen 820ba8169e Add retention leases replication tests (#38857)
This commit introduces the retention leases to ESIndexLevelReplicationTestCase,
then adds some tests verifying that the retention leases replication works
correctly in spite of the presence of the primary failover or out of order
delivery of retention leases sync requests.

Relates #37165
2019-02-20 19:21:00 -05:00
Mirko Jotic a6ae146ccc Converting Derivative Pipeline Agg integration test into AggregatorTestsCase. (#38679)
Replicates the majority of existing Derivative pipeline integration tests into
an AggregatorTestCase, with the goal of removing the integration
tests in the near future.
2019-02-20 16:35:32 -05:00
Igor Motov 3d93011e32 Fix median calculation in MedianAbsoluteDeviationAggregatorTests (#38979)
Fixes an error in median calculation in
MedianAbsoluteDeviationAggregatorTests for odd number of sample points,
which causes some rare test failures.

Fixes #38937
2019-02-20 13:24:30 -05:00
Ioannis Kakavas c783069804
Fix NPE on Stale Index in IndicesService(#39173)
This is a backport of  #38891 which closes #38845
2019-02-20 15:35:35 +02:00
David Turner efffb3d5b7 Simplify calculation in AwarenessAllocationDecider (#38091)
Today's calculation of the maximum number of shards per attribute is rather
convoluted. This commit clarifies that it returns
ceil(shardCount/numberOfAttributes).
2019-02-20 08:54:57 +00:00
Henning Andersen 00a26b9dd2 Blob store compression fix (#39073)
Blob store compression was not enabled for some of the files in
snapshots due to constructor accessing sub-class fields. Fixed to
instead accept compress field as constructor param. Also fixed chunk
size validation to work.

Deprecated repositories.fs.compress setting as well to be able to unify
in a future commit.
2019-02-20 09:24:41 +01:00
Hendrik Muhs 50b3858f7c add version 6.6.2 2019-02-19 20:28:06 +01:00
David Turner 0a9574c9d4 Add some missing toString() implementations (#39124)
Sometimes we turn objects into strings for logging or debugging using
`toString()`, but the default implementation is often unhelpful. This change
improves on this in two places I ran into recently.
2019-02-19 17:52:41 +00:00
Jason Tedor fef9bdb23f
Allow retention lease operations under blocks (#39089)
This commit allows manipulating retention leases under blocks.
2019-02-19 10:26:49 -05:00
Jason Tedor 12f6963456
Fix retention leases sync on recovery test
This test had a bug. We attempt to allow only the primary to be
allocated, to force all replicas to recovery from the primary after we
had set the state of the retention leases on the primary. However, in
building the index settings, we were overwriting the settings that
exclude the replicas from being allocated. This means that some of the
replicas would end up assigned and rather than receive retention leases
during recovery, they would be part of the replication group receiving
retention leases as they are manipulated. Since retention lease renewals
are only synced periodically, this means that the replica could be
lagging a little behind in some cases leading to an assertion tripping
in the test. This commit addresses this by ensuring that the replicas
are indeed not allocated until after the retention leases are done being
manipulated on the replica. We did this by not overwriting the exclude
settings.

Closes #39105
2019-02-19 09:07:33 -05:00
Alexander Reelsen 7f8a640363 Fix DateFormatters.parseMillis when no timezone is given (#39100)
The parseMillis method was able to work on formats without timezones by
falling back to UTC. The Date Formatter interface did not support this, as
the calling code was using the `Instant.from` java time API.

This switches over to an internal method which adds UTC as a timezone.

Closes #39067
2019-02-19 14:12:22 +01:00
Jim Ferenczi 199155f5fb
Enforce Completion Context Limit (#38675) (#39075)
This change adds a limit to the number of completion contexts that a completion field can define.

Closes #32741
2019-02-19 08:52:24 +01:00
Albert Zaharovits 6bc88b00ec Mute GatewayMetaStateTests.testAtomicityWithFailures (#39079)
Mute test GatewayMetaStateTests.testAtomicityWithFailures
2019-02-19 00:25:45 +02:00
Jason Tedor 2d8f6b6501
Introduce retention lease state file (#39004)
This commit moves retention leases from being persisted in the Lucene
commit point to being persisted in a dedicated state file.
2019-02-18 16:53:46 -05:00
Jason Tedor d43ac8fe11
Include in log retention leases that failed to sync
When retention leases fail to sync after an expiration check, we emit a
log message about this. This commit adds the retention leases that
failed to sync.
2019-02-18 15:08:08 -05:00
Jason Tedor bbb61002ba
Add some logging related to retention lease syncing (#39066)
When the background retention lease sync fires, we check an see if any
retention leases are expired. If any did expire, we execute a full
retention lease sync (write action). Since this is happening on a
background thread, we do not block that thread waiting for success (it
will simply try again when the timer elapses). However, we were
swallowing exceptions that indicate failure. This commit addresses that
by logging the failures. Additionally, we add some trace logging to the
execution of syncing retention leases.
2019-02-18 15:02:31 -05:00
Henning Andersen 99b2bc3461 Fix potential race during TcpTransport close (#39031)
Fixed two potential causes for leaked threads during tests:
1. When adding a channel to serverChannels, we add it under a monitor
that we do not use when reading from it. This is potentially unsafe if
there is no other happens-before relationship ensuring the safety of
this.
2. Long-shot but if the thread pool was shutdown before entering this
code, we would silently forget about closing server channels so added
assert.

Strengthened the locking to ensure that once we stop the transport, no
new server channels can be made.

Relates to CI failure issue: #37543
2019-02-18 19:13:23 +01:00
Alan Woodward ab4d5f404f Add overlapping, before, after filters to intervals query (#38999)
Lucene recently added `overlapping`, `before` and `after` filters to the intervals package. This
commit exposes them in elasticsearch.
2019-02-18 15:06:24 +00:00
Adrien Grand 45b17e8645
Don't close caches while there might still be in-flight requests. (#38958)
Many of our index components use ref-counting so that in the event that a shard
is closed while there are still ongoing requests, then the index reader and the
store only effectively get closed when ongoing requests have finished. However
we don't apply the same principle to the request and query caches, which might
get closed while there are still in-flight requests.

This commit adds ref-counting to `IndicesService` so that the caches and other
components it maintains only get closed when all shards are effectively closed.

Closes #37117
2019-02-18 13:59:58 +01:00
Martijn van Groningen ed08bc3537
Fix LocalIndexFollowingIT#testRemoveRemoteConnection() test (#38709)
* During fetching remote mapping if remote client is missing then
`NoSuchRemoteClusterException` was not handled.
* When adding remote connection, check that it is really connected
before continue-ing to run the tests.

Relates to #38695
2019-02-18 09:41:44 +01:00
Jason Tedor a5ce1e0bec
Integrate retention leases to recovery from remote (#38829)
This commit is the first step in integrating shard history retention
leases with CCR. In this commit we integrate shard history retention
leases with recovery from remote. Before we start transferring files, we
take out a retention lease on the primary. Then during the file copy
phase, we repeatedly renew the retention lease. Finally, when recovery
from remote is complete, we disable the background renewing of the
retention lease.
2019-02-16 15:37:52 -05:00
Tim Brooks b1c1daa63f
Add get file chunk timeouts with listener timeouts (#38758)
This commit adds a `ListenerTimeouts` class that will wrap a
`ActionListener` in a listener with a timeout scheduled on the generic
thread pool. If the timeout expires before the listener is completed,
`onFailure` will be called with an `ElasticsearchTimeoutException`.

Timeouts for the get ccr file chunk action are implemented using this
functionality. Additionally, this commit attempts to fix #38027 by also
blocking proxied get ccr file chunk actions. This test being un-muted is
useful to verify the timeout functionality.
2019-02-16 10:56:03 -07:00
Luca Cavanna a1a49f201d Tie break search shard iterator comparisons on cluster alias (#38853)
`SearchShardIterator` inherits its `compareTo` implementation from `PlainShardIterator`. That is good in most of the cases, as such comparisons are based on the shard id which is unique, even when searching against indices with same names across multiple clusters (thanks to the index uuid being different). In case though the same cluster is registered multiple times with different aliases, the shard id is exactly the same, hence remote results will be returned before local ones with same shard id objects. That is because remote iterators are added before local ones, and we use a stable sorting method in `GroupShardIterators` constructor.

This PR enhances `compareTo` for `SearchShardIterator` to tie break on cluster alias and introduces consistent `equals` and `hashcode` methods. This allows to remove a TODO in `SearchResponseMerger` which otherwise has to handle this special case specifically. Also, while at it I added missing tests around equals/hashcode and compareTo and expanded existing ones.
2019-02-16 09:41:03 +01:00
Nhat Nguyen 7e20a92888 Advance max_seq_no before add operation to Lucene (#38879)
Today when processing an operation on a replica engine (or the 
following engine), we first add it to Lucene, then add it to translog, 
then finally marks its seq_no as completed. If a flush occurs after step1,
but before step-3, the max_seq_no in the commit's user_data will be
smaller than the seq_no of some documents in the Lucene commit.
2019-02-15 21:04:28 -05:00
Nhat Nguyen 20755e666c Reduce global checkpoint sync interval in disruption tests (#38931)
We verify seq_no_stats is aligned between copies at the end of some
disruption tests. Sometimes, the assertion `assertSeqNos` is tripped due
to a lagged global checkpoint on replicas. The global checkpoint on
replicas is lagged because we sync the global checkpoint 30 seconds (by
default) after the last replication operation. This change reduces the
global checkpoint sync-internal to 1s in the disruption tests.

Closes #38318
Closes #36789
2019-02-15 21:04:20 -05:00
Nhat Nguyen a67b9f6d1f Relax testStressMaybeFlushOrRollTranslogGeneration (#38918)
The predicate shouldPeriodicallyFlush is determined by the uncommitted
translog size and the local checkpoint. The uncommitted translog size
depends on the local checkpoint. The condition shouldPeriodicallyFlush
can be true twice in in the test in the following scenario:

1. Index doc-0 and advances the local checkpoint to 0, the condition
shouldPeriodicallyFlush remains false.

2. Index doc-1 and add it to translog, but the local checkpoint is not
advanced yet (still 0). The condition shouldPeriodicallyFlush becomes
true because the uncommitted translog size is 216bytes (2ops + gen-1 +
gen-2) > 180bytes and the translog generation of the new index commit
would advance from 1 to 2.

> [2019-02-13T23:33:58,257][TRACE][o.e.i.e.Engine           ] [node_s_0]
> [test][0] committing writer with commit data [{local_checkpoint=0,
> max_unsafe_auto_id_timestamp=-1, translog_uuid=fFp1Yqd4QiqKDD4ZrC8F-g,
> min_retained_seq_no=0, history_uuid=cn31yrwVQk-Vs7qcg4bi_Q,
> retention_leases=primary_term:1;version:0;, translog_generation=2,
> max_seq_no=1}]

1. The shouldPeriodicallyFlush becomes true again after the local
checkpoint is advanced to 1 because the uncommitted translog size is
216bytes (2ops + gen-2 + gen-3) > 180bytes and the translog generation
of the new index commit would advance from 2 to 4.

> [2019-02-13T23:33:58,264][TRACE][o.e.i.e.Engine           ] [node_s_0]
> [test][0] committing writer with commit data [{local_checkpoint=1,
> max_unsafe_auto_id_timestamp=-1, translog_uuid=fFp1Yqd4QiqKDD4ZrC8F-g,
> min_retained_seq_no=0, history_uuid=cn31yrwVQk-Vs7qcg4bi_Q,
> retention_leases=primary_term:1;version:0;, translog_generation=4,
> max_seq_no=1}]

We need to relax the assertion in this test to cover this situation.

Closes #31629
2019-02-15 21:04:12 -05:00
Armin Braun 238425e5e7 Fix Issue with Concurrent Snapshot Init + Delete (#38518)
* Fix Issue with Concurrent Snapshot Init + Delete by ensuring that we're not finalizing a snapshot in the repository while it is initializing on another thread

* Closes #38489
2019-02-15 16:50:47 -08:00
Alan Woodward 176013e23c Avoid double term construction in DfsPhase (#38716)
DfsPhase captures terms used for scoring a query in order to build global term statistics across
multiple shards for more accurate scoring. It currently does this by building the query's `Weight`
and calling `extractTerms` on it to collect terms, and then calling `IndexSearcher.termStatistics()`
for each collected term. This duplicates work, however, as the various `Weight` implementations 
will already have collected these statistics at construction time.

This commit replaces this round-about way of collecting stats, instead using a delegating
IndexSearcher that collects the term contexts and statistics when `IndexSearcher.termStatistics()`
is called from the Weight.

It also fixes a bug when using rescorers, where a `QueryRescorer` would calculate distributed term
statistics, but ignore field statistics.  `Rescorer.extractTerms` has been removed, and replaced with
a new method on `RescoreContext` that returns any queries used by the rescore implementation.
The delegating IndexSearcher then collects term contexts and statistics in the same way described
above for each Query.
2019-02-15 16:00:38 +00:00
Daniel Mitterdorfer fcc7f553f5
Also mmap cfs files for hybridfs (#38940) (#38947)
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
2019-02-15 15:34:40 +01:00
David Turner 578514e892 Recover peers from translog, ignoring soft deletes (#38904)
Today if soft deletes are enabled then we read the operations needed for peer
recovery from Lucene. However we do not currently make any attempt to retain
history in Lucene specifically for peer recoveries so we may discard it and
fall back to a more expensive file-based recovery. Yet we still retain
sufficient history in the translog to perform an operations-based peer
recovery.

In the long run we would like to fix this by retaining more history in Lucene,
possibly using shard history retention leases (#37165). For now, however, this
commit reverts to performing peer recoveries using the history retained in the
translog regardless of whether soft deletes are enabled or not.
2019-02-15 10:45:15 +01:00
Henning Andersen a211e51343 ShardBulkAction ignore primary response on primary (#38901)
Previously, if a version conflict occurred and a previous primary
response was present, the original primary response would be used both
for sending to replica and back to client. This was made in the past as an
attempt to fix issues with conflicts after relocations where a bulk request
would experience a closed shard half way through and thus have to retry
on the new primary. It could then fail on its own update.

With sequence numbers, this leads to an issue, since if a primary is
demoted (network partitions), it will send along the original response
in the request. In case of a conflict on the new primary, the old
response is sent to the replica. That data could be stale, leading to
inconsistency between primary and replica.

Relocations now do an explicit hand-off from old to new primary and
ensures that no operations are active while doing this. Above is thus no
longer necessary. This change removes the special handling of conflicts
and ignores primary responses when executing shard bulk requests on the
primary.
2019-02-15 10:13:11 +01:00