`TopDocsCollectorContext` can already shortcut hit counts on `match_all` and `term` queries when there are no deletions.
This change adds this ability for `exists` queries if the index doesn't have deletions and fields are indexed.
Closes#37475
While serializing custom objects, the length of the list is computed after
filtering out the unsupported objects but while writing objects the filter
is not applied thus resulting in writing unsupported objects which will fail
to deserialize by the receiever. Adding the condition to filter out unsupported
custom objects.
Since #39006 we should be able to complete a peer-recovery without
waiting for pending indexing operations. Thus, the assertion in
testDoNotWaitForPendingSeqNo should be updated from false to true.
Closes#39510
This is now possible as Lucene's `TotalHits` implements `equals`/`hashcode`,
all the other methods can be in-lined in `SearchHits` instead, no need for
a specific wrapper class.
* Introduce Safer Chaining of Listeners
* The motivation here is to make reasoning about chains of `ActionListener` a little easier, by providing a safe method for nesting `ActionListener` that guarantees that a response is never dropped. Also, it dries up the code a little by removing the need to repeat `listener::onFailure` and `listener.onResponse` over and over.
* Refactored a number of obvious/easy spots to use the new listener constructor
Zen1IT#testFreshestMasterElectedAfterFullClusterRestart fails sometimes because
we request the cluster state before state recovery has completed, and therefore
obtain the default value for the setting we're relying on.
Confusingly, we were starting out by setting this setting to its default value,
so the test looked like it was failing because of a production bug. This commit
avoids this confusion in future by setting it to a non-default value at the
start of the test.
Fixes#39586.
This commit adds the following:
- more tests to IndicesServiceCloseTests, one of them found a bug in the order
in which `IndicesQueryCache#onClose` and
`IndicesService.indicesRefCount#decRef` are called.
- made `IndicesQueryCache.stats2` a synchronized map. All writes to it are
already protected by the lock of the Lucene cache, but the final read from
an assertion in `IndicesQueryCache#close()` was not so this change should
avoid any potential visibility issues.
- human-readable `toString`s to make debugging easier.
Relates #37117
This cleans up the Engine implementation by separating the sequence number generation from the
planning step in the engine, to avoid for the planning step to have any side effects. This makes it
easier to see that every sequence number is properly accounted for.
Add debug log when index is flushed to investigate a failure
in IndicesRequestCacheIT
"DEBUG" level is used as "TRACE" produces too much output irrelevant for this
issue
Relates to #32827
Backport support for replicating closed indices (#39499)
Before this change, closed indexes were simply not replicated. It was therefore
possible to close an index and then decommission a data node without knowing
that this data node contained shards of the closed index, potentially leading to
data loss. Shards of closed indices were not completely taken into account when
balancing the shards within the cluster, or automatically replicated through shard
copies, and they were not easily movable from node A to node B using APIs like
Cluster Reroute without being fully reopened and closed again.
This commit changes the logic executed when closing an index, so that its shards
are not just removed and forgotten but are instead reinitialized and reallocated on
data nodes using an engine implementation which does not allow searching or
indexing, which has a low memory overhead (compared with searchable/indexable
opened shards) and which allows shards to be recovered from peer or promoted
as primaries when needed.
This new closing logic is built on top of the new Close Index API introduced in
6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before
closing them, and closing an index on a 8.0 cluster will reinitialize the index shards
and therefore impact the cluster health.
Some APIs have been adapted to make them work with closed indices:
- Cluster Health API
- Cluster Reroute API
- Cluster Allocation Explain API
- Recovery API
- Cat Indices
- Cat Shards
- Cat Health
- Cat Recovery
This commit contains all the following changes (most recent first):
* c6c42a1 Adapt NoOpEngineTests after #39006
* 3f9993d Wait for shards to be active after closing indices (#38854)
* 5e7a428 Adapt the Cluster Health API to closed indices (#39364)
* 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767)
* 71f5c34 Recover closed indices after a full cluster restart (#39249)
* 4db7fd9 Adapt the Recovery API for closed indices (#38421)
* 4fd1bb2 Adapt more tests suites to closed indices (#39186)
* 0519016 Add replica to primary promotion test for closed indices (#39110)
* b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631)
* c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955)
* 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex()
* e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329)
* cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327)
* b9becdd Adapt testPendingTasks() for replicated closed indices (#38326)
* 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024)
* e53a9be Fix compilation error in IndexShardIT after merge with master
* cae4155 Relax NoOpEngine constraints (#37413)
* 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes
* c63fd69 [RCI] Add NoOpEngine for closed indices (#33903)
Relates to #33888
With #17187, we verified IndexService creation during initial state recovery on the master and if the
recovery failed the index was imported as closed, not allocating any shards. This was mainly done to
prevent endless allocation loops and full log files on data-nodes when the indexmetadata contained
broken settings / analyzers. Zen2 loads the cluster state eagerly, and this check currently runs on all
nodes (not only the elected master), which can significantly slow down startup on data nodes.
Furthermore, with replicated closed indices (#33888) on the horizon, importing the index as closed
will no longer not allocate any shards. Fortunately, the original issue for endless allocation loops is
no longer a problem due to #18467, where we limit the retries of failed allocations. The solution here
is therefore to just undo #17187, as it's no longer necessary, and covered by #18467, which will solve
the issue for Zen2 and replicated closed indices as well.
Mutations of the timeout values were using string-representations.
This resulted in very rare cases where the original timeout value was
represented as something like "0ms" and the new random time-value generated
was "0s". Although their string representations differ, their underlying
TimeValue does not. This resulted in `-Dtests.seed=7F4C034C43C22B1B` to
fail.
This adds a `details` parameter to shard locking in `NodeEnvironment`. This is
intended to be used for diagnosing issues such as
```
1> [2019-02-11T14:34:19,262][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.tasks/oSYOG0-9SHOx_pfAoiSExQ] deleting index
1> [2019-02-11T14:34:19,279][WARN ][o.e.i.IndicesService ] [node_s0] [.tasks/oSYOG0-9SHOx_pfAoiSExQ] failed to delete index
1> org.elasticsearch.env.ShardLockObtainFailedException: [.tasks][0]: obtaining shard lock timed out after 0ms
1> at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:736) ~[main/:?]
1> at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:655) ~[main/:?]
1> at org.elasticsearch.env.NodeEnvironment.lockAllForIndex(NodeEnvironment.java:601) ~[main/:?]
1> at org.elasticsearch.env.NodeEnvironment.deleteIndexDirectorySafe(NodeEnvironment.java:554) ~[main/:?]
```
In the hope that we will be able to determine why the shard is still locked.
Relates to #30290 as well as some other CI failures
Today we block on the ReferenceManager in the case of a scheduled refresh.
Yet if there is a refresh happening concurrently we might block and create
very smallish segments. Instead we should just move on to the next shard
and free up the refresh thread instead.
* Removed obviously unused fields+methods
* Inlined public methods that only had one caller
* Simplified `Optional` chain
* Simplified some obviously redundant conditions
* Use actual master node, not just a master elligible node when trying to cancel publication. This only works on the master and for unlucky seeds we never try the master within the 10s that the busy assert runs.
* Closes#36813
Backport of #39350
Contains the following:
* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
When trying to add a setting to the keystore with an upper case name, we
reject with an unclear error message. This commit makes that error
message much clearer.
* Don't Ping on Handshake Connection
* It does not make sense to run pings on the handshake connection
* Set the ping interval to `-1` to deactivate pings on it
Today the abstract `org.elasticsearch.discovery.PeerFinder` uses the logger of
its implementation, which in production is in `o.e.cluster.coordination`. This
turns out to be confusing and unhelpful, so with this change we move to using
the logger that belongs to `PeerFinder`.
Currently there are two security tests that specifically target the
netty security transport. This PR moves the client authentication tests
into `AbstractSimpleSecurityTransportTestCase` so that the nio transport
will also be tested.
Additionally the work to build transport configurations is moved out of
the netty transport and tested independently.
Since #39006, we should either remove `testWaitForPendingSeqNo`
or adjust it not to wait for the pending operations. This change picks
the latter.
Relates #39006
`withLocalReduction` is confusing as `local` effectively means "local
to the remote clusters" rather than "local the coordinating node" where
the method is executed. I propose we rename the method to
`crossClusterSearch` which better resembles what the static method is
used for.
We have been calling `reduce` against completion suggestions twice, once
in `SearchPhaseController#reducedQueryPhase` where all suggestions get
reduced, and once more in `SearchPhaseController#sortDocs` where we
add the top completion suggestions to the `TopDocs` so their docs can
be fetched. There is no need to do reduction twice. All suggestions can
be reduced in one call, then we can filter the result and pass only the
already reduced completion suggestions over to `sortDocs`. The small
important detail is that `shardIndex`, which is currently used only
to fetch suggestions hits, needs to be set before the first reduction,
hence outside of `sortDocs` where we have been doing it until now.
Checks that the core coordination algorithm implemented as part of Zen2 (#32006) supports
linearizable semantics. This commit adds a linearizability checker based on the Wing and Gong
graph search algorithm with support for compositional checking and activates these checks for all
CoordinatorTests.
Today we do not bubble up exceptions when processing NoOps but always
treat them as document-level failures. This incorrect treatment causes
the assert_no_failure being tripped in peer-recovery if IndexWriter was
closed exceptionally before.
Closes#38898
Today when users upgrade to 7.0, existing indices will automatically
switch to soft-deletes without an opt-out option. With this change,
we only enable soft-deletes by default for new indices.
Relates #36141
Moves the check for tiny polygons earlier in the test. It turned out
that polygons can be so tiny that we cannot even figure out their
orientation.
Relates to #37356
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format
Helps to narrow the performance gap in throughout for geonames benchmark (#37411) by 3%. For more details: https://github.com/elastic/elasticsearch/issues/37530#issuecomment-462758384
Relates to #37530
Relates to #37411
Relates to #35754
* Set mappings when creating indices in SuggestSearchIT
These tests don't test dynamic mapping, so they can use preset mappings. This
removes the possibility they may fail due to the mapping not being available
since mapping updates are asynchronous.
Resolves#39315
* Wrap creates in assertAcked
With this change, we won't wait for the local checkpoint to advance to
the max_seq_no before starting phase2 of peer-recovery. We also remove
the sequence number range check in peer-recovery. We can safely do these
thanks to Yannick's finding.
The replication group to be used is currently sampled after indexing
into the primary (see `ReplicationOperation` class). This means that
when initiating tracking of a new replica, we have to consider the
following two cases:
- There are operations for which the replication group has not been
sampled yet. As we initiated the new replica as tracking, we know that
those operations will be replicated to the new replica and follow the
typical replication group semantics (e.g. marked as stale when
unavailable).
- There are operations for which the replication group has already been
sampled. These operations will not be sent to the new replica. However,
we know that those operations are already indexed into Lucene and the
translog on the primary, as the sampling is happening after that. This
means that by taking a snapshot of Lucene or the translog, we will be
getting those ops as well. What we cannot guarantee anymore is that all
ops up to `endingSeqNo` are available in the snapshot (i.e. also see
comment in `RecoverySourceHandler` saying `We need to wait for all
operations up to the current max to complete, otherwise we can not
guarantee that all operations in the required range will be available
for replaying from the translog of the source.`). This is not needed,
though, as we can no longer guarantee that max seq no == local
checkpoint.
Relates #39000Closes#38949
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Today this test catches an exception and asserts that its proximate cause has
message `Random IOException` but occasionally this exception is wrapped two
layers deep, causing the test to fail. This commit adjusts the test to look at
the root cause of the exception instead.
1> [2019-02-25T12:31:50,837][INFO ][o.e.s.SharedClusterSnapshotRestoreIT] [testSnapshotFileFailureDuringSnapshot] --> caught a top level exception, asserting what's expected
1> org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] Snapshot could not be read
1> at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:212) ~[main/:?]
1> at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:135) ~[main/:?]
1> at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:54) ~[main/:?]
1> at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:127) ~[main/:?]
1> at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.doRun(TransportMasterNodeAction.java:208) ~[main/:?]
1> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[main/:?]
1> at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
1> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
1> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
1> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
1> Caused by: org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots
1> at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:564) ~[main/:?]
1> at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
1> ... 9 more
1> Caused by: java.io.IOException: Random IOException
1> at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.maybeIOExceptionOrBlock(MockRepository.java:275) ~[test/:?]
1> at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.readBlob(MockRepository.java:317) ~[test/:?]
1> at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.readBlob(ChecksumBlobStoreFormat.java:101) ~[main/:?]
1> at org.elasticsearch.repositories.blobstore.BlobStoreFormat.read(BlobStoreFormat.java:90) ~[main/:?]
1> at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:560) ~[main/:?]
1> at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
1> ... 9 more
FAILURE 0.59s J0 | SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot <<< FAILURES!
> Throwable #1: java.lang.AssertionError:
> Expected: a string containing "Random IOException"
> but: was "[test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots"
> at __randomizedtesting.SeedInfo.seed([B73CA847D4B4F52D:884E042D2D899330]:0)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.elasticsearch.snapshots.SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot(SharedClusterSnapshotRestoreIT.java:821)
> at java.lang.Thread.run(Thread.java:748)
In integration tests where `setBootstrapMasterNodeIndex()` is used in
combination with `autoMinMasterNodes = false` the cluster can start
bootstrapping once the number of nodes set with the
`setBootstrapMasterNodeIndex` have been started but it's not ensured
that all nodes have successfully joined to form the cluster.
This behaviour was introduced with 5db7ed22a0
and in order to ensure that the cluster is properly formed before proceeding
with the integration test, use `ensureStableCluster()` with the
appropriate number of expected nodes.
Fixes: #39220