This cleans up the Engine implementation by separating the sequence number generation from the
planning step in the engine, to avoid for the planning step to have any side effects. This makes it
easier to see that every sequence number is properly accounted for.
Fix bug in IndexResolver that caused conflicts in multi-field types to
be ignored up (causing the query to fail later on due to mapping
conflicts).
The issue was caused by the multi-field which forced the parent creation
before checking its validity across mappings
Fix#39547
(cherry picked from commit 4e4fe289f90b9b5eae09072d54903701a3128696)
Queries that require counting of all hits (COUNT(*) on implicit
group by), now enable accurate hit tracking.
Fix#37971
(cherry picked from commit 265b637cf6df08986a890b8b5daf012c2b0c1699)
This commit parallelizes some parts of the test
and its remove an unnecessary refresh call.
On my local machine it shaves off about 15 seconds
for a test execution time of ~64s (down from ~80s).
This test is still slow but progress over perfection.
Relates #37339
This is a backport of #38382
This change adds supports for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)
This change also handles the serialization/deserialization BWC logic:
- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)
Resolves#36872
Co-authored-by: Jay Modi jaymode@users.noreply.github.com
Backport support for replicating closed indices (#39499)
Before this change, closed indexes were simply not replicated. It was therefore
possible to close an index and then decommission a data node without knowing
that this data node contained shards of the closed index, potentially leading to
data loss. Shards of closed indices were not completely taken into account when
balancing the shards within the cluster, or automatically replicated through shard
copies, and they were not easily movable from node A to node B using APIs like
Cluster Reroute without being fully reopened and closed again.
This commit changes the logic executed when closing an index, so that its shards
are not just removed and forgotten but are instead reinitialized and reallocated on
data nodes using an engine implementation which does not allow searching or
indexing, which has a low memory overhead (compared with searchable/indexable
opened shards) and which allows shards to be recovered from peer or promoted
as primaries when needed.
This new closing logic is built on top of the new Close Index API introduced in
6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before
closing them, and closing an index on a 8.0 cluster will reinitialize the index shards
and therefore impact the cluster health.
Some APIs have been adapted to make them work with closed indices:
- Cluster Health API
- Cluster Reroute API
- Cluster Allocation Explain API
- Recovery API
- Cat Indices
- Cat Shards
- Cat Health
- Cat Recovery
This commit contains all the following changes (most recent first):
* c6c42a1 Adapt NoOpEngineTests after #39006
* 3f9993d Wait for shards to be active after closing indices (#38854)
* 5e7a428 Adapt the Cluster Health API to closed indices (#39364)
* 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767)
* 71f5c34 Recover closed indices after a full cluster restart (#39249)
* 4db7fd9 Adapt the Recovery API for closed indices (#38421)
* 4fd1bb2 Adapt more tests suites to closed indices (#39186)
* 0519016 Add replica to primary promotion test for closed indices (#39110)
* b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631)
* c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955)
* 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex()
* e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329)
* cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327)
* b9becdd Adapt testPendingTasks() for replicated closed indices (#38326)
* 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024)
* e53a9be Fix compilation error in IndexShardIT after merge with master
* cae4155 Relax NoOpEngine constraints (#37413)
* 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes
* c63fd69 [RCI] Add NoOpEngine for closed indices (#33903)
Relates to #33888
* SYS COLUMNS will skip UNSUPPORTED field types in ODBC and JDBC, as well.
NESTED and OBJECT types were already skipped in ODBC mode, now they are
skipped in JDBC mode, as well.
(cherry picked from commit 9e0df64b2d36c9069dfa506570468f0522c86417)
For functions: move checks for `text` fields without underlying `keyword`
fields or with many of them (ambiguity) to the type resolution stage.
For Order By/Group By: move checks to the `Verifier` to catch early
before `QueryTranslator` or execution.
Closes: #38501Fixes: #35203
This commit adds a simple integ test that exercises the flow:
* snapshot .security
* delete .security
* restore .security
, checking that the Native Realm works as expected.
Relates #34454
fix a couple of odd behaviors of data frame transforms REST API's:
- check if id from body and id from URL match if both are specified
- do not allow a body for delete
- allow get and stats without specifying an id
Backport of #39325
When ILM is disabled and Watcher is setting up the templates and policies for
the watch history indices, it will now use a template that does not have the
`index.lifecycle.name` setting, so that indices are not created with the
setting.
This also adds tests for the behavior, and changes the cluster state used in
these tests to be real instead of mocked.
Resolves#38805
This change fixes the tests that expect the reload of a
SSLConfiguration to fail. The tests relied on an incorrect assumption
that the reloader only called reload on for an SSLConfiguration if the
key and trust managers were successfully reloaded, but that is not the
case. This change removes the fail call with a wrapped call to the
original method and captures the exception and counts down a latch to
make these tests consistently tested.
Closes#39260
Backport of #39350
Contains the following:
* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
Previously, if a text field had an underlying keyword field
the latter was not used instead of the text leading to wrong
results returned by queries filtering with LIKE/RLIKE.
Fixes: #39442
The ScheduledEvent class has never preserved the time
zone so it makes more sense for it to store the start and
end time using Instant rather than ZonedDateTime.
Closes#38620
* Add "columnar" option for REST requests (but be lenient for non-"plain"
modes) for json, yaml, smile and cbor formats.
* Updated documentation
(cherry picked from commit 5b7e0de237fb514d14a61a347bc669d4b4adbe56)
Currently there are two security tests that specifically target the
netty security transport. This PR moves the client authentication tests
into `AbstractSimpleSecurityTransportTestCase` so that the nio transport
will also be tested.
Additionally the work to build transport configurations is moved out of
the netty transport and tested independently.
This changes the name of the internal security index to ".security-7",
but supports indices that were upgraded from earlier versions and use
the ".security-6" name.
In all cases, both ".security-6" and ".security-7" are considered to
be restricted index names regardless of which name is actually in use
on the cluster.
Backport of: #39337
Some small fix for the `x-pack` rest api spec.
* In both `security.enable_user.json` and `security.disable_user.json`
the `username` parameter was `false` instead of `true`
(the documentation is already correct).
* In `security.get_privileges.json` there were missing all the
possible paths since the path parameters are not required.
This fix aligns the document with the rest of the spec,
where all the possible combinations are listed.
It is possible that the Unfollow API may fail to release shard history
retention leases when unfollowing, so this needs to be handled by the
ILM Unfollow action. There's nothing much that can be done automatically
about it from the follower side, so this change makes the ILM unfollow
action simply ignore those failures.
This change is a backport of #39252
- Fixes TokenBackwardsCompatibilityIT: Existing tests seemed to made
the assumption that in the oneThirdUpgraded stage the master node
will be on the old version and in the twoThirdsUpgraded stage, the
master node will be one of the upgraded ones. However, there is no
guarantee that the master node in any of the states will or will
not be one of the upgraded ones.
This class now tests:
- That we can generate and consume tokens before we start the
rolling upgrade.
- That we can consume tokens generated in the old cluster during
all the stages of the rolling upgrade.
- That while on a mixed cluster, when/if the master node is
upgraded, we can generate, consume and refresh a token
- That after the rolling upgrade, we can consume a token
generated in an old cluster and can invalidate it so that it
can't be used any more.
- Ensures that during the rolling upgrade, the upgraded nodes have
the same configuration as the old nodes. Specifically that the
file realm we use is explicitly named `file1`. This is needed
because while attempting to refresh a token in a mixed cluster
we might create a token hitting an old node and attempt to refresh
it hitting a new node. If the file realm name is not the same, the
refresh will be seen as being made by a "different" client, and
will, thus, fail.
- Renames the Authentication variable we check while refreshing a
token to be clientAuth in order to make the code more readable.
Some of the above were possibly causing the flakiness of #37379
Today when users upgrade to 7.0, existing indices will automatically
switch to soft-deletes without an opt-out option. With this change,
we only enable soft-deletes by default for new indices.
Relates #36141
This commit is the final piece of the integration of CCR with retention
leases. Namely, we periodically renew retention leases and advance the
retaining sequence number while following.
* Remove Hipchat support from Watcher (#39199)
Hipchat has been shut down and has previously been deprecated in
Watcher (#39160), therefore we should remove support for these actions.
* Add migrate note
With this change, we won't wait for the local checkpoint to advance to
the max_seq_no before starting phase2 of peer-recovery. We also remove
the sequence number range check in peer-recovery. We can safely do these
thanks to Yannick's finding.
The replication group to be used is currently sampled after indexing
into the primary (see `ReplicationOperation` class). This means that
when initiating tracking of a new replica, we have to consider the
following two cases:
- There are operations for which the replication group has not been
sampled yet. As we initiated the new replica as tracking, we know that
those operations will be replicated to the new replica and follow the
typical replication group semantics (e.g. marked as stale when
unavailable).
- There are operations for which the replication group has already been
sampled. These operations will not be sent to the new replica. However,
we know that those operations are already indexed into Lucene and the
translog on the primary, as the sampling is happening after that. This
means that by taking a snapshot of Lucene or the translog, we will be
getting those ops as well. What we cannot guarantee anymore is that all
ops up to `endingSeqNo` are available in the snapshot (i.e. also see
comment in `RecoverySourceHandler` saying `We need to wait for all
operations up to the current max to complete, otherwise we can not
guarantee that all operations in the required range will be available
for replaying from the translog of the source.`). This is not needed,
though, as we can no longer guarantee that max seq no == local
checkpoint.
Relates #39000Closes#38949
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
when dealing with TimeoutException
The `IndexFollowingIT#testDeleteLeaderIndex()`` test failed,
because a NPE was captured as fatal error instead of an IndexNotFoundException.
Closes#39308