When a primary shard is recovered from its store, we trim the last
commit (when it's unsafe). If that primary crashes before the recovery
completes, we will lose the committed retention leases because they are
baked in the last commit. With this change, we copy the retention leases
from the last commit to the safe commit when trimming unsafe commits.
Relates #37165
In this case, we were incrementing the policy too much. This means on
every iteration we actually keep increasing the minimum retained
sequence number, even with leases in place. It was a bug from when the
soft deletes policy had retention leases incorporated into it. This
commit fixes this bug by ensuring we only increment in the proper
places, and adds careful tests for the various situations.
Forward port of https://github.com/elastic/elasticsearch/pull/38757
This change reverts the initial 7.0 commits and replaces them
with the 6.7 variant that still allows for the ecs flag.
This commit differs from the 6.7 variants in that ecs flag will
now default to true.
6.7: `ecs` : default `false`
7.x: `ecs` : default `true`
8.0: no option, but behaves as `true`
* Revert "Ingest node - user agent, move device to an object (#38115)"
This reverts commit 5b008a34aa.
* Revert "Add ECS schema for user-agent ingest processor (#37727) (#37984)"
This reverts commit cac6b8e06f.
* cherry-pick 5dfe1935345da3799931fd4a3ebe0b6aa9c17f57
Add ECS schema for user-agent ingest processor (#37727)
* cherry-pick ec8ddc890a34853ee8db6af66f608b0ad0cd1099
Ingest node - user agent, move device to an object (#38115) (#38121)
* cherry-pick f63cbdb9b426ba24ee4d987ca767ca05a22f2fbb (with manual merge fixes)
Dep. check for ECS changes to User Agent processor (#38362)
* make true the default for the ecs option, and update 7.0 references and tests
The hardcoded '\n' in string will not work in Windows where there is a
different line separator. A System.lineSeparator should be used to make
it work on all platforms
closes#38705
backport #38771
- Disables the request cache on the test, to prevent cached
values from potentially interfering with test results
- Changes the test to execute a single query, in hopes of making
failures more reproducible
Backport of #38583
There were two documents (seq=2 and seq=103) missing on the follower in
one of the failures of `testFailOverOnFollower`. I spent several hours
on that failure but could not figure out the reason. I adjust log and
unmute this test so we can collect more information.
Relates #38633
We need to use the current primary term instead of 1L for the initial
retention leases; otherwise, the primary term of the committed
retention leases won't match the current primary term if the
retention leases never gets updated.
This commit introduces actions for some common retention lease
operations that clients need to be able to perform remotely. These
actions include add/renew/remove.
fix tests to use clock in milliseconds precision in watcher code
make sure the date comparison in string format is using same formatters
some of the code was modified in #38514 possibly because of merge conflicts
closes#38581
Backport #38738
A recent test failure triggered an edge case scenario where failures may be coming back with the same shard id, yet from different clusters.
This commit adapts the failures comparator to take the cluster alias into account when merging failures as part of CCS requests execution.
Also the corresponding test has been split in two: with and without
search shard target set to the failure.
Closes#38672
When a retention lease already exists on an add retention lease
invocation, or a retention lease is not found on a renew retention lease
invocation today we throw an illegal argument exception. This puts a
burden on the caller to catch that specific exception and parse the
message. This commit relieves the burden from the caller by adding
dedicated exception types for these situations.
This commit introduces the ability to remove retention leases. Explicit
removal will be needed to manage retention leases used to increase the
likelihood of operation-based recoveries syncing, and for consumers such
as ILM.
geo_shape indexes created before 6.6 use geohash string encoding as default tree parameter and quadtree encoding for 6.6 and later. This commit fixes bwc to use geohash encoding in LegacyGeoshapeFieldMapper for indexes created before 6.6.
The assertion that the stats2 map is empty in
IndicesQueryCache.close has been observed to
fail very occasionally in internal cluster tests.
The likely cause is a cross-thread visibility
problem for a count variable. This change
makes that count volatile.
Relates #37117
Backport of #38714
The Close Index API has been refactored in 6.7.0 and it now performs
pre-closing sanity checks on shards before an index is closed: the maximum
sequence number must be equals to the global checkpoint. While this is a
strong requirement for regular shards, we identified the need to relax this
check in the case of CCR following shards.
The following shards are not in charge of managing the max sequence
number or global checkpoint, which are pulled from a leader shard. They
also fetch and process batches of operations from the leader in an unordered
way, potentially leaving gaps in the history of ops. If the following shard lags
a lot it's possible that the global checkpoint and max seq number never get
in sync, preventing the following shard to be closed and a new PUT Follow
action to be issued on this shard (which is our recommended way to
resume/restart a CCR following).
This commit allows each Engine implementation to define the specific
verification it must perform before closing the index. In order to allow
following/frozen/closed shards to be closed whatever the max seq number
or global checkpoint are, the FollowingEngine and ReadOnlyEngine do
not perform any check before the index is closed.
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Added a constructor accepting `StreamInput` as argument, which allowed to
make most of the instance members final as well as remove the default
constructor.
Removed a test only constructor in favour of invoking the existing
constructor that takes a `SearchRequest` as first argument.
Also removed profile members and related methods as they were all unused.
The existing formatter being used was not on par with the joda formatter
as it was missing the ability to parse a comma as a separator between
seconds and milliseconds.
While a real iso8601 would be much more complex, this might be
sufficient for some more use-cases.
The ingest date formatter now also uses the iso8601 formatter by
default.
Closes#38345
The benchmarks showed a sharp decrease in aggregation performance for
the UTC case.
This commit uses the same calculation as joda time, which requires no
conversion into any java time object, also, the check for an fixedoffset
has been put into the ctor to reduce the need for runtime calculations.
The same goes for the amount of the used unit in milliseconds.
Closes#37826
Whenever phase failure is raised in AbstractSearchAsyncAction, we go and
release search contexts of shards that successfully returned their
results, prior to notifying the listener of the failure. In case we are
executing a CCS request, it's important to look-up the connection to
send the release context request to.
This commit makes sure that the lookup takes the cluster alias into
account. We used to use `null` at all times instead which is not correct
and was not caught as any exception is caught without re-throwing it.
The test was relying on toString in ZonedDateTime which is different to
what is formatted by strict_date_time when milliseconds are 0
The method is just delegating to dateFormatter, so that scenario should
be covered there.
closes#38359
Backport #38610
Adds the ability to fetch chunks from different files in parallel, configurable using the new `ccr.indices.recovery.max_concurrent_file_chunks` setting, which defaults to 5 in this PR.
The implementation uses the parallel file writer functionality that is also used by peer recoveries.
This commit adds the 7.1 version constant to the 7.x branch.
Co-authored-by: Andy Bristol <andy.bristol@elastic.co>
Co-authored-by: Tim Brooks <tim@uncontended.net>
Co-authored-by: Christoph Büscher <cbuescher@posteo.de>
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Co-authored-by: markharwood <markharwood@gmail.com>
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: David Roberts <dave.roberts@elastic.co>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Alpar Torok <torokalpar@gmail.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Co-authored-by: Albert Zaharovits <albert.zaharovits@gmail.com>
This commit changes the `TransportVerifyShardBeforeCloseAction` so that it
always forces the flush of the shard. It seems that #37961 is not sufficient to
ensure that the translog and the Lucene commit share the exact same max
seq no and global checkpoint information in case of one or more noop
operations have been made.
The `BulkWithUpdatesIT.testThatMissingIndexDoesNotAbortFullBulkRequest`
and `FrozenIndexTests.testFreezeEmptyIndexWithTranslogOps` test this trivial
situation and they both fail 1 on 10 executions.
Relates to #33888
In #38333 and #38350 we moved away from the `discovery.zen` settings namespace
since these settings have an effect even though Zen Discovery itself is being
phased out. This change aligns the documentation and the names of related
classes and methods with the newly-introduced naming conventions.
This commit integrates retention leases with recovery. With this change,
we copy the current retention leases on primary to the replica during
phase two of recovery. At this point, the replica will have been added
to the replication group and so is already receiving retention lease
sync requests from the primary. This means that if any retention lease
syncs are triggered on the primary after we sample the retention leases
here during phase two, that sync request will also arrive on the replica
ensuring that the replica is from this point on up to date with the
retention leases on the primary. We have to copy these during phase two
since we will be applying indexing operations, potentially triggering
merges, and therefore must ensure the correct retention leases are in
place beforehand.
Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.
This is a continuation of #38014
Fixed NPE that caused CCR tests (IndexFollowingIT and likely others)
to fail.
schedule could bubble rejected exception to uncaught exception
handler when not using SAME executor if thread pool is terminated.
Now ignore rejected exception silently if executor is shutdown.
Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:
> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document
We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem.
This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.
Relates to #1078
This commit lifts the control of when retention leases are expired to
index shard. In this case, we move expiration to an explicit action
rather than a side-effect of calling
ReplicationTracker#getRetentionLeases. This explicit action is invoked
on a timer. If any retention leases expire, then we hard sync the
retention leases to the replicas. Otherwise, we proceed with a
background sync.
Currently the snapshot/restore process manually sets the global
checkpoint to the max sequence number from the restored segements. This
does not work for Ccr as this will lead to documents that would be
recovered in the normal followering operation from being recovered.
This commit fixes this issue by setting the initial global checkpoint to
the existing local checkpoint.
Previously, date formats of `YYYY.MM.dd` would hit an issue
where the year would jump towards the end of the calendar year.
This was an issue that had since been resolved in tests by using
`yyyy` to be the more accurate representation of the year.
Closes#37037.
`CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when
deserializing index creation requests, accidentally accepts mappings that are
nested twice under the type key (as described in the bug report #38266).
This in turn causes us to be too lenient in parsing typeless mappings. In
particular, we accept the following index creation request, even though it
should not contain the type key `_doc`:
```
PUT index?include_type_name=false
{
"mappings": {
"_doc": {
"properties": { ... }
}
}
}
```
There is a similar issue for both 'put templates' and 'put mappings' requests
as well.
This PR makes the minimal changes to detect and reject these typed mappings in
requests. It does not address #38266 generally, or attempt a larger refactor
around types in these server-side requests, as I think this should be done at a
later time.
With this change we no longer support pluggable discovery implementations. No
known implementations of `DiscoveryPlugin` actually override this method, so in
practice this should have no effect on the wider world. However, we were using
this rather extensively in tests to provide the `test-zen` discovery type. We
no longer need a separate discovery type for tests as we no longer need to
customise its behaviour.
Relates #38410
Today we throw a fatal `RuntimeException` if an exception occurs in
`getMasterName()`, and this includes the case where there is currently no
master. However, sometimes we call this method inside an `assertBusy()` in
order to allow for a cluster that is in the process of stabilising and electing
a master. The trouble is that `assertBusy()` only retries on an
`AssertionError` and not on a general `RuntimeException`, so the lack of a
master is immediately fatal.
This commit fixes the issue by asserting there is a master, triggering a retry
if there is not.
Fixes#38331
* The problem in #38226 is that in some corner cases multiple calls to `endSnapshot` were made concurrently, leading to non-deterministic behavior (`beginSnapshot` was triggering a repository finalization while one that was triggered by a `deleteSnapshot` was already in progress)
* Fixed by:
* Making all `endSnapshot` calls originate from the cluster state being in a "completed" state (apart from on short-circuit on initializing an empty snapshot). This forced putting the failure string into `SnapshotsInProgress.Entry`.
* Adding deduplication logic to `endSnapshot`
* Also:
* Streamlined the init behavior to work the same way (keep state on the `SnapshotsService` to decide which snapshot entries are stale)
* closes#38226
We already support unknown objects in the list of pipelines, this changes the
`PipelineConfiguration` to support fields other than just `id` and `config`.
Relates to #36938
Renames the following settings to remove the mention of `zen` in their names:
- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
X-Pack security supports built-in authentication service
`token-service` that allows access tokens to be used to
access Elasticsearch without using Basic authentication.
The tokens are generated by `token-service` based on
OAuth2 spec. The access token is a short-lived token
(defaults to 20m) and refresh token with a lifetime of 24 hours,
making them unsuitable for long-lived or recurring tasks where
the system might go offline thereby failing refresh of tokens.
This commit introduces a built-in authentication service
`api-key-service` that adds support for long-lived tokens aka API
keys to access Elasticsearch. The `api-key-service` is consulted
after `token-service` in the authentication chain. By default,
if TLS is enabled then `api-key-service` is also enabled.
The service can be disabled using the configuration setting.
The API keys:-
- by default do not have an expiration but expiration can be
configured where the API keys need to be expired after a
certain amount of time.
- when generated will keep authentication information of the user that
generated them.
- can be defined with a role describing the privileges for accessing
Elasticsearch and will be limited by the role of the user that
generated them
- can be invalidated via invalidation API
- information can be retrieved via a get API
- that have been expired or invalidated will be retained for 1 week
before being deleted. The expired API keys remover task handles this.
Following are the API key management APIs:-
1. Create API Key - `PUT/POST /_security/api_key`
2. Get API key(s) - `GET /_security/api_key`
3. Invalidate API Key(s) `DELETE /_security/api_key`
The API keys can be used to access Elasticsearch using `Authorization`
header, where the auth scheme is `ApiKey` and the credentials, is the
base64 encoding of API key Id and API key separated by a colon.
Example:-
```
curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health
```
Closes#34383