The new ops based recovery, introduce as part of #10708, is based on the assumption that all operations below the global checkpoint known to the replica do not need to be synced with the primary. This is based on the guarantee that all ops below it are available on primary and they are equal. Under normal operations this guarantee holds. Sadly, it can be violated when a primary is restored from an old snapshot. At the point the restore primary can miss operations below the replica's global checkpoint, or even worse may have total different operations at the same spot. This PR introduces the notion of a history uuid to be able to capture the difference with the restored primary (in a follow up PR).
The History UUID is generated by a primary when it is first created and is synced to the replicas which are recovered via a file based recovery. The PR adds a requirement to ops based recovery to make sure that the history uuid of the source and the target are equal. Under normal operations, all shard copies will stay with that history uuid for the rest of the index lifetime and thus this is a noop. However, it gives us a place to guarantee we fall back to file base syncing in special events like a restore from snapshot (to be done as a follow up) and when someone calls the truncate translog command which can go wrong when combined with primary recovery (this is done in this PR).
We considered in the past to use the translog uuid for this function (i.e., sync it across copies) and thus avoid adding an extra identifier. This idea was rejected as it removes the ability to verify that a specific translog really belongs to a specific lucene index. We also feel that having a history uuid will serve us well in the future.
Removing several occurrences of this typo in the docs and javadocs, seems to be
a common mistake. Corrections turn up once in a while in PRs, better to correct
some of this in one sweep.
* Fix percolator highlight sub fetch phase to not highlight query twice
The PercolatorHighlightSubFetchPhase does not override hitExecute and since it extends HighlightPhase the search hits
are highlighted twice (by the highlight phase and then by the percolator). This does not alter the results, the second highlighting
just overrides the first one but this slow down the request because it duplicates the work.
This commit refactors the bootstrap checks into a single result object
that encapsulates whether or not the check passed, and a failure message
if the check failed. This simpifies the checks, and enables the messages
to more easily be based on the state used to discern whether or not the
check passed.
Relates #26637
This exposes the node settings and the persistent part of the cluster state to the
bootstrap checks to allow plugins to enforce certain preconditions based on the
recovered state.
After backporting the script_field soft limit to the 6.x branches, this test can
now also run in a mixed cluster.
Relates to #26598
enter the commit message for your changes. Lines starting
This commit pushes the allocation ID down through to the global
checkpoint tracker at construction rather than when activated as a
primary.
Relates #26630
Today we have all non-plugin mappers in core. I'd like to start moving those
that neither map to json datatypes nor are very frequently used like `date` or
`ip` to a module.
This commit creates a new module called `mappers-extra` and moves the
`scaled_float` and `token_count` mappers to it. I'd like to eventually move
`range` fields there but it's more complicated due to their intimate
relationship with range queries.
Relates #10368
Requesting to many script_fields in a search request can be costly
because of script execution. This change introduces a soft limit on the number
of script fields that are allowed per request. The setting can be
changed per index using the index.max_script_fields setting.
Relates to #26390
This PR removes the vInt that precedes every value in order to know how long
they are. Instead the query takes an enum that tells how to compute the length
of values: for fixed-length data (ip addresses, double, float) the length is a
constant while longs and integers use a variable-length representation that
allows the length to be computed from the encoded values.
Also the encoding of ints/longs was made a bit more efficient in order not to
waste 3 bits in the header. As a consequence, values between -8 and 7 can now
be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2
bytes or less.
Closes#26443
This commit adds a dependency to the install module task on the task
that builds the module. This is needed for standalone integration
tests that require other modules to be installed. Without this, we do
not have a guarantee that the module is bundled.
If the query coordinating node is also a data node that holds all the
shards for a search request, we can end up recursing through the can
match phase (because we send a local request and on response in the
listener move to the next shard and do this again, without ever having
returned from previous shards). This recursion can lead to stack
overflow for even a reasonable number of indices (daily indices over a
sixty days with five shards per day is enough to trigger the stack
overflow). Moreover, all this execution would be happening on a network
thread (the thread that initially received the query). With this commit,
we allow search phases to override max concurrent requests. This allows
the can match phase to avoid recursing through the shards towards a
stack overflow.
Relates #26484
Requesting to many docvalue_fields in a search request can potentially be costly
because it might incur a per-field per-document seek. This change introduces a
soft limit on the number of fields that can be retrieved. The setting can be
changed per index using the `index.max_docvalue_fields_search` setting.
Relates to #26390
You can define a proxy using the following settings:
```yml
azure.client.default.proxy.host: proxy.host
azure.client.default.proxy.port: 8888
azure.client.default.proxy.type: http
```
Supported values for `proxy.type` are `direct`, `http` or `socks`. Defaults to `direct` (no proxy).
Closes#23506
BTW I changed a test `testGetSelectedClientBackoffPolicyNbRetries` as it was using an old setting name `cloud.azure.storage.azure.max_retries` instead of `azure.client.azure1.max_retries`.
Follow up for #23405.
We remove azure deprecated settings in 7.0:
* The legacy azure settings which where starting with `cloud.azure.storage.` prefix have been removed.
This includes `account`, `key`, `default` and `timeout`.
You need to use settings which are starting with `azure.client.` prefix instead.
* Global timeout setting `cloud.azure.storage.timeout` has been removed.
You must set it per azure client instead. Like `azure.client.default.timeout: 10s` for example.
Today we don't have a pluggable way to validate if the cluster state
is compatible with the node that joins. We already apply some checks for index
compatibility that prevents nodes to join a cluster with indices it doesn't support
but for plugins this isn't possible. This change adds a cluster state validator that
allows plugins to prevent a join if the cluster-state is incompatible.
Remove "index.mapper.dynamic" setting for 6.0 (and after) indices, but
still keep working for 5.x (and before) indices. Remove two index
dynamic disable test cases as the disability of index.mapper.dynamic is
already removed for current version. Add a new test class for version
test.
This commit contains:
* update AWS SDK for ECS Task IAM support
* ignore dependencies not essential to `discovery-ec2`:
* jmespath seems to be used for `waiters`
* amazon ion is a protocol not used by EC2 or IAM
Javadoc linking between projects currently relies on
projectSubstitutions. However, that is an extension variable that is not
part of BuildPlugin. This commit moves the javadoc linking into the root
build.gradle, alongside where projectSubstitutions are defined.
This test case was leftover from the static bwc tests. There was still
one use for checking we do not load old indices, but this PR moves the
legacy code needed for that directly into the test. I also opened a
follow up issue to completely remove the unsupported test: #26583.
When determining if a build is a snapshot build, we look for a field in
the JAR manifest. However, when running tests, we are not running with a
compiled core Elasticsearch JAR, we are running with the compiled core
classes on the classpath. We have a fallback for this, we always assume
such a situation is a snapshot build. However, when running builds with
-Dbuild.snapshot=false, this is not the case. As such, we need to
fallback to the value of build.snapshot. However, there are cases where
we are not running with a compiled core Elasticsearch JAR (e.g., when
the transport client is embedded in a web container) so we should only
do this fallback if we are in tests. To verify we are in tests, we check
if randomized runner is on the classpath.
Relates #26554
RangeQueryBuilder needs to perform too many `instanceof` checks in order to
check for `date` or `range` fields in order to know what it should do with the
shape relation, time zone and date format.
This commit adds those 3 parameters to the `rangeQuery` factory method so that
those instanceof checks are not necessary anymore.
* Limit the number of expanded fields it query_string and simple_query_string
This limits the number of automatically expanded fields for the "all fields"
mode (`"default_field": "*"`) for the `query_string` and `simple_query_string`
queries to 1024 fields.
Resolves#25105
* Add blurb about limit to the docs
* Throw a better error message for empty field names
When a document is parsed with a `""` for a field name, we currently throw a
confusing error about `.` being present in the field. This changes the error
message to be clearer about what's causing the problem.
Resolves#23348
* Fix exception message in test
The percolator will add a `_percolator_document_slot` field to all percolator
hits to indicate with what document it has matched. This number matches with
the order in which the documents have been specified in the percolate query.
Also improved the support for multiple percolate queries in a search request.
To protect against poisonous situations, ES will only try to allocate a shard 5 times (by default). After 5 consecutive failures, ES will stop assigning the shard and wait for an operator to fix the problem. Once the problem is fixed, the operator is expected to call `_reroute` with a `retry_failed` flag to force retrying of those shards. Currently that retry flag is only used for a single allocation run. However, if not all shards can be allocated at once (due to throttling) the operator has to keep on calling the API until all shards are assigned which is cumbersome. This PR changes the behavior of the flag to reset the failed allocations counter and this allowing shards to be assigned again.