Reverts #48947 and fixes the issue orginally addressed by removing the assertion.
It turns out we can't simply pass empty shard generations to the snapshot finalization in the
BwC case as that results in no indices being added to the meta for the given snapshot since
we take the indices from the shard generations (even in the BwC case the `null` generations work
fine for this).
Closes#48983
Today in 6.x it is possible to add an index tombstone to the graveyard without
deleting the corresponding index metadata, because the deletion is slightly
deferred. If you shut down the node and upgrade to 7.x when in this state then
the node will fail to apply any cluster states, reporting
java.lang.IllegalStateException: Cannot delete index [...], it is still part of the cluster state.
This commit addresses this situation by skipping over any index metadata with a
corresponding tombstone, allowing this metadata to be cleaned up by the 7.x
node.
This commits sends the cluster name and discovery naode in the transport
level handshake response. This will allow us to stop sending the
transport service level handshake request in the 8.0-8.x release cycle.
It is necessary to start sending this in 7.x so that 8.0 is guaranteed
to be communicating with a version that sends the required information.
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.
Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.
This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.
Fixes#47599
Note - #41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
We were tripping the assertion that the makes sure we only have empty `ShardGenerations` in `RepositoryData` in the BwC case because shard generations were passed to the `Repository` in the BwC case. Fixed by only generating empty shard gen for BwC snapshots in `SnapshotsService`.
Backport of #48450.
Make a number of changes so that code in the `server` directory is more
resilient to automatic formatting. This covers:
* Reformatting multiline JSON to embed whitespace in the strings
* Move some comments around to they aren't auto-formatted to a strange
place. This also required moving some `&&` and `||` operators from the
end-of-line to start-of-line`.
* Add helper method `reformatJson()`, to strip whitespace from a JSON
document using XContent methods. This is sometimes necessary where
a test is comparing some machine-generated JSON with an expected
value.
Also, `HyperLogLogPlusPlus.java` is now excluded from formatting because it
contains large data tables that don't reformat well with the current settings,
and changing the settings would be worse for the rest of the codebase.
The realtime GET API currently has erratic performance in case where a document is accessed
that has just been indexed but not refreshed yet, as the implementation will currently force an
internal refresh in that case. Refreshing can be an expensive operation, and also will block the
thread that executes the GET operation, blocking other GETs to be processed. In case of
frequent access of recently indexed documents, this can lead to a refresh storm and terrible
GET performance.
While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted
to read from the translog in case of realtime GET API or update API, this was removed in 5.0
(#20102) to avoid inconsistencies between values that were returned from the translog and
those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and
upsert to read from the translog again as it was easier to guarantee consistency for these, and
also brought back more predictable performance characteristics of this API. Calls to the realtime
GET API, however, would still always do a refresh if necessary to return consistent results. This
means that users that were calling realtime GET APIs to coordinate updates on client side
(realtime GET + CAS for conditional index of updated doc) would still see very erratic
performance.
This PR (together with #48707) resolves the inconsistencies between reading from translog and
index. In particular it fixes the inconsistencies that happen when requesting stored fields, which
were not available when reading from translog. In case where stored fields are requested, this
PR will reparse the _source from the translog and derive the stored fields to be returned. With
this, it changes the realtime GET API to allow reading from the translog again, avoid refresh
storms and blocking the GET threadpool, and provide overall much better and predictable
performance for this API.
We should not open new engines if a shard is closed. We break this
assumption in #45263 where we stop verifying the shard state before
creating an engine but only before swapping the engine reference.
We can fail to snapshot the store metadata or checkIndex a closed shard
if there's some IndexWriter holding the index lock.
Closes#47060
This change fixes a poisonous situation where an ongoing recovery was
canceled because a better copy was found on a node that the cluster had
previously tried allocating the shard to but failed. The solution is to
keep track of the set of nodes that an allocation was failed on so that
we can avoid canceling the current recovery for a copy on failed nodes.
Closes#47974
Today if the primary discovers that an indexing request needs a mapping update
then it will send it to the master for validation and processing. If, however,
the put-mapping request is invalid then the master still processes it as a
(no-op) cluster state update. When there are a large number of indexing
operations that result in invalid mapping updates this can overwhelm the
master.
However, the primary already has a reasonably up-to-date mapping against which
it can check the (approximate) validity of the put-mapping request before
sending it to the master. For instance it is not possible to remove fields in a
mapping update, so if the primary detects that a mapping update will exceed the
fields limit then it can reject it itself and avoid bothering the master.
This commit adds a pre-flight check to the mapping update path so that the
primary can discard obviously-invalid put-mapping requests itself.
Fixes#35564
Backport of #48817
This test failure manifests the limitation of the recovery source merge
policy explained in #41628. If we already merge down to a single segment
then subsequent force merges will be noop although they can prune
recovery source. We need to adjust this test until we have a fix for the
merge policy.
Relates #41628Closes#48735
The loading of `RepositoryData` is not an atomic operation.
It uses a list + get combination of calls.
This lead to accidentally returning an empty repository data
for generations >=0 which can never not exist unless the repository
is corrupted.
In the test #48122 (and other SLM tests) there was a low chance of
running into this concurrent modification scenario and the repository
actually moving two index generations between listing out the
index-N and loading the latest version of it. Since we only keep
two index-N around at a time this lead to unexpectedly absent
snapshots in status APIs.
Fixing the behavior to be more resilient is non-trivial but in the works.
For now I think we should simply throw in this scenario. This will also
help prevent corruption in the unlikely event but possible of running into this
issue in a snapshot create or delete operation on master failover on a
repository like S3 which doesn't have the "no overwrites" protection on
writing a new index-N.
Fixes#48122
The problem with wrapping here is that it converts any exception into an
IAE, which we treat as a client error (400 status) whereas the exception
being wrapped here could be a server error (e.g., NPE). This commit
stops wrapping all ingest processor exceptions as IAEs.
This commit introduces a consistent, and type-safe manner for handling
global build parameters through out our build logic. Primarily this
replaces the existing usages of extra properties with static accessors.
It also introduces and explicit API for initialization and mutation of
any such parameters, as well as better error handling for uninitialized
or eager access of parameter values.
Closes#42042
* Add ingest info to Cluster Stats (#48485)
This commit enhances the ClusterStatsNodes response to include global
processor usage stats on a per-processor basis.
example output:
```
...
"processor_stats": {
"gsub": {
"count": 0,
"failed": 0
"current": 0
"time_in_millis": 0
},
"script": {
"count": 0,
"failed": 0
"current": 0,
"time_in_millis": 0
}
}
...
```
The purpose for this enhancement is to make it easier to collect stats on how specific processors are being used across the cluster beyond the current per-node usage statistics that currently exist in node stats.
Closes#46146.
* fix BWC of ingest stats
The introduction of processor types into IngestStats had a bug.
It was set to `null` and set as the key to the map. This would
throw a NPE. This commit resolves this by setting all the processor
types from previous versions that are not serializing it out to
`_NOT_AVAILABLE`.
Previous behavior while copying HTTP headers to the ThreadContext,
would allow multiple HTTP headers with the same name, handling only
the first occurrence and disregarding the rest of the values. This
can be confusing when dealing with multiple Headers as it is not
obvious which value is read and which ones are silently dropped.
According to RFC-7230, a client must not send multiple header fields
with the same field name in a HTTP message, unless the entire field
value for this header is defined as a comma separated list or this
specific header is a well-known exception.
This commits changes the behavior in order to be more compliant to
the aforementioned RFC by requiring the classes that implement
ActionPlugin to declare if a header can be multi-valued or not when
registering this header to be copied over to the ThreadContext in
ActionPlugin#getRestHeaders.
If the header is allowed to be multivalued, then all such headers
are read from the HTTP request and their values get concatenated in
a comma-separated string.
If the header is not allowed to be multivalued, and the HTTP
request contains multiple such Headers with different values, the
request is rejected with a 400 status.
Today a couple of allocation deciders iterate through all the shards on a node
to find the `INITIALIZING` or `RELOCATING` ones, and this can slow down cluster
state updates in clusters with very high-density nodes holding many thousands
of shards even if those shards belong to closed or frozen indices. This commit
pre-computes the sets of `INITIALIZING` and `RELOCATING` shards to speed up
this search.
Closes#46941
Relates #48579
Co-authored-by: "hongju.xhj" <hongju.xhj@alibaba-inc.com>
Backport of #48553. Make a number of changes so that JSON in the server
directory is more resilient to automatic formatting. This covers:
* Reformatting multiline JSON to embed whitespace in the strings
* Add helper method `stripWhitespace()`, to strip whitespace from a JSON
document using XContent methods. This is sometimes necessary where
a test is comparing some machine-generated JSON with an expected
value.
Today it is possible that we import a dangling index that was created in a
newer version than one or more of the nodes in the cluster. Such an index would
prevent the older node(s) from rejoining the cluster if they were to briefly
leave it for some reason. This commit prevents the import of such dangling
indices.
Fixes#34264
With this change, we won't warm up searchers until we externally refresh
an engine. We explicitly refresh before allowing reading from a shard
(i.e., move to post_recovery state) and during resetting. These
guarantees that we have warmed up the engine before exposing the
external searcher.
Another prerequisite for #47186.
Fixes the shard snapshot status reporting for failed shards
in the corner case of failing the shard because of an exception
thrown in `SnapshotShardsService` and not the repository.
We were missing the update on the `snapshotStatus` instance in
this case which made the transport APIs using this field report
back an incorrect status.
Fixed by moving the failure handling to the `SnapshotShardsService`
for all cases (which also simplifies the code, the ex. wrapping in
the repository was pointless as we only used the ex. trace upstream
anyway).
Also, added an assertion to another test that explicitly checks this
failure situation (ex. in the `SnapshotShardsService`) already.
Closes#48526
We can run into an already closed store here and hence
throw on trying to increment the ref count => moving to
the guarded ref count increment
closes#48625
Previously the functions accepted a doc values reference, whereas they now
accept the name of the vector field. Here's an example of how a vector function
was called before and after the change.
```
Before: cosineSimilarity(params.query_vector, doc['field'])
After: cosineSimilarity(params.query_vector, 'field')
```
This seems more intuitive, since we don't allow direct access to vector doc
values and the the meaning of `doc['field']` is unclear.
The PR makes the following changes (broken into distinct commits):
* Add new function signatures of the form `function(params.query_vector,
'field')` and deprecates the old ones. Because Painless doesn't allow two
methods with the same name and number of arguments, we allow a generic `Object`
to be passed in to the function and decide on the behavior through an
`instanceof` check.
* Refactor the class bindings so that the document field is passed to the
constructor instead of the instance method. This allows us to avoid retrieving
the vector doc values on every function invocation, which gives a tiny speed-up
in benchmarks.
Note that this PR adds new signatures for the sparse vector functions too, even
though sparse vectors are deprecated. It seemed simplest to understand (for both
us and users) to keep everything symmetric between dense and sparse vectors.
Today we won't advance the safe commit on a new global checkpoint unless
the last commit can become safe. This is not great if we have more than
two commits as we can have a new safe commit earlier.
Closes#4853
This commit fixes intermittent failures in ShuffleForcedMergePolicyTests#testDiagnostics
by setting a more restricted merge policy that ensures that extra merging will not happen
before the forced merge.
This commit fixes the expectations of SearchAfterIT#shouldFail regarding the inner exceptions that should be thrown
when testing failures. The exception is sometimes wrapped in a QueryShardException so this change only checks that
the toString representation contains the expected message.
Closes#43143
This change adds a new merge policy that interleaves eldest and newest segments picked by MergePolicy#findForcedMerges
and MergePolicy#findForcedDeletesMerges. This allows time-based indices, that usually have the eldest documents
first, to be efficient at finding the most recent documents too. Although we wrap this merge policy for all indices
even though it is mostly useful for time-based but there should be no overhead for other type of indices so it's simpler
than adding a setting to enable it. This change is needed in order to ensure that the optimizations that we are working
on in # remain efficient even after running a force merge.
Relates #37043
Today, we hold the engine readLock while refreshing. Although this
choice simplifies the correctness reasoning, it can block IndexShard
from closing if warming an external reader takes time. The current
implementation of refresh does not need to hold readLock as
ReferenceManager can handle errors correctly if the engine is closed in
midway.
This PR is a prerequisite that we need to solve #47186.
* Extract remote "sniffing" to connection strategy (#47253)
Currently the connection strategy used by the remote cluster service is
implemented as a multi-step sniffing process in the
RemoteClusterConnection. We intend to introduce a new connection strategy
that will operate in a different manner. This commit extracts the
sniffing logic to a dedicated strategy class. Additionally, it implements
dedicated tests for this class.
Additionally, in previous commits we moved away from a world where the
remote cluster connection was mutable. Instead, when setting updates are
made, the connection is torn down and rebuilt. We still had methods and
tests hanging around for the mutable behavior. This commit removes those.
* Introduce simple remote connection strategy (#47480)
This commit introduces a simple remote connection strategy which will
open remote connections to a configurable list of user supplied
addresses. These addresses can be remote Elasticsearch nodes or
intermediate proxies. We will perform normal clustername and version
validation, but otherwise rely on the remote cluster to route requests
to the appropriate remote node.
* Make remote setting updates support diff strategies (#47891)
Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.
As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
* Make remote setting updates support diff strategies (#47891)
Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.
As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
* Fix .tasks index strict mapping: parent_id should be parent_task_id
The .tasks index has mappings that's strictly defined. `parent_task_id`
was defined as `parent_id` though which would cause an exception in case
a task is persisted that has a parent task id set.
While at it, a couple of compiler warnings were addressed and a test
request builder was removed in favour of using its corresponding request.
* increment version
The expand phase is always created providing a function that builds
the next phase to be run, which has a single purpose: sending the
response back. Such small search phase is not necessary and causes some
issues when reporting search progress and counting the search phases
that need to be executed and that are already executed. We can simply
rather send back the response, without creating a specific phase for that.
This relates to the effort towards #46250. We added
tracking of the shard generation for successful
snapshots to `8.0`.
This assertion isn't correct though. While an `8.0`
master won't create an entry with sucess state and
a null shard generation it may still (on e.g. master
failover) send a success entry created by a 7.x master
with a `null` generation over the wire.
Closes#47406
This changes the queries equals() method so that the boost factors for each term
are considered for the equality calculation. This means queries are only equal
if both their terms and associated boosts match. The ordering of the terms
doesn't matter as before, which is why we internally need to sort the terms and
boost for comparison on the first equals() call like before. Boosts that are
`null` are considered equal to boosts of 1.0f because topLevelQuery() will only
wrap into BoostQuery if boost is not null and different from 1f.
Closes#48184
BytesReference is currently an abstract class which is extended by
various implementations. This makes it very difficult to use the
delegation pattern. The implication of this is that our releasable
BytesReference is a PagedBytesReference type and cannot be used as a
generic releasable bytes reference that delegates to any reference type.
This commit makes BytesReference an interface and introduces an
AbstractBytesReference for common functionality.
On data-only nodes we were not using the last persisted cluster state as base point to compute
what needed storage, but the last applied cluster state (but not necessarily properly persisted)
instead.
In #48392 we added a second computation of the sizes of the relocating shards
in `canRemain()` but passed the wrong value for `subtractLeavingShards`. This
fixes that. It also removes some unnecessary logging in a test case added in
the same commit.
This is a follow up of https://github.com/elastic/elasticsearch/issues/43453 where we added
a system property to disallow allocation awareness in search requests. Since search requests
will no longer check the allocation awareness attributes for routing in the next major version,
this change adds a deprecation warning on any setup that uses these attributes.
Relates #43453
Previously there was a bug when an query inside script_score query
was rewritten. If min_score was not set and was equal to null,
we were converting it to float value which resulted to NPE.
This commit corrects this.
Closes#48081
Brings handling of out of bounds points in linestrings in line with
points. Now points with latitude above 90 and below -90 are handled
the same way as for points by adjusting the longitude by moving it by
180 degrees.
Relates to #43916
* Do not throw errors on unknown types in SearchAfterBuilder
The support for BigInteger and BigDecimal was added for XContent in
https://github.com/elastic/elasticsearch/pull/32888. However the SearchAfterBuilder
xcontent parser doesn't expect them to be present so it throws an AssertionError.
This change fixes this discrepancy by changing the AssertionError into an
IllegalArgumentException that will not cause the node to die when thrown.
Closes#48074
Today it is possible that the total size of all relocating shards exceeds the
total amount of free disk space. For instance, this may be caused by another
user of the same disk increasing their disk usage, or may be due to how
Elasticsearch double-counts relocations that are nearly complete particularly
if there are many concurrent relocations in progress.
The `DiskThresholdDecider` treats negative free space similarly to zero free
space, but it then fails when rendering the messages that explain its decision.
This commit fixes its handling of negative free space.
Fixes#48380
The comment says it needs random-access, but it passes `Long#MAX_VALUE` as the
lead cost, which forces sequential access, it should pass `0` instead. I took
advantage of this fix to improve the logic to leverage an estimation of the
number of times that `Bits#get` gets called to make better decisions.
Reverting the change introducing IsoLocal.ROOT and introducing IsoCalendarDataProvider that defaults start of the week to Monday and requires minimum 4 days in first week of a year. This extension is using java SPI mechanism and defaults for Locale.ROOT only.
It require jvm property java.locale.providers to be set with SPI,COMPAT
closes#41670
backport #48209
This change adds a new field `"shards"` to `RepositoryData` that contains a mapping of `IndexId` to a `String[]`. This string array can be accessed by shard id to get the generation of a shard's shard folder (i.e. the `N` in the name of the currently valid `/indices/${indexId}/${shardId}/index-${N}` for the shard in question).
This allows for creating a new snapshot in the shard without doing any LIST operations on the shard's folder. In the case of AWS S3, this saves about 1/3 of the cost for updating an empty shard (see #45736) and removes one out of two remaining potential issues with eventually consistent blob stores (see #38941 ... now only the root `index-${N}` is determined by listing).
Also and equally if not more important, a number of possible failure modes on eventually consistent blob stores like AWS S3 are eliminated by moving all delete operations to the `master` node and moving from incremental naming of shard level index-N to uuid suffixes for these blobs.
This change moves the deleting of the previous shard level `index-${uuid}` blob to the master node instead of the data node allowing for a safe and consistent update of the shard's generation in the `RepositoryData` by first updating `RepositoryData` and then deleting the now unreferenced `index-${newUUID}` blob.
__No deletes are executed on the data nodes at all for any operation with this change.__
Note also: Previous issues with hanging data nodes interfering with master nodes are completely impossible, even on S3 (see next section for details).
This change changes the naming of the shard level `index-${N}` blobs to a uuid suffix `index-${UUID}`. The reason for this is the fact that writing a new shard-level `index-` generation blob is not atomic anymore in its effect. Not only does the blob have to be written to have an effect, it must also be referenced by the root level `index-N` (`RepositoryData`) to become an effective part of the snapshot repository.
This leads to a problem if we were to use incrementing names like we did before. If a blob `index-${N+1}` is written but due to the node/network/cluster/... crashes the root level `RepositoryData` has not been updated then a future operation will determine the shard's generation to be `N` and try to write a new `index-${N+1}` to the already existing path. Updates like that are problematic on S3 for consistency reasons, but also create numerous issues when thinking about stuck data nodes.
Previously stuck data nodes that were tasked to write `index-${N+1}` but got stuck and tried to do so after some other node had already written `index-${N+1}` were prevented form doing so (except for on S3) by us not allowing overwrites for that blob and thus no corruption could occur.
Were we to continue using incrementing names, we could not do this. The stuck node scenario would either allow for overwriting the `N+1` generation or force us to continue using a `LIST` operation to figure out the next `N` (which would make this change pointless).
With uuid naming and moving all deletes to `master` this becomes a non-issue. Data nodes write updated shard generation `index-${uuid}` and `master` makes those `index-${uuid}` part of the `RepositoryData` that it deems correct and cleans up all those `index-` that are unused.
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
This commit fixes the usage of JsonStringEncoder#quoteAsUTF8 in the SearchSlowLog.
JsonStringEncoder#getInstance should always be called to get a thread local object
but this assumption was broken by #44642. This means that any slow log can throw
an AIOOBE since it uses the same byte array concurrently.
Closes#48358
The code here was needlessly complicated when it
enqueued all file uploads up-front. Instead, we can
go with a cleaner worker + queue pattern here by taking
the max-parallelism from the threadpool info.
Also, I slightly simplified the rethrow and
listener (step listener is pointless when you add the callback in the next line)
handling it since I noticed that we were needlessly rethrowing in the same
code and that wasn't worth a separate PR.
This class is only used by the blob store repository
and CCR and the abstractions didn't really make sense
with CCR ignoring the concrete `restoreFiles` method
completely and having a method used only by the blobstore
overriden as unsupported.
=> Moved to a more fitting set of abstractions
=> Dried up the stream wrapping in `BlobStoreRepository` a little
now that the `restoreFile` method could be simplified
Relates #48110 as it makes changing the API of `FileRestoreContext`
to what is needed for async restores simpler
Today it is possible that we create the `QueryCache` and then fail to create
the owning `IndexService` and this means we do not close the `QueryCache`
again. This commit addresses that leak.
Fixes#48186
Today if an Elasticsearch node reaches a disk watermark then it will repeatedly
emit logging about it, which implies that some action needs to be taken by the
administrator. This is misleading. Elasticsearch strives to keep nodes under
the high watermark, but it is normal to have a few nodes occasionally exceed
this level. Nodes may be over the low watermark for an extended period without
any ill effects.
This commit enhances the logging emitted by the `DiskThresholdMonitor` to be
less misleading. The expected case of hitting the high watermark and
immediately relocating one or more shards that to bring the node back under the
watermark again is reduced in severity to `INFO`. Additionally, `INFO` messages
are not emitted repeatedly.
Fixes#48038
The logic for handling empty segment files has been
unnecessary ever since #24021 which removes the support
for these files in 6.x -> we can safely remove the
support for restoring these from 7.x+ to simplify the code.
* Slow log must use separate underlying logger for each index (#47234)
SlowLog instances should not share the same underlying logger, as it would cause different indexes override each other levels. When creating underlying logger, unique per index identifier should be used. Name + IndexSettings.UUID
Closes#42432
There is no reason to still resolve the
fallback `IndexId` here. It only applies to
`2.x` repos and those we can't read anymore
anyway because they use an `/index` instead of
an `/index-N` blob at the repo root for which
at least 7.x+ does not contain the logic to find
it.
Adds `GET /_script_context`, returning a `contexts` object with each
available context as a key whose value is an empty object. eg.
```
{
"contexts": {
"aggregation_selector": {},
"aggs": {},
"aggs_combine": {},
...
}
}
```
refs: #47411
We were not closing repositories on Node shutdown.
In production, this has little effect but in tests
shutting down a node using `MockRepository` and is
currently stuck in a simulated blocked-IO situation
will only unblock when the node's threadpool is
interrupted. This might in some edge cases (many
snapshot threads and some CI slowness) result
in the execution taking longer than 5s to release
all the shard stores and thus we fail the assertion
about unreleased shard stores in the internal test cluster.
Regardless of tests, I think we should close repositories
and release resources associated with them when closing
a node and not just when removing a repository from the CS
with running nodes as this behavior is really unexpected.
Fixes#47689
This PR fixes (#47593). Stored scripts with the old-style id of lang#id are
saved through the upgrade process but are no longer accessible in recent
versions. This fix will drop those scripts altogether since there is no way for
a user to access them.
which is backport merge and adds a new ingest processor, named enrich processor,
that allows document being ingested to be enriched with data from other indices.
Besides a new enrich processor, this PR adds several APIs to manage an enrich policy.
An enrich policy is in charge of making the data from other indices available to the enrich processor in an efficient manner.
Related to #32789
This change modifies the local execution of the `can_match` phase to **not** apply
the plugin's reader wrapper (if it is configured) when acquiring the searcher.
We must ensure that the phase runs quickly and since we don't know the cost
of applying the wrapper it is preferable to avoid it entirely. The can_match
phase can aford false positives so it is also safe for the builtin plugins
that use this functionality.
Closes#46817
With this change, shard allocation prefers allocating replicas on a node
that already has a copy of the shard that is as close as possible to the
primary, so that it is as cheap as possible to bring the new replica in
sync with the primary. Furthermore, if we find a copy that is identical
to the primary then we cancel an ongoing recovery because the new copy
which is identical to the primary needs no work to recover as a replica.
We no longer need to perform a synced flush before performing a rolling
upgrade or full cluster start with this improvement.
Closes#46318
This change should reduce refreshes for a use-case where we perform
multiple realtime gets at the same time on an active index. Currently,
we only call refresh if the index operation is still on the versionMap.
However, at the time we call refresh, that operation might be already or
will be included in the latest reader. Hence, we do not need to refresh.
Adding another lock here is not an issue as the refresh is already
sequential.
If we roll translog but do not index, then a flush without force is a
noop. In this case, the number of retained translog files will be higher
than the value specified by the retention policy.
Closes#4741
Joda was using ResolverStyle.STRICT when parsing. This means that date will be validated to be a correct year, year-of-month, day-of-month
However, we also want to make it works with Year-Of-Era as Joda used to, hence custom temporalquery.localdate in DateFormatters.from
Within DateFormatters we use the correct uuuu year instead of yyyy year of era
worth noting: if yyyy(without an era) is used in code, the parsing result will be a TemporalAccessor which will fail to be converted into LocalDate. We mostly use DateFormatters.from so this takes care of this. If possible the uuuu format should be used.
Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a `--truncate-clean-translog` option to skip the corruption check on the
translog and blindly truncate it.
Shrink would set `max_retries=1` in order to avoid retrying. This
however sticks to the shrunk index afterwards, causing issues when a
shard copy later fails to allocate just once.
Avoiding a retry of a shrink makes sense since there is no new node
to allocate to and a retry will likely fail again. However, the downside of
having max_retries=1 afterwards outweigh the benefit of not retrying
the failed shrink a few times. This change ensures shrink no longer
sets max_retries and also makes all resize operations (shrink, clone,
split) leave the setting at default value rather than copy it from source.
Enable partial parsing of date part. This is making the behaviour in java.time implementation the same as with joda.
2018, 2018-01 and 2018-01-01 are all valid dates for date_optional_time or strict_date_optional_time
closes#45284closes#47473
SiblingPipelineAggregator is a public interfaces,
but the ctor was package-private. These should be protected so that
plugin authors can extend and implement their own sibling pipeline agg.