The monitoring bulk API accepts the same format as the bulk API, yet its concept
of types is different from "mapping types" and the deprecation warning is only
emitted as a side-effect of this API reusing the parsing logic of bulk requests.
This commit extracts the parsing logic from `_bulk` into its own class with a
new flag that allows to configure whether usage of `_type` should emit a warning
or not. Support for payloads has been removed for simplicity since they were
unused.
@jakelandis has a separate change that removes this notion of type from the
monitoring bulk API that we are considering bringing to 8.0.
This commit propagates some exceptions that were previously swallowed and also
makes sure that exceptions closing streams are either propagated if the try
block succeeded or added as suppressed exceptions otherwise.
* [ML] refactoring lazy query and agg parsing
* Clean up and addressing PR comments
* removing unnecessary try/catch block
* removing bad call to logger
* removing unused import
* fixing bwc test failure due to serialization and config migrator test
* fixing style issues
* Adjusting DafafeedUpdate class serialization
* Adding todo for refactor in v8
* Making query non-optional so it does not write a boolean byte
This commit consolidates more mapping validation logic into the same class.
`FieldTypeLookup` is now a bit simpler, and has the sole responsibility of quickly
resolving field names to their types.
I have a broader refactor planned around mapping merge validation, but this
change should at least be a step in the right direction.
These simplifications to `MapperMergeValidator` are possible now that there is
always a single mapping definition.
* Remove the type argument in `validateMapperStructure`.
* Remove unnecessary checks against existing mappers.
If TransportService is stopped before a shard-failure request is sent
but after the request is registered, TransportService will notify
ReplicationOperation a TransportException with an error message:
"transport stop, action: internal:cluster/shard/failure".
Relates #39584
* Bundle java in distributions
Setting up a jdk is currently a required external step when installing
elasticsearch. This is particularly problematic for the rpm/deb packages
as installing a jdk in the same package installation command does not
guarantee any order, so must be done in separate steps. Additionally,
JAVA_HOME must be set and often causes problems in selecting a correct
jdk when, for example, the system java is an older unsupported version.
This commit bundles platform specific openjdks into each distribution.
In addition to eliminating the issues above, it also presents future
possible improvements like using jlink to build jdk images only
containing modules that elasticsearch uses.
closes#31845
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.
This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the
only index created is the original or one set by the ingest node pipeline.
This was the execution order prior to 6.5.0 (#32786).
The execution order was changed in 6.5 to better support default pipelines.
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such
that if the target index does not exist when ingest node pipeline runs, it
will now pull the default pipeline (if one exists) from the settings of the
best matched of the index template.
Relates #32786
Relates #32758Closes#36545
This commit introduces the forget follower API. This API is needed in cases that
unfollowing a following index fails to remove the shard history retention leases
on the leader index. This can happen explicitly through user action, or
implicitly through an index managed by ILM. When this occurs, history will be
retained longer than necessary. While the retention lease will eventually
expire, it can be expensive to allow history to persist for that long, and also
prevent ILM from performing actions like shrink on the leader index. As such, we
introduce an API to allow for manual removal of the shard history retention
leases in this case.
We need to unwrap and use the actual cause when determining if the node
with primary shard is shutting down because TransportService will throw
a TransportException wrapped in a SendRequestTransportException.
Relates #39584
Today when a replicated write operation fails to execute on a replica,
the primary will reach out to the master to fail that replica (and mark
it stale). We then won't ack that request until the master removes the
failing replica; otherwise, we will lose the acked operation if the
failed replica is still in the in-sync set. However, if a node with the
primary is shutting down, we might ack such request even though we are
unable to send a shard-failure request to the master. This happens
because we ignore NodeClosedException which is triggered when the
ClusterService is being closed.
Closes#39467
This adds the capability to snapshot replicated closed indices.
It also changes snapshot requests in v8.0.0 to automatically expand wildcards to closed indices and hence start snapshotting closed indices by default. For v7.1.0 and above, wildcards are by default only expanded to open indices, which can be changed by explicitly setting the expand_wildcards option either to all or closed.
Note that indices are always restored as open indices, even if they have been snapshotted as closed replicated indices.
Relates to #33888
Lucene added an optimization to leave the term dictionary on disk
for non-id like fields. This change happened very late in the release
processes such that it's better to have an escape hatch if certain
use-cases are hurt by this optimization. This setting might be
removed in the future if it turns out to be unnecessary.
Currently SearchServiceTests.testCloseSearchContextOnRewriteException can fail
if a refresh happens while we test for the SearchPhaseExecutionException that is
thrown later in the test. The test takes the current Store#refCount and expects
it to be the same after the exception is thrown. If a refresh happens in that
interval however, the refCound will be different, causing the test to fail. This
can be provoked e.g. by running this section in a tight loop.
Switching of refresh for this tests solves the issue.
When preparing the state to send to other nodes, we're serializing it
for each node, despite using putIfAbsent.
This commit checks if the state was already serialized for this node
version before performing the potentially expensive computation.
The map is not used by multiple threads, so computeIfAbsent is not
needed (and could not be used here easily, because IOException could
be thrown).
(cherry picked from commit c99be63b43f5250f3cd220130df73c5e9e097459)
When a node is joining the cluster we ensure that it can send requests to the
master _at that time_. If it joins the cluster and _then_ loses the ability to
send requests to the master then it should be removed from the cluster. Today
this is not the case: the master can still receive responses to its follower
checks, and receives acknowledgements to cluster state publications, so has no
reason to remove the node.
This commit changes the handling of follower checks so that they fail if they
come from a master that the other node was following but which it now believes
to have failed.
Today the `GroupedActionListener` accepts a `defaults` parameter but all
callers pass an empty list. Also it is permitted to pass an empty group but
this is trappy because the delegated listener is never be called in that case.
This commit removes the `defaults` parameter and forbids an empty group.
* Optimize Bulk Message Parsing and Message Length Parsing
* findNextMarker took almost 1ms per invocation during the PMC rally track
* Fixed to be about an order of magnitude faster by using Netty's bulk `ByteBuf` search
* It is unnecessary to instantiate an object (the input stream wrapper) and throw it away, just to read the `int` length from the message bytes
* Fixed by adding bulk `int` read to BytesReference
This commit renames the retention lease setting
index.soft_deletes.retention.lease so that it is under the namespace
index.soft_deletes.retention_lease. As such, we rename the setting to
index.soft_deletes.retention_lease.period.
This commit adds a new build type (together with deb/rpm/tar/zip) to
represent the official Docker images. This build type will be displayed
in APIs such as the main and nodes info APIs.
In case multiple completion suggestion entries have the same score and
surface form, the order in which such options will be returned is
currently not deterministic.
With this commmit we introduce tie-breaking for such situations, based
on shard id, index name, index uuid and doc id like we already do for
ordinary search hits. With this change we also make shardIndex
mandatory when sorting and comparing completion suggestion options,
which was previously only needed later when fetching hits).
Also, we need to make sure shardIndex is properly set when merging
completion suggestions coming from multiple clusters in
`SearchResponseMerger`
* Soften redundant cast to allow use of `DeterministicTaskQueue` in this class for #39504
* Remove two redundant variables and lower visibility in two possible spots
* Make field `final`
Currently Fuzziness#asDistance(String) doesn't work for custom AUTO values. If
the fuzziness is AUTO, the method returns the correct edit distance to use,
depending on the input string, but for custom AUTO values it currently always
returns an edit distance of 1. Correcting this and adding unit and integration
tests to catch these cases.
Closes#39614
Today we have no chance to fetch actual segment stats for segments that
are currently unloaded. This is relevant in the case of frozen indices.
This allows to monitor how much memory a frozen index would use if it was
unfrozen.
The match interval builder analyses input text and converts it to an IntervalSource, and as such
may generate token streams with stopwords. This commit deals with these by using the extend
factory to cover the gaps produced by these stopwords so that phrase and ordered queries work
correctly.
`TopDocsCollectorContext` can already shortcut hit counts on `match_all` and `term` queries when there are no deletions.
This change adds this ability for `exists` queries if the index doesn't have deletions and fields are indexed.
Closes#37475
While serializing custom objects, the length of the list is computed after
filtering out the unsupported objects but while writing objects the filter
is not applied thus resulting in writing unsupported objects which will fail
to deserialize by the receiever. Adding the condition to filter out unsupported
custom objects.
Since #39006 we should be able to complete a peer-recovery without
waiting for pending indexing operations. Thus, the assertion in
testDoNotWaitForPendingSeqNo should be updated from false to true.
Closes#39510
This is now possible as Lucene's `TotalHits` implements `equals`/`hashcode`,
all the other methods can be in-lined in `SearchHits` instead, no need for
a specific wrapper class.
* Introduce Safer Chaining of Listeners
* The motivation here is to make reasoning about chains of `ActionListener` a little easier, by providing a safe method for nesting `ActionListener` that guarantees that a response is never dropped. Also, it dries up the code a little by removing the need to repeat `listener::onFailure` and `listener.onResponse` over and over.
* Refactored a number of obvious/easy spots to use the new listener constructor
Zen1IT#testFreshestMasterElectedAfterFullClusterRestart fails sometimes because
we request the cluster state before state recovery has completed, and therefore
obtain the default value for the setting we're relying on.
Confusingly, we were starting out by setting this setting to its default value,
so the test looked like it was failing because of a production bug. This commit
avoids this confusion in future by setting it to a non-default value at the
start of the test.
Fixes#39586.
This commit adds the following:
- more tests to IndicesServiceCloseTests, one of them found a bug in the order
in which `IndicesQueryCache#onClose` and
`IndicesService.indicesRefCount#decRef` are called.
- made `IndicesQueryCache.stats2` a synchronized map. All writes to it are
already protected by the lock of the Lucene cache, but the final read from
an assertion in `IndicesQueryCache#close()` was not so this change should
avoid any potential visibility issues.
- human-readable `toString`s to make debugging easier.
Relates #37117
This cleans up the Engine implementation by separating the sequence number generation from the
planning step in the engine, to avoid for the planning step to have any side effects. This makes it
easier to see that every sequence number is properly accounted for.