We generate slightly different NoOps in InternalEngine and
TransportShardBulkAction for the same failure.
1. InternalEngine uses Exception#getFailure to generate a message
without the class name: newOp [NoOp{seqNo=1, primaryTerm=1,
reason='Contexts are mandatory in context enabled completion field
[suggest_context]'}].
2. TransportShardBulkAction uses Exception#toString to generate a
message with the class name: NoOp{seqNo=1, primaryTerm=1,
reason='java.lang.IllegalArgumentException: Contexts are mandatory in
context enabled completion field [suggest_context]'}.
If a write operation fails while a replica is recovering, that replica
will possibly receive two different NoOps: one from recovery and one
from replication. These two different NoOps will trip
TranslogWriter#assertNoSeqNumberConflict assertion.
This commit ensures that we generate the same Noop for the same failure.
Closes#32986
As part of recent changes made to `ShardOperationFailedException` we introduced `index` and `shardId` members to the base class, but the subclasses are entirely responsible for the serialization of such fields. In the case of `ShardSearchFailure`, we have an additional `SearchShardTarget` instance member which also holds the index and the shardId, hence they get serialized as part of `SearchShardTarget` itself. When de-serializing a `ShardSearchFailure` though, we need to remember to also set the parent class `index` and `shardId` fields otherwise they get lost
Relates to #32640
The introduction of mapping version on index metadata has been
backported to 6.x. This commit adjusts the BWC version around mapping
version to account for this backport.
This commit introduces mapping version to index metadata. This value is
monotonically increasing and is updated on mapping updates. This will be
useful in cross-cluster replication so that we can request mapping
updates from the leader only when there is a mapping update as opposed
to the strategy we employ today which is to request a mapping update any
time there is an index metadata update. As index metadata updates can
occur for many reasons other than mapping updates, this leads to some
unnecessary requests and work in cross-cluster replication.
With this commit we implement a workaround for
https://bugs.openjdk.java.net/browse/JDK-8207200 which is a race
condition in the JVM that results in `IllegalArgumentException` to be
thrown in rare cases when we determine memory usage via `MemoryMXBean`.
As we do not want to fail requests in those cases we always return zero
memory usage.
Relates #31767
Relates #33125
When applying index metadata updates we run through the mappings
updating them if needed. Today if there is not an update to the default
mapper, we can lose the default mapping. This means that, for example,
if we apply a settings update to an index we will lose the default
mapper. This happens because we were not guarding updating the default
mapping with a check that the default mapping was updated in the
metadata update. When there is no update in the metadata update, we need
to continue to preserve the previous default mapping. This commit
achieves this by moving the updating of the default mapping under the
same guard that we use for updating the default mapping source. We add a
test that fails before putting the update under a guard and now passes
after moving the update under the guard.
This commit fixes a mappings update test. The test is broken in the
sense that it passes, but for the wrong reason. The test here is testing
that if we make a mapping update but do not commit that mapping update
then the mapper service still maintains the previous document
mapper. This was not the case long, long ago when a mapping update would
update the in-memory state before the cluster state update was
committed. This test was passing, but it was passing because the mapping
update was never even updated. It was never even updated because it was
encountering a null pointer exception. Of course the in-memory state is
not going to be updated in that case, we are simply going to end up with
a failed cluster state update. Fixing that leads to another issue which
is that the mapping source does not even parse so again we would, of
course, end up with the in-memory state not being modified. We fix these
issues, assert that the result cluster state task completed
successfully, and finally that the in-memory state was not updated since
we never committed the resulting cluster state.
This adds support for connecting to a remote cluster through
a tcp proxy. A remote cluster can configured with an additional
`search.remote.$clustername.proxy` setting. This proxy will be used
to connect to remote nodes for every node connection established.
We still try to sniff the remote clsuter and connect to nodes directly
through the proxy which has to support some kind of routing to these nodes.
Yet, this routing mechanism requires the handshake request to include some
kind of information where to route to which is not yet implemented. The effort
to use the hostname and an optional node attribute for routing is tracked
in #32517Closes#31840
If a shard was closed, we return null for SeqNoStats. Therefore the
assertion assertSeqNos will hit NPE when it verifies a closed shard.
This commit skips closed shards in assertSeqNos and enables this
assertion in AbstractDisruptionTestCase.
* Search: Support of wildcard on docvalue_fields
For consistency with stored_fields, docvalue_fields should support the use of wildcards.
Documentation of doc values fields is updated accordingly.
See also: #26390Closes#26299
This commit changes the query field expansion for query parsers
to not rely on an hardcoded list of field types. Instead we rely on
the type of exception that is thrown by MappedFieldType#termQuery to
include/exclude an expanded field.
Supersedes #31655Closes#31798
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
In our Netty layer we have had to take extra precautions against Netty
catching throwables which prevents them from reaching the uncaught
exception handler. This code has taken on additional uses in NIO layer
and now in the scheduler engine because there are other components in
stack traces that could catch throwables and suppress them from reaching
the uncaught exception handler. This commit is a simple cleanup of the
iterative evolution of this code to refactor all uses into a single
method in ExceptionsHelper.
Today we can only have non-affix settings updated and consumed _together_.
Yet, there are use-cases where two affix settings depend on each other which
makes using the hard without consuming updates together. Unfortunately, there is
not straight forward way to have N settings updated together in a type-safe way
having 2 still serves a large portion of use-cases.
This change allows an engine to recover from its local translog up to
the given seqno. The extended API can be used in these use cases:
When a replica starts following a new primary, it resets its index to
the safe commit, then replays its local translog up to the current
global checkpoint (see #32867).
When a replica starts a peer-recovery, it can initialize the
start_sequence_number to the persisted global checkpoint instead of the
local checkpoint of the safe commit. A replica will then replay its
local translog up to that global checkpoint before accepting remote
translog from the primary. This change will increase the chance of
operation-based recovery. I will make this in a follow-up.
Relates #32867
Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests`
per sub search request. This change adds support for setting this parameter on
all sub-search requests in an `_msearch`.
Relates to #31877
The maximum map count boostrap check can be a hindrance to users that do
not own the underlying platform on which they are executing
Elasticsearch. This is because addressing it requires tuning the kernel
and a platform provider might now allow this, especially on shared
infrastructure. However, this bootstrap check is not needed if mmapfs is
not in use. Today we do not have a way for the user to communicate that
they are not going to use mmapfs. This commit therefore adds a setting
that enables the user to disallow mmapfs. When mmapfs is disallowed, the
maximum map count bootstrap check is not enforced. Additionally, we
fallback to a different default index store and prevent the explicit use
of mmapfs for an index.
This change introduces a dedicated ConnectionManager for every RemoteClusterConnection
such that there is not state shared with the TransportService internal ConnectionManager.
All connections to a remote cluster are isolated from the TransportService but still uses
the TransportService and it's internal properties like the Transport, tracing and internal
listener actions on disconnects etc.
This allows a remote cluster connection to have a different lifecycle than a local cluster connection,
also local discovery code doesn't get notified if there is a disconnect on from a remote cluster and
each connection can use it's own dedicated connection profile which allows to have a reduced set of
connections per cluster without conflicting with the local cluster.
Closes#31835
* INGEST: Move all Pipeline State into IngestService
* Moves all pipeline state into the ingest service
* Retains the existing pipeline store and pipeline execution service as inner classes to make the review easier, they should be flattened out in the next step
* All tests for these classes were copied (and adapted) to the ingest service tests
* This is a refactoring step to enable a clean implementation of a pipeline processor (See #32473)
There are two problems with the scheduler engine today. Both relate to
listeners that throw.
The first problem is that any triggered listener that throws a plain old
exception will cause no additional listeners to be triggered for the
event, and will also cause the scheduler to never be invoked again. This
leads to lost events and is bad.
The second problem is that any triggered listener that throws an error
of the fatal kind will not lead to that error because caught by the
uncaught exception handler. This is because the triggered listener is
executed as a future task under a scheduled thread pool executor. A
throwable there goes caught by the JDK framework and set as the outcome
on the future task. Since we never inspect these tasks for their
outcomes, nor is there a good place to do this, we have to handle these
errors ourselves. To do this, we catch them and dispatch them to the
uncaught exception handler via a forked thread. This is similar to our
handling in Netty.
Since #28140 when the global checkpoint is advanced, we try to move the
safe commit forward, and clean up old index commits if possible. However,
we forget to trim unreferenced translog.
This change makes sure that we prune both old translog and index commits
when the safe commit advanced.
Relates #28140Closes#32089
Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the
same JVM and when we log we look at the name of the thread to figure out
the node name. This makes sure that all calls to `daemonThreadFactory`
include the node name.
Closes#32574
I'd like to follow this up with more drastic changes that make it
impossible to do this incorrectly but that change is much larger than
this and I'd like to get these log lines fixed up sooner rather than
later.
All Translog inner closes should happen after tragedy exception is set (#32674)
We faced with the nasty race condition. See #32526
InternalEngine.failOnTragic method has thrown AssertionError.
If you carefully look at if branches in this method, you will spot that its only possible, if either Lucene IndexWriterhas closed from inside or Translog, has closed from inside, but tragedy exception is not set.
For now, let us concentrate on the Translog class.
We found out that there are two methods in Translog - namely rollGeneration and trimOperations that are closing Translog in case of Exception without tragedy exception being set.
This commit fixes these 2 methods. To fix it, we pull tragedyException from TranslogWriter up-to Translog class, because in these 2 methods IndexWriter could be innocent, but still Translog needs to be closed. Also, tragedyException is wrapped with TragicExceptionHolder to reuse CAS/addSuppresed functionality in Translog and TranslogWriter.
Also to protect us in the future and make sure close method is never called from inside Translog special assertion examining stack trace is added. Since we're still targeting Java 8 for runtime - no StackWalker API is used in the implementation.
In the stack-trace checking method, we're considering inner caller not only Translog methods but Translog child classes methods as well. It does mean that Translog is meant for extending it, but it's needed to be able to test this method.
Closes#32526
This is related to #32517. This commit passes the DiscoveryNode to the
initiateChannel method for different Transport implementation. This
will allow additional attributes (besides just the socket address) to be
used when opening channels.
Randomized test conditions that cause some shards to have no docs on them
failed due to test asserts that relied on a lazy initialization side effect
from the map script. After this fix:
- Test cases with the relevant init script are protected
- Test cases with the relevant combine or reduce scripts were already
protected, because the combine and reduce scripts safely handle this case.
This is a followup to #31886. After that commit the
TransportConnectionListener had to be propogated to both the
Transport and the ConnectionManager. This commit moves that listener
to completely live in the ConnectionManager. The request and response
related methods are moved to a TransportMessageListener. That listener
continues to live in the Transport class.
* Lazy resolve DNS (i.e. `String` to `DiscoveryNode`) to not run into indefinitely caching lookup issues (provided the JVM dns cache is configured correctly as explained in https://www.elastic.co/guide/en/elasticsearch/reference/6.3/networkaddress-cache-ttl.html)
* Changed `InetAddress` type to `String` for that higher up the stack
* Passed down `Supplier<DiscoveryNode>` instead of outright `DiscoveryNode` from `RemoteClusterAware#buildRemoteClustersSeeds` on to lazy resolve DNS when the `DiscoveryNode` is actually used (could've also passed down the value of `clusterName = REMOTE_CLUSTERS_SEEDS.getNamespace(concreteSetting)` together with the `List<String>` of hosts, but this route seemed to introduce less duplication and resulted in a significantly smaller changeset).
* Closes#28858
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.
This commit makes the mapping parsing more strict and will fail during
mapping update or index creation if the geo context doesn't point to
a geo_point field.
Supersedes #32412Closes#32202
* Scripted metric aggregations: add deprecation warning and system property to control legacy params
Scripted metric aggregation params._agg/_aggs are replaced by state/states context variables. By default the old params are still present, and a deprecation warning is emitted when Scripted Metric Aggregations are used. A new system property can be used to disable the legacy params. This functionality will be removed in a future revision.
* Fix minor style issue and docs test failure
* Disable deprecated params._agg/_aggs in tests and revise tests to use state/states instead
* Add integration test covering deprecated scripted metrics aggs params._agg/_aggs access
* Disable deprecated params._agg/_aggs in docs integration tests and revise stored scripts to use state/states instead
* Revert unnecessary migrations doc change
A relevant note should be added in the changes destined for 7.0; this PR is going to be backported to 6.x.
* Replace deprecated _agg param bwc integration test with a couple of unit tests
* Fix compatibility test after merge
* Rename backwards compatibility system property per code review feedback
* Tweak deprecation warning text per review feedback
This fix prevernts trying to parse unknown timezone ids by converting
the joda time zone via java.util.TimeZone to a java time based ZoneId.
Closes#32927
testDocStats test is flaky and sometimes it's failing on jenkins and
failure is not reproducible locally. The reason for this failure is in
timing. If the number of deleted documents is greater than 33% of inserted
documents, Lucene will schedule segments to merge if TieredMergePolicy is
used (it's not the case for LogMergePolicy, but ES is only using
TieredMergePolicy). If this merge is performed before stats are
retrieved - we will get 0 for "deleted" counter.
So basically this counter could be either 0 or numOfDeletedDocs at this point,
but this is the too loose assertion and we decided to remove it at all.
Closes#32766