We used to shrink the version map under an external lock. This is
quite ambigious and instead we can simply issue an empty refresh to
shrink it.
Closes#27852
Today we prevent that the same thread acquires the same lock more than once.
This restriction is a relict form the early days of this concurrency construct
and can be removed.
The last operation executed in IndicesClientDocumentationIT.testCreate()
is an asynchronous index creation. Because nothing waits for its
completion, on slow machines the index can sometimes be created after
the testCreate() test is finished, and it can fail the following test.
Closes#27754
While the LiveVersionMap is an internal class that belongs to the engine we do
rely on some external locking to enforce the desired semantics. Yet, in tests
we mimic the outer locking but we don't have any way to enforce or assert on
that the lock is actually hold. This change moves the KeyedLock inside the
LiveVersionMap that allows the engine to access it as before but enables
assertions in the LiveVersionMap to ensure the lock for the modifying or
reading key is actually hold.
Currently FiltersAggregationBuilder#doRewrite creates a new FiltersAggregationBuilder which doesn't correctly copy the original "keyed" field if a non-keyed filter gets rewritten.
This can cause rendering bugs of the output aggregations like the one reported in #27841.
Closes#27841
Elasticsearch offers a number of http requests that can take a while to
execute. In #27713 we introduced an http read timeout that defaulted to
30 seconds. This means that if no reads happened for 30 seconds (even
after a request is received), the connection would be closed due to
timeout.
This commit disables the read timeout by default to allow us to evaluate
the impact of read timeouts and to avoid introducing distruptive
behavior.
Currently, method corruptTranslogFiles corrupts some translog files
whose translog_gen are at least the min_required_translog_gen from the
translog checkpoint. However this condition is not enough for
recoverFromTranslog to be always failed. If we corrupt only translog
operations from only translog files whose translog_gen are smaller than
the min_translog_gen of a recovering index commit, recoverFromTranslog
will be ok as we won't read translog operations from those files.
This commit makes sure corruptTranslogFiles to corrupt some translog
files that will be used in recoverFromTranslog.
Closes#27538
When the first parameter of `ESTestCase#randomValueOtherThan` is `null`
then run the supplier until it returns non-`null`. Previously,
`randomValueOtherThan` just ran the supplier one time which was
confusing.
Unexpectedly, it looks like not tests rely on the original `null`
handling.
Closes#27821
A FieldCapabilities request can cover multiple indices (or aliases pointing to multiple indices).
When rewriting the request for each index, store the original requested indices.
We currently have a complicated port assignment scheme to make sure that the nodes span off by the internal test cluster will be assigned fixed port ranges that will also not collide between clusters. The port ranges need to be fixed in advance so that the nodes will be able to find each other via `UnicastZenPing`.
This approach worked well for the last few years but we are now at a point that our testing has grown beyond it and we exceed the 5 reusable ranges per JVM. This means that nodes are not always assigned the first 5 ports in their range which causes cluster formation issues. On top of that, most of the clusters that are span up don't even rely on `UnicastZenPing` but rather `MockZenPings` that uses in memory maps for discovery (with the down side that they are not influenced by network disruption simulations).
This PR changes `InternalTestCluster` to use port 0 as a fixed assignment. This will allow the OS to manage ports and will ensure we don't have collisions. For tests that need to simulate network disruptions (and thus can't use `MockZenPings`), a new `UnicastHostProvider` is introduced that is based on the current state of the test cluster. Since that is only resolved at run time, it is aware of the port assignments of the OS.
Closes#27818Closes#27762
This commit moves GlobalCheckpointTracker from the engine to IndexShard, where it better fits logically: Tracking the global checkpoint based on the local checkpoints of all shards in the replication group is not a property of the engine, but rather a property fulfilled by the current primary shard. The LocalCheckpointTracker on the other hand is driven by the contents of the local translog. By moving GlobalCheckpointTracker to IndexShard, it makes little sense to keep the SequenceNumbersService class around - it would only wrap the LocalCheckpointTracker. This commit therefore removes the class and replaces occurrences of SequenceNumbersService in the engine directly by LocalCheckpointTracker.
AnalysisFactoryTestCase checks that the ES custom token filter multi-term
awareness matches the underlying lucene factory. For the trim filter this
won't be the case until LUCENE-8093 is released in 7.3, so we add a
temporary exclusion
Closes#27310
This commit fixes the version tests for release tests. The problem here
is that during release tests all version should be treated as released
so the assertions must be modified accordingly.
Relates #27815
When snapshotting the primary we capture a lucene commit at an arbitrary moment from a sequence number perspective. This means that it is possible that the commit misses operations and that there is a gap between the local checkpoint in the commit and the maximum sequence number.
When we restore, this will create a primary that "misses" operations and currently will mean that the sequence number system is stuck (i.e., the local checkpoint will be stuck). To fix this we should fill in gaps when we restore, in a similar fashion to normal store recovery.
Currently randomNonNegativeLong() returns 0 half as often as any positive long,
but random number generators are typically expected to return
uniformly-distributed values unless otherwise specified. This fixes this issue
by mapping Long.MIN_VALUE directly onto 0 rather than resampling.
Normally the hole is assigned to the component of the first edge to the south
of one of its vertices, but if the chosen hole vertex is south of everything
then the binary search returns -1 yielding an ArrayIndexOutOfBoundsException.
Instead, assign the vertex to the component of the first edge to its north.
Subsequent validation catches the fact that the hole is outside its component.
Fixes#25933
This commit moves the range field mapper back to core so that we can
remove the compile-time dependency of percolator on mapper-extras which
compilcates dependency management for the percolator client JAR, and
modules should not be intertwined like this anyway.
Relates #27854
This commit addresses the publication of the elasticsearch-cli to
Maven. For now for simplicity we publish this to Maven so that it is
available as a transitive dependency for any artifacts that depend on
the core elasticsearch artifact. It is possible that in the future we
can simply exclude this dependency but for now this is the safest and
simplest approach that can happen in a patch release.
Relates #27853
Today we use the in-memory global checkpoint from SequenceNumbersService
to clean up unneeded commit points, however the latest global checkpoint
may haven't fsynced to the disk yet. If the translog checkpoint fsync
failed and we already use a higher global checkpoint to clean up commit
points, then we may have removed a safe commit which we try to keep for
recovery.
This commit updates the deletion policy using lastSyncedGlobalCheckpoint
from Translog rather the in memory global checkpoint.
Relates #27606
This commit removes the usage of system properties for the HttpAsyncClient as this overrides some
defaults that we intentionally change. In order to set the default SSLContext to the system context
we set the SSLContext on the builder explicitly.
Closes#27827
This reverts commit e04e5ab037 as we no longer need the increased
logging for the mixed cluster tests. This will reduce the size of logs for some build failures.
This commit is related to #27260. It adds a base NioGroup for use in
different transports. This class creates and starts the underlying
selectors. Different protocols or transports are established by passing
the ChannelFactory to the bindServerChannel or openChannel
methods. This allows a TcpChannelFactory to be passed which will
create and register channels that support the elasticsearch tcp binary
protocol or a channel factory that will create http channels (or other).
Today we still maintain a version map even if we only index append-only
or in other words, documents with auto-generated IDs. We can instead maintain
an un-safe version map that will be swapped to a safe version map only if necessary
once we see the first document that requires access to the version map. For instance:
* a auto-generated id retry
* any kind of deletes
* a document with a foreign ID (non-autogenerated
In these cases we forcefully refresh then internal reader and start maintaining
a version map until such a safe map wasn't necessary for two refresh cycles.
Indices / shards that never see an autogenerated ID document will always meintain a version
map and in the case of a delete / retry in a pure append-only index the version map will be
de-optimized for a short amount of time until we know it's safe again to swap back. This
will also minimize the requried refeshes.
Closes#19813