Today we prune transport handlers in TransportService when a node is disconnected.
This can cause connections to starve in the TransportService if the connection is
opened as a short living connection ie. without sharing the connection to a node
via registering in the transport itself. This change now moves to pruning based
on the connections cache key to ensure we notify handlers as soon as the connection
is closed for all connections not just for registered connections.
Relates to #24632
Relates to #24575
Relates to #24557
- Removes clusterState, getInitialClusterState and getMinimumMasterNodes methods from Discovery interface.
- Sets PingContextProvider in ZenPing constructor
- Renames state in ZenDiscovery to committedState
This commit documents how to write a `ScriptEngine` in order to use
expert internal apis, such as using Lucene directly to find index term
statistics. These documents prepare the way to remove both native
scripts and IndexLookup.
The example java code is actually compiled and tested under a new gradle
subproject for example plugins. This change does not yet breakup
jvm-example into the new examples dir, which should be done separately.
relates #19359
relates #19966
Specifying s3 access and secret keys inside repository settings are not
secure. However, until there is a way to dynamically update secure
settings, this is the only way to dynamically add repositories with
credentials that are not known at node startup time. This commit adds
back `access_key` and `secret_key` s3 repository settings, but protects
it with a required system property `allow_insecure_settings`.
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.
This is re-adds this commit that was revered in 952feb58e4
If we fail to acquire the shard lock, need to retry and wait for the new
cluster state, we were sending the wrong kind of request for the replica
action. This commit fixes this issue.
This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key.
Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR).
Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time.
Closes#23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.
This commit addresses an issue in the missing active IDs prevent advance
test from the global checkpoint tracker. The assumptions this test was
making about reality were violated when global checkpoints were inlined
(specifically, the component of that change where the tracker's
knowledge of the global checkpoint was updated inline with updates to
the tracker's knowledge of local checkpoints for an allocatio ID). The
point of the test was to ensure that a lagging shard prevents the global
checkpoint from advancing, so this commit rewrites the test with that in
mind.
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.
With global checkpoints we also take into account if a global checkpoint must be fsynced.
Yet, with recent addition of inlining global checkpoints into indexing operations from a
test perspective unnecessary fsyncs might be reported if `Translog#syncNeeded` is checked.
Now the test only check if the last write location triggers an fsync instead.
Closes#24600
This commit fixes an issue in the checkpoints advance test. Namely, when
there zero documents indexed, after the global checkpoint is synced, the
global checkpoint will have advanced to the no ops performed. There is a
larger conceptual problem here, namely that the primary does not update
its knowledge of its own local checkpoint upon recovery which causes the
global checkpoint to initially be unassigned and then advance to no ops
performed, but this will be addressed in a follow-up.
Previously query weight was created for each search hit that needed to compute inner hits,
with this change the weight of the inner hit query is computed once for all search hits.
Closes#23917
We allow non-dynamic settings to be updated on closed indices but we don't
check if the updated settings can be used to open/create the index.
This can lead to unrecoverable state where the settings are updated but the index
cannot be reopened since the settings are not valid. Trying to update the invalid settings
is also not possible since the update will fail to validate the current settings.
This change adds the validation of the updated settings for closed indices and make sure that the new settings
do not prevent the reopen of the index.
Fixes#23787
Previously, if a master node updated the cluster state to reflect that a
snapshot is completed, but subsequently failed before processing a
cluster state to remove the snapshot from the cluster state, then the
newly elected master would not know that it needed to clean up the
leftover cluster state.
This commit ensures that the newly elected master sees if there is a
snapshot in the cluster state that is in the completed state but has not
yet been removed from the cluster state.
Closes#24452
There are now three public static method to build instances of
PreConfiguredTokenFilter and the ctor is private. I chose static
methods instead of constructors because those allow us to change
out the implementation returned if we so desire.
Relates to #23658
We have to do something to force the global checkpoint to be
synchronized to the replicas or the assertions at the end of the test
that they are in sync will trip. Since the last write operation to hit a
replica shard will only carry the penultimate global checkpoint (it will
advance when the replicas respond with their local checkpoint), and a
background sync will not happen until the primary shard falls idle, we
force a sync through a refresh action.
Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all)
provides information about snapshots in the repository, including the
snapshot state, number of shards snapshotted, failures, etc. In order
to provide information about each snapshot in the repository, the call
must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for
every snapshot. In cloud-based repositories, this can be expensive,
both from a cost and performance perspective. Sometimes, all the user
wants is to retrieve all the names/uuids of each snapshot, and the
indices that went into each snapshot, without any of the other status
information about the snapshot. This minimal information can be
retrieved from the repository index blob (`index-N`) without needing to
read each snapshot metadata blob.
This commit enhances the get snapshots API with an optional `verbose`
parameter. If `verbose` is set to false on the request, then the get
snapshots API will only retrieve the minimal information about each
snapshot (the name, uuid, and indices in the snapshot), and only read
this information from the repository index blob, thereby giving users
the option to retrieve the snapshots in a repository in a more
cost-effective and efficient manner.
Closes#24288
It was using the wrong version, which can cause errors like
```
1> java.security.AccessControlException: access denied ("java.net.SocketPermission" "[0:0:0:0:0:0:0:1]:34221" "connect,resolve")
1> at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) ~[?:1.8.0_111]
1> at java.security.AccessController.checkPermission(AccessController.java:884) ~[?:1.8.0_111]
1> at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) ~[?:1.8.0_111]
1> at java.lang.SecurityManager.checkConnect(SecurityManager.java:1051) ~[?:1.8.0_111]
1> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:625) ~[?:?]
1> at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:273) ~[httpcore-nio-4.4.5.jar:4.4.5]
1> at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:139) ~[httpcore-nio-4.4.5.jar:4.4.5]
1> at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) ~[httpcore-nio-4.4.5.jar:4.4.5]
1> at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) ~[httpasyncclient-4.1.2.jar:4.1.2]
1> at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[httpasyncclient-4.1.2.jar:4.1.2]
```
When running tests
This commit terminates any controller processes plugins might have after
the node has been closed. This gives the plugins a chance to shut down their
controllers gracefully.
Previously there was a race condition where controller processes could be shut
down gracefully and terminated by two threads running in parallel, leading to
non-deterministic outcomes.
Additionally, controller processes that failed to shut down gracefully were
not forcibly terminated when running as a Windows service; there was a reliance
on the plugin to shut down its controller gracefully in this situation.
This commit also fixes this problem.
Adds a new "icu_collation" field type that exposes lucene's
ICUCollationDocValuesField. ICUCollationDocValuesField is the replacement
for ICUCollationKeyFilter which has been deprecated since Lucene 5.
This commit renames ScriptEngineService to ScriptEngine. It is often
confusing because we have the ScriptService, and then
ScriptEngineService implementations, but the latter are not services as
we see in other places in elasticsearch.
File scripts have 2 related settings: the path of file scripts, and
whether they can be dynamically reloaded. This commit deprecates those
settings.
relates #21798
A constraint on the global checkpoint was inadvertently committed from
the inlining global checkpoint work. Namely, the constraint prevents the
global checkpoint from advancing to no ops performed, a situation that
can occur when shards are started but empty.
Today we rely on background syncs to relay the global checkpoint under
the mandate of the primary to its replicas. This means that the global
checkpoint on a replica can lag far behind the primary. The commit moves
to inlining global checkpoints with replication requests. When a
replication operation is performed, the primary will send the latest
global checkpoint inline with the replica requests. This keeps the
replicas closer in-sync with the primary.
However, consider a replication request that is not followed by another
replication request for an indefinite period of time. When the replicas
respond to the primary with their local checkpoint, the primary will
advance its global checkpoint. During this indefinite period of time,
the replicas will not be notified of the advanced global
checkpoint. This necessitates a need for another sync. To achieve this,
we perform a global checkpoint sync when a shard falls idle.
Relates #24513
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.
The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.
This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.
This is a part of #23658
This test waited 10 seconds for a refresh listener to appear in
the stats. It turns out that in our NFS testing infrastructure this can
take a lot longer than 10 seconds. The error reported here:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+nfs/257/consoleFull
has it taking something like 15 seconds. This bumps the timeout
to a solid minute.
Closes#24417
Now that indices have a single type by default, we can move to the next step
and identify documents using their `_id` rather than the `_uid`.
One notable change in this commit is that I made deletions implicitly create
types. This helps with the live version map in the case that documents are
deleted before the first type is introduced. Otherwise there would be no way
to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from
`DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are
different if versioning is involved.
If a node in version >= 5.3 acts as a coordinating node during a scroll request that targets a single shard, the scroll may return the same documents over and over iff the targeted shard is hosted by a node with a version <= 5.3.
The nodes in this version will advance the scroll only if the search_type has been set to `query_and_fetch` though this search type has been removed in 5.3.
This change handles this situation by adding the removed search_type in the request that targets a node in version <= 5.3.
Previously, Mustache would call `toString` on the `_ingest.timestamp`
field and return a date format that did not match Elasticsearch's
defaults for date-mapping parsing. The new ZonedDateTime class in Java 8
happens to do format itself in the same way ES is expecting.
This commit adds support for a feature flag that enables the usage of this new date format
that has more native behavior.
Fixes#23168.
This new fix can be found in the form of a cluster setting called
`ingest.new_date_format`. By default, in 5.x, the existing behavior
will remain the same. One will set this property to `true` in order to
take advantage of this update for ingest-pipeline convenience.
This commit fixes inefficient (worst case exponential) loading of
snapshot repository data when checking for incompatible snapshots,
that was introduced in #22267. When getting snapshot information,
getRepositoryData() was called on every snapshot, so if there are
a large number of snapshots in the repository and _all snapshots
were requested, the performance degraded exponentially. This
commit fixes the issue by only calling getRepositoryData once and
using the data from it in all subsequent calls to get snapshot
information.
Closes#24509