Adds the logic for handling joins by a prospective leader. Introduces the Coordinator class with the
basic lifecycle modes (candidate, leader, follower) as well as a JoinHelper class that contains most
of the plumbing for handling joins.
Today the PeerFinder silently updates the set of found peers as new peers are
discovered and old ones are disconnected, and elections are scheduled
independently of these changes. In fact, it would be better if the election
scheduler were only activated on discovery of a quorum of peers. This commit
introduces the `onFoundPeersUpdated` method that allows this flow.
An election requires a node to select a term that is higher than all
previously-seen terms. If nodes are too enthusiastic about starting elections
then they can effectively excludes itself from the cluster until the leader can
bump to a still-higher term, and if this process repeats then a single faulty
node can prevent the cluster from making useful progress.
The solution is to start the election with a pre-voting round to ensure that
there is at least a quorum of nodes who believe there to be no leader.
This also fixes up some merge issues.
This is a followup to #31886. After that commit the
TransportConnectionListener had to be propogated to both the
Transport and the ConnectionManager. This commit moves that listener
to completely live in the ConnectionManager. The request and response
related methods are moved to a TransportMessageListener. That listener
continues to live in the Transport class.
* Lazy resolve DNS (i.e. `String` to `DiscoveryNode`) to not run into indefinitely caching lookup issues (provided the JVM dns cache is configured correctly as explained in https://www.elastic.co/guide/en/elasticsearch/reference/6.3/networkaddress-cache-ttl.html)
* Changed `InetAddress` type to `String` for that higher up the stack
* Passed down `Supplier<DiscoveryNode>` instead of outright `DiscoveryNode` from `RemoteClusterAware#buildRemoteClustersSeeds` on to lazy resolve DNS when the `DiscoveryNode` is actually used (could've also passed down the value of `clusterName = REMOTE_CLUSTERS_SEEDS.getNamespace(concreteSetting)` together with the `List<String>` of hosts, but this route seemed to introduce less duplication and resulted in a significantly smaller changeset).
* Closes#28858
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.
This commit makes the mapping parsing more strict and will fail during
mapping update or index creation if the geo context doesn't point to
a geo_point field.
Supersedes #32412Closes#32202
* Scripted metric aggregations: add deprecation warning and system property to control legacy params
Scripted metric aggregation params._agg/_aggs are replaced by state/states context variables. By default the old params are still present, and a deprecation warning is emitted when Scripted Metric Aggregations are used. A new system property can be used to disable the legacy params. This functionality will be removed in a future revision.
* Fix minor style issue and docs test failure
* Disable deprecated params._agg/_aggs in tests and revise tests to use state/states instead
* Add integration test covering deprecated scripted metrics aggs params._agg/_aggs access
* Disable deprecated params._agg/_aggs in docs integration tests and revise stored scripts to use state/states instead
* Revert unnecessary migrations doc change
A relevant note should be added in the changes destined for 7.0; this PR is going to be backported to 6.x.
* Replace deprecated _agg param bwc integration test with a couple of unit tests
* Fix compatibility test after merge
* Rename backwards compatibility system property per code review feedback
* Tweak deprecation warning text per review feedback
This fix prevernts trying to parse unknown timezone ids by converting
the joda time zone via java.util.TimeZone to a java time based ZoneId.
Closes#32927
testDocStats test is flaky and sometimes it's failing on jenkins and
failure is not reproducible locally. The reason for this failure is in
timing. If the number of deleted documents is greater than 33% of inserted
documents, Lucene will schedule segments to merge if TieredMergePolicy is
used (it's not the case for LogMergePolicy, but ES is only using
TieredMergePolicy). If this merge is performed before stats are
retrieved - we will get 0 for "deleted" counter.
So basically this counter could be either 0 or numOfDeletedDocs at this point,
but this is the too loose assertion and we decided to remove it at all.
Closes#32766
Moves JoinTaskExecutor out of ZenDiscovery so that it can be reused for Zen2. Also ensures that tasks to JoinTaskExecutor have a proper identity, so that multiple tasks for the same node can coexist.
This commit disables the automatic `refresh_interval` in order to ensure
that index readers cannot differ between the normal and scroll search.
This issue is related to the 7.5 Lucene upgrade which contains a change that
makes single segment merge more likely to occur (max deletes percentage).
Closes#32682
We do not support passphrases on the secure settings storage (the
keystore). Yet, we added support for this in the API layer. This commit
removes this support so that we are not limited in our future options,
or have to make a breaking change.
This change cleans up some methods in the CharArrays class from x-pack, which
includes the unification of char[] to utf8 and utf8 to char[] conversions that
intentionally do not use strings. There was previously an implementation in
x-pack and in the reloading of secure settings. The method from the reloading
of secure settings was adopted as it handled more scenarios related to the
backing byte and char buffers that were used to perform the conversions. The
cleaned up class is moved into libs/core to allow it to be used by requests
that will be migrated to the high level rest client.
Relates #32332
This commit fixes a global checkpoint listeners test wherein we were
expecting an executor to have been used even if there were no
listeners. This is silliness, so this commit adjusts the assertion to
verify that the executor never fires if there are no listeners, and
fires exactly once if there is one or more listeners.
The ElectionScheduler runs while there is no known elected master and is
responsible for scheduling elections randomly, backing off on failure, to
balance the desire to elect a master quickly with the desire to avoid more than
one node starting an election at once.
This commit introduces the ability for global checkpoint listeners to be
registered at the shard level. These listeners are notified when the
global checkpoint is updated, and also when the shard closes. To
encapsulate these listeners, we introduce a shard-level component that
handles synchronization of notification and modifications to the
collection of listeners.
This is related to #31835. It moves the default connection profile into
the ConnectionManager class. The will allow us to have different
connection managers with different profiles.
This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead.
While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.
This commit adds a java time version of the existing rounding classes, which features the same test suite and a small test class to check if serialization works as expected.
Significance score doubles were being parsed as long. Existing tests did not catch this because SignificantLongTermsTests and SignificantStringTermsTests did not set the score. Fixed these and also added integration test.
Thanks for the report/fix, Blakko
Closes#32770
* INGEST: Create Index Before Pipeline Execute
* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in #32758)
* closes#32758
#31821 introduced an unreleased bug where NOOP updates were incorrectly mutating the bulk
shard request, inserting null item to be replicated, which would result in NullPointerExceptions when
serializing the request to be shipped to the replicas.
Closes#32808
This is related to #31835. This commit adds a connection manager that
manages client connections to other nodes. This means that the
TcpTransport no longer maintains a map of nodes that it is connected
to.
Increases testability of MasterService and the discovery layer. Changes:
- Async publish method
- Moved a few interfaces/classes top-level to simplify imports
- Deterministic MasterService implementation for tests
With the move to java time, the default formatter used by toString on
ZonedDateTime uses optional components for least significant portions of
the date. This commit changes the cat indices api to use a strict date
time format, which will always output milliseconds, even if they are
zero.
closes#32466
Currently AbstractBuilderTestCase generates certain random values in its
`beforeTest()` method annotated with @Before only the first time that a test
method in the suite is run while initializing the serviceHolder that we use for
the rest of the test. This changes the values of subsequent random values
and has the effect that when running single methods from a test suite with
"-Dtests.method=*", the random values it sees are different from when the same
test method is run as part of the whole test suite. This makes it hard to use
the reproduction lines logged on failure.
This change runs the inialization of the serviceHolder and the randomization
connected to it using the test runners master seed, so reproduction by running
just one method is possible again.
Closes#32400
Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a `RetryOnPrimaryException`, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure.
This commit refactors how we deal with retries with the goal of removing `RetryOnPrimaryException` and `RetryOnReplicaException` (not done yet). It achieves so by introducing a class `BulkPrimaryExecutionContext` that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through:
1) A translation phase for updates
2) Execution phase (always index/delete)
3) Waiting for a mapping update to come in, if needed
4) Requires a retry (for updates and cases where the mapping are still not available after the put mapping call returns)
5) A finalization phase which allows updates to the index/delete result to an update result.
This adds a java time based date math parser class in order, which will replace the joda date based one in the future. For now the class also returns the date in milliseconds since the epoch.
Currently if a document cannot be indexed because it violates the defined
mapping for the index, a MapperException is thrown. In some cases it is
useful to expose the expected field type in the exception itself,
so that the user can react based on the error message. This change adds
the expected data type to the MapperException.
Closes#31502
Remove a few of the logger constructors that aren't widely used or
aren't used at all and deprecate a few more logger constructors in favor
of log4j2's `LogManager`.
A bug in the test suite prevented to properly check that all date
formatters printed the date the same way like joda time does.
This fixes the test and thus also a fair share of formats, that
now use the strict parser for printing.
We previously discussed moving the classes extending `AcknowledgedResponse` to
simply use `AcknowledgedResponse`, making the class non-abstract.
This moves the first class to do this, removing `WritePipelineResponse` in the
process.
If we like the way this looks, I will switch the remaining classes over to using
`AcknowledgedResponse`.