No changes were really needed in our test infra as it didn't use `node.client`. Yet it didn't take into account ingest nodes, what we used to call client nodes in InternalTestCluster were actually ingest only nodes, which now become coordinating only nodes.
Also renamed some method to get rid of the node client terminology as much as possible in favour or coordinating only node.
The cluster stats api now returns counts for each node role. The `master_data`, `master_only`, `data_only` and `client` fields have been removed from the response in favour of `master`, `data`, `ingest` and `coordinating_only`. The same node can have multiple roles, hence contribute to multiple roles counts. Every node is implicitly a coordinating node, so whenever a node has no explicit roles, it will be counted as coordinating only.
_cat/nodes used to return `c` for client node or `d` for data node as part of the node.role column. This commit changes it to return `m` for master eligible, `d` for data and/or `i` for ingest. A node with no explicit roles will be a coordinating only node and marked with `-`. A node can obviously have multiple roles. The master column has been adapted to return only whether a node is the current master (`*`) or not (`-`).
A node can now have roles, Role is an enum made of master, data, ingest. A ndoe with no roles is simplicitly a coordinating only node. Roles are resolved once at construction time based on node attributes and never serialized. Moving DiscoveryNode to Writeable helps cleaning up the code, making fields final allow to easily see where roles need to be initialized and do it in one single place.
As discussed in #16565, the node.client setting is an unnecessary shortcut to node.data: false and node.master: false. We have places where we treat nodes with node.client set to true differently compared to master false and data false, which is not correct. Also, with the addition of node.ingest or potentially new roles, it becomes confusing to figure out if a node client should support ingestion or not.
This commit removes the node.client setting in favour being explicit using node.master, node.data and node.ingest instead.
Decommissioning a node or applying a filter inclusion / exclusion can potentially lead to many shards that need to be moved to other nodes. This commit reuses the model across all
shard movements in an allocation round: It calculates the shard model once and simulates the application of all shards that can be moved on this model.
Closes#16926
Today we might run into a rejected execution exception when
we shutdown the node while handling a transport exception. The
exception is run in a seperate thread but that thread might
not be able to execute due to the shutdown. Today we barf and fill
the logs with large exception. This commit catches this exception
and logs it as debug logging instead.
Remove support for legacy checksums
Elasticsearch 5.0 doesn't support indices with legacy checksums anymore.
The last time we write legacy checksums was in 1.3.0 which was based
on lucene 4.9 already which means that all files have CRC32 checksums.
All indices that Elasticsearch can read today must be written with
lucene version >= 4.8 anyway so we can drop this layer of backwards
compatibility entirely.
Since we are close to upgrading to Lucene 6.0 we should get rid of this
in a more contained change than the lucene upgrade.
Elasticsearch 5.0 doesn't support indices wiht legacy checksums anymore.
The last time we write legacy checksums was in 1.3.0 which was based
on lucene 4.9 already which means that all files have CRC32 checksums.
All indices that Elasticsearch can read today must be written with
lucene version >= 4.8 anyway so we can drop this layer of backwards
compatibility entirely.
Since we are close to upgrading to Lucene 6.0 we should get rid of this
in a more contiained change than the lucene upgrade.
On shared FS / shadow replicas we rely on a lock retry if the lock has
not yet been relesed on a relocated primary. This commit adds this `hack`
for shared filesystems only.
Closes#16936
This commit modifies TransportBulkAction to use relative time instead of
absolute time when measuring how long a bulk request took to be
processed, and adds tests for this functionality.
Closes#16916
`writeLockTimeout` has been removed in Lucene 6 completely and since we have
the shard locking mechanism now for quite a while we don't need this anymore.
Shards should only be allocated once all resources are released such that there
can't be any other shard holding the lock to that index in any sane situation.
This commit removes the ability to use string fields on indices created on or
after 5.0. Dynamic mappings now generate text fields by default for strings
but there are plans to also add a sub keyword field (in a future PR).
Most of the changes in this commit are just about replacing string with
keyword or text. Some tests have been removed because they existed because of
corner cases of string mappings like setting ignore-above on a text field or
enabling term vectors on a keyword field which are now impossible.
The plan is to remove strings entirely in 6.0.