Today, shard failure requests are blindly handled on the master without
any validation that the request is a legal request. A legal request is a
shard failure request for which the shard requesting the failure is
either the local allocation or the primary allocation. This is because
shard failure requests are classified into only two sets: requests that
correspond to shards that exist, and requests that correspond to shards
that do not exist. Requests that correspond to shards that do not exist
are immediately marked as successful (there is nothing to do), and
requests that correspond to shards that do exist are sent to the
allocation service for handling the failure.
This pull request adds a third classification for shard failure requests
to separate out illegal shard failure requests and enables the master to
validate shard failure requests. The master communicates the illegality
of a shard failure request via a new exception:
NoLongerPrimaryShardException. This exception can be used by shard
failure listeners to discover when they've sent a shard failure request
that they were not allowed to send (e.g., if they are no longer the
primary allocation for the shard).
Closes#16275
Identifying when a plugin id is maven coordinates is currently done by
checking if the plugin id contains 2 colons. However, a valid url could
have 2 colons, for example when a port is specified. This change adds
another check, ensuring the plugin id with maven coordinates does not
contain a slash, which only a url would have.
closes#16376
This commit marks OldIndexBackwardsCompatibilityIT#testOldIndexes as
awaiting a Lucene snapshot upgrade to reflect the fact that
Elasticsearch 2.2.0 is built against Lucene 5.4.1 but the current Lucene
snapshot in master/2.x does not contain the Lucene version 5.4.1 field.
Relates #16373
This commit fixes a test bug in
JvmGcMonitorServiceSettingsTests#testMissingSetting. The purpose of the
test is to test that if settings are provided for a collector for at
least one of warn, info, and debug then it is provided for all of warn,
info, and debug. However, for a collector setting to be valid it must be
a positive time value but the randomization in the test construction
could produce zero time values.
Closes#16369
This removes the deprecation for the geohash based setter to quickly fix the failure here: http://build-us-00.elastic.co/job/es_core_master_suse/3312
Reintroducing postponed until related test in groovy module is fixed. Need to figure out what went wrong when I ran the build locally w/o failure before.
This commit enableds strict settings validation on node startup. All settings
passed to elasticsearch either through system properties, yaml files or any other
way to pass settings must be registered and valid. Settings that are unknown ie. due to
typos or due to deprecation or removal will cause the node to NOT start up. Plugins
have to declare all their settings on the `SettingsModule#registerSetting` and settings for
plugins that are not installed must be removed.
This commit also removes the ability to specify the nodes name via `-Des.name` or just `name` in the
configuration files. The node name must be prefixed with the node prexif like `node.name: Boom`. Left over
usage of `name` will also cause startup to fail.
Adds to GeoDistanceSortBuilder:
* equals
* hashcode
* writeto/readfrom
* moves xcontent parsing logic over
* adds roundtrip tests
* fixes roundtrip test for xcontent by keeping points just as geopoints not geohashes internally
* fixes xcontent parsing of ignore_malformed if coerce is set/unset
* adds exception to sortMode setter to avoid setting invalid sort modes
Relates to #15178
When primary relocation completes, a cluster state is propagated that deactivates the old primary and marks the new primary as active.
As cluster state changes are not applied synchronously on all nodes, there can be a time interval where the relocation target has processed
the cluster state and believes to be the active primary and the relocation source has not yet processed the cluster state update and
still believes itself to be the active primary. This commit ensures that, before completing the relocation, the reloction source deactivates
writing to its store and delegates requests to the relocation target.
Closes#15900
Adds a container that represents a resource with reference counting capabilities. Provides operations to suspend acquisition of new references. Useful for resource management when resources are intermittently unavailable.
Closes#15956
Its what we say our maximum line length is in CONTRIBUTING.md and it'd be
nice to have a check on that. Unfortunately, we don't actually wrap all lines
at 140.
Cli tools currently catch all exceptions, and only print the exception
message, except when a special system property is set. Even with this
flag set, certain exceptions, like IOException, are captured and their
stack trace is always lost.
This change adds a UserError class, which can be used a cli tools to
specify a message to the user, as well as an exit status. All other
exceptions are propagated out of main, so java will exit with non-zero
and print the stack trace.
This commit fixes an inconsistency between ShardId#equals(Object),
ShardId#hashCode, and ShardId#compareTo(ShardId). In particular,
ShardId#equals(Object) compared only the numerical shard ID and the
index name, but did not compare the index UUID; a similar situation
applies to ShardId#compareTo(ShardId). However, ShardId#hashCode
incorporated the indexUUID into its calculation. This can lead to
situations where two ShardIds compare as equal yet have different hash
codes.
Closes#16319
The following settings were moved to Setting contents in HttpTransportSettings:
* http.cors.allow-origin
* http.port
* http.publish_port
* http.detailed_errors.enabled
* http.max_content_length
* http.max_chunk_size
* http.max_header_size
* http.max_initial_line_length
The following settings were removed:
* http.port
* http.netty.port
* http.netty.publish_port
* http.netty.max_content_length
* http.netty.max_chunk_size
* http.netty.max_header_size
* http.netty.max_initial_line_length
As a prerequisite for refactoring the whole PhraseSuggestionBuilder
to be able to be parsed and streamed from the coordinating node, the
DirectCandidateGenerator must implement Writeable, be able to parse
a new instance (fromXContent()) and later when transported to the
shard to generate a PhraseSuggestionContext.DirectCandidateGenerator.
Also adding equals/hashCode and tests and moving DirectCandidateGenerator
to its own DirectCandidateGeneratorBuilder class.
Long ago (#7834) the owner ship of the local disco node was centralized to the cluster service. LocalDiscovery is still created it's own disco node, which is not used by the cluster service and thus creating confusion (two nodes same name but different ids).
This commit also removes and optimization where when joining a new master we would first copy the master's metadata and only then pull in the rest of the cluster state (and it's nodes).
Closes#16317
The plugin cli currently is extremely lenient, allowing most errors to
simply be logged. This can lead to either corrupt installations (eg
partially installed plugins), or confused users.
This change rewrites the plugin cli to have almost no leniency.
Unfortunately it was not possible to remove all leniency, due in
particular to how config files are handled.
The following functionality was simplified:
* The format of the name argument to install a plugin is now an official
plugin name, maven coordinates, or a URL.
* Checksum files are required, and only checked, for official plugins
and maven plugins. Checksums are also only SHA1.
* Downloading no longer uses a separate thread, and no longer has a timeout.
* Installation, and removal, attempts to be atomic. This only truly works
when no config or bin files exist.
* config and bin directories are verified before copying is attempted.
* Permissions and user/group are no longer set on config and bin files.
We rely on the users umask.
* config and bin directories must only contain files, no subdirectories.
* The code is reorganized so each command is a separate class. These
classes already existed, but were embedded in the plugin cli class, as
an extra layer between the cli code and the code running for each command.
With the recent change to allow ESSingleNodeTestCase to specify plugins
and Version for the node, node creation became lazy, where it now
happens before the first test to run. However, there were many static
methods on ESSingleNodeTestCase that still try to access the node. This
change makes those methods non-static.
This commit removes and forbids the use of lowercase ells ('l') in long
literals because they are often hard to distinguish from the digit
representing one ('1').
Closes#16329
Previously it's possible to get errors like:
```
Caused by: NotSerializableExceptionWrapper[d:\shared_data\afs-issue-1-1\23]
```
Which doesn't tell us what the underlying exception type was that could
not be serialized.
This changes the message to prepend the
`ElasticsearchException.getExceptionName()` of the exception (which is a
underscore case of the class with the leading 'Elasticsearch' removed)
This commit amends a logging statement in TransportReplicationAction to
log the full shard ID instead of just the numerical shard ID (without
the index name).
When there is an exception thrown during pipeline creation within
Rest calls (in put pipeline, and simulate) We now return a structured
error response to the user with details around which processor's
configuration is the cause of the issue, or which configuration property
is misconfigured, etc.
We are leaking all kinds of resources if something during Node#close() barfs.
This commit cuts over to a list of closeables to release resources that
also closed remaining services if one or more services fail to close.
Closes#13685
This is an issue due to some refactoring we did along the lines of Index.java etc.
We pass the index object to Set#contains which should actually be only it's name.
Closes#16299
This can cause premature closing of the underlying file descriptor immediately
if at the same time the thread is blocked on IO. The file descriptor will remain closed
and subsequent access to NIOFSDirectory will throw a ClosedChannelException.
Adds a method that emits a WordScorerFactory to all of the
three SmoothingModel implementatins that will be needed when
we switch to parsing the PhraseSuggestion on the coordinating
node and need to delay creating the WordScorer on the shards.
Now MasterNodeOperations, ReplicationAllShards, ReplicationSingleShard, BroadcastReplication and BroadcastByNode actions keep track of their parent tasks.
With this commit we revert the default value for the setting
'index.requests.cache.enable' again to false, which has been
set to true by accident in PR #16054.
Checked with @s1monw
Note for many of the settings we had a netty specific override to a generic transport setting. I have removed those as they are not needed imo (ex. "transport.connections_per_node.recovery" was overriden by "transport.netty.connections_per_node.recovery", which I removed). The netty prefix was kept for settings that are tied to the netty universe - ex. "transport.netty.max_cumulation_buffer_capacity"
Closes#16200