Remove the settings around dangling indices, such as no import and timeout for deletion, we always want to import dangling indices for safety, and we should not allow to change the behavior. This also cleans up the code quite a bit.
closes#10016
we want to make sure the recovery finished all the way to post recovery. Current check, validating the shard is either in POST_RECOVERY or STARTED is not good because the shard could be also closed if things go fast enough (like in our tests). The assertion is changed to check the shard is not left in CREATED or RECOVERING.
Closes#10028
- Added NAME constants for each script language, avoiding to repeat the same strings all over the place.
- Simplified `compile` method signatures by removing a couple of variants. Note that all of these signatures are going to change again with #6418 as in order to compile/execute a script the caller will need to specify which operation is attempting to execute the script, info that will be provided as an additional mandatory argument.
- Removed double call to ScriptService#verifyDynamicScripting for every indexed or dynamic script.
- Decreased ScriptService inner classes visibility to private (CacheKey, IndexedScript, ApplySettings)
- Moved ScriptService inner classes to the bottom of the class, I think it makes it more readable.
- Resolved some compiler warnings
Closes#9992
The tribe node, at startup, sets up the tribe clients that will join their corresponding tribes. All of the tribe.* settings are properly forwarded to the corresponding tribe client. System properties and global configuration settings must not be forwarded to the tribe client though or they will end up overriding per tribe settings with same name causing issues.
For instance if you set the transport.tcp.port to some defined value for the tribe node, via system property or configuration file, that same value must not be forwarded to the tribe clients, otherwise they will try and use the same port, which will be already occupied by the tribe node itself, resulting in startup failed. Same for cluster.name, which will cause the tribe clients not to join their tribes.
Closes#9576Closes#9721
This commit adds scripting capability to significant_terms.
Custom heuristics can be implemented with a script that provides
parameters subset_freq, superset_freq,subset_size, superset_size.
closes#7850
The local DocumentMapper is updated while parsing and dynamic fields are added before
parsing has finished. If parsing fails after a dynamic field has been added already
then the field was not added to the cluster state but was present in the local mapper of this
node. New documents with the same field would not necessarily cause an update either and
after restarting the node the mapping for these fields were lost. Instead the new fields
should always be updated.
closes#9851closes#9874
Since the method can be called in an #execute event of the cluster service, we need the ability to use the cluster state that will be provided in the ClusterChangedEvent, have the ClusterState be provided as a parameter
Previously it was ignored and the publish cluster state timeout would kick in. In that case a stale master node would just wait for the inevitable and waste valuable time.
This issue was discovered by the DiscoveryWithServiceDisruptionsTests#testStaleMasterNotHijackingMajority test.
Also only perform cluster state versions and wrong master node check inside cluster state update task.
If a tragic even happens while we are reading the segments info from the
store the store might have been closed concurrently. We had this
behavior before and was lost in a refactoring.
If a tragic even happens while we are reading the segments info
from the store the store might have been closed concurrently. We had this behavior
before and was lost in a refactoring.
If a folder for an index was created that folder is never deleted from that node unless the index is deleted.
Data only nodes therefore can have empty folders for indices that they do not even have shards for.
This commit makes sure empty folders are cleaned up after all shards have moved away from a data only
node. The behavior is unchanged for master eligible nodes.
closes#9985
The Histogram and Range APIs for the aggregations changed so that there was a common interface between he types of Range/Histogram. This PR reflects that change in the Java API docs
Contributes to #9976
Delete-by-query is incredibly costly because it forces a refresh each
time, so if you are also indexing this can cause massive segment
explosion.
This change throttles delete-by-query when merges can't keep up. It's
likely not enough (#7052 is the long-term solution) but can only
help.
Closes#9986
In TransportSearchTypeAction one of the logger calls was passing the throwable in as a parameter for the message rather than a throwable to be printed as a stack trace. This change fixes it so the throwable is printed properly
We used to handle truncated translogs in a better manner (assuming that
the node was killed halfway through writing an operation and discarding
the last operation). This brings back that behavior by catching an
`EOFException` during the stream reading and throwing a
`TruncatedTranslogException` which can be safely ignored in
`IndexShardGateway`.
Fixes#9699
We've been relying on URI for url encoding, but it turns out it has some problems. For instance '+' stays as is while it should be encoded to `%2B`. If we go and manually encode query params we have to be careful though not to run into double encoding ('+'=>'%2B'=>'%252B'). The applied solution relies on URI encoding for the url path, but manual url encoding for the query parameters. We prevent URI from double encoding query params by using its single argument constructor that leaves everything as is.
We can also revert back the expression script REST test that revealed this to its original content (which contains an addition).
Closes#9769Closes#9946
It may take some time for the old master node to step down anf for it to rejoin and that all nodes have it in the nodes list.
By waiting for the old master node to have stepped down, we can again rely on assertDiscoveryCompleted() to make sure that it has joined.
If the isolated unicast host is also a master node then its local cluster state gets unusable a source for pinging when the disruption stops.
All the nodes in the cluster state node list can be removed and at that time it will only ping itself and never find out about the other nodes.
(these nodes will not ping, because they are already following a new master)
CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the suite cluster after each failure. We were checking whether a test had failed in the @After method though, which runs before the listener gets notified, so the failed flag would always be false.
This commit makes sure that the suite cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure).
Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the suite cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway.
Closes#9015