This PR completes the refactoring of the cluster allocation explain API and improves it in the following two high-level ways:
1. The explain API now uses the same allocators that the AllocationService uses to make shard allocation decisions. Prior to this PR, the explain API would run the deciders against each node for the shard in question, but this was not executed on the same code path as the allocators, and many of the scenarios in shard allocation were not captured due to not executing through the same code paths as the allocators.
2. The APIs have changed, both on the Java and JSON level, to accurately capture the decisions made by the system. The APIs also now report on shard moving and rebalancing decisions, whereas the previous API did not report decisions for moving shards which cannot remain on their current node or rebalancing shards to form a more balanced cluster.
Note: this change affects plugin developers who may have a custom implementation of the ShardsAllocator interface. The method weighShards has been removed and no longer has any utility. In order to support the new explain API, however, a custom implementation of ShardsAllocator must now implement ShardAllocationDecision decideShardAllocation(ShardRouting shard, RoutingAllocation allocation) which provides a decision and explanation for allocating a single shard. For implementations that do not support explaining a single shard allocation via the cluster allocation explain API, this method can simply return an UnsupportedOperationException.
`scaled_float` should be used as DOUBLE in aggregations but currently they are used as LONG.
This change fixes this issue and adds a simple it test for it.
Fixes#22350
Before, snapshot/restore would synchronize all operations on the cluster
state except for deleting snapshots. This meant that only one
snapshot/restore operation would be allowed in the cluster at any given
time, except for deletions - there could be two or more snapshot
deletions running at the same time, or a deletion could be running,
unbeknowest to the rest of the cluster, and thus a snapshot or restore
would be allowed at the same time as the snapshot deletion was still in
progress. This could cause any number of synchronization issues,
including the situation where a snapshot that was deleted could reappear
in the index-N file, even though its data was no longer present in the
repository.
This commit introduces a new custom type to the cluster state to
represent deletions in progress. Now, another deletion cannot start if
a deletion is currently in progress. Similarily, a snapshot or restore
cannot be started if a deletion is currently in progress. In each case,
if attempting to run another snapshot/restore operation while a deletion
is in progress, a ConcurrentSnapshotExecutionException will be thrown.
This is the same exception thrown if trying to snapshot while another
snapshot is in progress, or restore while a snapshot is in progress.
Closes#19957
Today we only expose `value_type` in scriptable aggregations, however it is
also useful with unmapped fields. I suspect we never noticed because
`value_type` was not documented (fixed) and most aggregations are scriptable.
Closes#20163
Sequence BWC logic consists of two elements:
1) Wire level BWC using stream versions.
2) A changed to the global checkpoint maintenance semantics.
For the sequence number infra to work with a mixed version clusters, we have to consider situation where the primary is on an old node and replicas are on new ones (i.e., the replicas will receive operations without seq#) and also the reverse (i.e., the primary sends operations to a replica but the replica can't process the seq# and respond with local checkpoint). An new primary with an old replica is a rare because we do not allow a replica to recover from a new primary. However, it can occur if the old primary failed and a new replica was promoted or during primary relocation where the source primary is treated as a replica until the master starts the target.
1) Old Primary & New Replica - this case is easy as is taken care of by the wire level BWC. All incoming requests will have their seq# set to `UNASSIGNED_SEQ_NO`, which doesn't confuse the local checkpoint logic (keeping it at `NO_OPS_PERFORMED`)
2) New Primary & Old replica - this one is trickier as the global checkpoint service currently takes all in sync replicas into consideration for the global checkpoint calculation. In order to deal with old replicas, we change the semantics to say all *new node* in sync replicas. That means the replicas on old nodes don't count for the global checkpointing. In this state the seq# infra is not fully operational (you can't search on it, because copies may miss it) but it is maintained on shards that can support it. The old replicas will have to go through a file based recovery at some point and will get the seq# information at that point. There is still an edge case where a new primary fails and an old replica takes over. I'lll discuss this one with @ywelsch as I prefer to avoid it completely.
This PR also re-enables the BWC tests which were disabled. As such it had to fix any BWC issue that had crept in. Most notably an issue with the removal of the `timestamp` field in #21670.
The commit also includes a fix for the default value of the seq number field in replicated write requests (it was 0 but should be -2), that surface some other minor bugs which are fixed as well.
Last - I added some debugging tools like more sane node names and forcing replication request to implement a `toString`
In #20305, _suggest endpoint was deprecated
in favour of using _search endpoint. This
commit removes the dedicated _suggest endpoint
entirely from master.
This commit adds a skip for the include segment file sizes REST tests on
nodes less than or equal to version 5.1.1 as the stats APIs did not
correctly account for this parameter prior to version 5.1.2.
Relates #21879
This commit addresses an issue in the stats APIs where
include_segment_file_sizes was not being consumed leading to requests
containing this parameter being rejected.
Relates #21879
* Replace _suggest endpoint to _search in docs
In 5.0, the _suggest endpoint is just sugar for _search
with suggestions specified. Users should move away from
using the _suggest endpoint, as it is marked as deprecated in 5.x and
will be removed in 6.0
* update docs to use _search endpoint instead of _suggest
* Add deprecation logging to RestSuggestAction
* Use search endpoint instead of suggest endpoint in rest tests
Queries must be rewritten before the query phase executes otherwise non-executable queries like `wrapper` query or `terms` will fail or queries that require resources like script service can't access these service unless rewritten.
Relates to #21303
This change allows specifying alias/wildcard expression in indices_boost.
And added another format for specifying indices_boost. It accepts array of index name and boost pair.
If an index is included in multiple aliases/wildcard expressions, the first match will be used.
With new format, old format is marked as deprecated.
Closes#4756
Improves the error message returned when looking up a task that
belongs to a node that is no longer part of the cluster. The new
error message tells the user that the node isn't part of the cluster.
This is useful because if you start a task and the node goes down
there isn't a record of the task at all. This hints to the user that
the task might have died with the node.
Relates to #22027
Fixes an issue where indexing requests with operation type "create" auto-convert external versioning to internal versioning and silently ignore the version number instead of failing with an error message.
This commit adds a skip for the missing types REST test. There was a bug
for an unclosed XContent object on version prior to version 5.0.3 which
is going to lead to different responses on vresions prior to 5.0.3 and
versions on or after version 5.0.3.
When there are no indexes, get mapping has a series of special cases.
Two of those expect the response object already started, and the other
two respond with an exception. Those two cases (types passed in but no
indexes and vice versa) would fail in their error response generation
because it did not expect an object to already be started in the json
generator. This change moves the object start to where it is needed for
the empty responses.
closes#21916
REST tests use the default OOTB low/high disk watermarks of 85%/90%, which can make some tests fail if run on a machine with a fuller disk. This commit changes the watermarks in the same way as in IntegTestCase so that they're essentially ignored.
With #21738 we added an indices section to the search shards api, that will return the concrete indices hit by the request, and eventually the corresponding alias filter.
The java API returns the AliasFilter object, which holds the filter itself and an array of aliases that pointed to the index in the original request. The REST layer doesn't print out the aliases array though. This commit adds the aliases array as well and tests for this.
Group, List and Affix settings generate a bogus diff that turns the actual
diff into a string containing a json structure for instance:
```
"action" : {
"search" : {
"remote" : {
"" : "{\"my_remote_cluster\":\"[::1]:60378\"}"
}
}
}
```
which make reading the setting impossible. This happens for instance
if a group or affix setting is rendered via `_cluster/settings?include_defaults=true`
This change fixes the issue as well as several minor issues with affix settings that
where not accepted as valid setting today.
* Scripting: Remove groovy scripting language
Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
Add indices and filter information to search shards api output
The search shards api returns info about which shards are going to be hit by executing a search with provided parameters: indices, routing, preference. Indices can also be aliases, which can also hold filters. The output includes an array of shards and a summary of all the nodes the shards are allocated on. This commit adds a new indices section to the search shards output that includes one entry per index, where each index can be associated with an optional filter in case the index was hit through a filtered alias.
This is relevant since we have moved parsing of alias filters to the coordinating node.
Relates to #20916
The `type` parameter has always been accepted by the search_shards api, probably to make the api and its urls the same as search. Truth is that the type never had any effect, it's been ignored from day one while accepting it may make users think that we actually do something with it.
This commit removes support for the type parameter from the REST layer and the Java API. Backwards compatibility is maintained on the transport layer though.
The new added serialization test also uncovered a bug in the java API where the `ClusterSearchShardsRequest` could be created with no arguments, but the indices were required to be not null otherwise the request couldn't be serialized as `writeTo` would throw NPE. Fixed by setting a default value (empty array) for indices.
This is a followup to #21689 where we removed a misplaced try catch for IndexMissingException and IndexClosedException which was related to #9047 (at least for the index closed case). The code block within the change was moved as part of #20890, which made the catch redundant. It was somehow used before (e.g. in 5.0) but it doesn't seem that this catch had any effect. Added tests to verify that. In fact a specific catch added to the search api only would defeat the purpose of having common indices options that work throughout all our APIs.
Relates to #21689
The tests rely on the allocation decider allowing allocation and
the disk threshold decider wasn't allowing it. This came up from
time to in CI but *all* the time on my local machine because it is
a small ssd.
* master: (22 commits)
Add proper toString() method to UpdateTask (#21582)
Fix `InternalEngine#isThrottled` to not always return `false`. (#21592)
add `ignore_missing` option to SplitProcessor (#20982)
fix trace_match behavior for when there is only one grok pattern (#21413)
Remove dead code from GetResponse.java
Fixes date range query using epoch with timezone (#21542)
Do not cache term queries. (#21566)
Updated dynamic mapper section
Docs: Clarify date_histogram bucket sizes for DST time zones
Handle release of 5.0.1
Fix skip reason for stats API parameters test
Reduce skip version for stats API parameter tests
Strict level parsing for indices stats
Remove cluster update task when task times out (#21578)
[DOCS] Mention "all-fields" mode doesn't search across nested documents
InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted
Fixed bad asciidoc in boolean mapping docs
Fixed bad asciidoc ID in node stats
Be strict when parsing values searching for booleans (#21555)
Fix time zone rounding edge case for DST overlaps
...
Adds a version constant for it, bwc indices, and a vagrant upgrade-from
version. Also bumps the "upgrade from" version for the backwards-5.0
test and adds `skip`s for tests that don't fail against 5.0 so we skip
them during the backwards testing.
Finally, this skips the "Shrink index via API" test because it fails
consistently for me. Inconsistently for CI, but consistently for me.
I'll work on making it consistent tomorrow.
This commit reduces the skip version for the stats API unrecognized
parameter tests now that the logic for unrecognized parameters has been
backported to 5.x.
Today when parsing a stats request, Elasticsearch silently ignores
incorrect metrics. This commit removes lenient parsing of stats requests
for the nodes stats and indices stats APIs.
Relates #21417
This commit enables real BWC testing against a 5.1 snapshot. All
REST tests plus rolling upgrade test now run against a mixed version
cross major version cluster.
* master:
ShardActiveResponseHandler shouldn't hold to an entire cluster state
Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469)
Remove (again) test uses of onModule (#21414)
[TEST] Add assertBusy when checking for pending operation counter after tests
Revert "Add trace logging when aquiring and releasing operation locks for replication requests"
Allows multiple patterns to be specified for index templates (#21009)
[TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed
Document _reindex with random_score
* master: (516 commits)
Avoid angering Log4j in TransportNodesActionTests
Add trace logging when aquiring and releasing operation locks for replication requests
Fix handler name on message not fully read
Remove accidental import.
Improve log message in TransportNodesAction
Clean up of Script.
Update Joda Time to version 2.9.5 (#21468)
Remove unused ClusterService dependency from SearchPhaseController (#21421)
Remove max_local_storage_nodes from elasticsearch.yml (#21467)
Wait for all reindex subtasks before rethrottling
Correcting a typo-Maan to Man-in README.textile (#21466)
Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
Replace all index date-math examples with the URI encoded form
Fix typos (#21456)
Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
Add VirtualBox version check (#21370)
Export ES_JVM_OPTIONS for SysV init
Skip reindex rethrottle tests with workers
Make forbidden APIs be quieter about classpath warnings (#21443)
...
* Allows for an array of index template patterns to be provided to an
index template, and rename the field from 'template' to 'index_pattern'.
Closes#20690
Applied (almost) the same rules we use to validate index names
to new alias names. The only rule not applies it "must be lowercase".
We have tests that don't follow that rule and I except there are lots
of examples of camelCase alias names in the wild. We can add that
validation but I'm not sure it is worth it.
Closes#20748
Adds an alias that starts with `#` to the BWC index and validates
that you can read from it and remove it. Starting with `#` isn't
allowed after 5.1.0/6.0.0 so we don't create the alias or check it
after those versions.