Similar to the one in `TransportMessage`. Added the `ContextHolder` base class where both `TransportMessage` and `RestRequest` derive from
Now next to the known headers, the context is always copied over from the rest request to the transport request (when the injected client is used)
This adds the ability to the Term Vector API to generate term vectors for
artifical documents, that is for documents not present in the index. Following
a similar syntax to the Percolator API, a new 'doc' parameter is used, instead
of '_id', that specifies the document of interest. The parameters '_index' and
'_type' determine the mapping and therefore analyzers to apply to each value
field.
Closes#7530
The get, put and delete indexed script apis map to get, index and delete api and internally create those corresponding requests. We need to make sure that the original headers are handed over to the new request by passing the original request in the constructor when creating the new one.
Also streamlined the support for version and version_type in the REST layer since the parameters were not consistently parsed and set to the internal java API requests.
Modified the REST delete template and delete script actions to make use of a client instead of using the `ScriptService` directly.
Closes#7569
Removed CHM in favour of an OpenHashMap and synchronized accessor/mutator methods. Also, the context is now lazily inititialied (just like we do with the headers)
The useful headers are now stored into a `Set` instead of an array so we can easily deduplicate them. A set is also returned instead of an array by the `usefulHeaders` static getter.
Relates to #6513Closes#7590
Serialization if "index" setting for boost did not work since
the serialization was just true/false instead of valid options
"no"/"not_analyzed"/"analyzed".
closes#7557
`GetIndexedScriptRequest` now extends `ActionRequest` instead of `SingleShardOperationRequest`, as the index field that was provided with the previous base class is not needed (hardcoded).
Closes#7553
Aggregations are collection-wide statistics, which is incompatible with the
collection mode of search_type=SCAN since it doesn't collect all matches on
calls to the search API.
Close#7429
Aggregations are collection-wide statistics so they would always be the same.
In order to save CPU/bandwidth, we can just return them on the first page.
Same as #1642 but for aggregations.
After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.
Closes#7543
Enable lucene verification of checksums on segments before merging them.
This prevents corruption from existing segments from silently slipping into
newer merged segments.
Closes#7360
System properties are typically set via the command line and therefore override the node settings. If one has `node.local=true` or `node.mode=local` it can result in cryptic error messages during the test run.
System properties are typically set via the command line and therefore override the node settings. If one has `node.local=true` or `node.mode=local` it can result in cryptic error messages during the test run.
The global cluster gets created from a static block and shared through all tests in the same jvm. The `buildTestCluster` method can't get called passing in `Scope.GLOBAL`, hence removed its mention from it as it might be misleading. The only two scopes supported within the `buildTestCluster` method are `SUITE` and `TEST`.
Merging the accumulated work from the feautre/improve_zen branch. Here are the highlights of the changes:
__Testing infra__
- Networking:
- all symmetric partitioning
- dropping packets
- hard disconnects
- Jepsen Tests
- Single node service disruptions:
- Long GC / Halt
- Slow cluster state updates
- Discovery settings
- Easy to setup unicast with partial host list
__Zen Discovery__
- Pinging after master loss (no local elects)
- Fixes the split brain issue: #2488
- Batching join requests
- More resilient joining process (wait on a publish from master)
Closes#7493
Previous implementation used a marker interface and had no explicit failure call back for the case update task was run on a non master (i.e., the master stepped down after it was submitted). That lead to a couple of instance of checks.
This approach moves ClusterStateUpdateTask from an interface to an abstract class, which allows adding a flag to indicate whether it should only run on master nodes (defaults to true). It also adds an explicit onNoLongerMaster call back to allow different error handling for that case. This also removed the need for the NoLongerMaster.
Closes#7511
We currently have two ways to randomize the number of shards and replicas: random index template, that stays the same for all indices created under the same scope, and the overridable `indexSettings` method, called by `createIndex` and `prepareCreate` which returns different values each time.
Now that the `randomIndexTemplate` method is not static anymore, we can easily apply the same logic to both. Especially for number of replicas, we used to have slightly different behaviours, where more than one replicas were only rarely used through random index template, which gets now applied to the `indexSettings` method too (might speed up the tests a bit)
Side note: `randomIndexTemplate` had its own logic which didn't depend on `numberOfReplicas` or `maximumNumberOfReplicas`, which was causing bw comp tests failures since in some cases too many copies of the data are requested, which cannot be allocated to older nodes, and the write consistency quorum cannot be met, thus indexing times out.
Closes#7522
Settings that are not default for _size, _index and _timestamp were only build in
toXContent if these fields were actually enabled.
_timestamp, _index and _size can be dynamically enabled or disabled.
Therfore the settings must be kept, even if the field is disabled.
(Dynamic enabling/disabling was intended, see TimestampFieldMapper.merge(..)
and SizeMappingTests#testThatDisablingWorksWhenMerging
but actually never worked, see below).
To avoid that _timestamp is overwritten by a default mapping
this commit also adds a check to mapping merging if the type is already
in the mapping. In this case the default is not applied anymore.
(see
SimpleTimestampTests#testThatUpdatingMappingShouldNotRemoveTimestampConfiguration)
As a side effect, this fixes
- overwriting of paramters from the _source field by default mappings
(see DefaultSourceMappingTests).
- dynamic enabling and disabling of _timestamp and _size ()
(see SimpleTimestampTests#testThatTimestampCanBeSwitchedOnAndOff and
SizeMappingIntegrationTests#testThatTimestampCanBeSwitchedOnAndOff )
Tests:
Enable UpdateMappingOnClusterTests#test_doc_valuesInvalidMappingOnUpdate again
The missing settings in the mapping for _timestamp, _index and _size caused a the
failure: When creating a mapping which has settings other than default and the
field disabled, still empty field mappings were built from the type mappers.
When creating such a mapping, the mapping source on master and the rest of the cluster
can be out of sync for some time:
1. Master creates the index with source _timestamp:{_store:true}
mapper classes are in a correct state but source is _timestamp:{}
2. Nodes update mapping and refresh source which then completely misses _timestamp
3. After a while source is refreshed again also on master and the _timestamp:{}
vanishes there also.
The test UpdateMappingOnCusterTests#test_doc_valuesInvalidMappingOnUpdate failed
because the cluster state was sampled from master between 1. and 3. because the
randomized testing injected a default mapping with disabled _size and _timestamp
fields that have settings which are not default.
The test
TimestampMappingTests#testThatDisablingFieldMapperDoesNotReturnAnyUselessInfo
must be removed because it actualy expected the timestamp to remove
parameters when it was disabled.
closes#7137
The root endpoint returns basic information about this node, like it's name and ES version etc. The cluster name is an important information that belongs in that list.
Closes#7524
The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.
Closes#7505Closes#7514
During a test run we have a global shared cluster and potentially a suite level or even a test level cluster running. All of those share the same node name pattern (node_#). This can be confusing if you're debugging discovery related tests where those nodes from the different clusters potentially interact (and reject each other). This commit gives each cluster type a unique prefix to make tracing and log filtering simpler.
Closes#7518
Comparisons for the BigArrays breaker use "greater than" instead of
"greater than or equal", which was never an issue before because the
test size was not right on a page boundary. A test with an exactly
divisible page boundary (4mb exactly in this case) caused the sizes to
be equal to, but not exceed, the limit, and never break.
The limit should be smaller than the test increments the breaker anyway.
Today we have logic that removes a shard from the indexservice if
the shard has changed ie. from replica to primary or if it's recovery
source vanished etc. This can cause shards from been not allocated at
all on a nodes causeing delete requests to timeout since we were waiting
for shards on nodes that got dropped due to a IndexShardMissingException
Closes#7509
1) One issue reported by a user is due to the truncation of the geohash string. Added Junit test for this scenario
2) Another suspect piece of code was the “toAutomaton” method that only merged the first of possibly many precisions into the result.
Closes#7368
When we corrupt a file in the snapshot/restore case we have to corrupt
a per-segment file. The .del file might change with the commit / flush
that is triggered by the snapshot operation.
This commit makes the default number of shards for the .scripts index to ````1````, it also
forces the auto_expand replicas to ````1-all````. This change means that script index GET requests to load
scripts from the index should always use the local copy of the scripts index, preventing any network traffic or calls
on script GET.
Today the iteration order of the interfaces might change across JVMs
this commit cleans up the NetworkUtils class and attempts to ensure
consistent iteration order across JVMs.
For reliability and debug purposes each test JVM should use it's own
TCP port range if executed in parallel. This also moves away from the
default port range to prevent conflicts with running ES instance on the local
machine.
Only when segments are merged away due to merging then entries in this cache are cleaned up.
Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.
Closes#7037Closes#7031
ClusterState has a reference to the cluster name since version 1.1.0 (df7474b9fc) . However, if the state was sent from a master of an older version, this name can be set to null. This is an unexpected and can cause bugs. The bad part is that it will never correct it self until a full cluster restart where the cluster state is rebuilt using the code of the latest version.
This commit changes the default to the node's cluster name.
Relates to #7386Closes#7414
RandomScoreFunction previously relied on the order the documents were
iterated in from Lucene. This caused changes in ordering, with the same
seed, if documents moved to different segments. With this change, a
murmur32 hash of the _uid for each document is used as the "random"
value. Also, the hash is adjusted so as to only return values between
0.0 and 1.0 to enable easier manipulation to fit into users' scoring
models.
closes#6907, #7446
At the moment, when a node looses connection to the master (due to a partition or the master was stopped), we ping the unicast hosts in order to discover other nodes and elect a new master or get of another master than has been elected in the mean time. This can go wrong if all unicast targets are on the same side of a minority partition and therefore will never rejoin once the partition is healed.
Closes#7336
Requests are handled by the worked thread pool of the target node instead of the generic thread pool of the source node.
Also this change is required in order to make GC disruption work with local transport. Previously the handling of the a request was performed on on a node that that was being GC disrupted, resulting in some actions being performed while GC was being simulated.
The cluster state version allows resolving the case where a old master node become unresponsive and later wakes up and pings all the nodes in the cluster, allowing the newly elected master to decide whether it should step down or ask the old master to rejoin.