By default the reroute API should return the new cluster state, excluding the metadata. It was however it was wrongly using an old parameter (filter_metadata) and thus failed to do so. This commits restores but wiring it to the correct `metric` parameter. We also add an enum representing the possible metrics, to avoid similar future mistakes.
Closes#7520Closes#7523
The functionality of copying headers in the REST layer (from REST requests to transport requests) remains the same. Made it a bit nicer by introducing a RestClientFactory component that is a singleton and allows to register useful headers without requiring static methods.
Plugins just have to inject the RestClientFactory now, and call its `addRelevantHeaders` method that is not static anymore.
Relates to #6513Closes#7594
Also renamed HeadersCopyClientTests since it tests a class that was renamed and removed randomization around client wrapper, as it now needs to be weapped all the time to copy the context, doesn't depend on useful headers that have been registered anymore.
Relates to #7610
Today we throw an error if local transport is configured with BWC tests.
Yet, the BWC test need network to be enabled so test can just set the
required defaults.
Closes#7660
Previously we incorrectly sent them in the wrong order, which can cause
validators not to be run for dynamic settings that have been added
matching a particular wildcard.
Fixes#7651
get rid of readSource() entirely, it sucked, Operations should be able
to provide the source themselves.
No more TranslogStream headers, you are now required to pass an
StreamInput or StreamOutput for all operations, which means no extra
state is needed and no need to construct new versions when detecting the
version.
Read and write translog op sizes in TranslogStreams
Previously we handled these integers outside of the translog stream
itself, which was very unclean because other code had to know about
reading the size, or about writing the correct header sometimes.
There is some additional code in LocalIndexShardGateway to handle the
legacy case for older translogs, because we need to read and discard the
size in order to maintain the compatibility for the streaming
operations (they did not read or write the size for 1.3.x and earlier).
Additionally, we need to handle a case where the header is truncated
when recovering from disk
Use a NoopStreamOutput instead of byte arrays
Instead of writing translog operations to a temporary byte array and
then writing that byte array to the stream, we now write the operation
twice, once to a No-op stream to get the size, then again to the real
size.
This trades a little more CPU usage for less memory usage.
Today we serialize the snashot metadata to a byte array and then copy
the byte array to a stream. Instead this commit moves the serialization
directly to the target stream without the intermediate representation.
Closes#7637
We keep around a noop stats indicating how many update operations ended up not updating the document (typically because it didn't change). However, the TransportUpdateAction update that counter only after returning the result. This can throw off stats check which are done immediately after, potentially causing test failures.
Closes#7639
Today there are two different ways to cleanup search contexts which can
potentially lead to double releasing of a context. This commit unifies
the methods and prevents double closing.
Closes#7625
Updates on the _timestamp field were silently ignored.
Now _timestamp undergoes the same merge as regular
fields. This includes exceptions if a property cannot
be changed.
"path" and "default" cannot be changed.
closes#5772closes#6958closes#7614
partially fixes#777
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.
Contributes to #6917
Reduce Phases can be expensive and some of them like the aggregations
reduce phase might even execute a one-off call via an internal client
that might cause a deadlock due to execution on the network thread
that is needed to handle the one-off call. This commit dispatches
the reduce phase to the search threadpool to ensure we don't wait
for the current thread to be available.
Closes#7623
Lucene's experimental codecs (from the codecs module) do not provide
backwards compatibility and are free to change from release to
release. When they do change, they typically cannot in general read
older indices and the resulting exceptions look like index corruption.
So, we are removing built-in support for them to prevent applications
from choosing one and then seeing strange exceptions on upgrade.
Closes#7566Closes#7604
Enables filtering the actions on both sides - request and response. Also added a base class for filter implementations (cleans up filters that only need to filter one side)
Also refactored the filter & filter chain methods to more intuitive names
BlobContainer used to provide async APIs which are not used
internally. The implementation of these APIs are also not async
by nature and neither is any of the pluggable BlobContainers. This
commit simplifies the API to a simple input / output stream and
reduces the hierarchy of BlobContainer dramatically.
NOTE: This is a breaking change!
Closes#7551
Similar to the one in `TransportMessage`. Added the `ContextHolder` base class where both `TransportMessage` and `RestRequest` derive from
Now next to the known headers, the context is always copied over from the rest request to the transport request (when the injected client is used)
This adds the ability to the Term Vector API to generate term vectors for
artifical documents, that is for documents not present in the index. Following
a similar syntax to the Percolator API, a new 'doc' parameter is used, instead
of '_id', that specifies the document of interest. The parameters '_index' and
'_type' determine the mapping and therefore analyzers to apply to each value
field.
Closes#7530
The get, put and delete indexed script apis map to get, index and delete api and internally create those corresponding requests. We need to make sure that the original headers are handed over to the new request by passing the original request in the constructor when creating the new one.
Also streamlined the support for version and version_type in the REST layer since the parameters were not consistently parsed and set to the internal java API requests.
Modified the REST delete template and delete script actions to make use of a client instead of using the `ScriptService` directly.
Closes#7569
Removed CHM in favour of an OpenHashMap and synchronized accessor/mutator methods. Also, the context is now lazily inititialied (just like we do with the headers)
The useful headers are now stored into a `Set` instead of an array so we can easily deduplicate them. A set is also returned instead of an array by the `usefulHeaders` static getter.
Relates to #6513Closes#7590
Serialization if "index" setting for boost did not work since
the serialization was just true/false instead of valid options
"no"/"not_analyzed"/"analyzed".
closes#7557
`GetIndexedScriptRequest` now extends `ActionRequest` instead of `SingleShardOperationRequest`, as the index field that was provided with the previous base class is not needed (hardcoded).
Closes#7553
Aggregations are collection-wide statistics, which is incompatible with the
collection mode of search_type=SCAN since it doesn't collect all matches on
calls to the search API.
Close#7429
Aggregations are collection-wide statistics so they would always be the same.
In order to save CPU/bandwidth, we can just return them on the first page.
Same as #1642 but for aggregations.
After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.
Closes#7543
Enable lucene verification of checksums on segments before merging them.
This prevents corruption from existing segments from silently slipping into
newer merged segments.
Closes#7360
System properties are typically set via the command line and therefore override the node settings. If one has `node.local=true` or `node.mode=local` it can result in cryptic error messages during the test run.
System properties are typically set via the command line and therefore override the node settings. If one has `node.local=true` or `node.mode=local` it can result in cryptic error messages during the test run.
The global cluster gets created from a static block and shared through all tests in the same jvm. The `buildTestCluster` method can't get called passing in `Scope.GLOBAL`, hence removed its mention from it as it might be misleading. The only two scopes supported within the `buildTestCluster` method are `SUITE` and `TEST`.
Merging the accumulated work from the feautre/improve_zen branch. Here are the highlights of the changes:
__Testing infra__
- Networking:
- all symmetric partitioning
- dropping packets
- hard disconnects
- Jepsen Tests
- Single node service disruptions:
- Long GC / Halt
- Slow cluster state updates
- Discovery settings
- Easy to setup unicast with partial host list
__Zen Discovery__
- Pinging after master loss (no local elects)
- Fixes the split brain issue: #2488
- Batching join requests
- More resilient joining process (wait on a publish from master)
Closes#7493
Previous implementation used a marker interface and had no explicit failure call back for the case update task was run on a non master (i.e., the master stepped down after it was submitted). That lead to a couple of instance of checks.
This approach moves ClusterStateUpdateTask from an interface to an abstract class, which allows adding a flag to indicate whether it should only run on master nodes (defaults to true). It also adds an explicit onNoLongerMaster call back to allow different error handling for that case. This also removed the need for the NoLongerMaster.
Closes#7511
We currently have two ways to randomize the number of shards and replicas: random index template, that stays the same for all indices created under the same scope, and the overridable `indexSettings` method, called by `createIndex` and `prepareCreate` which returns different values each time.
Now that the `randomIndexTemplate` method is not static anymore, we can easily apply the same logic to both. Especially for number of replicas, we used to have slightly different behaviours, where more than one replicas were only rarely used through random index template, which gets now applied to the `indexSettings` method too (might speed up the tests a bit)
Side note: `randomIndexTemplate` had its own logic which didn't depend on `numberOfReplicas` or `maximumNumberOfReplicas`, which was causing bw comp tests failures since in some cases too many copies of the data are requested, which cannot be allocated to older nodes, and the write consistency quorum cannot be met, thus indexing times out.
Closes#7522
Settings that are not default for _size, _index and _timestamp were only build in
toXContent if these fields were actually enabled.
_timestamp, _index and _size can be dynamically enabled or disabled.
Therfore the settings must be kept, even if the field is disabled.
(Dynamic enabling/disabling was intended, see TimestampFieldMapper.merge(..)
and SizeMappingTests#testThatDisablingWorksWhenMerging
but actually never worked, see below).
To avoid that _timestamp is overwritten by a default mapping
this commit also adds a check to mapping merging if the type is already
in the mapping. In this case the default is not applied anymore.
(see
SimpleTimestampTests#testThatUpdatingMappingShouldNotRemoveTimestampConfiguration)
As a side effect, this fixes
- overwriting of paramters from the _source field by default mappings
(see DefaultSourceMappingTests).
- dynamic enabling and disabling of _timestamp and _size ()
(see SimpleTimestampTests#testThatTimestampCanBeSwitchedOnAndOff and
SizeMappingIntegrationTests#testThatTimestampCanBeSwitchedOnAndOff )
Tests:
Enable UpdateMappingOnClusterTests#test_doc_valuesInvalidMappingOnUpdate again
The missing settings in the mapping for _timestamp, _index and _size caused a the
failure: When creating a mapping which has settings other than default and the
field disabled, still empty field mappings were built from the type mappers.
When creating such a mapping, the mapping source on master and the rest of the cluster
can be out of sync for some time:
1. Master creates the index with source _timestamp:{_store:true}
mapper classes are in a correct state but source is _timestamp:{}
2. Nodes update mapping and refresh source which then completely misses _timestamp
3. After a while source is refreshed again also on master and the _timestamp:{}
vanishes there also.
The test UpdateMappingOnCusterTests#test_doc_valuesInvalidMappingOnUpdate failed
because the cluster state was sampled from master between 1. and 3. because the
randomized testing injected a default mapping with disabled _size and _timestamp
fields that have settings which are not default.
The test
TimestampMappingTests#testThatDisablingFieldMapperDoesNotReturnAnyUselessInfo
must be removed because it actualy expected the timestamp to remove
parameters when it was disabled.
closes#7137
The root endpoint returns basic information about this node, like it's name and ES version etc. The cluster name is an important information that belongs in that list.
Closes#7524
The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.
Closes#7505Closes#7514
During a test run we have a global shared cluster and potentially a suite level or even a test level cluster running. All of those share the same node name pattern (node_#). This can be confusing if you're debugging discovery related tests where those nodes from the different clusters potentially interact (and reject each other). This commit gives each cluster type a unique prefix to make tracing and log filtering simpler.
Closes#7518
Comparisons for the BigArrays breaker use "greater than" instead of
"greater than or equal", which was never an issue before because the
test size was not right on a page boundary. A test with an exactly
divisible page boundary (4mb exactly in this case) caused the sizes to
be equal to, but not exceed, the limit, and never break.
The limit should be smaller than the test increments the breaker anyway.