Shard requests like broadcast delete and delete by query, that needs to be executed on primary and all replicas, get read and written out to the transport on the same node. That means that if we add some field version checks are not enough to maintain bw comp since a newer node that holds the primary might receive the request from an older node, that didn't provide the field. Yet, when writing the request out again to a newer node that holds the replica, we do try and serialize the field although it's missing. The newer fields just needs to be set to optional in these cases, in addition to the version checks.
Re-enabled testDeleteByQuery and testDeleteRoutingRequired bw comp tests since this was the cause of their failures.
Closes#7406
The verifyThreadNames starts a node and checks that all new threads on the JVM are properly named. The current test uses the name of the new node which sometimes fails because our shared cluster spawns a new thread which is properly named but for not for the new name.
The commits relaxes the requirement of the test and on verify the threads are properly named (but not necessarily of the new node)
Items with no specified field now defaults to all the possible fields from the
document source. Previously, we had required 'fields' to be specified either
as a top level parameter or for each item. The default behavior is now similar
to the MLT API.
Closes#7382
This commit changes the way how files are selected for retransmission
on recovery / restore. Today this happens on a per-file basis where the
rather weak checksum and the file length in bytes is compared to check if
a file is identical. This is prone to fail in the case of a checksum collision
which can happen under certain circumstances.
The changes in this commit move the identity comparsion to a per-commit / per-segment
level where files are only treated as identical iff all the other files in the
commit / segment are the same. This "all or nothing" strategy is reducing the chance for
a collision dramatically since we also use a strong hash to identify commits / segments
based on the content of the ".si" / "segments.N" file.
Closes#7351
resetting the clients on each test (in after test) makes the tests running, especially in network mode, much slower, since transport client needs to be created each time when randmized to be used. Also, on OSX, the excessive connections causes bind exceptions eventually which makes running the network tests much harder on it.
closes#7329
Today we always use the latest commit point to return the metadata from
the store. This might cause problems for snapshot and restore since in
contrast to recovery it won't prevent concurrent flushes (lucene commits).
This can lead to all kinds of interesting effects if we are snapshotting
while flushing. This change uses the IndexCommit to open the metadata snapshot
from the store which is consistent with what we snapshot.
Closes#7376
Changes the serialisation of pre and post offset to use Long instead of VLong so that negative values are supported. This actually only showed up in the case where minDocCount=0 as the rounding is only serialised in this case.
Closes#7312
The term vector API can now generate term vectors on the fly, if the terms are
not already stored in the index. This commit exploits this new functionality
for the MLT query. Now the terms are directly retrieved using multi-
termvectors API, instead of generating them from the texts retrieved using the
multi-get API.
Closes#7014
Today we have very confusing naming since some methods names claim to
read binary but in fact read utf-16 and convert to utf-8 under the hood.
This commit clarifies the interfaces and adds additional documentation.
Closes#7367
These options are not used anymore. Instead numeric values can now contain dups
and it is the responsibility of the aggregation to deal with it (eg. terms).
And otherwise all values sources are now sorted, which is the contract of the
interfaces that they implement.
Close#7276
A request that relates to indices (`IndicesRequest` or `CompositeIndicesRequest`) might be converted to some other internal request(s) (e.g. shard level request) that get distributed over the cluster. Those requests contain the concrete index they refer to, but it is not known which indices (or aliases or expressions) the original request related to.
This commit makes sure that the original indices are available as part of the shard level requests and makes them implement `IndicesRequest` as well.
Also every internal request should be created passing in the original request, so that the original headers, together with the eventual original indices and options, get copied to it. Corrected some places where this information was lost.
NOTE: As for the bulk api and other multi items api (e.g. multi_get), their shard level requests won't keep around the whole set of original indices, but only the ones that related to the bulk items sent to each shard, the important bit is that we keep the original names though, not only the concrete ones.
Closes#7319
This change fixes the creation circle shapes o it calculates it correctly instead of essentially using the diameter as the radius. The radius has to be converted into degrees but calculating the ratio of the desired radius to the circumference of the earth and then multiplying it by 360 (number of degrees around the earths circumference). This issue here was that it was only multiplied by 180 making the result out by a factor of 2. Also made the test for circles actually check to make sure it has the correct centre and radius.
Closes#7301
Recent test failures triggered by #7289 were caused by this, simply because internal node settings (transport type key) that are not supported by the external older nodes were copied to them by mistake.
* Removed & refactored unused module code
* Allowed to set transports programmatically
* Allow to set the source of the changed transport
Note: The current implementation breaks BWC as you need to specify a concrete
transport now instead of a module if you want to use a different
Transport or HttpServerTransport
Closes#7289
TransportShardSingleOperationAction is currently subclassed by different transport actions. Some of them are internal only, meaning that their execution will take place only in the same node where their parent execution took place. That means that their main transport handler doesn't need to be registered, the only transport handler that's needed is the shard level one.
Added `isSubAction` method (defaults to false) to the parent class that tells whether the action is a main one or a subaction, used to decide whether we need to register the main transport handler.
Closes#7285
These two aggregators basically do exactly the same thing, they just interpret
bytes differently. This refactoring found an (unreleased) bug in the long terms
aggregator which didn't work correctly with duplicate values.
Close#7279
This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.
Closes#7306Closes#7323
This issue has been fixed in commons-cli:1.3 project which sadly has not been released yet.
See https://issues.apache.org/jira/browse/CLI-183
This patch builds another list of options with no selected groups by default.
When commons-cli:1.3 will be released, we need to remove this patch.
Closes#7282.
Switch management threads to a fixed thread pool with up to 5 threads, and queue size of 100 by default, after which excess incoming requests are rejected.
Closes#7318Closes#7320
The score is explained already, it should not be again explained per function.
Also, remove explanation from parameter list of ScoreFunction#explainScore()
and leave only the score.
This also removes ExplainableSearchScript which is not used anywhere and
was the only reason to have the Explanation in the parameter anyway.
closes#7245
The terms and histogram aggregations always have an order. So it would make the
response easier to consume to return the buckets as a list instead of a
collection in order to make it easier to do things like getting the first/last
buckets.
Close#7275
This change means that the default settings for expand_wildcards are only applied if the expand_wildcards parameter is not specified rather than being set upfront. It also adds the none and all options to the parameter to allow the user to specify no expansion and expansion to all indexes (equivalent to 'open,closed')
Closes#7258
Added @Nullable to:
- IndicesService.indexService
- IndexService.shard
- IndexService.shardInjector
This change doesn't try to do anything smart but just makes sure that a
*MissingException is thrown instead of a NullPointerException when the requested
object doesn't exist.
Close#7251
Also replaced int,String pair with ShardId that holds the same info and serializes it the same way.
Replaced shardId and index getters in BroadcastOperationRequest with a single ShardId getter.
Closes#7255
The geohash grid it 8 cells wide and 4 cells tall. GeoHashUtils.neighbor(String,int,int.int) set the limit of the number of cells in y to < 3 rather than <= 3 resulting in it either not finding all neighbours or incorrectly searching for a neighbour in a different parent cell.
Closes#7226
This change stores the index creation time in the index metadata when an index is created. The creation time cannot be changed but can be set as part of the create index request to allow for correct creation times for historical data.
Closes#7119
In order to have the possibility of debugging on the command line, the user
now can either set the es.cli.debug system property
which results in stack traces being written to to the terminal.
Closes#7222
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:
DELETE /test
PUT /test/t/1
{
"date": "2014-01-01"
}
GET /_validate/query?explain
{
"query": {
"filtered": {
"filter": {
"range": {
"date": {
"from": 0
}
}
}
}
}
}
Returns:
"explanation": "ConstantScore(no_cache(date:[0 TO *]))"
This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:
GET /_validate/query?explain
{
"query": {
"filtered": {
"filter": {
"range": {
"date": {
"from": "now"
"to": "now/d+1"
}
}
}
}
}
}
was cached.
Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.
Added as well tests for all possible combinations.
Closes#7114.
When installing a bin only plugin, it is identified as a site plugin.
A current workaround would be to create in the zip file another empty dir. So if you have:
* `bin/myfile.sh`
* `empty/empty.txt`
the `bin` content will be extracted as expected.
Closes#7152.
indexRandom will try to delete bogus documents multiple times since they get tracked by indexOrAlias/id, and after the actual deletion any other attempt throws error and fails the test
An anti-pattern that we have in our code, noticeable for java API users, is that we modify incoming requests by replacing the index or alias with the concrete index. This way not only the request has changed, but all following communications that use that request will lose the information on whether the original request was performed against an alias or an index.
Refactored the following base classes: `TransportShardReplicationOperationAction`, `TransportShardSingleOperationAction`, `TransportSingleCustomOperationAction`, `TransportInstanceSingleOperationAction` and all subclasses by introduced an InternalRequest object that contains the original request plus additional info (e.g. the concrete index). This internal request doesn't get sent over the transport but rebuilt on each node on demand (not different to what currently happens anyway, as concrete index gets set on each node). When the request becomes a shard level request, instead of using the only int shardId we serialize the ShardId that contains both concrete index name (which might then differ ffrom the original one within the request) and shard id.
Using this pattern we can move get, multi_get, explain, analyze, term_vector, multi_term_vector, index, delete, update, bulk to not replace the index name with the concrete one within the request. The index name within the original request will stay the same.
Made it also clearer within the different transport actions when the index needs to be resolved and when that's not needed (e.g. shard level request), by exposing `resolveIndex` method. Moved check block methods to parent classes as their content was always the same on every subclass.
Improved existing tests by randomly introducing the use of an alias, and verifying that the responses always contain the concrete index name and not the original one, as that's the expected behaviour.
Added backwards compatibility tests to make sure that the change is applied in a backwards compatible manner.
Closes#7223