Percentiles are supposed to be monotonically increasing but floating-point
rounding issues can come into play and make the test fail if checks are too
strict.
The validate API was failing to reject JSON input that had unsupported
fields placed after a supported field. This was causing invalid requests
to be reported as valid.
Fixes#5685
Made sure that a match_all query is used when no query is specified and ensure no NPE is thrown either.
Also used the same code path as the search api to ensure that alias filters are taken into account, same for type filters.
Closes#6111Closes#6112Closes#6116
More Like This API would not take into account 'size' and 'from' in request body parameters.
Instead these values would always be overriden by the default values of REST parameters
'search_size' and 'search_from'.
Closes#5981
- Fix bug where repeatedly calling computeSummaryStatistics() could
accumulate some values incorrectly
- Fix check for number of responsive nodes on list is <= number of
candidate benchmark nodes
- Add public getters for summary statistics
- Add javadoc for new getters
- Add javadoc comments about API use
- Improve abort and status tests by calling awaitBusy() to wait for jobs
to be completely submitted before testing them
In our REST tests we already have support for features and skip sections that allow to skip tests if a feature is not supported.
We can then add a skip section based on the benchmark feature to the benchmark tests and execute them only when they are supported, knowing that they need at least a node with node.bench settings within the cluster. We can check that this requirement is met by calling the nodes info api.
This way we can dynamically decide whether to execute those tests or not and we don't need to have a node.bench around all the time. In fact, given that the REST tests use the GLOBAL cluster, we want to be able to randomize settings as much as possible and run tests against default settings as well. Also, this mechanism can be easily supported by the external cluster implementation that is used during the release process.
Introduced ability to disable benchmark nodes which is needed by BenchmarkNegativeTest.
The test failed because 'percent_terms_to_match' defaults to 0.3, which results
in requiring that some terms only found in the queried document must match, when
all the documents are on the same shard.
By default More Like This API excludes the queried document from the response.
However, when debugging or when comparing scores across different queries, it
could be useful to have the best possible matched hit. So this option lets users
explicitly specify the desired behavior.
Closes#6067
* The shardTopDocs array should get created with the size equal to the total number of shard level requests and not the total number of requests that have a shard level result.
* Make sure no null TopDocs entires are passed down to TopDocs#merge
* Added dedicated scroll tests that tests scrolling on an index that has missing shards due to node failure.
* Made sure that the sort fields in SimpleNestedTests exists by adding the fields in the mapping during index creation.
Closes#6022
- Fix bug where repeatedly calling computeSummaryStatistics() could
accumulate some values incorrectly.
- Fix check for number of responsive nodes on list is <= number of
candidate benchmark nodes.
- Add public getters for summary statistics
- Add javadoc for new getters
- Add javadoc comments about API use
- Randomized integration tests for the benchmark API.
- Negative tests for cases where the cluster cannot run benchmarks.
- Return 404 on missing benchmark name.
- Allow to specify 'types' as an array in the JSON syntax when describing a benchmark competition.
- Don't record slowest for single-request competitions.
Closes#6003, #5906, #5903, #5904
Significant terms internally maintain a priority queue per shard with a size potentially
lower than the number of terms. This queue uses the score as criterion to determine if
a bucket is kept or not. If many terms with low subsetDF score very high
but the `min_doc_count` is set high, this might result in no terms being
returned because the pq is filled with low frequent terms which are all sorted
out in the end.
This can be avoided by increasing the `shard_size` parameter to a higher value.
However, it is not immediately clear to which value this parameter must be set
because we can not know how many terms with low frequency are scored higher that
the high frequent terms that we are actually interested in.
On the other hand, if there is no routing of docs to shards involved, we can maybe
assume that the documents of classes and also the terms therein are distributed evenly
across shards. In that case it might be easier to not add documents to the pq that have
subsetDF <= `shard_min_doc_count` which can be set to something like
`min_doc_count`/number of shards because we would assume that even when summing up
the subsetDF across shards `min_doc_count` will not be reached.
closes#5998closes#6041
Relates to #6059, where two new constants were introduced in IndicesOptions. There were already two constants there though, one of which we could have reused. This commit tries to unify them.
We currently compute initial sizings based on the cardinality of our fields.
This can be highly exagerated for sub aggregations, for example if there is a
parent terms aggregation that is executed over a field that has a very long
tail: most buckets will only collect a couple of documents.
Close#5994
We switched to Lucene's SloppyMath way of computing an approximate value of
the eath diameter given a latitude in order to compute distances, yet the
bounding box optimization of the geo distance filter still assumed a constant
earth diameter, equal to the average.
Close#6008
This relates to #6040, the fix is twofold, first, not handling missing context specifically in the search code, but behave the same as we do in non scroll search, where if all the shards failed, raise an exception. The second is to apply this logic in both scroll cases.
Way back when, when ES started, there was an idea for a dump infrastructure, but it ended up supporting its serviceability aspects through APIs, remove the unused code
Removed MetaData#concreteIndices variations that didn't require an IndicesOptions argument. Every caller should specify how indices should be resolved to concrete indices based on the indices options argument.
Closes#6059
RootMapper.validate was only used by the routing field mapper, which makes
buggy assumptions about how fields are indexed. For example, it assumes that
the index representation of a field is the same as its external representation.
Close#5844
A bad/non-existing scroll ID used to return a 200, however a 404 might be more useful.
Also, this PR returns the right Exception (SearchContextMissingException) in the Java API.
Additionally: Added StatusToXContent interface and RestStatusToXContentListener listener, so
the appropriate RestStatus can be returned
Closes#5729
Similar to search removal, the operation threading options are not really ued, and the default should always be used. This also considerably simplifies the code.
A side affect is that we can now remove the ShardIterator#firstOrNull method, which can cause for sneaky bugs to occur.
closes#6044
The analyze API used the standard analyzer from lucene and therefore removed
stopwords instead of using the elasticsearch default analyzer.
Closes#5974
The possibility of filtering for index templates in the cluster state API
had been introduced before there was a dedicated index templates API. This
commit removes this support from the cluster state API, as it was not really
clean, requiring you to specify the metadata and the index templates.
Closes#4954
Search operation threading is an option that is not really used, and current non default implementations are flawed. Handling it also creates quite the complexity in the search handling codebase...
This is a breaking change, but one that is actually a good one, since I haven't seen/heard anybody use it, and if its used, its problematic...
closes#6042
Change #5561 introduced a potential bug in that iterations that are performed
on a thread are might not be visible to other threads due to the removal of the
`volatile` keyword.
Close#6039
When a thread pool rejects the execution on the local node, the search might not return.
This happens due to the fact that we move to the next shard only *within* the execution on the thread pool in the start method. If it fails to submit the task to the thread pool, it will go through the fail shard logic, but without "counting" the current shard itself. When this happens, the relevant shard will then execute more times than intended, causing the total opes counter to skew, and for example, if on another shard the search is successful, the total ops will be incremented *beyond* the expectedTotalOps, causing the check on == as the exit condition to never happen.
The fix here makes sure that the shard iterator properly progresses even in the case of rejections, and also includes improvement to when cleaning a context is sent in case of failures (which were exposed by the test).
Though the change fixes the problem, we should work on simplifying the code path considerably, the first suggestion as a followup is to remove the support for operation threading (also in broadcast), and move the local optimization execution to SearchService, this will simplify the code in different search action considerably, and will allow to remove the problematic #firstOrNull method on the shard iterator.
The second suggestion is to move the optimization of local execution to the TransportService, so all actions will not have to explicitly do the mentioned optimization.
fixes#4887
Changed getFiniteStrings to use an iterative implementation instead of
recursive, so we don't use a Java stack-frame per character for each
suggestion at build & query time.
The defaults we have today in our data intensive memory structures don't properly add up to properly protected from potential OOM.
The circuit breaker, today at 80%, aims at protecting from extensive field data loading. The default threshold today is too permissive and can still cause OOMs.
The filter cache today is at 20%, and its too high when adding it to other limits we have, reduce it to 10%, which is still a big enough portion of the heap, yet provides improved safety measure.
closes#5990
If a source node disconnect during recover, the target node will respond by canceling the recovery. Typically the master will respond by removing the disconnected node from the cluster state, promoting another shard to become primary. This is sent it to all nodes and the target node will start recovering from the new primary. However, if the drop of a node caused the node count to go bellow min_master_node, the master will step down and will not promote shard immediately. When a new master is elected we may publish a new cluster state (who's point is to notify of a new master) which is not yet updated. This caused the node to start a recovery to a non existent node. Before we aborted the recovery without cleaning up the shard, causing subsequent correct cluster states to be ignored. We should not start the recovery process but wait for another cluster state to come in.
Closes#6024
A boost terms factor of 1.0 is not the same as no boosting of terms.
The desired behavior is to deactivate boosting by default. If the user
specifies any value other than 0, then boosting is activated.
Closes#6021
Updating to this version allows to configure a special JNA directory,
in case the /tmp directory is mounted with the noexec option, as JNA
extracts some data and tries to execute parts of it.
Also updated documentation to clarify mlockall and memory settings as well
as pointing to the new jna.tmpdir system property.
Closes#5493
In addition, add a special warning if the misplaced function is a "boost_factor"
function to avoid confusion of "boost" and "boost_function".
closes#5995
In the string type, we have an optimization to reuse the StringTokenStream on a thread local when a non analyzed field is used (instead of creating it each time). We should use this across the board on all places where we create a field with a String.
Also, move to a specific XStringField, that we can reuse StringTokenStream instead of copying it.
closes#6001
Some tests run before the thread is started and thus use 0 as a the current time, which later on leads to big time jumps and thus failures.
Ex. InternalEngineTests.testVersioningReplicaConflict2
Sub-aggregations are currently collected directly, by just forwarding the
doc ID and bucket ordinal to them. This change adds the new BucketCollector
abstract class that Aggregator extends, so that we have more flexibility to
add implicit filters or buffering between an aggregator and its sub
aggregators.
Close#5975
Sorting was broken on nested documents because the `missing(slot)` method
didn't correctly set the segment ordinal (readerGen), causing term ordinals to
be compared across segments.
Close#5986
When providing a number (milliseconds since epoch, UTC), range and term query/filter don't handle it correctly and convert it to a string, that is then first tried to parse as a date
closes#5969
Instead of resolving the global ordinal for each hit on the fly, resolve the global ordinals during post collect.
On fields with not so many unique values, that can reduce the number of global ordinals significantly.
Closes#5895Closes#5854
The default number of clients nodes is randomized between 0 and 1, applied to all cluster scopes (global, suite and test). Can be changed through the newly added `@ClusterScope#numClientNodes`.
In our tests we currently refer to nodes in a generic way. All the tests that either stop or start nodes rely on the fact that those nodes hold data though. Made that clearer as that becomes more important when introducing other types of nodes within the test cluster. Reflected this by adapting and renaming the following methods in `TestCluster`:
- ensureAtLeastNumNodes to ensureAtLeastNumDataNodes
- ensureAtMostNumNodes to ensureAtMostNumDataNodes
- stopRandomNode to stopRandomDataNode
and the following ones in `ElasticsearchIntegrationTest`:
- allowNodes to allowDataNodes
- dataNodes to numDataNodes.
- @ClusterScope#numNodes to numDataNodes
- @ClusterScope#minNumNodes to minNumDataNodes
- @ClusterScope#maxNumNodes to maxNumDataNodes
Added facilities to be able to deal with data nodes specifically, like for instance retrieve a client to a data node, or retrieve an instance of a class through guice only from data nodes.
Adapted existing tests to successfully run although there's a node client around.
Fixed _cat/allocation REST tests to make disk.total, disk.avail and disk.percent optional as client nodes won't return that info.
Closes#5949
Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.
Relates to #3960closes#5940
during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case.
Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects
closes#5918
Our ordinals currently start at 1, like FieldCache did in older Lucene versions.
However, Lucene 4.2 changed it in order to make ordinals start at 0, using -1
as the ordinal for the missing value. We should switch to the same numbering as
Lucene for consistency. This also allows to remove some abstraction on top of
Lucene doc values.
Close#5871
Separate version check logic for reads and writes for all version types, which allows different behavior in these cases.
Change `VersionType.EXTERNAL` & `VersionType.EXTERNAL_GTE` to behave the same as `VersionType.INTERNAL` for read operations.
The previous behavior was fit for writes but is useless in reads.
This commit also makes the usage of `EXTERNAL` & `EXTERNAL_GTE` in the update api raise a validation error as it make cause data to
be lost.
Closes#5663 , Closes#5661, Closes#5929