With cross cluster search we can potentially proxy `can_match` requests
to nodes that don't have the endpoint. This might not cause any problem
from a functional perspecitve but will cause ugly error messages on
the target node. This commit will cause an IAE if we try to talk to an
incompatible node via a proxy.
Relates to #25704
It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts.
This commit renames:
- rest -> elasticsearch-rest-client
- sniffer -> elasticsearch-rest-client-sniffer
- rest-high-level -> elasticsearch-rest-high-level-client
A couple of small changes are also preparing the high level client for its first release.
Closes#20248
The slop parameter defaults to 0 in the Lucene SpanNearQuery, so we can set it
to this default value also and don't have to require it being specified in the
query when using the Rest API. Leaving `slop` a ctro arg in the Java API as it
should normally be specified and we can keep it `final` that way.
Closes#25642
A shrunk index should ignore anything from templates and instead take
its mappings, aliases, and settings from the original index, plus any
new settings and aliases passed in with the shrink request. This commit
causes this to be the case.
Relates #25380
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.
This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
Requests that execute a stored script will no longer be allowed to specify the lang of the script. This information is stored in the cluster state making only an id necessary to execute against. Putting a stored script will still require a lang.
* Changes DocValueFieldsFetchSubPhase to reuse doc values iterators for multiple hits
Closes#24986
* iter
* Update ScriptDocValues to not reuse GeoPoint and Date objects
* added Javadoc about script value re-use
* Enable doc values for range fields by default.
* Store ranges in a binary format that support multi field fields.
* Added BinaryDocValuesRangeQuery that can query ranges that have been encoded into a binary doc values field.
* Wrap range queries on a range field in IndexOrDocValuesQuery query.
Closes#24314
This method does exactly what getHits() does and is used in only a few places,
so it can safely be removed. It seems to be a left-over from when
InternalSearchHits was folded into the SearchHits interface, which didn't
contain this method.
In certain situations we can early terminate and just skip the entire
query phase or make the lucene level rewrite very cheap if we can already
tell that a query won't match any documents. For instance if there is a single
`match_none` ie. due to some range rewrite in a filter or must clause of a boolean
query it can just drop all it's other queries since it will never match.
Flake ids organize bytes in such a way that ids are ordered. However, we do not
need that property and could reorganize bytes in an order that would better suit
Lucene's terms dict instead.
Some synthetic tests suggest that this change decreases the disk footprint of
the `_id` field by about 50% in many cases (see `UUIDTests.testCompression`).
For instance, when simulating the indexing of 10M docs at a rate of 10k docs
per second, the current uid generator used 20.2 bytes per document on average,
while this new generator which only puts bytes in a different order uses 9.6
bytes per document on average.
We had already explored this idea in #18209 but the attempt to share long common
prefixes had had a bad impact on indexing speed. This time I have been more
careful about putting discriminant bytes early in the `_id` in a way that
preserves indexing speed on par with today, while still allowing for better
compression.
There is a bug when a call to `BytesReferenceStreamInput` skip is made
on a `BytesReference` that has an initial offset. The offset for the
current slice is added to the current index and then subtracted from the
length. This introduces the possibility of a negative number of bytes to
skip. This happens inside a loop, which leads to an infinte loop.
This commit correctly subtracts the current slice index from the
slice.length. Additionally, the `BytesArrayTests` are modified to test
instances that include an offset.
This is a protection mechanism to prevent a single search request from
hitting a large number of shards in the cluster concurrently. If a search is
executed against all indices in the cluster this can easily overload the cluster
causing rejections etc. which is not necessarily desirable. Instead this PR adds
a per request limit of `max_concurrent_shard_requests` that throttles the number of
concurrent initial phase requests to `256` by default. This limit can be increased per request
and protects single search requests from overloading the cluster. Subsequent PRs can introduces
addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of
the number of nodes or sort shards iters such that we gain the best concurrency across nodes.
We lost the cluster alias due to some special caseing in inner hits
and due to the fact that we didn't pass on the alias to the shard request.
This change ensures that we have the cluster alias present on the shard to
ensure all SearchShardTarget reads preserve the alias.
Relates to #25606
Currently when we close a channel in Netty4Utils.closeChannels we
block until the closing is complete. This introduces the possibility
that a network selector thread will block while waiting until a
separate network selector thread closes a channel.
For instance: T1 closes channel 1 (which is assigned to a T1 selector).
Channel 1's close listener executes the closing of the node. That
means that T1 now tries to close channel 2. However, channel 2 is
assigned to a selector that is running on T2. T1 now must wait until T2
closes that channel at some point in the future.
This commit addresses this by adding a boolean to closeChannels
indicating if we should block on close. We only set this boolean to true
if we are closing down the server channels at shutdown. This call is
never made from a network thread. When we call the closeChannels method
with that boolean set to false, we do not block on close.
With #24236, tribe nodes submit cluster state changes to their MasterService, making it unnecessary to explicitly update the cluster state version. This PR fixes the double-incrementing of cluster state versions on tribe nodes, which are not harmful, but unnecessary.
This change collapses some of the packages for the bucket aggregations into their parent packages. This was done for the following aggregations:
* The variants of the range aggregation (geo_distance, date and ip) were moved into the `o.e.s.a.bucket.range` package
* The `o.e.s.a.bucket.terms.support` package was removed and the classes were moved to `o.e.s.a.bucket.terms`
* The filter aggregation was moved to `o.e.s.a.bucket.filter`
Since this PR is already relatively large with only the above changes subsequent PRs will do similar operations on relevant metric and pipeline aggregations
Relates to #22868
The test is currently serializing the cluster state using an older ES version format, but then deserializes those same bytes by
assuming they are of the current ES version.
When resolving wildcards, aliases should be treated as unavailable indices when the `ignoreAliases` option is set to `true` (currently enabled with delete index api and update aliases api). This way the `allow_no_indices` and `ignore_unavailable` options can be honoured, otherwise WildcardExpressionResolver ends up treating aliases differently and there is no way to control when an error is thrown.
The default behaviour for the delete index api, which has `ignore_unavailable` set to `false` and `allow_no_indices` set to `true` by default, is to throw an error when executed against an alias, same as when it's executed against an index that does not exist.
We currently check whether translog files can be trimmed whenever we create a new translog generation or close a view. However #25294 added a long translog retention period (12h, max 512MB by default), which means translog files should potentially be cleaned up long after there isn't any indexing activity to trigger flushes/the creation of new translog files. We therefore need a scheduled background check to clean up those files once they are no longer needed.
Relates to #10708
This commit does two things:
- bumps the version from 6.0.0-alpha3 to 6.0.0-beta1
- renames the 6.0.0-alpha3 version constant to 6.0.0-beta1
Relates #25621
This commit adjusts the expectation for the max number of threads in the
scaling thread pool configuration test. The reason that this expectation
is incorrect is because we removed the limitation that the number of
processors maxes out at 32, instead letting it be the true number of
logical processors on the machine. However, when we removed this
limitation, this test was never adjusted to reflect the new reality yet
it never arose since our tests were not running on machines with
incredibly high core counts.
Relates #20874
Updating the global checkpoint on a replica can occur for a few
different reasons:
- from inlined global checkpoint updates
- from a primary term transition
- from finalizing recovery
Yet, the trace logging for a global checkpoint update does not present
this information that can be useful when tracing test failures. This
commit adds a reason for the global checkpoint update on a replica so
that we can trace these updates.
Relates #25612
The current BWC code in `BulkItemRequest` mutates the underlying `DocWriteRequests` which causes test failures and unexpected state (our test infra checks bwc serialization on the fly). This PR removes this logic from master. Another PR will add a BWC layer to 5.x only.
This PR contains the logic in https://github.com/elastic/elasticsearch/pull/25510 , which is needed to run the tests.
Previously the primary didn't update it's own local checkpoint (and thus the global checkpoint) before some indexing occurred. With recent changes the primary now properly initializes it self and thus ops recovery is possible even if no indexing has occurred.
This commit adds cross-settings validation for the low/high/flood stage
disk watermark settings. This validation was enabled by the introduction
of multiple settings validation.
Relates #25600
When a shard is promoted to replica, it's possible that it was
previously a replica that started following a new primary. When it
started following this new primary, the state of its local checkpoint
tracker was reset. Upon promotion, it's possible that the state of the
local checkpoint tracker has not yet restored from a successful
primary-replica re-sync. To account for this, we must restore the state
of the local checkpoint tracker when a replica shard is promoted to
primary. To do this, we stream the operations in the translog, marking
the operations that are in the translog as completed. We do this before
we fill the gaps on the newly promoted primary, ensuring that we have a
primary shard with a complete history up to the largest maximum sequence
number it has ever seen.
Relates #25553
This commit refactors the global checkpont tracker to make it more
resilient. The main idea is to make it more explicit what state is
actually captured and how that state is updated through
replication/cluster state updates etc. It also fixes the issue where the
local checkpoint information is not being updated when a shard becomes
primary. The primary relocation handoff becomes very simple too, we can
just verbatim copy over the internal state.
Relates #25468
The created and found fields in index and delete responses became obsolete after the introduction of the result field in index, update and delete responses (#19566).
After deprecating the created and found fields in 5.x (#19633), now they are removed.
Fixes#19630
* Improved REST endpoint exception handling, see #15335
Also improved OPTIONS http method handling to better conform with the
http spec.
* Tidied up formatting and comments
See #15335
* Tests for #15335
* Cleaned up comments, added section number
* Swapped out tab indents for space indents
* Test class now extends ESSingleNodeTestCase
* Capture RestResponse so it can be examined in test cases
Simple addition to surface the RestResponse object so we can run tests
against it (see issue #15335).
* Refactored class name, included feedback
See #15335.
* Unit test for REST error handling enhancements
Randomizing unit test for enhanced REST response error handling. See
issue #15335 for more details.
* Cleaned up formatting
* New constructor to set HTTP method
Constructor added to support RestController test cases.
* Refactored FakeRestRequest, streamlined test case.
* Cleaned up conflicts
* Tests for #15335
* Added functionality to ignore or include path wildcards
See #15335
* Further enhancements to request handling
Refactored executeHandler to prioritize explicit path matches. See
#15335 for more information.
* Cosmetic fixes
* Refactored method handlers
* Removed redundant import
* Updated integration tests
* Refactoring to address issue #17853
* Cleaned up test assertions
* Fixed edge case if OPTIONS method randomly selected as invalid method
In this test, an OPTIONS method request is valid, and should not return
a 405 error.
* Remove redundant static modifier
* Hook the multiple PathTrie attempts into RestHandler.dispatchRequest
* Add missing space
* Correctly retrieve new handler for each Trie strategy
* Only copy headers to threadcontext once
* Fix test after REST header copying moved higher up
* Restore original params when trying the next trie candidate
* Remove OPTIONS for invalidHttpMethodArray so a 405 is guaranteed in tests
* Re-add the fix I already added and got removed during merge :-/
* Add missing GET method to test
* Add documentation to migration guide about breaking 404 -> 405 changes
* Explain boolean response, pull into local var
* fixup! Explain boolean response, pull into local var
* Encapsulate multiple HTTP methods into PathTrie<MethodHandlers>
* Add PathTrie.retrieveAll where all matching modes can be retrieved
Then TrieMatchingMode can be package private and not leak into RestController
* Include body of error with 405 responses to give hint about valid methods
* Fix missing usageService handler addition
I accidentally removed this :X
* Initialize PathTrieIterator modes with Arrays.asList
* Use "== false" instead of !
* Missing paren :-/
Indexing ids in binary form should help with indexing speed since we would
have to compare fewer bytes upon sorting, should help with memory usage of
the live version map since keys will be shorter, and might help with disk
usage depending on how efficient the terms dictionary is at compressing
terms.
Since we can only expect base64 ids in the auto-generated case, this PR tries
to use an encoding that makes the binary id equal to the base64-decoded id in
the majority of cases (253 out of 256). It also specializes numeric ids, since
this seems to be common when content that is stored in Elasticsearch comes
from another database that uses eg. auto-increment ids.
Another option could be to require base64 ids all the time. It would make things
simpler but I'm not sure users would welcome this requirement.
This PR should bring some benefits, but I expect it to be mostly useful when
coupled with something like #24615.
Closes#18154