This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
* Comprehensively test supported/unsupported field type:agg combinations (#52493)
This adds a test to AggregatorTestCase that allows us to programmatically
verify that an aggregator supports or does not support a particular
field type. It fetches the list of registered field type parsers,
creates a MappedFieldType from the parser and then attempts to run
a basic agg against the field.
A supplied list of supported VSTypes are then compared against the
output (success or exception) and suceeds or fails the test accordingly.
Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com>
* Skip fields that are not aggregatable
* Use newIndexSearcher() to avoid incompatible readers (#52723)
Lucene's `newSearcher()` can generate readers like ParallelCompositeReader
which we can't use. We need to instead use our helper `newIndexSearcher`
The docs here add nothing compared to those in the package. If anything
they are somewhat confusing since they don't give all necessary details to understand the snapshot process.
=> remove them and link to the complete docs at the package level
Bulk requests currently keep a reference to all bulk item requests until every one of them has
completed. There is no need to do so, however, and, in case of large bulks, can mean
unnecessary holding onto memory that might be better used elsewhere. More so as different
shard-level bulks can complete at different speeds, and one slow shard-level request should
not require holding onto every other shard-level request.
Currently there is an issue with the InboundPipeline releasing bytes
earlier than appropriate. This can lead to the bytes being reused before
the message is handled. This commit fixes that issue and adds a test to
detect when it is occurring.
* Add warnings/errors when V2 templates would match same indices… (#54367)
* Add warnings/errors when V2 templates would match same indices as V1
With the introduction of V2 index templates, we want to warn users that templates they put in place
might not take precedence (because v2 templates are going to "win"). This adds this validation at
`PUT` time for both V1 and V2 templates with the following rules:
** When creating or updating a V2 template
- If the v2 template would match indices for an existing v1 template or templates, provide a
warning (through the deprecation logging so it shows up to the client) as well as logging the
warning
The v2 warning looks like:
```
index template [my-v2-template] has index patterns [foo-*] matching patterns from existing older
templates [old-v1-template,match-all-template] with patterns (old-v1-template =>
[foo*],match-all-template => [*]); this template [my-v2-template] will take
precedence during new index creation
```
** When creating a V1 template
- If the v1 template is for index patterns of `"*"` and a v2 template exists, warn that the v2
template may take precedence
- If the v1 template is for index patterns other than all indices, and a v2 template exists that
would match, throw an error preventing creation of the v1 template
** When updating a V1 template (without changing its existing `index_patterns`!)
- If the v1 template is for index patterns that would match an existing v2 template, warn that the
v2 template may take precedence.
The v1 warning looks like:
```
template [my-v1-template] has index patterns [*] matching patterns from existing index templates
[existing-v2-template] with patterns (existing-v2-template => [foo*]); this template [my-v1-template] may be ignored in favor of an index template at index creation time
```
And the v1 error looks like:
```
template [my-v1-template] has index patterns [foo*] matching patterns from existing index templates
[existing-v2-template] with patterns (existing-v2-template => [f*]), use index templates (/_index_template) instead
```
Relates to #53101
* Remove v2 index and component templates when cleaning up tests
* Finish half-finished comment sentence
* Guard template removal and ignore for earlier versions of ES
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* Also ignore 500 errors when clearing index template v2 templates
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This fixes a serialization bug in `auto_date_histogram` that comes up in
a cluster mixed between pre-7.3.0 and post-7.3.0.
Includes #54429 to keep 7.x looking like master for simpler backports.
Closes#54382
Pipeline aggregations like `stats_bucket`, `sum_bucket`, and
`percentiles_bucket` only operate on buckets that have multiple buckets.
This adds support for those aggregations to `geo_distance`, `ip_range`,
`auto_date_histogram`, and `rare_terms`.
This all happened because we used a marker interface to mark compatible
aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget
to implement the interface.
This replaces the marker interface with an abstract method in
`AggregationBuilder`, `bucketCardinality` which makes you return `NONE`,
`ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At
this point `ONE` and `NONE` amount to about the same thing, but I
suspect that'll be a useful distinction when validating bucket sorts.
Closes#53215
This commit takes into account the index.number_of_replicas (defaults to
0 - no replicas- ) value when setting an index template. This change
enables the index.wait_for_active_shards value to be interpreted
correctly
(cherry picked from commit 07026ac3d56dc9fae69467adfda7eaed7ea3ca00)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Co-authored-by: tninokehoe <62655306+tninokehoe@users.noreply.github.com>
This commit improves the behavior of aborting snapshots and by that fixes
some extremely rare test failures.
Improvements:
1. When aborting a snapshot while it is in the `INIT` stage we do not need
to ever delete anything from the repository because nothing is written to the
repo during INIT any more (in the past running deletes for these snapshots made
sense because we were writing `snap-` and `meta-` blobs during the `INIT` step).
2. Do not try to finalize snapshots that never moved past `INIT`. Same reason as
with the first step. If we never moved past `INIT` no data was written to the repo
so no need to now write a useless entry for the aborted snapshot to `index-N`.
This is especially true, since the reason the snapshot was aborted during `INIT` was
a delete call so the useless empty snapshot just added to `index-N` would be removed
by the subsequent delete that is still waiting anyway.
3. if after aborting a snapshot we wait for it to finish we should not try deleting it
if it failed. If the snapshot failed it means it did not become part of the most recent
`RepositoryData` so a delete for it will needlessly fail with a confusing message about
that snapshot being missing or concurrent repository modification. I moved to throw the snapshot missing exception here because that seems the most user friendly. This allows the user to simply ignore `404` returns from the delete API when using it to make sure a snapshot is aborted+deleted.
Marking this as a non-issue since it doesn't have any negative repercussions other than confusing exceptions on some snapshot aborts.
Closes#52843
We try to rewrite time zones to fixed offsets in the date histogram aggregation
if the data in the shard is within a single transition.
However this optimization is not applied on time zones that don't apply daylight saving changes
but had some random transitions in the past (e.g. Australia/Brisbane or Asia/Katmandu).
This changes fixes the rewrite of such time zones to fixed offsets.
Backport of #53982
In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams,
the abstraction needs to be made more flexible, because currently it really can be only
an alias or an index.
* Renamed `AliasOrIndex` to `IndexAbstraction`.
* Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is.
* Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum.
* Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface.
* Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`.
* Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method.
Relates to #53100
This setting is not documented and has dubious value since it means
there can be nodes in the cluster (non-data and non-master nodes) that
do not have persistent node IDs. This does not have any use cases so
this commit removes the setting.
This commit ensures that node roles are sorted by node role name, which
makes the output easier to consume, and also makes it easier to rely on
the behavior of the output in assertions.
Currently all of our transport protocol decoding and aggregation occurs
in the individual transport modules. This means that each implementation
(test, netty, nio) must implement this logic. Additionally, it means
that the entire message has been read from the network before the server
package receives it.
This commit creates a pipeline in server which can be passed arbitrary
bytes to handle. Internally, the pipeline will decode, decompress, and
aggregate the messages. Additionally, this allows us to run many
megabytes of bytes through the pipeline in tests to ensure that the
logic works.
This work will enable future work:
Circuit breaking or backoff logic based on message type and byte
in the content aggregator.
Sharing bytes with the application layer using the ref counted
releasable network bytes.
Improved network monitoring based specifically on channels.
Finally, this fixes the bug where we do not circuit break on the correct
message size when compression is enabled.
Elasticsearch has a number of different BytesReference implementations.
These implementations can all implement the interface in different ways
with subtly different behavior and performance characteristics. On the
other-hand, the JVM only represents bytes as an array or a direct byte
buffer. This commit deletes the specialized Netty implementations and
moves to using a generic ByteBuffer reference type. This will allow us
to focus on standardizing performance and behave around a smaller number
of implementations that can be used by all components in Elasticsearch.
* Add REST APIs for IndexTemplateV2Metadata CRUD (#54039)
* Add REST APIs for IndexTemplateV2Metadata CRUD
This commit adds the get/put/delete APIs for interacting with the now v2 versions of index
templates.
These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag.
Relates to #53101
* Add exceptions for HLRC tests
* Add skips for 7.x versions
* Use index_template instead of template_v2 in action names
* Add test for MetaDataIndexTemplateService.addIndexTemplateV2
* Move removal to static method and add test
* Add unit tests for request classes (implement hashCode & equals)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* Fix compilation
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
We have a number of places where we want to read a fairly complex object from
XContent, but aren't interested in its contents; for example, mappings are often
serialized and deserialized between several objects before they are actually built
into a MappingMetaData object. This means that potentially large maps of maps
are constructed several times, only to immediately be re-serialized again.
This commit adds a new helper method to XContentHelper that reads the children
of an xcontent object directly to a BytesReference, serialized via the same xcontenttype
as the parent parser, avoiding the construction of intermediary maps or lists.
* Fix Snapshot Completion Listener Lost on Master Failover
If master fails over before (or we run into any other exception) when removing
the snapshot from the CS we must still resolve all the completion listeners for
the snapshot.
This commit causes negative TimeValues, other than -1 which is sometimes used as
a sentinel value, to be rejected during parsing.
Also introduces a hack to allow ILM to load policies which were written to the
cluster state with a negative min_age, treating those values as 0, which should
match the behavior of prior versions.
The NodesStatsRequest class uses a set of strings for its internal
serialization. This commit updates the class's interface so that we
no longer use hard-coded getters and setters, but rather
methods that add strings directly. For example, the old way of
adding "os" metrics to a request would be to call request.os(true).
The new way of doing this is to call request.addMetric("os").
For the time being, the canonical list of metrics is an enum in
NodesStatsRequest. This will eventually be replaced with something
pluggable.
Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and
requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no
longer needed since we now only send the parameters along with the rest request
that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct
defaults are set on the client side. This change removes setting and sending
these defaults where possible, leaving only the overwrite of batchedReduceSize
with a default value of 5, since the default used in the vanilla SearchRequest
is 512. However, we don't need to send this value along as a request parameter
if its the default since the correct one will be set on the receiving end if no
value is specified.
Also adding tests for RestSubmitAsyncSearchAction that check the correct
defaults are set when parameters are missing on the server side.
Backport of #54200
These mock calls cause Eclipse to think that `Exception` can be thrown
because `CheckedFunction`'s lower bound is `Exception`. This makes
Eclipse happy.
Make it possible to reuse the cluster state update of rollover for
simulation purposes by extracting it. Also now run the full rollover in
the pre-rollover phase and the actual rollover phase, allowing a
dedicated exception in case of concurrent rollovers as well as a more
thorough pre-check.
We have some very occasional failures in SearchAfterIT, where a search throws
an exception because a shard does not have the mapping for the requested sort
field. The field should have been added in a dynamic mapping update after an
index event, but it seems that there can sometimes be a small delay in propagating
this update to the shards.
This commit changes the test to explicitly define the relevant field at index creation
time.
Fixes#51900
The remove keystore command can handle multiple settings. In a few
places, we were not consistent about mentioning this. This commit
addreses this, in the CLI help, and the docs.
This commit begins the work of removing the "hard-coded" metric getters
and setters from the NodesInfoRequest classes. We start by providing new
flexible getters and setters. We then update the test classes to remove
the old getters, and then remove those getters.
Changes ThreadPool's schedule method to run the schedule task in the context of the thread
that scheduled the task.
This is the more sensible default for this method, and eliminates a range of bugs where the
current thread context is mistakenly dropped.
Closes#17143
Today the keystore add-file command can only handle adding a single
setting/file pair in a single invocation. This incurs the startup costs
of the JVM many times, which in some environments can be expensive. This
commit teaches the add-file keystore command to accept adding multiple
settings in a single invocation.
Today the keystore add command can only handle adding a single
setting/value pair in a single invocation. This incurs the startup costs
of the JVM many times, which in some environments can be expensive. This
commit teaches the add keystore command to accept adding multiple
settings in a single invocation.
This drop the "top level" pipeline aggregators from the aggregation
result tree which should save a little memory and a few serialization
bytes. Perhaps more imporantly, this provides a mechanism by which we
can remove *all* pipelines from the aggregation result tree. This will
save quite a bit of space when pipelines are deep in the tree.
Sadly, doing this isn't simple because of backwards compatibility. Nodes
before 7.7.0 *need* those pipelines. We provide them by setting passing
a `Supplier<PipelineTree>` into the root of the aggregation tree that we
only call if we need to serialize to a version before 7.7.0.
This solution works for cross cluster search because we always reduce
the aggregations in each remote cluster and then forward them back to
the coordinating node. Its quite possible that the coordinating node
needs the pipeline (say it is version 7.1.0) and the gateway node in the
remote cluster doesn't (version 7.7.0). In that case the data nodes
won't send the pipeline aggregations back to the gateway node.
Critically, the gateway node *will* send the pipeline aggregations back
to the coordinating node. This is all managed with that
`Supplier<PipelineTree>`, but *how* it is managed is a bit tricky.
We can run into a state where there's no more events to wait for temporarily
but the cluster still isn't green. I added the wait for green flag to the request
so the assertion for green cluster health below doesn't fail.
Closes#53457
We are using this assertion for identical shard snapshots
for situations where `snapshot1` wasn't the first snapshot
for the tested shard. Hence, we can't assume that it will
not share any files with previous snapshots.
This showed up in failing tests when `snapshot1` was equivalent
to a previous snapshot because no documents were deleted from the
repo randomly in the failing test but even if documents are deleted
there is no guarantee that no files will be shared.
=> I removed this assertion since its immaterial for what is tested
here anyway.
Closes#54034
This commit ensures that we rewrite the shard request with a short-lived can_match searcher.
This is required for frozen indices since the high level rewrite is now performed on a network thread where we don't want to perform I/O.
Closes#53985
Reindex would use timeValueNanos(System.nanoTime()). The intended use
for TimeValue is as a duration, not as absolute time. In particular,
this could result in negative TimeValue's, being unsupported in #53913.
Modified to use the bare long nano-second value.
* Add validation for component templates (#54023)
* Add validation for component templates
This change adds validation to make sure that settings and mappings are correct in
component template. It's done the same way as in index templates - code is reused.
Reletes to #53101
* Fix checkstyle violation
* Update server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateService.java
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
* Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
* Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
* Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
* Update server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateService.java
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
* Adjusted to 7.7
* unused import fixed
* npe fixeD
* change exception type
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
This commit ensures that we don't use the non-deterministic canReturnNullResponseIfMatchNoDocs
boolean in the cache key of the ShardSearchRequest. The value of this boolean has no influence
on the cacheability of the request.
Closes#32827
Today we log `failed to execute pipeline for a bulk request` at `ERROR` level
if an attempt to run an ingest pipeline fails. A failure here is commonly due
to an `EsRejectedExecutionException`. We also feed such failures back to the
client and record the rejection in the threadpool statistics.
In line with #51459 there is no need to log failures within actions so noisily
and with such urgency. It is better to leave it up to the client to react
accordingly. Typically an `EsRejectedExecutionException` should result in the
client backing off and retrying, so a failure here is not normally fatal enough
to justify an `ERROR` log at all.
This commit reduces the log level for this message to `DEBUG`.
Fixes an issue where the elasticsearch-node command-line tools would not work correctly
because PersistentTasksCustomMetaData contains named XContent from plugins. This PR
makes it so that the parsing for all custom metadata is skipped, even if the core system would
know how to handle it.
Closes#53549
This removes some more ceremony when declaring agg parsers. You no
longer need a static `parse` method, instead you can just make the
`PARSER` public in most cases.
There are still a few aggs with the `parse` method, but those `parse`
methods are a little more complex to untangle.
Currently the remote info api has added a number of possible fields
(proxy, num_socket_connections, etc) that are available in proxy mode.
These fields are not aligned with what the settings are named. This
commit modifies this API to align with the settings.
Wildcard queries on keyword fields get normalized, however this normalization
step should exclude the two special characters * and ? in order to keep the
wildcard query itself intact.
Closes#46300
The snapshot stats response list of snapshot statuses is not ordered according to the
given list of snapshot names so randomly we could mix up snapshot1 and snapshot2
when asserting on the stats.
Fixed by getting each snapshot's stats individually.
Closes#54034
This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
* The request targets more than 128 shards.
* The request contains read-only indices.
* The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.
Closes#39835
I created this bug today in #53793. When a `DelayableWriteable` that
references an existing object serializes itself it wasn't taking the
version of the node on the other side of the wire into account. This
fixes that.
Today when cluster.remote.connect is set to false, and some aspect of
the codebase tries to get a remote client, today we return a no such
remote cluster exception. This can be quite perplexing to users,
especially if the remote cluster is actually defined in their cluster
state, it is only that the local node is not a remote cluter
client. This commit addresses this by providing a dedicated error
message when a remote cluster is not available because the local node is
not a remote cluster client.
This commit removes the configuration time vs execution time distinction
with regards to certain BuildParms properties. Because of the cost of
determining Java versions for configuration JDK locations we deferred
this until execution time. This had two main downsides. First, we had
to implement all this build logic in tasks, which required a bunch of
additional plumbing and complexity. Second, because some information
wasn't known during configuration time, we had to nest any build logic
that depended on this in awkward callbacks.
We now defer to the JavaInstallationRegistry recently added in Gradle.
This utility uses a much more efficient method for probing Java
installations vs our jrunscript implementation. This, combined with some
optimizations to avoid probing the current JVM as well as deferring
some evaluation via Providers when probing installations for BWC builds
we can maintain effectively the same configuration time performance
while removing a bunch of complexity and runtime cost (snapshotting
inputs for the GenerateGlobalBuildInfoTask was very expensive). The end
result should be a much more responsive build execution in almost all
scenarios.
(cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)
This moves the pipeline aggregation validation from the data node to the
coordinating node so that we, eventually, can stop sending pipeline
aggregations to the data nodes entirely. In fact, it moves it into the
"request validation" stage so multiple errors can be accumulated and
sent back to the requester for the entire request. We can't always take
advantage of that, but it'll be nice for folks not to have to play
whack-a-mole with validation.
This is implemented by replacing `PipelineAggretionBuilder#validate`
with:
```
protected abstract void validate(ValidationContext context);
```
The `ValidationContext` handles the accumulation of validation failures,
provides access to the aggregation's siblings, and implements a few
validation utility methods.
If a setting is touched during bootstrap before logging is configured,
and that setting uses a byte size value, the deprecation logger for
ByteSizeValue will be initialized. However, this means a logger will be
configured before log4j is initialized, which we reject at startup. This
commit puts this deprecation logger in a holder pattern so that it is
not initialized until first use, which will happen after logging is
configured.
Benchmarking showed that the effect of the ExitableDirectoryReader
is reduced considerably when checking every 8191 docs. Moreover,
set the cancellable task before calling QueryPhase#preProcess()
and make sure we don't wrap with an ExitableDirectoryReader at all
when lowLevelCancellation is set to false to avoid completely any
performance impact.
Follows: #52822
Follows: #53166
Follows: #53496
(cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)
* Get Async Search: omit _clusters section when empty (#53907)
The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content.
This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`.
* Async search: remove version from response (#53960)
The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).
That said this commit clarifies this in the docs and removes the version field from the async search response
* Async Search: replicas to auto expand from 0 to 1 (#53964)
This way single node clusters that are green don't go yellow once async search is used, while
all the others still have one replica.
* [DOCS] address timing issue in async search docs tests (#53910)
The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final.
With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response.
Closes#53887Closes#53891
Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.
The test in CloseWhileRelocatingShardsIT failed recently
multiple times (3) when waiting for initial indices to be
become green. Looking at the execution logs from #53544
it appears at the very beginning of the test and when
the WindowsFS file system is picked up (which is known
to slow down tests).
This commit simply increases the timeout for the first
ensureGreen() to 60 seconds. If the test continues to fail,
we might want to test a larger timeout or disable
WindowsFS for this test.
Closes#53544
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to thest the data stream API stubs.
This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.
The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.
The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.
Relates to #53100
Today we only read `cluster.max_voting_config_exclusions` from the dynamic
settings in the cluster metadata, ignoring any value set in
`elasticsearch.yml`. This commit addresses this.
Closes#53455
When indexing a rectangle that crosses the dateline, we are currently not
handling it properly and we index a polygon that do not cross the dateline.
This changes generates two polygons wrapping the dateline.
* Adds ability for contexts to specify their own defaults.
* Context defaults are applied if no context-specific or
general setting exists.
* See 070ea7e for settings keys.
* Increases the per-context default for the `ingest` context.
* Cache size is doubled, 200 compared to default of 100
* Cache expiration is unchanged at no expiration
* Cache max compilation is quintupled, 375/5m instead of 75/5m
Backport of: 1b37d4b
Refs: #50152
This commit changes the Transforms notifications index to be hidden
index, with a hidden alias.
This commit also removes the temporary hack in
MetaDataCreateIndexService that prevents deprecation warnings for known
dot-prefixed index names which are not hidden/system indices, as this
was the last index pattern to need that hack.
We mark cluster states persisted on master-ineligible nodes as
potentially-stale using the voting configuration `{STALE_STATE_CONFIG}` which
prevents these nodes from being elected as master if they are restarted as
master-eligible. Today we do not handle this special voting configuration
differently in the `ClusterFormationFailureHandler`, leading to a mysterious
message `an election requires a node with id [STALE_STATE_CONFIG]` if the
election does not succeed.
This commit adds a special case description for this situation to explain
better why this node cannot win an election.
Closes#53734
The test was randomly and very rarely failing due to generating the same sort
key for multiple records, which was making order of these records in the results
nondeterministic. While investigating the test I also found that the data wasn't
generated in the way that matches the actual data. Normally, the order of
documents in hits and scoreDocs in InternalTopHits should be the same. However,
in the test only scoreDocs were sorted which was cause very confusing failure
messages. This commit fixes this issue as well.
Fixes#53676
Today in the `CoordinatorTests` each node uses multiple threadpools. This is
mostly fine as they are almost completely stateless, except for the
`ThreadContext`: by using multiple threadpools we cannot make assertions that
the thread context is/isn't preserved as we expect. This commit consolidates
the threadpool instances in use so that each node uses just one.
TermsLookup in master no longer accepts a type parameter. We should emit
a deprecate warning in 7.x when a terms lookup requests includes type to prepare
users for its removal.
Relates to #41059
This commit adds a new AsyncSearchClient to the High Level Rest Client which
initially supporst the submitAsyncSearch in its blocking and non-blocking
flavour. Also adding client side request and response objects and parsing code
to parse the xContent output of the client side AsyncSearchResponse together
with parsing roundtrip tests and a simple roundtrip integration test.
Relates to #49091
Backport of #53592
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.
Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
The retention lease syncs need to occur under the system context,
because they are internal actions executed on behalf of the user. Today
we are relying on this happening for background syncs by virtue of the
fact that the context the syncs are created under is the system
context. This is due to these occurring on the cluster state applier
thread. However, there are situations where this does not hold such as
when a timed out cluster state publication occurs, and the node where
the shard is allocated is the elected master node. In that case, the
context will be empty due to the fact that we do not reschedule
publication under the system context. Currently, doing so runs us into
some troubles with losing the existing context, possibly dropping
deprecation headers. We could copy that context over when marking the
current context as the system context, but the implications of that
require some more investigation. For now, we explicitly mark the
retention lease syncs as executing under the system context, as this is
situation that we can reason about.
The JodaCompatibleZonedDateTime is a compatibility object that unions
Joda's DateTime and Java's ZonedDateTime, meant for use in scripts. When
it was added, we serialized the JCZDT as a Joda DateTime so that when
sending to older nodes they could still read the object. However, on
newer nodes, we continued also reading this as a Joda DateTime. This
commit changes the read side to form a JCZDT.
closes#53586
* Add IndexTemplateV2 to MetaData (#53753)
* Add IndexTemplateV2 to MetaData
This adds the `IndexTemplateV2` and `IndexTemplateV2Metadata` class to be used for the new
implementation of index templates. The new metadata is stored as a `MetaData.Custom` implementation.
Relates to #53101
* Add ITV2Metadata unit tests
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* Update min supported version constant
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.
This rewrites tests, to make the max points in leaf node to
be a small value to control the tests.
Closes#49703
This commit makes a number of improvements when importing the
Elasticsearch project into IntelliJ IDEA. Specifically:
- Contributing documentation has been updated to reflect that the
'idea' task should no long be used and Gradle project import is
instead the officially supported way of setting up the project.
- Attempts to run the 'idea' task will result in a failure with a
message directing folks to our CONTRIBUTING.md document.
- The project JDK is explicit set rather that using whatever JAVA_HOME
is.
- Gradle build operation delegation is disabled, and test execution is
configured to 'choose per test'.
- Gradle is configured to inherit the project JDK.
- Some code style conventions are automatically configured.
- File encoding is explicitly set to UTF-8.
- Parallel module compilation is enabled and deprecated feature
warnings are disabled.
- A remote debug run configuration using listen mode is created.
- JUnit runner is configured with required system properties.
- License headers are configured such that Apache 2 is the default
notice added to all source files with exception of source in /x-pack
which will use the Elastic license.
Today cluster states are sometimes (rarely) applied in the default context
rather than system context, which means that any appliers which capture their
contexts cannot do things like remote transport actions when security is
enabled.
There are at least two ways that we end up applying the cluster state in the
default context:
1. locally applying a cluster state that indicates that the master has failed
2. the elected master times out while waiting for a response from another node
This commit ensures that cluster states are always applied in the system
context.
Mitigates #53751
With the upgrade to Lucene 8.5, LatLonShape field has support for distance queries. This change implements this new feature and removes the limitation.
Backport to 7x
Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928
Co-Authored-By: @iverase
This commit disables the sort optimization added in #51852 for scroll requests.
Scroll queries keep a state per shard so we cannot modify the request on
the first round (submit).
This bug was introduced in non-released versions which is why this pr
is marked as a non-issue.
On clusters with a large number of shards, the shards limits allocation
decider can exhibit poor performance leading to timeouts applying
cluster state updates. This occurs because for every shard, we do a loop
to count the number of shards on the node, and the number of shards for
the index of the shard. This is roughly quadratic in the number of
shards. This loop is not necessary, since we already have a O(1) method
to count the number of non-relocating shards on a node, and with this
commit we add some infrastructure to RoutingNode to make counting the
number of shards per index O(1).
* Adds per context settings:
`script.context.${CONTEXT}.cache_max_size` ~
`script.cache.max_size`
`script.context.${CONTEXT}.cache_expire` ~
`script.cache.expire`
`script.context.${CONTEXT}.max_compilations_rate` ~
`script.max_compilations_rate`
* Context cache is used if:
`script.max_compilations_rate=use-context`. This
value is dynamically updatable, so users can
switch back to the general cache if desired.
* Settings for context caches take the first value
that applies:
1) Context specific settings if set, eg
`script.context.ingest.cache_max_size`
2) Correlated general setting is set to the non-default
value, eg `script.cache.max_size`
3) Context default
The reason for 2's inclusion is to allow an easy
transition for users who've customized their general
cache settings.
Using the general cache settings for the context caches
results in higher effective settings, since they are
multiplied across the number of contexts. So a general
cache max size of 200 will become 200 * # of contexts.
However, this behavior it will avoid users snapping to a
value that is too low for them.
Backport of: #52855
Refs: #50152
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents.
This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard.
For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none.
This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
The introduction of the ExitableDirectoryReader showed increase of
latencies for range queries using pointvalues.
Check for cancellation every 1024 docs instead of every 15 to lower
the impact of the check in query's performance.
Follows: #52822Fixes: #53496
(cherry picked from commit 6b5fc35e4458e60a7ca5822584ec6a60562f2c01)
It is useful to be able to delay state recovery until enough data nodes have
joined the cluster, since this gives the shard allocator a decent opportunity
to re-use as much existing data as possible. However we also have the option to
delay state recovery until a certain number of master-eligible nodes have
joined, and this is unnecessary: we require a majority of master-eligible nodes
for state recovery, and there is no advantage in waiting for more.
This commit deprecates the unnecessary settings in preparation for their
removal.
Relates #51806
* Add REST API for ComponentTemplate CRUD
This adds the Put/Get/DeleteComponentTemplate APIs that allow inserting, retrieving, and removing
ComponentTemplateMetadata into the cluster state metadata.
These APIs are currently only available behind a feature flag system property -
`es.itv2_feature_flag_registered`.
Relates to #53101
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Re-applies the change from #53523 along with test fixes.
closes#53626closes#53624closes#53622closes#53625
Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
Today it can happen that a transport message fails to send (for example,
because a transport interceptor rejects the request). In this case, the
response handler is never invoked, which can lead to necessary cleanups
not being performed. There are two ways to handle this. One is to expect
every callsite that sends a message to try/catch these exceptions and
handle them appropriately. The other is merely to invoke the response
handler to handle the exception, which is already equipped to handle
transport exceptions.
* Submit async search to work only with POST (#53368)
Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method.
* Refine SearchProgressListener internal API (#53373)
The following cumulative improvements have been made:
- rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce`
- add unit test for `SearchShard`
- on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private
- Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class.
- rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members
- make `SearchShard` and `SearchShardTask` classes final
* Move async search yaml tests to x-pack yaml test folder (#53537)
The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too.
* [DOCS] Add temporary redirect for async-search (#53454)
The following API spec files contain a link to a not-yet-created
async search docs page:
* [async_search.delete.json][0]
* [async_search.get.json][1]
* [async_search.submit.json][2]
The Elaticsearch-js client uses these spec files to create their docs.
This created a broken link in the Elaticsearch-js docs, which has broken
the docs build.
This PR adds a temporary redirect for the docs page. This redirect
should be removed when the actual API docs are added.
[0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json
[1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json
[2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
I broke sorting aggregations by `doc_count` in #51271 by mixing up true
and false. This flips that comparison and adds a few tests to double
check that we don't so this again.
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
`PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.
This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.
Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.
Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
* New wildcard field optimised for wildcard queries (#49993)
Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values. Also supports aggregations and sorting.
Keyword field values with length more than ignore_above are not
indexed. But highlighters still were retrieving these values
from _source and were trying to highlight them. This sometimes lead to
errors if a field length exceeded max_analyzed_offset. But also this
is an overall wrong behaviour to attempt to highlight something that was
ignored during indexing.
This PR checks if a keyword value was ignored because of its length,
and if yes, skips highlighting it.
Backport: #53408Closes#43800
This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.
````
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
"aggs": {
"date_histogram": {
"field": "@timestamp",
"fixed_interval": "1h"
}
}
}
````
If after 100ms the final response is not available, a `partial_response` is included in the body:
````
{
"id": "9N3J1m4BgyzUDzqgC15b",
"version": 1,
"is_running": true,
"is_partial": true,
"response": {
"_shards": {
"total": 100,
"successful": 5,
"failed": 0
},
"total_hits": {
"value": 1653433,
"relation": "eq"
},
"aggs": {
...
}
}
}
````
The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:
````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````
That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:
````
{
"id": "9N3J1m4BgyzUDzqgC15b",
"version": 10,
"is_running": false,
"response": {
"is_partial": false,
...
}
}
````
Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`.
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.
Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed.
Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.
````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````
The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.
The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.
Relates #49091
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Today the NodeConnectionsService emits a DEBUG-level log message each time it
calls TransportService#connectToNode, which happens for every node in the
cluster every ten seconds, and also at every cluster state update. That's a lot
of log messages. Most of these calls are no-ops and can be ignored, but if the
call was not a no-op then it may be worth investigating further. Since the logs
do not distinguish the interesting and uninteresting cases, they are not
useful.
This commit distinguishes the two cases and pushes the noisy logging for the
common no-op case down to TRACE level, leaving only useful and actionable
information in the DEBUG-level logs.
Previously, Term Vectors API was returning empty results for
artificial documents with keyword fields. Checking only for `string()`
on `IndexableField` is not enough, since for `KeywordFieldType`
`binaryValue()` must be used instead.
Fixes#53494
(cherry picked from commit 1fc3fe3d32f41eab2101c0536751b7c47e63cc48)
* Use snake case for nodes stats/info metric names (#53446)
The REST API uses "thread_pool" as the name of the thread pool metric.
If we use this name internally when we serialize nodes stats and info
requests, we won't need to do any fancy logic to check for and switch
out "threadPool", which was the previous internal name.
This commit fixes a bug on sorted queries with a primary sort field
that uses different types in the requested indices. In this scenario
the returned min/max values to sort the shards are not comparable so
we should avoid the sorting rather than throwing an obscure exception.
* Add ComponentTemplate to MetaData (#53290)
* Add ComponentTemplate to MetaData
This adds a `ComponentTemplate` datastructure that will be used as part of #53101 (Index Templates
v2) to the `MetaData` class. Currently there are no APIs for interacting with this class, so it will
always be an empty map (other than in tests). This infrastructure will be built upon to add APIs in
a subsequent commit.
A `ComponentTemplate` is made up of a `Template`, a version, and a MetaData.Custom class. The
`Template` contains similar information to an `IndexTemplateMetaData` object— settings, mappings,
and alias configuration.
* Update minimal supported version constant
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
The spirit of StreamInput/StreamOutput is that common I/O patterns should
be handled by these classes so that the persistence methods in
application classes can be kept short, which facilitates easy visual
comparison between read and write methods, and reduces risks of having
serialization issues due to mismatched implementations.
To this end, this change adds readOptionalVLong and writeOptionalVLong
methods to these classes as we have started to build up cases where
that conditional/null logic has been implemented directly in the read &
write methods.
Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
If an index was created in version 6 and contain a date field with a joda-style pattern it should still be allowed to search and insert document into it.
Those created in 6 but date pattern starts with 8, should be considered as java style.
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.
This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.
Closes#53168
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:
1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index
This commit avoids these issues by adding a UUID to SearchContexId.
In rare circumstances it is possible for an isolated node to have a greater
term than the currently-elected leader. Today such a node will attempt to join
the cluster but will not offer a vote to the leader and will reject its cluster
state publications due to their stale term. This situation persists since there
is no mechanism for the joining node to inform the leader that its term is
stale and a new election is required.
This commit adds the current term of the joining node to the join request. Once
the join has been validated, the leader will perform another election to
increase its term far enough to allow the isolated node to join properly.
Fixes#53271
* Record Force Merges in live commit data
Prerequisite of #52182. Record force merges in the live commit data
so two shard states with the same sequence number that differ only in whether
or not they have been force merged can be distinguished when creating snapshots.
We can't always have the same segment stats and doc stats between
InternalEngine and ReadOnlyEngine if there are some fully deleted
segments. ReadOnlyEngine always filters out them. InternalEngine,
however, will keep them if peer recovery retention leases exist or the
number of the retaining operations is non-zero.
This change reverts the fix in #51331 and uses the wrapped reader to
calculate the segment stats and doc stats. For the test, we need to
disable the extra retaining soft-deletes operations.
Closes#51303
Prior to this commit, rollover did not propagate the `is_hidden` alias
property when rollover over an index. This commit ensures that an alias
that's rollover over will remain hidden.