* Record Force Merges in live commit data
Prerequisite of #52182. Record force merges in the live commit data
so two shard states with the same sequence number that differ only in whether
or not they have been force merged can be distinguished when creating snapshots.
We can't always have the same segment stats and doc stats between
InternalEngine and ReadOnlyEngine if there are some fully deleted
segments. ReadOnlyEngine always filters out them. InternalEngine,
however, will keep them if peer recovery retention leases exist or the
number of the retaining operations is non-zero.
This change reverts the fix in #51331 and uses the wrapped reader to
calculate the segment stats and doc stats. For the test, we need to
disable the extra retaining soft-deletes operations.
Closes#51303
This avoids NPE when executing SLM policy when no config was provided.
Related to #44465Closes#53171
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Prior to this commit, rollover did not propagate the `is_hidden` alias
property when rollover over an index. This commit ensures that an alias
that's rollover over will remain hidden.
This removes the `instanceof`s from `SiblingPipelineAggregator` by
adding a `rewriteBuckets` method to `InternalAggregation` that can be
called to, well, rewrite the buckets. The default implementation of
`rewriteBuckets` throws the same exception that was thrown when you
attempted to run a `SiblingPipelineAggregator` on an aggregation without
buckets. It is overridden by `InternalSingleBucketAggregation` and
`InternalMultiBucketAggregation` to correctly rewrite their buckets.
When an composite aggregation is run against an index with a sort that
*starts* with the "source" fields from the composite but has additional
fields it'd blow up in while trying to decide if it could use the sort.
This changes it to decide that it *can* use the sort.
Closes#52480
This change optimizes the merge of terms aggregations by removing
the priority queue that was used to collect all the buckets during
a non-final reduction. We don't need to keep the result sorted since
the merge of buckets in a subsequent reduce can modify the order.
I wrote a small micro-benchmark to test the change and the speed ups
are significative for small merge buffer sizes:
````
########## Master:
Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units
TermsReduceBenchmark.reduceTopHits 5 10000 1000 1000 avgt 10 2459,690 ± 198,682 ms/op
TermsReduceBenchmark.reduceTopHits 16 10000 1000 1000 avgt 10 1030,620 ± 91,544 ms/op
TermsReduceBenchmark.reduceTopHits 32 10000 1000 1000 avgt 10 558,608 ± 44,915 ms/op
TermsReduceBenchmark.reduceTopHits 128 10000 1000 1000 avgt 10 287,333 ± 8,342 ms/op
TermsReduceBenchmark.reduceTopHits 512 10000 1000 1000 avgt 10 257,325 ± 54,515 ms/op
########## Patch:
Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units
TermsReduceBenchmark.reduceTopHits 5 10000 1000 1000 avgt 10 805,611 ± 14,630 ms/op
TermsReduceBenchmark.reduceTopHits 16 10000 1000 1000 avgt 10 378,851 ± 17,929 ms/op
TermsReduceBenchmark.reduceTopHits 32 10000 1000 1000 avgt 10 261,094 ± 10,176 ms/op
TermsReduceBenchmark.reduceTopHits 128 10000 1000 1000 avgt 10 241,051 ± 19,558 ms/op
TermsReduceBenchmark.reduceTopHits 512 10000 1000 1000 avgt 10 231,643 ± 6,170 ms/op
````
The code for the benchmark can be found [here](). It seems to be up to 3x faster for terms aggregations
that return 10,000 unique terms (1000 terms per shard). For a cardinality of 100,000 terms, this patch is up to 5x faster:
````
########## Patch:
Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units
TermsReduceBenchmark.reduceTopHits 5 100000 1000 1000 avgt 10 12791,083 ± 397,128 ms/op
TermsReduceBenchmark.reduceTopHits 16 100000 1000 1000 avgt 10 3974,939 ± 324,617 ms/op
TermsReduceBenchmark.reduceTopHits 32 100000 1000 1000 avgt 10 2186,285 ± 267,124 ms/op
TermsReduceBenchmark.reduceTopHits 128 100000 1000 1000 avgt 10 914,657 ± 160,784 ms/op
TermsReduceBenchmark.reduceTopHits 512 100000 1000 1000 avgt 10 604,198 ± 145,457 ms/op
########## Master:
Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units
TermsReduceBenchmark.reduceTopHits 5 100000 1000 1000 avgt 10 60696,107 ± 929,944 ms/op
TermsReduceBenchmark.reduceTopHits 16 100000 1000 1000 avgt 10 16292,894 ± 783,398 ms/op
TermsReduceBenchmark.reduceTopHits 32 100000 1000 1000 avgt 10 7705,444 ± 77,588 ms/op
TermsReduceBenchmark.reduceTopHits 128 100000 1000 1000 avgt 10 2156,685 ± 88,795 ms/op
TermsReduceBenchmark.reduceTopHits 512 100000 1000 1000 avgt 10 760,273 ± 53,738 ms/op
````
The merge of buckets can also be optimized. Currently we use an hash map to merge buckets coming from different shards so this can be costly if the number of unique terms is high. Instead, we could always sort the shard terms result by key and perform a merge sort to reduce the results. This would save memory and make the merge more linear in terms
of complexity in the coordinating node at the expense of an additional sort in the shards. I plan to test this possible optimization in a follow up.
Relates #51857
It doesn't make a whole lot of sense for `BitArray#clear` to grow the
underlying storage array just to clear the bit. We *already* treat
indices outside of the storage array as unset. This turns such
operations into a noop.
A previous change (#53029) is causing analysis jobs to wait for certain indices to be made available. While this it is good for jobs to wait, they could fail early on _start.
This change will cause the persistent task to continually retry node assignment when the failure is due to shards not being available.
If the shards are not available by the time `timeout` is reached by the predicate, it is treated as a _start failure and the task is canceled.
For tasks seeking a new assignment after a node failure, that behavior is unchanged.
closes#53188
Updates the SVG for a token graph to make the layout consistent with
other graphs. This means moving the text directly above the edge lines.
Previously, the text was above the edge line.
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
Adds a tip admonition to the basic example in the EQL search docs.
This tip lets users know they can set up a Beat to automatically
index data in ES, rather than manually indexing using the bulk or index
APIs.
Documents the `nodes` response parameters returned by the
`_cluster/stats` API.
Also adds collapsible attributes for the `indices` and `nodes`
sections.
A freeze operation can partially fail in multiple places, including the
close verification step. This left the index in an unfrozen but
partially closed state. Now throw an exception to retry the freeze step
instead.
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
* Add unit tests before refactoring
* Convert boolean fields to set of strings
In order to make nodes stats plugins pluggable, we need to make the
NodesStatsRequest class capable of carrying a flexible list of metrics
rather than a fixed list of boolean flags. This commit changes the
internal storage of the class without changing its serialization.
* Change serialization of NodesStatsRequest
* Set up BWC before merging
* Singularize enum name
* Avoid race condition in ILMHistorySotre (#53039)
* Avoid race condition in ILMHistorySotre
This change modifies ILMHistoryStore to always apply correct settings and mappings,
even if template is deleted and not yet recreated. This ensures that ILM history index
is correctly managed by ILM and also fixes flaky history tests that were prone to
triggenring this race.
This commit also refactors and simplifies ILM history tests.
Closes#50353 and #52853
* Review comment
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* fixed tests
* backport #53306
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit adds a new request object field, "version", containing the version of the requesting client. This parameter is now accepted - and for certain clients required - by the server and the request is validated against it. Currently server's and client's versions still need to be equal in order for the request to be accepted. Relaxing this check is going to be part of future work.
On the clients' side, the only check remaining is to ensure that the peer server is supporting version backwards compatibility (i.e. is on, or newer than a certain release).
(cherry picked from commit a8f413a20fb023bec83af0de1211a2936a7f558c)
Update CommonEqlRestTestCase code to simplify making changes as requested.
Update EqlActionIT to simplify the test code as requested.
Replace Jackson parser with XContent in EqlActionIT.
Whitelist more EQL tests specs that are now supported.
This commit moves the global checkpoint listeners used in CCR to the CCR
thread pool. This removes the last use of the listener thread pool in
the codebase.
Under certain circumstances SpanMultiTermQueryWrapper uses
SpanBooleanQueryRewriteWithMaxClause as its rewrite method, which in turn tries
to get a TermsEnum from the wrapped MultiTermQuery currently using a `null`
AttributeSource. While queries TermsQuery or subclasses of AutomatonQuery ignore
this argument, FuzzyQuery uses it to create a FuzzyTermsEnum which triggers an
NPE when the AttributeSource is not provided. This PR fixes this by supplying an
empty AttributeSource instead of a `null` value.
Closes#52894
Makes the following changes to the `word_delimiter_graph` token filter
docs:
* Updates the Lucene experimental admonition.
* Updates description
* Adds analyze snippet
* Adds custom analyzer and custom filter snippets
* Reorganizes and updates parameter list
* Expands and updates section re: differences between `word_delimiter`
and `word_delimiter_graph`
This check was added as part of: 0f2d26bdca
Checking this before the test starts makes more sense, because
the watches index has then also be removed.
Relates to #53177
Today when notifying a global checkpoint listener, we use the listener
thread pool. This commit turns this inside out so that the global
checkpoint listener must provide an executor on which to notify the
listener.
This commit drops the dispatching listenable action future that forks to
the listener thread pool. This was previously used in the transport
client but is no longer used.
It can be that a failure is repeated to a grouped action listener. For
example, if the same exception such as a connect transport exception, is
the cause of repeated failures. Previously we were unconditionally
self-suppressing the exception into the first exception, but
self-supressing is not allowed. Thus, we would throw an exception and
the grouped action listener would never complete. This commit addresses
this by guarding against self-suppression.
Today we notify refresh listeners by forking to the listener thread pool
and then serially notifying listeners on a thread there. Refreshes are
expensive though, so the expectation is that we are executing refreshes
on threads that can afford an expensive operation (e.g., not a network
thread) and as such, executing listeners that we expect to be cheap aon
the calling thread is okay. This commit removes the forking of notifying
refresh listeners to run directly on the calling thread that executed a
refresh.
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.
The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
Our lovely `BitArray` compactly stores "flags", lazilly growing its
underlying storage. It is super useful when you need to store one bit of
data for a zillion buckets or a documents or something. Usefully, it
defaults to `false`. But there is a wrinkle! If you ask it whether or
not a bit is set but it hasn't grown its underlying storage array
"around" that index then it'll throw an `ArrayIndexOutOfBoundsException`.
The per-document use cases tend to show up in order and don't tend to
mind this too much. But the use case in aggregations, the per-bucket use
case, does. Because buckets are collected out of order all the time.
This changes `BitArray` so it'll return `false` if the index is too big
for the underlying storage. After all, that index *can't* have been set
or else we would have grown the underlying array. Logically, I believe
this makes sense. And it makes my life easy. At the cost of three lines.
*but* this adds an extra test to every call to `get`. I think this is
likely ok because it is "very close" to an array index lookup that
already runs the same test. So I *think* it'll end up merged with the
array bounds check.