Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
* Add unit tests before refactoring
* Convert boolean fields to set of strings
In order to make nodes stats plugins pluggable, we need to make the
NodesStatsRequest class capable of carrying a flexible list of metrics
rather than a fixed list of boolean flags. This commit changes the
internal storage of the class without changing its serialization.
* Change serialization of NodesStatsRequest
* Set up BWC before merging
* Singularize enum name
Under certain circumstances SpanMultiTermQueryWrapper uses
SpanBooleanQueryRewriteWithMaxClause as its rewrite method, which in turn tries
to get a TermsEnum from the wrapped MultiTermQuery currently using a `null`
AttributeSource. While queries TermsQuery or subclasses of AutomatonQuery ignore
this argument, FuzzyQuery uses it to create a FuzzyTermsEnum which triggers an
NPE when the AttributeSource is not provided. This PR fixes this by supplying an
empty AttributeSource instead of a `null` value.
Closes#52894
Today when notifying a global checkpoint listener, we use the listener
thread pool. This commit turns this inside out so that the global
checkpoint listener must provide an executor on which to notify the
listener.
This commit drops the dispatching listenable action future that forks to
the listener thread pool. This was previously used in the transport
client but is no longer used.
It can be that a failure is repeated to a grouped action listener. For
example, if the same exception such as a connect transport exception, is
the cause of repeated failures. Previously we were unconditionally
self-suppressing the exception into the first exception, but
self-supressing is not allowed. Thus, we would throw an exception and
the grouped action listener would never complete. This commit addresses
this by guarding against self-suppression.
Today we notify refresh listeners by forking to the listener thread pool
and then serially notifying listeners on a thread there. Refreshes are
expensive though, so the expectation is that we are executing refreshes
on threads that can afford an expensive operation (e.g., not a network
thread) and as such, executing listeners that we expect to be cheap aon
the calling thread is okay. This commit removes the forking of notifying
refresh listeners to run directly on the calling thread that executed a
refresh.
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.
The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
Our lovely `BitArray` compactly stores "flags", lazilly growing its
underlying storage. It is super useful when you need to store one bit of
data for a zillion buckets or a documents or something. Usefully, it
defaults to `false`. But there is a wrinkle! If you ask it whether or
not a bit is set but it hasn't grown its underlying storage array
"around" that index then it'll throw an `ArrayIndexOutOfBoundsException`.
The per-document use cases tend to show up in order and don't tend to
mind this too much. But the use case in aggregations, the per-bucket use
case, does. Because buckets are collected out of order all the time.
This changes `BitArray` so it'll return `false` if the index is too big
for the underlying storage. After all, that index *can't* have been set
or else we would have grown the underlying array. Logically, I believe
this makes sense. And it makes my life easy. At the cost of three lines.
*but* this adds an extra test to every call to `get`. I think this is
likely ok because it is "very close" to an array index lookup that
already runs the same test. So I *think* it'll end up merged with the
array bounds check.
This commit updates the template used for watch history indices with
the hidden index setting so that new indices will be created as hidden.
Relates #50251
Backport of #52962
Currently the AbstractBulkByScrollRequest accepts slice values of 0 via its
`setSlices` method, denoting the "auto" slicing behaviour that is usable by
settting the "slices=auto" parameter on rest requests. When using the High Level
Rest Client, however, we send the 0 value as an integer, which is then rejected
as invalid by `AbstractBulkByScrollRequest#parseSlices`. Instead of making
parsing of the rest request more lenient, this PR opts for changing the
RequestConverter logic in the client to translate 0 values to "auto" on the rest
requests.
Closes#53044
For Node Info to be pluggable, NodesInfoRequest must be able to carry
arbitrary strings. This commit reworks the internals of that class to
use a set rather than hard-coded boolean fields.
NodesInfoRequest defaults to specifying all values. We test for
this behavior as we refactor and use random testing for the
various combinations of metrics.
Add backwards compatibility for transport requests.
With ExitableDirectoryReader in place, check for query cancellation
during QueryPhase#preProcess where the query rewriting takes place.
Follows: #52822
(cherry picked from commit 0d38626d8e6e9e2620a7a446b617a2ac42852461)
The bool query builder in elasticsearch accepts both must_not and mustNot
fields. Given that leniency is abhorrent and must be eschewed, we should deprecate
the latter as it doesn't fit with the style of parameters elsewhere in the DSL.
Use assertBusy when doing reroute after bridged disruption,
since it can return non-acked if a node is marked faulty
by follower check after disruption ended.
Closes#53064
7.5 and 7.6 had a regression that allowed for
script_score queries to have negative scores.
We have corrected this regression in #52478.
This is an addition to #52478 that adds
a test and release notes.
Implement an Exitable DirectoryReader that wraps the original
DirectoryReader so that when a search task is cancelled the
DirectoryReaders also stop their work fast. This is usuful for
expensive operations like wilcard/prefix queries where the
DirectoryReaders can spend lots of time and consume resources,
as previously their work wouldn't stop even though the original
search task was cancelled (e.g. because of timeout or dropped client
connection).
(cherry picked from commit 67acaf61f33bc5f54e26541514d07e375c202e03)
With #50871 aggrgations should now be parsed directly by an
`ObjectParser` or `ConstructingObjectParser` without the need for the
ceremonial `parse` method. This removes 9 of those `parse` methods and
parses the aggregation directly from their `ObjectParser`.
Currently the remote connection manager will delegate the size() call to
the underlying cluster connection manager. This introduces the
possibility that call will return 1 before the nodeConnection method has
been triggered to add the connection to the remote connection list. This
can cause issues, as the ensureConnected method checks the connection
managers size and executes synchronously if the size is > 0. This leads
to a potential cluster not connected exception while we are still
waiting for the connection opened callback to be triggered.
This commit fixes this issue by using the remote connection manager's
size to report the connection manager's size.
Fixes#52029.
This commit removes the hand-rolled x-content parsing logic from BoolQueryBuilder
and instead uses an ObjectParser to handle parsing. It also removes the long-deprecated
(since version 6) disable_coord parameter.
Converts the deprecations to `deprecatedAndMaybeLog` to reduce the
number of times we log deprecations, since some of these could be called
at a high frequency (due to unconverted queries, aggs, etc)
This commit introduces a module for Kibana that exposes REST APIs that
will be used by Kibana for access to its system indices. These APIs are wrapped
versions of the existing REST endpoints. A new setting is also introduced since
the Kibana system indices' names are allowed to be changed by a user in case
multiple instances of Kibana use the same instance of Elasticsearch.
Additionally, the ThreadContext has been extended to indicate that the use of
system indices may be allowed in a request. This will be built upon in the future
for the protection of system indices.
Backport of #52385
With #50871 aggrgations should now be parsed directly by an
`ObjectParser` or `ConstructingObjectParser` without the need for the
ceremonial `parse` method. This removes 10 of those `parse` methods and
parses the aggregation directly from their `ObjectParser`.
`MinAndMax` encapsulates min and max values for a shard. It uses generics to make sure that the values are of the same type and are also comparable. Though there are warnings whenever this class is currently used, which are addressed with this commit.
Relates to #49092
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.
The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?
For this field there is a choice between
1. accepting values in `_source` when they are equal to the value configured
in mappings, but rejecting mapping updates
2. rejecting values in `_source` but then allowing updates to the value that
is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.
Backport of #49713
When notifying global checkpoint listeners, we have an opportunity to
early return if there are not any registered listeners. This is
important since it saves some allocations, and also saves forking some
empty work to another thread. This commit adds an early return from
notifying listeners if there are not any registered.
Allow AffixSetting as validator dependencies. If a validator
specifies AffixSettings as a dependency, then `validate(T, Map)`
will have the concrete setting in a map.
Backport of: #52973, 1e0ba70
Fixes: #52933
This commit removes a TODO in the IndexNameExpressionResolver that
indicated the API should use a Set instead of a List. However, this
TODO was not completely correct since the ordering of arguments matters
due to negations when evaluating wildcards and since we also allow
a list of patterns like `*,-foo,*`, which would have a different
meaning even when using a Set with insertion ordering.
Relates #52788
Backport of #52963
Since version 8.4, `MMapDirectory` has an optimization to read long[]
arrays directly in little endian order, which postings leverage. So it'd
be more efficient to open postings with `MMapDirectory`.
I refactored a bit the existing logic to better explain why every listed
file extension is open with `mmap`.
Backport of #51233 to the seven dot x branch.
Tries to load a `Mapper` instance for the mapping snippet of a dynamic template.
This should catch things like using an analyzer that is undefined or mapping attributes that are unused.
This is best effort:
* If `{{name}}` placeholder is used in the mapping snippet then validation is skipped.
* If `match_mapping_type` is not specified then validation is performed for all mapping types.
If parsing succeeds with a single mapping type then this the dynamic mapping is considered valid.
If is detected that a dynamic template mapping snippet is invalid at mapping update time then the mapping update is failed for indices created on 8.0.0-alpha1 and later. For indices created on prior version a deprecation warning is omitted instead. In 7.x clusters the mapping update will never fail in case of an invalid dynamic template mapping snippet and a deprecation warning will always be omitted.
Closes#17411Closes#24419
Co-authored-by: Adrien Grand <jpountz@gmail.com>
The `terms` aggregation can be sortd by the results of its
sub-aggregations. Because it uses that sorting for filtering to the
top-n it tries not to construct all of the buckets for the child
aggregations. This has its own interesting problem around reduction, but
they aren't super relevant to this change. This change moves that
optimization from the `TermsAggregator` and into the aggregators being
sorted on. This should make it more clear what is going on and it
unifies this optimization with validating the sort.
Finally, this should enable some minor optimizations to save a few
comparisons when sorting multi-valued buckets. I'll get those in a
follow up because they are now *fairly* obvious. They probably won't be
a huge performance improvement, but it'll be nice anyway.
Add index name(s) into the source for the cluster state update done when putting mapping.
This ensures that the pending tasks API includes information on source indices.
Computing the stats for completion fields may involve a significant amount of
work since it walks every field of every segment looking for completion fields.
Innocuous-looking APIs like `GET _stats` or `GET _cluster/stats` do this for
every shard in the cluster. This repeated work is unnecessary since these stats
do not change between refreshes; in many indices they remain constant for a
long time.
This commit introduces a cache for these stats which is invalidated on a
refresh, allowing most stats calls to bypass the work needed to compute them on
most shards.
Closes#51915
Backport of #51991
We aren't able to reproduce or figure out the reason that failed this test.
This commit adds more assertions so we can narrow the scope.
Relates #52223
Since #51905, we use the local checkpoint of the safe commit to
calculate the number of uncommitted operations of a translog stats. If a
periodic flush triggered by afterWriteOperation completes before we sync
translog, then the last commit is not safe. We also need to sync
translog from Engine instead of the translog so that we can advance the
safe commit.
Relates #51905Closes#52223
Since #51905, we skip translog recovery if the local checkpoint of the
safe commit equals to the global checkpoint. This change adjusts the
test not to create a new snapshot in that case.
Closes#52221
Relates #51905
Separates the translog from the index deletion conditions (allowing the translog to be cleaned
up more eagerly), and avoids taking the write lock on the translog if no clean-up is actually
necessary.
Today we use the translog_generation of the safe commit as the minimum
required translog generation for recovery. This approach has a
limitation, where we won't be able to clean up translog unless we flush.
Reopening an already recovered engine will create a new empty translog,
and we leave it there until we force flush.
This commit removes the translog_generation commit tag and uses the
local checkpoint of the safe commit to calculate the minimum required
translog generation for recovery instead.
Closes#49970
This commit changes the `index.hidden` setting from being final to a
dynamic setting. While the setting being final allows for easier
reasoning about an index, making this setting update-able has more
benefits in that we can upgrade existing indices to be hidden and it
will enable future features that would dynamically make indices hidden.
Backport of #52772
We've pretty well settled on `ContextParser` for a generic interface to
`ObjectParser`-like-things. This switches the interface used for
building parsing pipeline aggregations to `ContextParser` which saves a
couple of little wrappers around `ObjectParser`.
Currently 3 remote cluster settings (ping interval, skip unavailable,
and compression) have a dependency on the seeds setting being
comfigured. With proxy mode, it is now possible that these settings the
seeds setting has not been configured. This commit removes this
dependency and adds new validation for these settings.
Generalize how queries on `_index` are handled at rewrite time (#52486)
Since this change refactors rewrites, I also took it as an opportunity to adrress #49254: instead of returning the same queries you would get on a keyword field when a field is unmapped, queries get rewritten to a MatchNoDocsQueryBuilder.
This change exposed a couple bugs, like the fact that the percolator doesn't rewrite queries at query time, or that the significant_terms aggregation doesn't rewrite its inner filter, which I fixed.
Closes#49254
Currently we have two ways to create a GroupShardsIterator: one that will resort the iterators based on their natural ordering, and another one that will leave them in their original order. This is currently done through two constructors, one that accepts a single argument which does the sorting, and another which accepts a second boolean argument to control whether sorting should happen or not. This second constructor is only called externally to disable the sorting.
By introducing a specific method to create a sorted shard iterator we clarify and make it easier to track when we do sort and when we do not as the iterators are externally sorted.
This change fixes the incomplete backport of #46731 in 7.x (as of 7.5).
We now check if `max_children` is set on the top level nested sort and fails with an
exception if it's not the case.
Relates #46731Closes#52202
* Remove TODO in MaxAgeCondition serialization
This removes the TODO with a message for any future readers regarding the code in question.
Resolves#52505
Currently the shard bulk request can be rejected by the write threadpool
after a mapping update. This introduces a scenario where the mapping
listener thread will attempt to finish the request and fsync. This
thread can potentially be a transport thread. This commit fixes this
issue by forcing the finish action to happen on the write threadpool.
Fixes#51904.
Lucene's RAMDirectory has been deprecated. This commit replaces all uses of
RAMDirectory in elasticsearch with the newer ByteBuffersDirectory. Most uses
are in tests, but the percolator and painless executor may get some small speedups.
Currently, date ranges queries using NOW-based date math are rewritten to
MatchAllDocs queries when being preprocessed for the percolator. However,
since we added the verification step, this can result in incorrect matches when
percolator queries are run without scores. This commit changes things to instead
wrap date queries that use NOW with a new DateRangeIncludingNowQuery.
This is a simple wrapper query that returns its delegate at rewrite time, but it can
be detected by the percolator QueryAnalyzer and be dealt with accordingly.
This also allows us to remove a method on QueryRewriteContext, and push all
logic relating to NOW-based ranges into the DateFieldMapper.
Fixes#52617
When the Node class is being constructed, an initial environment is
passed in with the initial settings for the node. Once the plugin
servicie is initialized, the final Environment+Settings are created, at
which point the initial environment should no longer be used. This
commit renames the constructor arg to avoid naming clashes with the
final environment variable.
Before boost in script_score query was wrongly applied only to the subquery.
This commit makes sure that the boost is applied to the whole score
that comes out of script.
Closes#48465
In #42838 we moved the terms index of all fields off-heap except the
`_id` field because we were worried it might make indexing slower. In
general, the indexing rate is only affected if explicit IDs are used, as
otherwise Elasticsearch almost never performs lookups in the terms
dictionary for the purpose of indexing. So it's quite wasteful to
require the terms index of `_id` to be loaded on-heap for users who have
append-only workloads. Furthermore I've been conducting benchmarks when
indexing with explicit ids on the http_logs dataset that suggest that
the slowdown is low enough that it's probably not worth forcing the terms
index to be kept on-heap. Here are some numbers for the median indexing
rate in docs/s:
| Run | Master | Patch |
| --- | ------- | ------- |
| 1 | 45851.2 | 46401.4 |
| 2 | 45192.6 | 44561.0 |
| 3 | 45635.2 | 44137.0 |
| 4 | 46435.0 | 44692.8 |
| 5 | 45829.0 | 44949.0 |
And now heap usage in MB for segments:
| Run | Master | Patch |
| --- | ------- | -------- |
| 1 | 41.1720 | 0.352083 |
| 2 | 45.1545 | 0.382534 |
| 3 | 41.7746 | 0.381285 |
| 4 | 45.3673 | 0.412737 |
| 5 | 45.4616 | 0.375063 |
Indexing rate decreased by 1.8% on average, while memory usage decreased
by more than 100x.
The `http_logs` dataset contains small documents and has a simple
indexing chain. More complex indexing chains, e.g. with more fields,
ingest pipelines, etc. would see an even lower decrease of indexing rate.
This drops more of the `instanceof`s from `AggregationPath`. There are
still a couple in `AggregationPath`. And I ended up moving two into
`BucketsAggregator`, but I think this is still an improvement!
We consider index level read_only_allow_delete blocks temporary since
the DiskThresholdMonitor can automatically release those when an index
is no longer allocated on nodes above high threshold.
The rest status has therefore been changed to 429 when encountering this
index block to signal retryability to clients.
Related to #49393
This commit renames ElasticsearchAssertions#assertThrows to
assertRequestBuilderThrows and assertFutureThrows to avoid a
naming clash with JUnit 4.13+ and static imports of these methods.
Additionally, these methods have been updated to make use of
expectThrows internally to avoid duplicating the logic there.
Relates #51787
Backport of #52582
Phase 1 of adding compilation limits per context.
* Refactor rate limiting and caching into separate class,
`ScriptCache`, which will be used per context.
* Disable compilation limit for certain tests.
Backport of 0866031
Refs: #50152
This commit modifies the codebase so that our production code uses a
single instance of the IndexNameExpressionResolver class. This change
is being made in preparation for allowing name expression resolution
to be augmented by a plugin.
In order to remove some instances of IndexNameExpressionResolver, the
single instance is added as a parameter of Plugin#createComponents and
PersistentTaskPlugin#getPersistentTasksExecutor.
Backport of #52596
Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode).
`RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO.
The benefits of this move are:
* Much faster repository status API calls
* listing all snapshot names becomes instant
* Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data
* Additional cloud cost savings
* Better resiliency, saving another spot where an IO issue could break the snapshot
* We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.
* Refactor Inflexible Snapshot Repository BwC (#52365)
Transport the version to use for a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere.
Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.
AllocationDeciders would collect Yes decisions when not asking for debug
info. Changed to only include Yes decisions when debug is requested
(explain).
Added ability to specify comma separated list of source indices without
array. Also fixed so that empty string results in validation error
rather than index does not exist.
Closes#51949
Issue #52000 looks like a case of cluster state updates being slower than
expected, but it seems that these slowdowns are relatively rare: most
invocations of `testDelayWithALargeAmountOfShards` take well under a minute in
CI, but there are occasional failures that take 6+ minutes instead. When it
fails like this, cluster state persistence seems generally slow: most are
slower than expected, with some small updates even taking over 2 seconds to
complete.
The failures all have in common that they use `WindowsFS` to emulate Windows'
behaviour of refusing to delete files that are still open, by tracking all
files (really, inodes) and validating that deleted files are really closed
first. There is a suggestion that this is a little slow in the Lucene test
framework [1]. To see if we can attribute the slowdown to that common factor,
this commit suppresses the use of `WindowsFS` for this test suite.
[1] 4a513fa99f/lucene/test-framework/src/java/org/apache/lucene/util/TestRuleTemporaryFilesCleanup.java (L166)
Currently we lock when generating time based uuids. The lock is
implemented to prevent concurrent writes to the last timestamp. The uuid
generation is an area of contention when indexing. This commit modifies
the code to use atomic compare and set operations to update the last
timestamp.
Every time a setting#exist call is made we lock on the keyset to ensure
that it has been initialized. This a heavyweight operation that only
should be done once. This commit moves to a volatile read instead to
prevent unnecessary locking.