Hotfix to not run into stuck snapshots because of master circuit breaking these requests.
Given that these requests are very small and much of the memory associated with them is already allocated
when the circuit breaker kicks in, the risk of this change introducing a higher chance of master running out
of memory should be very small.
Closes#54714
We rewrite more query builders to MatchNoneQueryBuilders now, which are always
cacheable. We should make sure the tests expects this when the rewritten query
is a MatchNoneQueryBuilder.
Closes#55331
Backport from: #54726
The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used.
In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs.
Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex().
If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).
Relates to #53100
* Add ValuesSource Registry and associated logic (#54281)
* Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638)
* ValuesSourceRegistry Prototype (#48758)
* Remove generics from ValuesSource related classes (#49606)
* fix percentile aggregation tests (#50712)
* Basic thread safety for ValuesSourceRegistry (#50340)
* Remove target value type from ValuesSourceAggregationBuilder (#49943)
* Cleanup default values source type (#50992)
* CoreValuesSourceType no longer implements Writable (#51276)
* Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131)
* Put values source types on fields (#51503)
* Remove VST Any (#51539)
* Rewire terms agg to use new VS registry (#51182)
Also adds some basic AggTestCases for untested code
paths (and boilerplate for future tests once the IT are
converted over)
* Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337)
* Wire Percentiles aggregator into new VS framework (#51639)
This required a bit of a refactor to percentiles itself. Before,
the Builder would switch on the chosen algo to generate an
algo-specific factory. This doesn't work (or at least, would be
difficult) in the new VS framework.
This refactor consolidates both factories together and introduces
a PercentilesConfig object to act as a standardized way to pass
algo-specific parameters through the factory. This object
is then used when deciding which kind of aggregator to create
Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will
be moved in a subsequent PR.
* Remove generics and target value type from MultiVSAB (#51647)
* fix checkstyle after merge (#52008)
* Plumb ValuesSourceRegistry through to QuerySearchContext (#51710)
* Convert RareTerms to new VS registry (#52166)
* Wire up Value Count (#52225)
* Wire up Max & Min aggregations (#52219)
* ValuesSource refactoring: Wire up Sum aggregation (#52571)
* ValuesSource refactoring: Wire up SigTerms aggregation (#52590)
* Soft immutability for VSConfig (#52729)
* Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734)
Also fixes Percentiles which was incorrectly specified to only accept
numeric, but in fact also accepts Boolean and Date (because those are
numeric on master - thanks `testSupportedFieldTypes` for catching it!)
* VS refactoring: Wire up stats aggregation (#52891)
* ValuesSource refactoring: Wire up string_stats aggregation (#52875)
* VS refactoring: Wire up median (MAD) aggregation (#52945)
* fix valuesourcetype issue with constant_keyword field (#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java
this commit implements `getValuesSourceType` for
the ConstantKeyword field type.
master was merged into feature/extensible-values-source
introducing a new field type that was not implementing
`getValuesSourceType`.
* ValuesSource refactoring: Wire up Avg aggregation (#52752)
* Wire PercentileRanks aggregator into new VS framework (#51693)
* Add a VSConfig resolver for aggregations not using the registry (#53038)
* Vs refactor wire up ranges and date ranges (#52918)
* Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034)
This commit updates the geo_bounds aggregation to depend
on registering itself in the ValuesSourceRegistry
relates #42949.
* VS refactoring: convert Boxplot to new registry (#53132)
* Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037)
This commit updates the geo*_grid aggregations to depend
on registering itself in the ValuesSourceRegistry
relates to the values-source refactoring meta issue #42949.
* Wire-up geo_centroid agg to ValuesSourceRegistry (#53040)
This commit updates the geo_centroid aggregation to depend
on registering itself in the ValuesSourceRegistry.
relates to the values-source refactoring meta issue #42949.
* Fix type tests for Missing aggregation (#53501)
* ValuesSource Refactor: move histo VSType into XPack module (#53298)
- Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core.
- This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept
- Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs
* Wire up DateHistogram to the ValuesSourceRegistry (#53484)
* Vs refactor parser cleanup (#53198)
Co-authored-by: Zachary Tong <polyfractal@elastic.co>
Co-authored-by: Zachary Tong <zach@elastic.co>
Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com>
Co-authored-by: Tal Levy <JubBoy333@gmail.com>
* First batch of easy fixes
* Remove List.of from ValuesSourceRegistry
Note that we intend to have a follow up PR dealing with the mutability
of the registry, so I didn't even try to address that here.
* More compiler fixes
* More compiler fixes
* More compiler fixes
* Precommit is happy and so am I
* Add new Core VSTs to tests
* Disabled supported type test on SigTerms until we can backport it's fix
* fix checkstyle
* Fix test failure from semantic merge issue
* Fix some metaData->metadata replacements that got lost
* Fix list of supported types for MinAggregator
* Fix list of supported types for Avg
* remove unused import
Co-authored-by: Zachary Tong <polyfractal@elastic.co>
Co-authored-by: Zachary Tong <zach@elastic.co>
Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com>
Co-authored-by: Tal Levy <JubBoy333@gmail.com>
The main changes are:
1. Throw an error when updating `include_in_parent` or `include_in_root` attribute of nested field dynamically by the PUT mapping API.
2. Add a test for the change.
Closes#53792
Co-authored-by: bellengao <gbl_long@163.com>
Modify the value of nowInMillis in queryShardContext to current timestamp, because the
value will be used lately when validating the filtered alias which uses now in a date_nanos
range query.
When retrieving the snapshots for a set of repos or deleting a single snapshot, it's possible for
the body of the `ActionListener`'s `onResponse` method to throw an Exception. In this case, the
`errHandler` passed in may not be executed, resulting in the `running` boolean not being reset back
to false.
This commit uses `ActionListener.wrap(...)` instead of creating a new ActionListener, which ensures
that if the `onResponse` fails in any way, the `onFailure` handler is still called.
Resolves#55217
Queries like script_score wrap a query and modify its score. If the inner query
rewrites to match_none, then the entire query can rewrite to match_none. This
lets us detect that certain shards can be skipped during the 'can match' phase.
This was a simple change that seemed like it would help in some cases. But it
will likely not have a huge impact, since in many use cases where the 'can
match' phase is helpful, the search is not sorted by score.
Today we pass the `RepositoriesService` to the searchable snapshots plugin
during the initialization of the `RepositoryModule`, forcing the plugin to be a
`RepositoryPlugin` even though it does not implement any repositories.
After discussion we decided it best for now to pass this in via
`Plugin#createComponents` instead, pending some future work in which plugins
can depend on services more dynamically.
Today the voting config exclusions API accepts node filters and resolves them
to a collection of node IDs against the current cluster membership.
This is problematic since we may want to exclude nodes that are not currently
members of the cluster. For instance:
- if attempting to remove a flaky node from the cluster you cannot reliably
exclude it from the voting configuration since it may not reliably be a
member of the cluster
- if `cluster.auto_shrink_voting_configuration: false` then naively shrinking
the cluster will remove some nodes but will leaving their node IDs in the
voting configuration. The only way to clean up the voting configuration is to
grow the cluster back to its original size (potentially replacing some of the
voting configuration) and then use the exclusions API.
This commit adds an alternative API that accepts node names and node IDs but
not node filters in general, and deprecates the current node-filters-based API.
Relates #47990.
Backport of #50836 to 7.x.
Co-authored-by: zacharymorn <zacharymorn@gmail.com>
The ResourceWatcherService enables watching of files for modifications
and deletions. During startup various consumers register the files that
should be watched by this service. There is behavior that might be
unexpected in that the service may not start polling until later in the
startup process due to the use of lifecycle states to control when the
service actually starts the jobs to monitor resources. This change
removes this unexpected behavior so that upon construction the service
has already registered its tasks to poll resources for changes. In
making this modification, the service no longer extends
AbstractLifecycleComponent and instead implements the Closeable
interface so that the polling jobs can be terminated when the service
is no longer required.
Relates #54867
Backport of #54993
`auto_date_histogram`'s reduction behavior is fairly complex and we have
some fairly complex testing logic for it but it is super difficult to
look at that testing logic and say "ah, that is what it does in this
case". This adds some tests explicit (non-randomized) tests of the
reduction logic that *should* be easier to read.
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
Currently forbidden apis accounts for 800+ tasks in the build. These
tasks are aggressively created by the plugin. In forbidden apis 3.0, we
will get task avoidance
(https://github.com/policeman-tools/forbidden-apis/pull/162), but we
need to ourselves use the same task avoidance mechanisms to not trigger
these task creations. This commit does that for our foribdden apis
usages, in preparation for upgrading to 3.0 when it is released.
Upgrade to lucene 8.5.1 release that contains a bug fix for a bug that might introduce index corruption when deleting data from an index that was previously shrunk.
We can be a little more efficient when aborting a snapshot. Since we know the new repository
data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners.
This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.
Snapshot deletes should first check the cluster state for an in-progress snapshot
and try to abort it before checking the repository contents. This allows for atomically
checking and aborting a snapshot in the same cluster state update, removing all possible
races where a snapshot that is in-progress could not be found if it finishes between
checking the repository contents and the cluster state.
Also removes confusing races, where checking the cluster state off of the cluster state thread
finds an in-progress snapshot that is then not found in the cluster state update to abort it.
Finally, the logic to use the repository generation of the in-progress snapshot + 1 was error
prone because it would always fail the delete when the repository had a pending generation different from its safe generation when a snapshot started (leading to the snapshot finalizing at a
higher generation).
These issues (particularly that last point) can easily be reproduced by running `SLMSnapshotBlockingIntegTests` in a loop with current `master` (see #54766).
The snapshot resiliency test for concurrent snapshot creation and deletion was made to more
aggressively start the delete operation so that the above races would become visible.
Previously, the fact that deletes would never coincide with initializing snapshots resulted
in a number of the above races not reproducing.
This PR is the most consistent I could get snapshot deletes without changes to the state machine. The fact that aborted deletes will not put the delete operation in the cluster state before waiting for the snapshot to abort still allows for some possible (though practically very unlikely) races. These will be fixed by a state-machine change in upcoming work in #54705 (which will have a much simpler and clearer diff after this change).
Closes#54766
* Remove Redundant Cluster State during Snapshot INIT + Master Failover (#54420)
Similar to #54395 we know that a snapshot in INIT state has not
written anything to the repository yet. If we see one from a master
failover, there is no point in moving it to ABORTED before removing it
from the cluster state in a subsequent CS update.
Instead, we can simply remove its job from the CS the first time
we see it on master failover and be done with it.
* Move Snapshot Status Related Method to Appropriate Places
Lots of things living in `SnapshotsService` for no reason other than
that `SnapshotsService` provides the `RepositoriesService`.
Cleaning this up to directly use `RepositoriesService` in the relevant
transport actions and by that shortening the already very complex `SnapshotsService`.
Just like in `AbstractCoordinatorTestCase` we can't just assume the cluster
is stable once all the cluster states align since stray follower/leader check
tasks could still hit us after a disconnect, causing future test operations to fail.
=> fixed by running all tasks in the possible time span of running into these
checks before validating that cluster states align on all nodes to prevent this
like we do in the coordinator tests.
Closes#55103
Provides basic repository-level stats that will allow us to get some insight into how many
requests are actually being made by the underlying SDK. Currently only tracks GET and LIST
calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint
that will help us better understand some of the low-level dynamics of searchable snapshots.
This is a first cut at giving NodeInfo the ability to carry a flexible
list of heterogeneous info responses. The trick is to be able to
serialize and deserialize an arbitrary list of blocks of information. It
is convenient to be able to deserialize into usable Java objects so that
we can aggregate nodes stats for the cluster stats endpoint.
In order to provide a little bit of clarity about which objects can and
can't be used as info blocks, I've introduced a new interface called
"ReportingService."
I have removed the hard-coded getters (e.g., getOs()) in favor of a
flexible method that can return heterogeneous kinds of info blocks
(e.g., getInfo(OsInfo.class)). Taking a class as an argument removes the
need to cast in the client code.
With this change, when a task is canceled, the task manager will cancel
not only its direct child tasks but all also its descendant tasks.
Closes#50990
Adds support for filters to T-Test aggregation. The filters can be used to
select populations based on some criteria and use values from the same or
different fields.
Closes#53692
The secure_settings_password was never taken into consideration in
the ReloadSecureSettings API. This commit fixes that and adds
necessary REST layer testing. Doing so, it also:
- Allows TestClusters to have a password protected keystore
so that it can be set for tests.
- Adds a parameter to the run task so that elastisearch can
be run with a password protected keystore from source.
The usage of local parameter for GetFieldMappingRequest has been removed from the underlying transport action since v2.0.
This PR deprecates the parameter from rest layer. It will be removed in next major version.
We added a fancy method to provide random realistic test data to the
reduction tests in #54910. This uses that to remove some of the more
esoteric machinations in the agg tests. This will marginally increase
the coverage of the serialiation tests and, more importantly, remove
some mysterious value generation code that only really made sense for
random reduction tests but was used all over the place. It doesn't, on
the other hand, make the tests shorter. Just *hopefully* more clear.
I only cleaned up a few tests this way. If we like this it'd probably be
worth grabbing others.
We found some problems during the test.
Data: 200Million docs, 1 shard, 0 replica
hits | avg | sum | value_count |
----------- | ------- | ------- | ----------- |
20,000 | .038s | .033s | .063s |
200,000 | .127s | .125s | .334s |
2,000,000 | .789s | .729s | 3.176s |
20,000,000 | 4.200s | 3.239s | 22.787s |
200,000,000 | 21.000s | 22.000s | 154.917s |
The performance of `avg`, `sum` and other is very close when performing
statistics, but the performance of `value_count` has always been poor,
even not on an order of magnitude. Based on some common-sense knowledge,
we think that `value_count` and sum are similar operations, and the time
consumed should be the same. Therefore, we have discussed the agg
of `value_count`.
The principle of counting in es is to traverse the field of each
document. If the field is an ordinary value, the count value is
increased by 1. If it is an array type, the count value is increased
by n. However, the problem lies in traversing each document and taking
out the field, which changes from disk to an object in the Java
language. We summarize its current problems with Elasticsearch as:
- Number cast to string overhead, and GC problems caused by a large
number of strings
- After the number type is converted to string, sorting and other
unnecessary operations are performed
Here is the proof of type conversion overhead.
```
// Java long to string source code, getChars is very time-consuming.
public static String toString(long i) {
int size = stringSize(i);
if (COMPACT_STRINGS) {
byte[] buf = new byte[size];
getChars(i, size, buf);
return new String(buf, LATIN1);
} else {
byte[] buf = new byte[size * 2];
StringUTF16.getChars(i, size, buf);
return new String(buf, UTF16);
}
}
```
test type | average | min | max | sum
------------ | ------- | ---- | ----------- | -------
double->long | 32.2ns | 28ns | 0.024ms | 3.22s
long->double | 31.9ns | 28ns | 0.036ms | 3.19s
long->String | 163.8ns | 93ns | 1921 ms | 16.3s
particularly serious.
Our optimization code is actually very simple. It is to manage different
types separately, instead of uniformly converting to string unified
processing. We added type identification in ValueCountAggregator, and
made special treatment for number and geopoint types to cancel their
type conversion. Because the string type is reduced and the string
constant is reduced, the improvement effect is very obvious.
hits | avg | sum | value_count | value_count | value_count | value_count | value_count | value_count |
| | | double | double | keyword | keyword | geo_point | geo_point |
| | | before | after | before | after | before | after |
----------- | ------- | ------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- |
20,000 | 38s | .033s | .063s | .026s | .030s | .030s | .038s | .015s |
200,000 | 127s | .125s | .334s | .078s | .116s | .099s | .278s | .031s |
2,000,000 | 789s | .729s | 3.176s | .439s | .348s | .386s | 3.365s | .178s |
20,000,000 | 4.200s | 3.239s | 22.787s | 2.700s | 2.500s | 2.600s | 25.192s | 1.278s |
200,000,000 | 21.000s | 22.000s | 154.917s | 18.990s | 19.000s | 20.000s | 168.971s | 9.093s |
- The results are more in line with common sense. `value_count` is about
the same as `avg`, `sum`, etc., or even lower than these. Previously,
`value_count` was much larger than avg and sum, and it was not even an
order of magnitude when the amount of data was large.
- When calculating numeric types such as `double` and `long`, the
performance is improved by about 8 to 9 times; when calculating the
`geo_point` type, the performance is improved by 18 to 20 times.
Currently the remote cluster sniff connection process can succeed even
if no connections are opened. This commit fixes this by failing the
connection process if no connections are successfully opened.
This commit adds an explicit test of time zone rewrite on date nanos
field. Today this is working but we need tests to ensure that we don't
break it unintentionally.
Makes query result serialization more robust by propagating possible
IOExceptions that can occur during shard level result serialization to the
caller instead of throwing AssertionError that is not intercepted.
Fixes#54665
The use of available processors, the terminology, and the settings
around it have evolved over time. This commit cleans up some places in
the codes and in the docs to adjust to the current terminology.
* Prevent putting V2 index template when overlapping with existing template
This change prevents putting V2 index template when it would overlap with existing V2 template
of the same priority
Relates to #53101
This changes the behavior of aggregations when search is performed
against enough shards to enable "batch reduce" mode. In this case we
force always store aggregations in serialized form rather than a
traditional java reference. This should shrink the memory usage of large
aggregations at the cost of slightly slowing down aggregations where the
coordinating node is also a data node. Because we're only doing this
when there are many shards this is likely to be fairly rare.
As a side effect this lets us add logs for the memory usage of the aggs
buffer:
```
[2020-04-03T17:03:57,052][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [1320->448] max [1320]
[2020-04-03T17:03:57,089][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [1328->448] max [1328]
[2020-04-03T17:03:57,102][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [1328->448] max [1328]
[2020-04-03T17:03:57,103][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [1328->448] max [1328]
[2020-04-03T17:03:57,105][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs final reduction [888] max [1328]
```
These are useful, but you need to keep some things in mind before
trusting them:
1. The buffers are oversized ala Lucene's ArrayUtils. This means that we
are using more space than we need, but probably not much more.
2. Before they are merged the aggregations are inflated into their
traditional Java objects which *probably* take up a lot more space
than the serialized form. That is, after all, the reason why we store
them in serialized form in the first place.
And, just because I can, here is another example of the log:
```
[2020-04-03T17:06:18,731][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [147528->49176] max [147528]
[2020-04-03T17:06:18,750][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [147528->49176] max [147528]
[2020-04-03T17:06:18,809][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [147528->49176] max [147528]
[2020-04-03T17:06:18,827][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs partial reduction [147528->49176] max [147528]
[2020-04-03T17:06:18,829][TRACE][o.e.a.s.SearchPhaseController] [runTask-0] aggs final reduction [98352] max [147528]
```
I got that last one by building a ten shard index with a million docs in
it and running a `sum` in three layers of `terms` aggregations, all on
`long` fields, and with a `batched_reduce_size` of `3`.
`PipelineAggregator`s are only sent across the wire for backwards
compatibility with 7.7.0. `PipelineAggregator` needs to continue to
implement `NamedWriteable` for backwards compatibility but pipeline
aggregations created after 7.7.0 need not implement any of the methods
in that interface because we'll never attempt to call them. So this
creates implementations in `PipelineAggregator` (the base class) that
just throw exceptions.
* HLRC support for Index Templates V2 (#54838)
* HLRC support for Index Templates V2
This change adds High Level Rest Client support for Index Templates V2.
Relates to #53101
* fixed compilation error
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>