Our lovely `BitArray` compactly stores "flags", lazilly growing its
underlying storage. It is super useful when you need to store one bit of
data for a zillion buckets or a documents or something. Usefully, it
defaults to `false`. But there is a wrinkle! If you ask it whether or
not a bit is set but it hasn't grown its underlying storage array
"around" that index then it'll throw an `ArrayIndexOutOfBoundsException`.
The per-document use cases tend to show up in order and don't tend to
mind this too much. But the use case in aggregations, the per-bucket use
case, does. Because buckets are collected out of order all the time.
This changes `BitArray` so it'll return `false` if the index is too big
for the underlying storage. After all, that index *can't* have been set
or else we would have grown the underlying array. Logically, I believe
this makes sense. And it makes my life easy. At the cost of three lines.
*but* this adds an extra test to every call to `get`. I think this is
likely ok because it is "very close" to an array index lookup that
already runs the same test. So I *think* it'll end up merged with the
array bounds check.
This commit updates the template used for watch history indices with
the hidden index setting so that new indices will be created as hidden.
Relates #50251
Backport of #52962
Datafeed bwc tests have been muted for some time in the 7.x. This is because of date_histogram interval deprecation warnings.
This commit fixes the tests as must as possible while still handling deprecation warnings.
Currently the AbstractBulkByScrollRequest accepts slice values of 0 via its
`setSlices` method, denoting the "auto" slicing behaviour that is usable by
settting the "slices=auto" parameter on rest requests. When using the High Level
Rest Client, however, we send the 0 value as an integer, which is then rejected
as invalid by `AbstractBulkByScrollRequest#parseSlices`. Instead of making
parsing of the rest request more lenient, this PR opts for changing the
RequestConverter logic in the client to translate 0 values to "auto" on the rest
requests.
Closes#53044
For Node Info to be pluggable, NodesInfoRequest must be able to carry
arbitrary strings. This commit reworks the internals of that class to
use a set rather than hard-coded boolean fields.
NodesInfoRequest defaults to specifying all values. We test for
this behavior as we refactor and use random testing for the
various combinations of metrics.
Add backwards compatibility for transport requests.
With ExitableDirectoryReader in place, check for query cancellation
during QueryPhase#preProcess where the query rewriting takes place.
Follows: #52822
(cherry picked from commit 0d38626d8e6e9e2620a7a446b617a2ac42852461)
The bool query builder in elasticsearch accepts both must_not and mustNot
fields. Given that leniency is abhorrent and must be eschewed, we should deprecate
the latter as it doesn't fit with the style of parameters elsewhere in the DSL.
Use assertBusy when doing reroute after bridged disruption,
since it can return non-acked if a node is marked faulty
by follower check after disruption ended.
Closes#53064
This commit fixes ensures that for external builds
(e.g. plugin development) that the REST tests that are
copied are properly filtered to only include the API
by default.
The code prior to this change resulted in including both
the API and tests since the copy.include resulted as an
empty list by default since the stream is empty unless
explicitly configured.
related #52114fixes#53183
When we test backwards compatibility we often end up in a situation
where we *sometimes* get a warning, and sometimes don't. Like, we won't
get the warning if we're testing against an older version, but we will
in a newer one. Or we won't get the warning if the request randomly
lands on a node with an old version of the code. But we wouldn't if it
randomed into a node with newer code.
This adds `allowed_warnings` to our yaml test runner for those cases:
warnings declared this way are "allowed" but not "required".
Blocks #52959
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
7.5 and 7.6 had a regression that allowed for
script_score queries to have negative scores.
We have corrected this regression in #52478.
This is an addition to #52478 that adds
a test and release notes.
Tests have been periodically failing due to a race condition on checking a recently `STOPPED` task's state. The `.ml-state` index is not created until the task has already been transitioned to `STARTED`. This allows the `_start` API call to return. But, if a user (or test) immediately attempts to `_stop` that job, the job could stop and the task removed BEFORE the `.ml-state|stats` indices are created/updated.
This change moves towards the task cleaning up itself in its main execution thread. `stop` flips the flag of the task to `isStopping` and now we check `isStopping` at every necessary method. Allowing the task to gracefully stop.
closes#53007
For analytics, we need a consistent way of indicating when a value is missing. Inheriting from anomaly detection, analysis sent `""` when a field is missing. This works fine with numbers, but the underlying analytics process actually treats `""` as a category in categorical values.
Consequently, you end up with this situation in the resulting model
```
{
"frequency_encoding" : {
"field" : "RainToday",
"feature_name" : "RainToday_frequency",
"frequency_map" : {
"" : 0.009844409027270245,
"No" : 0.6472019970785184,
"Yes" : 0.6472019970785184
}
}
}
```
For inference this is a problem, because inference will treat missing values as `null`. And thus not include them on the infer call against the model.
This PR takes advantage of our new `missing_field_value` option and supplies `\0` as the value.
Implement an Exitable DirectoryReader that wraps the original
DirectoryReader so that when a search task is cancelled the
DirectoryReaders also stop their work fast. This is usuful for
expensive operations like wilcard/prefix queries where the
DirectoryReaders can spend lots of time and consume resources,
as previously their work wouldn't stop even though the original
search task was cancelled (e.g. because of timeout or dropped client
connection).
(cherry picked from commit 67acaf61f33bc5f54e26541514d07e375c202e03)
The assumption added in #52631 skips a problematic test
if it fails to create the required conditions for the
scenario it is supposed to be testing. (This happens
very rarely.)
However, before skipping the test it needs to remove the
failed job it has created because the standard test
cleanup code treats failed jobs as fatal errors.
Closes#52608
Upgrading the GCS SDK to the most recent version.
Adjusting (i.e. improving) the REST mock accordingly.
This should significantly boost performance by pulling in
https://github.com/googleapis/java-core/issues/86 in some cases.
Adds documentation for the `any` keyword to the EQL syntax docs.
Includes:
* Definition of an event category and its relationship to the event
category field.
* Example matching all event categories using `any` keyword
* Example using `any` with `where true`
Tests in GoogleCloudStorageBlobStoreRepositoryTests are known
to be flaky on JDK 8 (#51446, #52430 ) and we suspect a JDK
bug (https://bugs.openjdk.java.net/browse/JDK-8180754) that triggers
some assertion on the server side logic that emulates the Google
Cloud Storage service.
Sadly we were not able to reproduce the failures, even when using
the same OS (Debian 9, Ubuntu 16.04) and JDK (Oracle Corporation
1.8.0_241 [Java HotSpot(TM) 64-Bit Server VM 25.241-b07]) of
almost all the test failures on CI. While we spent some time fixing
code (#51933, #52431) to circumvent the JDK bug they are still flaky
on JDK-8. This commit mute these tests for JDK-8 only.
Close ##52906
With #50871 aggrgations should now be parsed directly by an
`ObjectParser` or `ConstructingObjectParser` without the need for the
ceremonial `parse` method. This removes 9 of those `parse` methods and
parses the aggregation directly from their `ObjectParser`.
Currently the remote connection manager will delegate the size() call to
the underlying cluster connection manager. This introduces the
possibility that call will return 1 before the nodeConnection method has
been triggered to add the connection to the remote connection list. This
can cause issues, as the ensureConnected method checks the connection
managers size and executes synchronously if the size is > 0. This leads
to a potential cluster not connected exception while we are still
waiting for the connection opened callback to be triggered.
This commit fixes this issue by using the remote connection manager's
size to report the connection manager's size.
Fixes#52029.