It was relying on the compensated sum working but the test framework was
dodging it. This forces the accuracy tests to come from a single shard
where we get the proper compensated sum.
Closes#56757
We get the number of shards and replicas with our bare hands in index
metadata, rather than letting the settings infrastructure do the work
for us. This commit switches to using the settings infrastructure.
Today a 7.x node logs `cluster UUID set to [...]` on every cluster state update
received from a 6.8 master, because 6.8 nodes are not able to commit the
cluster UUID properly. We could try and deduplicate these logs somehow, but
that would introduce a good deal of complexity. Instead, this commit suppresses
these logs entirely when receiving cluster state updates from a 6.8 master.
In DF analytics classification, it is possible to use no samples
of a class if its cardinality is too low.
This commit fixes this by ensuring the target sample count can never be zero.
Backport of #56783
* JDBC: fix access to the Manifest for non-entry JAR
The JDBC driver will attempt to read its version from the Manifest file
embedded into its JAR. The URL pointing to the JAR can be provided in a
few ways.
So far, accessing the Manfiest was attempted by getting a URLConnection
out of the URL and then getting an input stream out of this connection.
For file JAR URLs, this only works however if the URL points to the
driver as a JAR file entry (i.e. <sub-url>!/jdbc-driver.jar!/). If
that's not the case, the JarURLConnection will throw an IOException.
This commit fixes that: in case the URL points to a JAR entry
(jar:file:<path>/jdbc-driver.jar!/), the manifest is read directly with
JarURLConnection#getManifest().
(cherry picked from commit 2175b7b01cf5fcf3ab2bb21404a9bd454a8df3f0)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
In some cases the Enrich processor factory may be called before it is
ready to create processors. While these calls are usually made in error,
the response from the Enrich processor is an NPE which is almost always
an unhelpful error when debugging an issue.
Added support for decompression at LLRC and added integration test
(cherry picked from commit 2621452473e0c236aa28db749f782a24eca6c974)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Co-authored-by: Hakky54 <hakangoudberg@hotmail.com>
This change aims to fix our setup in CI so that we can run 7.x in
FIPS 140 mode. The major issue that we have in 7.x and did not
have in master is that we can't use the diagnostic trust manager
in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it
explicitly disallows the wrapping of X509TrustManager.
Previous attempts like #56427 and #52211 focused on disabling the
setting in all of our tests when creating a Settings object or
on setting fips_mode.enabled accordingly (which implicitly disables
the diagnostic trust manager). The attempts weren't future proof
though as nothing would forbid someone to add new tests without
setting the necessary setting and forcing this would be very
inconvenient for any other case ( see
#56427 (comment) for the full argumentation).
This change introduces a runtime check in SSLService that overrides
the configuration value of xpack.security.ssl.diagnose.trust and
disables the diagnostic trust manager when we are running in Java 8
and the SunJSSE provider is set in FIPS mode.
* [Transform] add support for terms agg in transforms (#56696)
This adds support for `terms` and `rare_terms` aggs in transforms.
The default behavior is that the results are collapsed in the following manner:
`<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...`
Or if no sub aggs exist
`<AGG_NAME>.<BUCKET_NAME>.<_doc_count>`
The mapping is also defined as `flattened` by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.
This is a followup to #56632. Tests that had to be changed
to mock the C++ log handler more accurately need to be more
careful about when that stream ends, as ending of that
stream is used to detect crashes in the production system.
Fixes#56796
Mapper.Builder currently has some complex generics on it to allow fluent builder
construction. However, the second parameter, a return type from the build() method,
is unnecessary, as we can use covariant return types. This commit removes this second
generic parameter.
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
This aggregation will perform normalizations of metrics
for a given series of data in the form of bucket values.
The aggregations supports the following normalizations
- rescale 0-1
- rescale 0-100
- percentage of sum
- mean normalization
- z-score normalization
- softmax normalization
To specify which normalization is to be used, it can be specified
in the normalize agg's `normalizer` field.
For example:
```
{
"normalize": {
"buckets_path": <>,
"normalizer": "percent"
}
}
```
If CI is running tests at exactly 0 or 5 minutes past the hour
the ack-watch docs tests may fail with a 409 error if the ack
test happens to run at the exact time that the schedule watch
is running.
This commit changes the public documentation (and the test) for
the ack to a feb 29th at noon schedule. Test doc or tests do
not really care about the schedule date and this is chosen
since it is a valid date, but one that is extremely unlikely
to cause issues.
Optimize away events queries and joins/sequence that cannot match any
results without having to query the backend.
(cherry picked from commit 69c8ef8cfefd8fc6dcb6d1a566bfcd537068e3e4)
We never do any file IO or other blocking work on the transport threads
so no tangible benefit can be derived from using more threads than CPUs
for IO.
There are however significant downsides to using more threads than necessary
with Netty in particular. Since we use the default setting for
`io.netty.allocator.useCacheForAllThreads` which is `true` we end up
using up to `16MB` of thread local buffer cache for each transport thread.
Meaning we potentially waste CPUs * 16MB of heap for unnecessary IO threads in addition to obvious inefficiencies of artificially adding extra context switches.
Adds the conflicting types and an example of an index which specifies
them in order to make it easier for the user to understand the conflict.
Backport of #56700
This change ensures that the maintenance service that is responsible for deleting the expired response is stopped between each test. This is needed since we check that no search context are in-flight after each test method.
Fixes#55988
Backporting #56585 to 7.x branch.
Adds tracking for the API calls performed by the GoogleCloudStorage
underlying SDK. It hooks an HttpResponseInterceptor to the SDK
transport layer and does http request filtering based on the URI
paths that we are interested to track. Unfortunately we cannot hook
a wrapper into the ServiceRPC interface since we're using different
levels of abstraction to implement retries during reads
(GoogleCloudStorageRetryingInputStream).
If an email action is used in a foreach loop, message ids could have
been duplicated, which then get rejected by the mail server.
This commit introduces an additional static counter in the email action
in order to ensure that every message id is unique.
Prior to this change the named pipes that connect the ML C++
processes to the Elasticsearch JVM were all opened before any
of them were read from or written to.
This created a problem, where if the C++ process logged more
messages between opening the log pipe and opening the last
pipe to be connected than there was space for in the named
pipe's buffer then the C++ process would block. This would
mean it never got as far as opening the last named pipe, so
the JVM would never get as far as reading from the log pipe,
hence a deadlock.
This change alters the connection order so that the JVM
starts reading from the logging pipe immediately after opening
it so that if the C++ process logs messages while opening the
other named pipes they are captured in a timely manner and
there is no danger of a deadlock.
Backport of #56632
If a channel gets disconnected, then we should cancel the tasks
associated with that channel as their results won't be retrieved.
Closes#56327
Relates #56619
Backport of #56620
We previously rejected removing the number of replicas setting, which
prevents users from reverting this setting to its default the natural
way. To fix this, we put back the setting with the default value in the
cases that the user is trying to remove it. Yet, we also need to do the
work of updating the routing table and so on appropriately. This case
was missed because when the setting is being removed, we were defaulting
to -1 in this code path, which is treated as not being updated. Instead,
we must treat the case when we are removing this setting as if the
setting is being updated, too. This commit does that.
* [DOCS] Add info about ILM and unallocated shards.
* Incorporated review feedback.
* Update docs/reference/ilm/actions/ilm-allocate.asciidoc
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
* Apply suggestions from code review
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
* Fix xref
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
In normal operation native controllers are not expected to write
anything to stdout or stderr. However, if due to an error or
something unexpected with the environment a native controller
does write something to stdout or stderr then it will block if
nothing is reading that output.
This change makes the stdout and stderr of native controllers
reuse the same stdout and stderr as the Elasticsearch JVM (which
are by default redirected to es.stdout.log and es.stderr.log) so
that if something unexpected is written to native controller
output then:
1. The native controller process does not block, waiting for
something to read the output
2. We can see what the output was, making it easier to debug
obscure environmental problems
Backport of #56491
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>