Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog).
Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed.
relates #55699
relates #52369
backports #55941
The building block of the eql response is currently the SearchHit. This
is a problem since it is tied to an actual search, and thus has scoring,
highlighting, shard information and a lot of other things that are not
relevant for EQL.
This becomes a problem when doing sequence queries since the response is
not generated from one search query and thus there are no SearchHits to
speak of.
Emulating one is not just conceptually incorrect but also problematic
since most of the data is missed or made-up.
As such this PR introduces a simple class, Event, that maps nicely to
the terminology while hiding the ES internals (the use of SearchHit or
GetResult/GetResponse depending on the API used).
Fix#59764Fix#59779
Co-authored-by: Igor Motov <igor@motovs.org>
(cherry picked from commit 997376fbe6ef2894038968842f5e0635731ede65)
* Faster `equals` for `BytesArray` which is nice since with this change we use it for the search cache
* Lighter `StreamInput` for `BytesArray` that should save memory and some indirection relative to the one on the abstract bytes reference
* Lighter `writeTo` implementation
* Build a `BytesArray` instead of a PagedBytesReference whenever possible to save indirection and memory
This is mostly motivated by the performance issues we are seeing around the GET mappings
REST API which (in case of a large number of indices) will create decompressing streams in a hot loop
which takes a significant amount of time for the system calls involved in instantiating deflaters
and inflaters.
Also, this fixes a leaked deflater when deserializing cached repository data.
This method might have materialize all the bytes in a reference into a fresh `byte[]`.
Using the stream is much safer and only trivially more expensive + in most cases we now run the fast path via `BytesArray` anyway.
This optimization is more relevant in the context of CCR. When a node in
the follower cluster leaves, we reallocate the shard-follow tasks on
that node to other nodes. The new tasks will overwhelm the follower
cluster with many put-mapping, update-settings requests, although most
of them are noop. This change detects and optimizes the noop
update-settings requests.
This continues #61301, migrating all of the mappers in `server` to the
new `MapperTestCase` which is nicer than `FieldMapperTestCase` because
it doesn't depend on all of Elasticsearch.
No-op changes to:
* Move `Search your data` source files into the same directory
* Rename `Search your data` source files based on page ID
* Remove unneeded includes
* Remove the `Request` dir
* [ML] adding docs + hlrc for data frame analysis feature_processors (#61149)
Adds HLRC and some docs for the new feature_processors field in Data frame analytics.
Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
We only ever support `JSON` for the query source format in practice.
The reason this test worked before is a bug in xcontent parsing that parses
empty maps out of streams of the wrong format.
Closes#61483
It's unnecessary (and adds one string comparison to every request) to special
case the favicon so I added it as a normal REST handler to simplify the code.
This commit removes the log info message "Created ML annotations index and aliases".
The message comes in addition to elasticsearch's index creation logging and it does
not add to it. In addition, since #61107 that message may be logged multiple times.
Backport of #61461
Wrapping a `BytesArray` in a `StreamInput` for deserialization is inefficient.
This forces Jackson to internally buffer (i.e. copy) all bytes from the `BytesArray`
before deserializing, adding overhead for copying the bytes and managing the buffers.
This commit fixes a number of spots where `BytesArray` is the most common type of
`BytesReference` to special case this type and parse it more efficiently.
Also improves parsing `String`s to use the more efficient direct `String` parsing APIs.
Changes:
* Removes narrative around URI searches. These aren't commonly used in production. The `q` param is already covered in the search API docs: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-search.html#search-api-query-params-q
* Adds a common options section that highlights narrative docs for query DSL, aggregations, multi-index search, search fields, pagination, sorting, and async search.
* Adds a `Search shard routing` page. Moves narrative docs for adaptive replica selection, preference, routing , and shard limits to that section.
* Moves search timeout and cancellation content to the `Search your data` page.
* Creates a `Search multiple data streams and indices` page. Moves related narrative docs for multi-target syntax searches and `indices_boost` to that page.
* Removes narrative examples for the `search_type` parameters. Moves documentation for this parameter to the search API docs.
Closes#60864. Tweak the JDK directories' permissions in the ES
Docker image so that ES can run under a different user and group.
These changes assume that the image is being run with bind-mounted
config, data and logs directories, and reads and writes to these
locations will still fail when both the UID and GID are not the
default. Everything should be OK when running with the default GID
of zero, however.
Report anonymous roles in response to "GET _security/_authenticate" API call when:
* Anonymous role is enabled
* User is not the anonymous user
* Credentials is not an API Key
Access the common versions map is done in a lot of places. While it can
be access through an import of VersionProperties, the vast majority of
places use it through the provided convenience property added by
BuildPlugin. This commit moves that convenience property to the base
java plugin, so further reduce dependence on the BuildPlugin.
In the past, the only way to run a local Elasticsearch build with a remote debugger was by extracting elasticsearch and passing ES_JAVA_OPTS. However, since switching to gradle, a convenience flag was added, `--debug-jvm` (which is documented elsewhere in the testings docs), when running a local elasticsearch build through gradle. This commit removes the old documentation.
There are warnings about unlicense realms when user lookup fails. This PR adds
similar warnings for when no authentication token can be extracted from the request.
Today a remote cluster connection comprises a `PING` and a `REG`
channel. The `PING` channel is only used for health checks between the
elected master and the members of its own cluster, so is unused in a
remote cluster connection. This commit removes this unused connection.
The API key document currently doesn't include the user's full_name or email attributes,
and as a result, when those attributes return `null` when hitting `GET`ing `/_security/_authenticate`,
and in the SAML response from the [IdP Plugin](https://github.com/elastic/elasticsearch/pull/54046).
This changeset adds those fields to the document and extracts them to fill in the User when
authenticating. They're effectively going to be a snapshot of the User from when the key was
created, but this is in line with roles and metadata as well.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
For large responses to the get mappings request, the serialization
to XContent can be extremely slow (serializing mappings is expensive since
we have to decompress and deserialize the mapping source).
To not introduce instability on the IO thread handling the get mappings response
we should move the serialization to the management pool.
The trade-off of introducing one or two new context switches for responses that are
small enough to not cause trouble on the transport thread to prevent instability
in case of a large number of mappings in the cluster seems worth it.
It is not realistic to drop messages without eventually failing.
To retain the coverage of long pauses this PR adjusts the blackholed
behavior to fail a send after 24h (which is assumed to be longer than any
timeout in the system) instead of never.
Closes#61034
Before when a value was copied to a field through a parent field or `copy_to`,
we parsed it using the `FieldMapper` from the source field. Instead we should
parse it using the target `FieldMapper`. This ensures that we apply the
appropriate mapping type and options to the copied value.
To implement the fix cleanly, this PR refactors the value parsing strategy. Now
instead of looking up values directly, field mappers produce a helper object
`ValueFetcher`. The value fetchers are responsible for almost all aspects of
fetching, including looking up the right paths in the _source.
The PR is fairly big but each commit can be reviewed individually.
Fixes#61033.
Previously we didn't retain the requested fields when performing a shallow copy
of the search source. This meant that when a search was rewritten, we could drop
the requested fields and fail to return them in the response.
Saving some cycles here and there on the IO loop:
* Don't instantiate new `Runnable` to execute on `SAME` in a few spots
* Don't instantiate complicated wrapped stream for empty messages
* Stop instantiating almost never used `ClusterStateObserver` in two spots
* Some minor cleanup and preventing pointless `Predicate<>` instantiation in transport master node action