`foreach` processors store information within the `_ingest` metadata object.
This commit adds the contents of the `_ingest` metadata (if it is not empty).
And will append new inference results if the result field already exists.
This allows a `foreach` to execute and multiple inference results being written to the same result field.
closes https://github.com/elastic/elasticsearch/issues/60867
This makes KeywordFieldMapper extend ParametrizedFieldMapper, with explicitly
defined parameters.
In addition, we add a new option to Parameter, restrictedStringParam, which
accepts a restricted set of string options.
Found this while checking if I can speed up SnapshotResiliencyTests
to get more coverage/time. Turns out throwing a new instance here on
every task was taking 9% of the CPU wall-time in those tests. With
this change it's 4% of the overall.
The Query string parser was not delegating the construction of wildcard/regex queries to the underlying field type.
The wildcard field has special data structures and queries that operate on them so cannot rely on the basic regex/wildcard queries that were being used for other fields.
Closes#60957
Use thread-local buffers and deflater and inflater instances to speed up
compressing and decompressing from in-memory bytes.
Not manually invoking `end()` on these should be safe since their off-heap memory
will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals
that are not instantiated at a high frequency.
This significantly reduces the amount of byte copying and object creation relative to the previous approach
which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied
bytes out of that buffer to a freshly allocated `byte[]`, used 4k stream buffers needlessly when working with
bytes that are already in arrays (`writeTo` handles efficient writing to the compression logic now) etc.
Relates #57284 which should be helped by this change to some degree.
Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these
code paths.
The test failed because the leader was taking a lot of CPUs to process
many mapping updates. This commit reduces the mapping updates, increases
timeout, and adds more debug info.
Closes#59832
This commit introduces a new thread pool, `system_read`, which is
intended for use by system indices for all read operations (get and
search). The `system_read` pool is a fixed thread pool with a maximum
number of threads equal to lesser of half of the available processors
or 5. Given the combination of both get and read operations in this
thread pool, the queue size has been set to 2000. The motivation for
this change is to allow system read operations to be serviced in spite
of the number of user searches.
In order to avoid a significant performance hit due to pattern matching
on all search requests, a new metadata flag is added to mark indices
as system or non-system. Previously created system indices will have
flag added to their metadata upon upgrade to a version with this
capability.
Additionally, this change also introduces a new class, `SystemIndices`,
which encapsulates logic around system indices. Currently, the class
provides a method to check if an index is a system index and a method
to find a matching index descriptor given the name of an index.
Relates #50251
Relates #37867
Backport of #57936
We accept _source values with multiple levels of arrays, such as
`"field": [[[1, 2]]]`. This PR ensures that field retrieval can handle nested
arrays by unwrapping the arrays before parsing the values.
Examines the reindex response in order to report potential
problems that occurred during the reindexing phase of
data frame analytics jobs.
Backport of #60911
If the search for get stats with multiple job Ids fails the listener is called for each failure.
This change waits for all responses then returns the first error if there was one.
Same as https://github.com/elastic/elasticsearch/pull/43288 for GCS.
We don't need to do the bucket exists check before using the repo, that just needlessly
increases the necessary permissions for using the GCS repository.
Because the 'fields' option loads from _source (which is a stored field), it is
not possible to retrieve 'fields' when stored_fields are disabled.
This also fixes#60912, where setting stored_fields: _none_ prevented the
_ignored fields from being loaded and caused a parsing exception.
This moves the `distance_feature` query building out of
`DistanceFeatureQueryBuilder` and into subclasses of `MappedFieldType`.
Without this we don't have a chance of supporting this for runtime
fields. In general I'm not sad to see the `instanceof`s go.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This way is faster, saving about 8% on the microbenchmark that rounds to
the nearest month. That is in the hot path for `date_histogram` which is
a very popular aggregation so it seems worth it to at least try and
speed it up a little.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit removes the ability to test the top level result of an aggregator
before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search
are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test
the final output (the one sent to the end user) rather than an intermediary result
that could be different.
This change also removes spurious commits triggered on top of a random index writer.
These commits slow down the tests and are redundant with the commits that the
random index writer performs.
Split the autoscaling decider into a service and configuration
in order to enable having additional context information available
in the service. Added AutoscalingDeciderContext holding generic
information all deciders are expected to need. Implemented GET
_autoscaling/decision
There is no point in timing out a join attempt any more once a cluster
is entirely in 7.x. Timing out and retrying with the same master is
pointless, and an in-flight join attempt to one master no longer blocks
attempts to join other masters. This commit deprecates this unnecessary
setting and removes its effect from the joining process.
Relates #60873 which removes this setting in master.
This adds a force-merge step to the searchable snapshot action, enabled by default,
but parameterizable using the `force_merge-index" optional boolean.
eg.
```
PUT _ilm/policy/my_policy
{
"policy": {
"phases": {
"cold": {
"actions": {
"searchable_snapshot" : {
"snapshot_repository" : "backing_repo",
"force_merge_index": true
}
}
}
}
}
}
```
(cherry picked from commit d0a17b2d35f1b083b574246bdbf3e1929471a4a9)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
"Transitive" is technically ok here but it's an overloaded word and it's
not immediately clear which meaning is intended so this log message
always makes me do a double-take. I think both "transient" and
"transitory" are clearer, with "transient" being the usual choice.