`foreach` processors store information within the `_ingest` metadata object.
This commit adds the contents of the `_ingest` metadata (if it is not empty).
And will append new inference results if the result field already exists.
This allows a `foreach` to execute and multiple inference results being written to the same result field.
closes https://github.com/elastic/elasticsearch/issues/60867
The Query string parser was not delegating the construction of wildcard/regex queries to the underlying field type.
The wildcard field has special data structures and queries that operate on them so cannot rely on the basic regex/wildcard queries that were being used for other fields.
Closes#60957
Use thread-local buffers and deflater and inflater instances to speed up
compressing and decompressing from in-memory bytes.
Not manually invoking `end()` on these should be safe since their off-heap memory
will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals
that are not instantiated at a high frequency.
This significantly reduces the amount of byte copying and object creation relative to the previous approach
which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied
bytes out of that buffer to a freshly allocated `byte[]`, used 4k stream buffers needlessly when working with
bytes that are already in arrays (`writeTo` handles efficient writing to the compression logic now) etc.
Relates #57284 which should be helped by this change to some degree.
Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these
code paths.
The test failed because the leader was taking a lot of CPUs to process
many mapping updates. This commit reduces the mapping updates, increases
timeout, and adds more debug info.
Closes#59832
Examines the reindex response in order to report potential
problems that occurred during the reindexing phase of
data frame analytics jobs.
Backport of #60911
If the search for get stats with multiple job Ids fails the listener is called for each failure.
This change waits for all responses then returns the first error if there was one.
This commit removes the ability to test the top level result of an aggregator
before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search
are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test
the final output (the one sent to the end user) rather than an intermediary result
that could be different.
This change also removes spurious commits triggered on top of a random index writer.
These commits slow down the tests and are redundant with the commits that the
random index writer performs.
Split the autoscaling decider into a service and configuration
in order to enable having additional context information available
in the service. Added AutoscalingDeciderContext holding generic
information all deciders are expected to need. Implemented GET
_autoscaling/decision
This adds a force-merge step to the searchable snapshot action, enabled by default,
but parameterizable using the `force_merge-index" optional boolean.
eg.
```
PUT _ilm/policy/my_policy
{
"policy": {
"phases": {
"cold": {
"actions": {
"searchable_snapshot" : {
"snapshot_repository" : "backing_repo",
"force_merge_index": true
}
}
}
}
}
}
```
(cherry picked from commit d0a17b2d35f1b083b574246bdbf3e1929471a4a9)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
"Transitive" is technically ok here but it's an overloaded word and it's
not immediately clear which meaning is intended so this log message
always makes me do a double-take. I think both "transient" and
"transitory" are clearer, with "transient" being the usual choice.
This suite is still occasionally failing with a timeout on macOS.
Suggest further increasing this timeout until this suite is broken up.
Relates #58071
* [ML] have DELETE analytics ignore stats failures and clean up unused stats (#60776)
When deleting an analytics configuration, the request MIGHT fail if
the .ml-stats index does not exist or is in strange state (shards unallocated).
Instead of making the request fail, we should log that we were unable to delete the stats docs and then
have them cleaned up in the 'delete_expire_data' janitorial process
remove test, scripts are excluded in the change collector, the test is a leftover from a previous
solution of #57332, which has been discarded
relates #60724fixes#60794
When an exception is thrown during test inference we are
not including the cause message in our logging. This commit
addresses this issue.
Backport of #60749
This pull request adds recovery state tracking for Searchable Snapshots.
In order to track recoveries for searchable snapshot backed indices, this pull
request adds a new type of RecoveryState.
This newRecoveryState instance is able to deal with the
small differences that arise during Searchable snapshots recoveries.
Those differences can be summarized as follows:
- The Directory implementation that's provided by SearchableSnapshots mark the
snapshot files as reused during recovery. In order to keep track of the
recovery process as the cache is pre-warmed, those files shouldn't be marked
as reused.
- Once the shard is created, the cache starts its pre-warming phase, meaning that
we should keep track of those downloads during that process and tie the recovery
to this pre-warming phase. The shard is considered recovered once this pre-warming
phase has finished.
Backport of #60505
disable optimizations when using scripts in group_by, when scripts using scripts we can not predict
the outcome and we have no query counterpart. Other optimizations for other group_by's are not
affected.
fixes#57332
implements a test suite for testing continuous transform with randomization in terms of mappings,
index settings, transform configuration. Add a test case for terms and date histogram. The test
covers:
- continuous mode with several checkpoints created
- correctness of results
- optimizations (minimal necessary writes)
- permutations of features (index settings, aggs, data types, index or data stream)
When the RBACEngine authorizes scroll searches it sets the index access control
to the very limiting IndicesAccessControl.ALLOW_NO_INDICES value.
This change will set it to the value for the index access control that was produced
during the authorization of the initial search that created the scroll,
which is now stored in the scroll context.
Implements license degradation behavior for searchable snapshots. Snapshot-backed shards are failed when the license becomes invalid, and shards won't be reallocated. After valid license is put in place again, shards are allocated again.
This commit removes the body property from the
indices.create_data_stream.json REST API spec
as the API does not support sending a body.
Update the description of the API to remove
that a data stream can be updated with the
API - data streams can only be created with
this API and attempting to update yields a
`resource_already_exists_exception`.
Closes#60704
(cherry picked from commit 2cab2e0ee094769852df31566dbe22b5df59d900)
Currently, validation of mappers (checking that cross-references are correct, limits on
field name lengths and object depths, multiple definitions, etc) is performed by the
MapperService. This means that any mapper-specific validation, for example that done
on the CompletionFieldMapper, needs to be called specifically from core server code,
and so we can't add validation to mappers that live in plugins.
This commit reworks the validation framework so that mapper-specific validation is
done on the Mapper itself. Mapper gets a new `validate(MappingLookup)`
method (already present on `MetadataFieldMapper` and now pulled up to the parent
interface), which is called from a new `DocumentMapper.validate()` method. All
the validation code currently living on `MapperService` moves either to individual
mapper implementations (FieldAliasMapper, CompletionFieldMapper) or into
`MappingLookup`, an altered `DocumentFieldMappers` which now knows about
object fields and can check for duplicate definitions, or into DocumentMapper
which handles soft limit checks.
* Merge test runner task into RestIntegTest (#60261)
* Merge test runner task into RestIntegTest
* Reorganizing Standalone runner and RestIntegTest task
* Rework general test task configuration and extension
* Fix merge issues
* use former 7.x common test configuration
We have various ways of copying between two streams and handling thread-local
buffers throughout the codebase. This commit unifies a number of them and
removes buffer allocations in many spots.
This commit changes TokenAuthIntegTests so all occurrences of
assertThat(x.size(), equalTo(0));
become
assertThat(x, empty());
This means that the assertion failure message will include the
contents of the list (`x`) instead of just its size, which
facilitates easier failure diagnosis.
Relates: #56903
Backport of: #60496
This commit does three things:
* Removes all Copyright/license headers for the build.gradle files under x-pack. (implicit Apache license)
* Removes evaluationDependsOn(xpackModule('core')) from build.gradle files under x-pack
* Removes a place holder test in favor of disabling the test task (in the async plugin)
Closing a regular index and mounting a snapshot-backed index into that existing index does not clean the existing index
folders of those preexisting shards.
This PR removes the existing Lucene / translog files once the searchable snapshot shard is starting up. Future PRs will
make reuse of the existing index files to populate the cache.
When a new cluster starts, the HTTP layer becomes ready to accept incoming
requests while the basic license is still being populated in the background.
When a get license request comes in before the license is ready, it can get
404 error. This PR fixes it by either wrap the license check in assertBusy or
ensure the license is ready before perform the check.
This is a backport for both #60498 and #60573
For consistency reasons (and reducing the overload of IllegalArgumentException)
this changes the exception thrown when trying to create a data stream
that already exists.
(cherry picked from commit ac2184c4614bba0f3ee377da49aea0daed98bab4)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
- Replace immediate task creations by using task avoidance api
- One step closer to #56610
- Still many tasks are created during configuration phase. Tackled in separate steps
* SQL: Add option to provide the delimiter for the CSV format (#59907)
* Add option to provide the delimiter to the CSV fmt
This adds the option to provide the desired character as the separator
for the CSV format (the default remains comma).
A set of characters are excluded though - like CR, LF, `"` - to avoid
slipping onto the CSV-dialects slope. The tab is also forbidden, the
user needs to choose the "tsv" format explicitely.
Update the doc to make it clear that the textual CSV, TSV and TXT
formats pass the cursor back to the user through the Cursor HTTP header.
(cherry picked from commit 3a8b00cc7480f7ada57fcea3cbac957facac08fc)
* Java8 fixes
- replace Set#of();
- URLDecoder#decode() requires a string (vs a charset) as 2nd arg.
* Fix SYS COLUMNS schema in ODBC mode (#59513)
* Fix SYS COLUMNS schema in ODBC mode
This fixes a regression when certain ODBC-specific columns that need to
be of the short type were returned as the integer type.
This also fixes the stubbing for the *-indices SYS COLUMN commands.
(cherry picked from commit 96d89dc9b1fd731e736ef804a16bd05496c1dea6)
* Java8 fix: avoid diamond notation in test.
Qualify anonymous class in test.
* fix npe on ambiguous group by
* add tests for aggregates and group by, add quotes to error message
* add more cases for Group By ambiguity test
* change error messages for field ambiguity
* change collection aliases approach
* add locations of attributes for ambiguous grouping error
* Adress review comments
- remove Comparable implementations from Attribute and Location;
- add ad-hoc comparator for sorting locations in ambiguity message;
- remove added AttributeAlias class with Touple;
- add code comment to explain issue with Location overwriting.
* Fix c&p error in location ref generation comparator
Fix copy&paste error in dedicated comparator used for sorting ambiguity
location references.
Slightly increase its readability.
Co-authored-by: Nikita Verkhovin <verkhovin13@gmail.com>
(cherry picked from commit 9ba70a3483f0f4987229bec231cdc004f51b88a5)
This commit adds compatibility testing of our JDBC driver against
different Elasticsearch versions. Although we are really testing the
forwards compatibility nature of the JDBC driver we model the testing
the same as we do existing BWC tests, that is, with the current branch
fetching the earlier versions of the artifact that is to be tested. In
this case, that's the JDBC driver itself.
Because the tests include the JDBC driver jar on it's classpath we had
to change the packaging of the driver jar in order to avoid jarhell and
other conflicting dependency issues when using an old JDBC driver with
later branches. For this we simply relocate all driver dependencies in
the shadow jar under a "shadowed" package. This allows the JDBC driver
to use the correct version of Elasticsearch libs classes, while the
tests themselves use their versions. Since this required a change to the
driver jar compatibility testing can only go back as far as that version
which at the time of this commit is 7.8.1.
Prior to this change ML memory estimation processes for a
given job would always use the same named pipe names. This
would often cause one of the processes to fail.
This change avoids this risk by adding an incrementing counter
value into the named pipe names used for memory estimation
processes.
Backport of #60395
The retention run goes through a number of steps and can randomly take more than 10s.
=> increased timeout to 30s like we did in other spots in this test
Also, noticed that we had a hard wait of 10s in this test, removed it and adjusted following
busy assert in a way that can deal with a missing snapshot (from when the assert runs before
the snapshot was put into the CS).
Closes#60336
In order to unify model inference and analytics results we
need to write the same fields.
prediction_probability and prediction_score are now written
for inference calls against classification models.
CCR will stop functioning if the master node is on 7.8, but data nodes
are before that version because the master node considers that all data
nodes do not have the remote cluster client role. This commit allows CCR
work on data nodes with legacy roles only.
Relates #54146
Relates #59375
This sets up all indexing to one of our write aliases to require it actually be an alias.
This allows failures scenarios to be captured quickly, loudly, and then potentially recovered.
If a feature is created via a custom pre-processor,
we should return the importance for that feature.
This means we will not return the importance for the
original document field for custom processed features.
closes https://github.com/elastic/elasticsearch/issues/59330
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.
Addresses #49028 and #55363.
If a primary shard of a follower index is being relocated, then we
will fail to create a follow-task. This validation is too restricted.
We should ensure that all primaries of the follower index are active
instead.
Closes#59625
Today, a follow task will fail if the master node of the follower
cluster is temporarily overloaded and unable to process master node
requests (such as update mapping, setting, or alias) from a follow-task
within the default timeout. This error is transient, and follow-tasks
should not abort. We can avoid this problem by setting the timeout of
master node requests on the follower cluster to unbounded.
Closes#56891
Data frame analytics jobs that work with very large datasets
may produce bulk requests that are over the memory limit
for indexing. This commit adds a helper class that bundles
index requests in bulk requests that steer away from the
memory limit. We then use this class both from the results
joiner and the inference runner ensuring data frame analytics
jobs do not generate bulk requests that are too large.
Note the limit was implemented in #58885.
Backport of #60219
Previously the test was asserting the prediction on each document
was close 10.0 from the expected. It turned out that was not enough
as we occasionally saw the test failing by little.
Instead of relaxing that assertion, this commit changes it to
assert the mean prediction error is less than 10.0. This should
reduce the chances of the test failing significantly.
Fixes#60212
Backport of #60221
When the job is force-closed or shutting down due to a fatal error we clean
up all cancellable job operations. This includes cancelling the results processor.
However, this means that we might not persist objects that are written from the
process like stats, memory usage, etc.
In hindsight, we do not gain from cancelling the results processor in its
entirety. It makes more sense to skip row results and model chunks but keep
stats and instrumentation about the job as the latter may contain useful information
to understand what happened to the job.
Backport of #60113
Putting an ingest pipeline used to require that the user calling
it had permission to get nodes info as well as permission to
manage ingest. This was due to an internal implementaton detail
that was not visible to the end user.
This change alters the behaviour so that a user with the
manage_pipeline cluster privilege can put an ingest pipeline
regardless of whether they have the separate privilege to get
nodes info. The internal implementation detail now runs as
the internal _xpack user when security is enabled.
Backport of #60106
It can take more than 10 seconds to auto-follow and create a follow-task
on a slow CI. This commit increases timeout in AutoFollowIT by replacing
assertBusy with assertLongBusy.
Closes#59952
Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient.
By the same token, the use of stream copying with the default 8k buffer size for blob writes was inefficient as well.
We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`.
This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs.
This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.
This commit continues on the work in #59801 and makes other
implementors of the LocalNodeMasterListener interface thread safe in
that they will no longer allow the callbacks to run on different
threads and possibly race each other. This also helps address other
issues where these events could be queued to wait for execution while
the service keeps moving forward thinking it is the master even when
that is not the case.
In order to accomplish this, the LocalNodeMasterListener no longer has
the executorName() method to prevent future uses that could encounter
this surprising behavior.
Each use was inspected and if the class was also a
ClusterStateListener, the implementation of LocalNodeMasterListener
was removed in favor of a single listener that combined the logic. A
single listener is used and there is currently no guarantee on execution
order between ClusterStateListeners and LocalNodeMasterListeners,
so a future change there could cause undesired consequences. For other
classes, the implementations of the callbacks were inspected and if the
operations were lightweight, the overriden executorName method was
removed to use the default, which runs on the same thread.
Backport of #59932
In #58877, when we switched test inference on java, we just
use the doc's `_source` as features. However, this could be
missing out on features that were used during training,
e.g. alias fields, etc.
This commit addresses this by extracting fields to use as
features during inference the same way they are extracted
in `DataFrameDataExtractor` when they are used for training.
Backport of #59963
If shard level results are incomplete in the data streams stats call, it is possible to get inaccurate
counts of the number of backing indices, despite this data being accurate and available in the
cluster state.