When target indices are remote only, CCS does not require user to have privileges on the local cluster. This PR ensure Point-In-Time reader follows the same pattern.
Relates: #61827
Currently we log the NettyAllocator description when the netty plugin is
created. Unfortunately, this hits certain static fields in Netty which
triggers the settings of the number of CPU processors. This conflicts
with out Elasticsearch behavior to override this based on a setting.
This commit resolves the issue by logging after the processors have been
set.
CCS with remote indices only does not require any privileges on the local cluster.
This PR ensures that search with scroll follow the permission model.
* [DOCS] EQL: Improve regsvr32 misuse explanation (#62722)
Expands the introduction to better explain what regsvr32 misuse is and
how it works at a high level.
* [DOCS] EQL: Style fixes
Currently the netty pool chunk size defaults to 16MB. The number does
not play well with the G1GC which causes this to consume entire regions.
Additionally, we normally allocated arrays of size 64KB or less. This
means that Elasticsearch could handle a smaller pool chunk size to play
nicer with the G1GC.
This commit ensures that the final order of the terms aggregations
is registered correctly after the final reduce.
This bug was introduced in #62028 which is not released yet so this PR is marked
as a non-issue.
This issue was discovered when running a terms aggregation under an auto-date
histogram. In such a case, the auto-date histogram may run multiple final reduce
to merge buckets together. This change makes sure that running multiple final reduces
doesn't create duplicates but it doesn't fix the fact that the final reduce may prune
the list of terms prematurely. This other bug is tracked separately in #62731.
This assertion does not always hold because there can be a race between
`putReaderContext` and `afterIndexRemoved` when an index is deleted.
Closes#62624
This allows the `check-migration` step to move past the allocation check
if the tier routing settings are manually unset.
This helps a user unblock ILM in case a tier is removed (ie. if the warm tier
is decommissioned this will allow users to resume the ILM policies stuck in
`check-migration` waiting for the warm nodes to become available and the managed
index to allocate. this allows the index to allocate on the other available tiers)
(cherry picked from commit d7a1eaa7f51d0972d10c0df1d3cd77d6b755dd41)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
The distro tests rely on two jdks, pulled in by the jdk download plugin.
The move the artifact transforms result in the path to the extracted
jdks existing under the gradle cache dir, which is outside the vagrant
mount of the elasticsearch project. This commit creates a local copy
within the `qa:os` project that the packaging tests use.
closes#61138
Implement FORMAT according to the SQL Server spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/format-transact-sql?view=sql-server-ver15#ExampleD by translating to the java.time patterns used in DATETIME_FORMAT.
Closes: #54965
Co-authored-by: Marios Trivyzas <matriv@users.noreply.github.com>
Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>
Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
(cherry picked from commit da511f4e033db6e8a6aa2a54b23e906b5e026845)
This is a follow up of #62480 where we are oversizing one array when initialising. In addition it prevents a possible CircuitBreaker leak during initialisation.
Make serializing `RepositoryData` a little faster and split up/document the code for it a little
as well given how massive this method has gotten at this point.
As part of the conversion, adds the ability to customize merge validation - in this case, we
allow an update to the constant value if it is currently set to null, but refuse further
updates once it has been set once.
This commit also converts ParametrizedMapperTests to use MapperServiceTestCase.
If HyperLogLogPlusPlus failed during construction, it would
not release already allocated resources, causing the request
circuit breaker to not be adjusted down.
Closes#62439
This PR adds a new 'version' field type that allows indexing string values
representing software versions similar to the ones defined in the Semantic
Versioning definition (semver.org). The field behaves very similar to a
'keyword' field but allows efficient sorting and range queries that take into
accound the special ordering needed for version strings. For example, the main
version parts are sorted numerically (ie 2.0.0 < 11.0.0) whereas this wouldn't
be possible with 'keyword' fields today.
Valid version values are similar to the Semantic Versioning definition, with the
notable exception that in addition to the "main" version consiting of
major.minor.patch, we allow less or more than three numeric identifiers, i.e.
"1.2" or "1.4.6.123.12" are treated as valid too.
Relates to #48878
* Add Maven Central as a JDBC repository
Document Maven Central as a JDBC repository.
(cherry picked from commit 2bc4d7eb19a26bf21b11214c4351470b677e1598)
This commit adds a test that verifies that snapshots incrementality
is respected when a snapshot-backed index is snapshotted. This
test mounts a snapshot as a snapshot-backed index, creates a
new snapshot from it and then verifies that no new data blobs
were added to the repository.
The `standard` tokenfilter was removed by #33310, and should have been
unuseable in any indexes created since 7.0. However, a cacheing bug fixed
by #51092 meant that it was still possible in certain circumstances to create
indexes referencing the standard filter in versions up to 7.5.2. Our checks
in AnalysisModule still refer to 7.0.0, however, meaning that a cluster that
contains one of these rogue indexes cannot be upgraded.
This commit adjusts the AnalysisModule checks so that we only refuse to
build a mapping referring to standard filter if the index created version is
7.6 or later.
Fixes#62644
The autoscaling decision API now returns an absolute capacity,
and leaves the actual decision of whether a scale up or down
is needed to the orchestration system.
The decision API now returns both a tier and node level required
and current capacity as wells as a decider level breakdown of the
same though with in particular current memory still not populated.
In the context of of a recurring test failure tracked by #32827, we added trace logging and an extra cache key renderer argument to IndicesRequestCache#getOrCompute (see #39475 and #34180).
We addressed the issue with #54071, but the extra argument was left behind, with a NORELEASE comment saying it should be removed.
With this commit, we remove the extra cache key rendered argument and the corresponding log lines which are not so useful without it.
Closes#55837
This commit adds the `index.routing.allocation.prefer._tier` setting to the
`DataTierAllocationDecider`. This special-purpose allocation setting lets a user specify a
preference-based list of tiers for an index to be assigned to. For example, if the setting were set
to:
```
"index.routing.allocation.prefer._tier": "data_hot,data_warm,data_content"
```
If the cluster contains any nodes with the `data_hot` role, the decider will only allow them to be
allocated on the `data_hot` node(s). If there are no `data_hot` nodes, but there are `data_warm` and
`data_content` nodes, then the index will be allowed to be allocated on `data_warm` nodes.
This allows us to specify an index's preference for tier(s) without causing the index to be
unassigned if no nodes of a preferred tier are available.
Subsequent work will change the ILM migration to make additional use of this setting.
Relates to #60848
This commit changes the yamlRestTest and javaRestTest tasks to be lazily created.
This change requires pro-actively creating the testClusters container so that the
configuration can be applied without any changes to the build.gradle files.
related: #60261
related: #47804
Backports #61590 to 7.x
So far we don't allow metadata fields in the document _source. However, in the case of the _doc_count field mapper (#58339) we want to be able to set
This PR adds a method to the metadata field parsers that exposes if the field can be included in the document source or not.
This way each metadata field can configure if it can be included in the document _source
This commit adjusts the following APIs so now they not only support an `_all` case, but wildcard patterned Ids as well.
- `GET _ml/calendars/<calendar_id>/events`
- `GET _ml/calendars/<calendar_id>`
- `GET _ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>`
- `DELETE _ml/anomaly_detectors/<job_id>/_forecast/<forecast_id>`
We removed index-time boosting back in 5x, and we no longer document the 'boost'
parameter on any of our mapping types. However, it is still possible to define an
index-time boost on a field mapper for a surprisingly large number of field types, and
they even have an effect (sometimes, on some queries).
As a first step in finally removing all traces of index time boosting, this comment emits
a deprecation warning whenever a boost parameter is found on a mapping definition.
* [ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922)
Adds new flag include to the get trained models API
The flag initially has two valid values: definition, total_feature_importance.
Consequently, the old include_model_definition flag is now deprecated.
When total_feature_importance is included, the total_feature_importance field is included in the model metadata object.
Including definition is the same as previously setting include_model_definition=true.
* fixing test
* Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/action/GetTrainedModelsRequestTests.java
In #62357 we introduced an additional optimization that allows us to skip the
most of the fetch phase early if no results are found. This change caused
some cancellation test failures that were relying on definitive cancellation
during the fetch phase. This commit adds an additional quick cancellation
check at the very beginning of the fetch phase to make cancellation process
more deterministic.
Fixes#62530
Changes the way we collecting ordinals in the Cardinality aggregation from Lucene FixedBitSet to BitArray. The benefit is that BitArray is tracked by our Circuit breakers so it is safer.
Today when a snapshot restore is aborted (for example when the index is
explicitly deleted) while the restoration of the files from the repository has
already started the file restores are not interrupted. It means that Elasticsearch
will continue to read the files from the repository and will continue to write
them to disk until all files are restored; the store will then be closed and
files will be deleted from disk at some point but this can take a while. This
will also take some slots in the SNAPSHOT thread pool too. The Recovery
API won't show any files actively being recovered, the only notable
indicator would be the active threads in the SNAPSHOT thread pool.
This commit adds a check before reading a file to restore and before
writing bytes on disk so that a closing store can be detected more
quickly and the file recovery process aborted. This way the file
restores just stops and for most of the repository implementations
it means that no more bytes are read (see #62370 for S3), finishing
threads in the SNAPSHOT thread pool more quickly too.
Async search tests can take more than one minute due to the excessive trace logs.
And the point in time in the tests can be expired the midway.
Closes#62451