This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.
Closes#43599
This commit moves the packaging tests for elasticsearch-setup-passwords
to java from bats. The change also enables future tests to enable
security in Elasticsearch and automatically have waitForElasticsearch
work correctly, at least to the same extent it worked in bats, by
waiting on the ES port instead of health check.
relates #46005
This commits sets an output marker file for the docker build tasks so
that it can be tracked as up to date. It also fixes the docker build
context task to omit the build date as in input property which always
left the task as out of date.
relates #49359
This commit clarifies how to override JAVA_HOME from the bundled jdk for
deb and rpm installs, which each have their own file that is sourced
upon service startup.
closes#49068
Fix reference about the uid:gid that Elasticsearch runs as inside
the Docker container and add a packaging test to ensure that bind
mounting a data dir with a random uid and gid:0 works as
expected.
Backport of #49529Closes#47929
JavaDateFormatter should keep the pattern with the prefixed 8 as it will be used for serialisation. The stripped pattern should be used for the enclosed formatters.
closes#48698
Fixes a bug where a scripted upsert that causes a dynamic mapping update is retried (because
mapping update is still in-flight), and the request is mutated multiple times.
Closes#48670
Backport of #49076
In case an exception occurs inside a pipeline processor,
the pipeline stack is kept around as header in the exception.
Then in the on_failure processor the id of the pipeline the
exception occurred is made accessible via the `on_failure_pipeline`
ingest metadata.
Closes#44920
Authentication has grown more complex with the addition of new realm
types and authentication methods. When user authentication does not
behave as expected it can be difficult to determine where and why it
failed.
This commit adds DEBUG and TRACE logging at key points in the
authentication flow so that it is possible to gain addition insight
into the operation of the system.
Backport of: #49575
The AuthenticationService has a feature to "smart order" the realm
chain so that whicherver realm was the last one to successfully
authenticate a given user will be tried first when that user tries to
authenticate again.
There was a bug where the building of this realm order would
incorrectly drop the first realm from the default chain unless that
realm was the "last successful" realm.
In most cases this didn't cause problems because the first realm is
the reserved realm and so it is unusual for a user that authenticated
against a different realm to later need to authenticate against the
resevered realm.
This commit fixes that bug and adds relevant asserts and tests.
Backport of: #49473
This commit fixes#49587. Due to a settings change, the broken test was
asserting on the incorrect setting. This commit fixes that issue and
adds additional assertions to ensure that all settings are working
properly.
Preliminary to shorten the diff of #49060. In #49060
we execute cluster state updates during the writing of a new
index gen and thus it must be an async API.
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
The annotated text mapper has a field type that currently extends StringFieldType,
which means that all the positional-related query factory methods need to be copied
over from TextFieldType. In addition, MappedFieldType.intervals() hasn't been
overridden, so you can't use intervals queries with annotated text - a major drawback,
since one of the purposes of annotated text is to be able to run positional queries against
annotations.
This commit changes the annotated text field type to extend TextFieldType instead,
adding tests to ensure that position queries work correctly.
Closes#49289
Same as #49518 pretty much but for GCS.
Fixing a few more spots where input stream can get closed
without being fully drained and adding assertions to make sure
it's always drained.
Moved the no-close stream wrapper to production code utilities since
there's a number of spots in production code where it's also useful
(will reuse it there in a follow-up).
Grouping By YEAR() is translated to a histogram aggregation, but
previously if there was a scalar function invloved (e.g.:
`YEAR(date + INTERVAL 2 YEARS)`), there was no proper script created
and the histogram was applied on a field with name: `date + INTERVAL 2 YEARS`
which doesn't make sense, and resulted in null result.
Check the underlying field of YEAR() and if it's a function call
`asScript()` to properly get the painless script on which the histogram
is applied.
Fixes: #49386
(cherry picked from commit 93c37abc943d00d3a14ba08435d118a6d48874c7)
Backport of #49552.
Closes#49428. The code that works out an HTTP code for an exception didn't
consider the JsonParseException case, meant that an invalid JSON request could
result in a 500 Internal Server Error. Now it returns 400 Bad Request.
Add TRUNC as alias to already implemented TRUNCATE
numeric function which is the flavour supported by
Oracle and PostgreSQL.
Relates to: #41195
(cherry picked from commit f2aa7f0779bc5cce40cc0c1f5e5cf1a5bb7d84f0)
We depend on the number of data frame rows in order to report progress
for the writing of results, the last phase of a job run. However, results
include other objects than just the data frame rows (e.g, progress, inference model, etc.).
The problem this commit fixes is that if we receive the last data frame row results
we'll report that progress is complete even though we still have more results to process
potentially. If the job gets stopped for any reason at this point, we will not be able
to restart the job properly as we'll think that the job was completed.
This commit addresses this by limiting the max progress we can report for the
writing_results phase before the results processor completes to 98.
At the end, when the process is done we set the progress to 100.
The commit also improves failure capturing and reporting in the results processor.
Backport of #49551
Previously, CaseProcessor was pre-calculating (called `process()`)
on all the building elements of a CASE/IIF expression, not only the
conditions involved but also the results, as well as the final else result.
In case one of those results had an erroneous calculation
(e.g.: division by zero) this was executed and resulted in
an Exception to be thrown, even if this result was not used because of
the condition guarding it. e.g.:
```
SELECT CASE myField1 = 0 THEN NULL ELSE myField2 / myField1 END
FROM test;
```
Fixes: #49388
(cherry picked from commit dbd169afc98686cae1bc72024fad0ca32b272efd)
The default is set to Integer.MAX_VALUE but is reported to be `0` in the docs.
With the current implementation a value of 0 would mean all terms are filtered
out, which is the opposite of "unbounded".
Closes#49520
The breaking changes cover the removal of TLSv1 from the default
protocols, but assume that users who need to retain TLSv1 support will
understand all the places where they may used it.
This has proven not to be true, as it is easy to be unaware that (for
example) an LDAP server is using TLSv1.
This change explicitly lists all the places where TLS protocols may
need to be configured.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
Co-Authored-By: Pius <pius@elastic.co>
We do not guarantee that EngineTestCase#getDocIds is called after the
engine has been externally refreshed. Hence, we trip an assertion
assertSearcherIsWarmedUp.
CI: https://gradle-enterprise.elastic.co/s/pm2at5qmfm2iu
Relates #48605
This commit back ports three commits related to enabling the simple
connection strategy.
Allow simple connection strategy to be configured (#49066)
Currently the simple connection strategy only exists in the code. It
cannot be configured. This commit moves in the direction of allowing it
to be configured. It introduces settings for the addresses and socket
count. Additionally it introduces new settings for the sniff strategy
so that the more generic number of connections and seed node settings
can be deprecated.
The simple settings are not yet registered as the registration is
dependent on follow-up work to validate the settings.
Ensure at least 1 seed configured in remote test (#49389)
This fixes#49384. Currently when we select a random subset of seed
nodes from a list, it is possible for 0 seeds to be selected. This test
depends on at least 1 seed being selected.
Add the simple strategy to cluster settings (#49414)
This is related to #49067. This commit adds the simple connection
strategy settings and strategy mode setting to the cluster settings
registry. With these changes, the simple connection mode can be used.
Additionally, it adds validation to ensure that settings cannot be
misconfigured.
The new CompensatedSum is a nice DRY refactor, but had the unanticipated
side effect of creating a lot of object allocation in the aggregation hot collection
loop: one object per visited document, per aggregator. In some places it
created two per-doc-per-agg (weighted avg, geo centroids, etc) since there
were multiple compensations being maintained.
This PR moves the object creation out of the hot loop so that it is now
created once per segment, and resets the internal state each time through
the loop
* Adds a title abbreviation
* Relocates the older name deprecation warning
* Updates the description and adds a Lucene link
* Adds a note to explain payloads and how to store them
* Adds analyze and custom analyzer snippets
* Adds a 'Return stored payloads' example
The categorization job wizard in the ML UI will use this
information when showing the effect of the chosen categorization
analyzer on a sample of input.
Before this change excluding an unsupported field resulted in
an error message that explained the excluded field could not be
detected as if it doesn't exist. This error message is confusing.
This commit commit changes this so that there is no error in this
scenario. When excluding a field that does exist but has been
automatically been excluded from the analysis there is no harm
(unlike excluding a missing field which could be a typo).
Backport of #49535