That PR changed the execution path of index settings default to be on the master
until the PR is back-ported the old master will not return default settings.
When deleting or creating a snapshot for a given shard, elasticsearch
usually starts by listing all the existing snapshotted files in the repository.
Then it computes a diff and deletes the snapshotted files that are not
needed anymore. During this deletion, an exception is thrown if the file
to be deleted does not exist anymore.
This behavior is challenging with cloud based repository implementations
like S3 where a file that has been deleted can still appear in the bucket for
few seconds/minutes (because the deletion can take some time to be fully
replicated on S3). If the deleted file appears in the listing of files, then the
following deletion will fail with a NoSuchFileException and the snapshot
will be partially created/deleted.
This pull request makes the deletion of these files a bit less strict, ie not
failing if the file we want to delete does not exist anymore. It introduces a
new BlobContainer.deleteIgnoringIfNotExists() method that can be used
at some specific places where not failing when deleting a file is
considered harmless.
Closes#28322
The test indexes new documents and is thus correct in testing that the response result
is `CREATED`. Sadly we can't guarantee exactly once delivery just yet.
Relates #9967Closes#21658
Today when processing a request for a URL path for which we can not find
a handler we send back a plain-text response. Yet, we have the accept
header in our hand and can respect the accepted media type of the
request. This commit addresses this.
This change adds a new plugin called `analysis-nori` that exposes
Korean text analysis in es using the new Lucene Korean analyzer module named (`nori`).
The plugin adds:
* a Korean analyzer: `nori`
* a Korean tokenizer: `nori_tokenizer`
* a part of speech stop filter: `nori_part_of_speech`
* a filter that can replace Hanja characters with their Hangul transcription: `nori_readingform`
This commit removes the unnecessary transport_client cluster permission
from the role that is used as an example in our documentation. This
permission is not needed to use cross cluster search.
When validating the search request, we make sure any date_histogram
aggregations have timezones that match the jobs. But we didn't
do any such validation on range queries.
While it wouldn't produce incorrect results, it would be confusing
to the user as no documents would match the aggregation (because we
add a filter clause on the timezone for the agg).
Now the user gets an exception up front, and some helpful text about
why the range query didnt match, and which timezones are acceptable
Changes how data is read from CipherInputStream
Instead of using `read()` and checking that the bytes read are what we
expect, use `readFully()` which will read exactly the number of bytes
while keep reading until the end of the stream or throw an
`EOFException` if not all bytes can be read.
This approach keeps the simplicity of using CipherInputStream while
working as expected with both JCE and BCFIPS Security Providers
This commit updates the multi cluster search test with security so that
the user that is simply performing a multi cluster search does not have
any cluster permissions. This is done as none are needed by this user
and excess privileges could mask a behavior change.
This PR adds support for the Get Settings API to the java high-level rest client.
Furthermore, logic related to the retrieval of default settings has been moved from the rest layer into the transport layer and now default settings may be retrieved consistency via both the rest API and the transport API.
Upgrade to lucene-7.4.0-snapshot-1ed95c097b
This version contains:
* An Analyzer for Korean
* An IntervalQuery and IntervalsSource that retrieve minimum intervals of positional queries.
* A new API to retrieve matches (offsets and positions) of a query for a single document.
* Support for soft deletes in the index writer.
* A fixed shingle filter that handles index time synonyms.
* Support for emoji sequence in ICUTokenizer (with an upgrade to icu 61.1)
When the watcher service pauses execution due to a cluster state update,
the trigger service and its engines also need to pause properly instead
of keeping going. This is also important when the .watches index is
deleted, so that watches don't stay in a triggered mode.
The IndexAndAliasesResolver resolves the indices and aliases for each
request and also handles local and remote indices. The current
implementation uses the ResolvedIndices class to hold the resolved
indices and aliases. While evaluating the indices and aliases against
the user's permissions, the final value for ResolvedIndices is
constructed. Prior to this change, this was done by creating a
ResolvedIndices for the first set of indices and for each additional
addition, a new ResolvedIndices object is created and merged with
the existing one. With a small number of indices and aliases this does
not pose a large problem; however as the number of indices/aliases
grows more list allocations and array copies are needed resulting in a
large amount of garbage and severely impacted performance.
This change introduces a builder for ResolvedIndices that appends to
mutable lists until the final value has been constructed, which will
ultimately reduce the amount of garbage generated by this code.
QA tests that use security need to use a trial license instead of a
basic license. Basic licenses do not enable security so these tests are
not running in the expected configuration. This can also lead to issues
that seem completely unrelated such as authentication failures right
after cluster formation.
The authentication failure right after cluster formation happens since
a request makes it past the authentication license checks and the
request starts the authentication process. During authentication the
cluster forms and the license is updated to a basic license, which
disables all realms. The request that is being authenticated then tries
to iterate over the realms, but the realms are empty and the request
cannot be authenticated. This results in a authentication failure even
though the credentials provided are correct.
Closes#30306
When dealing with filtering, a composite aggregation might return empty
buckets (which have been filtered) which gets sent as is to the client.
Unfortunately this interprets the response as no more data instead of
retrying.
This now has changed and the listener keeps retrying until either the
query has ended or data passes the filter.
Fix#30292
This commit fixes an issue with the data diagnostics were
empty buckets are not reported even though they should. Once
a job is reopened, the diagnostics do not get initialized from
the current data counts (especially the latest record timestamp).
The result is that if the data that is sent have a time gap compared
to the previous ones, that gap is not accounted for in the empty bucket
count.
This commit fixes that by initializing the diagnostics with the current
data counts.
Closes#30080
We were recently looking at bugs that can only occur if two different documents were indexed concurrently. For example, what happens if the local checkpoint advances above the sequence number of a document that's being indexed. That can only happen if another concurrent operation caused the checkpoint to advance. It has to be another document to allow concurrency as we acquire a per uid lock.While our investigation proved that the suspected bug doesn't exists, we still discovered our unit testing coverage is not good enough to cover this case.
This PR extend the test concurrent out of order replica processing to use two documents in its history.
The Get Repositories response object held a list of RepositoryMetaData
entries. This object does not have the from/toXContent methods that are
needed to expose this to the high level REST client. The
RepositoriesMetaData, however, does, and it also contains a list of
RepositoryMetaData objects within it. So rather than duplicate this
logic or move it (RepositoriesMetaData is a fragment object used by
cluster state), the object holding state in the Response was changed to
use the RepositoriesMetaData instead. This also cleans up the read/write
methods in the response, as they can now use the same read/write in
RepositoriesMetaData, which also were not present in the singular class.
Each test now has its own watch id that is being used.
This ensures there are no old history entries, which can potentially
lead to broken test assertions.
Uses a filter on the copy task for the eclipse settings files to
replace the token @@LICENSE_HEADER_TEXT@@ with the correct licence
header from the new buildSrc/src/main/resources/license-headers
directory
The current implementation starts/stops watcher using an executor. This
can result in our of order operations.
This commit reduces those executor calls to an absolute minimum in order
to be able to do state changes within the cluster state listener method,
which runs in sequence.
When a state change occurs that forces the watcher service to pause
(like no watcher index, no master node, no local shards), the service is
now in a paused state.
Pausing is a super lightweight operation, which marks the
ExecutionService as paused and waits for the currently executing watches
to finish in the background via an executor. The same applies for
stopping, the potentially long running operation is outsourced in to an
executor, as waiting for executed watches is decoupled from the current
state.
The only other long running operation is starting, where watches need to
be loaded. This is also done via an executor, but has an additional
protection by checking the cluster state version it was started with. If
another cluster state version was trying to load the watches, then this
loading will not take effect.
This PR also cleans up some unused states, like the a simple boolean in
the HistoryStore/TriggeredWatchStore marking it as started or stopped,
as this can now be caught in the execution service.
Another advantage of this approach is the fact, that now only triggered
watches are not getting executed, while watches that are run via the
Execute Watch API will still be executed regardless if watcher is
stopped or not.
Lastly the TickerScheduleTriggerEngine thread now only starts on data nodes.
This commit is a follow up to #30135. It updates the stream
compatibility versions in the start_trial requests and responses to
reflect that fact that this work has been backported to 6.3.
This commit adds setting the homedir for the elasticsearch user to the
adduser command in the packaging preinstall script. While the
elasticsearch user is a system user, it is sometimes conventient to have
an existing homedir (even if it is not writeable). For example, running
cron as the elasticsearch user will try to change dir to the homedir.
closes#14453
Fix NPE when CumulativeSum agg encounters null/empty bucket
If the cusum agg encounters a null value, it's because the value is
missing (like the first value from a derivative agg), the path is
not valid, or the bucket in the path was empty.
Previously cusum would just explode on the null, but this changes it
so we only increment the sum if the value is non-null and finite.
This is safe because even if the cusum encounters all null or empty
buckets, the cumulative sum is still zero (like how the sum agg returns
zero even if all the docs were missing values)
I went ahead and tweaked AggregatorTestCase to allow testing pipelines,
so that I could delete the IT test and reimplement it as AggTests.
Closes#27544
Necessary changes so that the licensing functionality can be
used in a JVM in FIPS 140 approved mode.
* Uses adequate salt length in encryption
* Changes key derivation to PBKDF2WithHmacSHA512 from a custom
approach with SHA512 and manual key stretching
* Removes redundant manual padding
Other relevant changes:
* Uses the SAH512 hash instead of the encrypted key bytes as the
key fingerprint to be included in the license specification
* Removes the explicit verification check of the encryption key
as this is implicitly checked in signature verification.