This commit adds an API to read translog snapshot from Lucene,
then cut-over from the existing translog to the new API in CCR.
Relates #30086
Relates #29530
* master:
Mute ML upgrade test (#30458)
Stop forking javac (#30462)
Client: Deprecate many argument performRequest (#30315)
Docs: Use task_id in examples of tasks (#30436)
Security: Rename IndexLifecycleManager to SecurityIndexManager (#30442)
[Docs] Fix typo in cardinality-aggregation.asciidoc (#30434)
Avoid NPE in `more_like_this` when field has zero tokens (#30365)
Build: Switch to building javadoc with html5 (#30440)
Add a quick tour of the project to CONTRIBUTING (#30187)
Reindex: Use request flavored methods (#30317)
Silence SplitIndexIT.testSplitIndexPrimaryTerm test failure. (#30432)
Auto-expand replicas when adding or removing nodes (#30423)
Docs: fix changelog merge
Fix line length violation in cache tests
Add stricter geohash parsing (#30376)
Add failing test for core cache deadlock
[DOCS] convert forcemerge snippet
Update forcemerge.asciidoc (#30113)
Added zentity to the list of API extension plugins (#29143)
Fix the search request default operation behavior doc (#29302) (#29405)
We started forking javac to avoid GC overhead when running builds. Yet,
we do not seem to have this problem anymore and not forking leads to a
substantial speed improvement. This commit stops forking javac.
Deprecate the many arguments versions of `performRequest` and
`performRequestAsync` in favor of the `Request` object flavored variants
introduced in #29623. We'll be dropping the many arguments variants in
7.0 because they make it difficult to add new features in a backwards
compatible way and they create a *ton* of intellisense noise.
We had been using `task_id:1` or `taskId:1` because it is parses as a
valid task identifier but the `:1` part is confusing. This replaces
those examples with `task_id` which matches the response from the list
tasks API.
Closes#28314
Previously only index and delete operations are indexed into Lucene,
therefore every segment should have both _id and _version terms as these
operations contain both terms. However, this is no longer guaranteed
after noop is also indexed into Lucene. A segment which contains only
no-ops does not have neither _id or _version because a no-op does not
contain these terms.
This change adds a dummy version to no-ops and makes _id terms optional
in PerThreadIDVersionAndSeqNoLookup.
Relates #30226
This commit renames IndexLifecycleManager to SecurityIndexManager as it
is not actually a general purpose class, but specific to security. It
also removes indirection in code calling the lifecycle service, instead
calling the security index manager directly.
GetSettingsResponseTests fails on the ccr branch because it excludes the
root-level settings while we introduced "index.soft_deletes" - a
root-level setting in this branch.
This commit moves "index.soft_deletes" to the grouped-level settings.
Moreover, as we are going to introduce more settings for soft_deletes, I
think it's better to organize all these settings into a single group.
Fixes and edge case when using `more_like_this` where TermVectorsWriter
could throw an NPE when a field produced zero tokens after analysis. This
changes the implementation to use an empty list of tokens in this case.
Closes#30148
* elastic-master:
Watcher: Mark watcher as started only after loading watches (#30403)
Pass the task to broadcast actions (#29672)
Disable REST default settings testing until #29229 is back-ported
Correct wording in log message (#30336)
Do not fail snapshot when deleting a missing snapshotted file (#30332)
AwaitsFix testCreateShrinkIndexToN
DOCS: Correct mapping tags in put-template api
DOCS: Fix broken link in the put index template api
Add put index template api to high level rest client (#30400)
Relax testAckedIndexing to allow document updating
[Docs] Add snippets for POS stop tags default value
Move respect accept header on no handler to 6.3.1
Respect accept header on no handler (#30383)
[Test] Add analysis-nori plugin to the vagrant tests
[Docs] Fix bad link
[Docs] Fix end of section in the korean plugin docs
Expose the Lucene Korean analyzer module in a plugin (#30397)
Docs: remove transport_client from CCS role example (#30263)
[Rollup] Validate timezone in range queries (#30338)
Use readFully() to read bytes from CipherInputStream (#28515)
Fix docs Recently merged #29229 had a doc bug that broke the doc build. This commit fixes.
Test: remove cluster permission from CCS user (#30262)
Add Get Settings API support to java high-level rest client (#29229)
Watcher: Remove unneeded index deletion in tests
Adds a description of the most important subdirectories to
`CONTRIBUTING.md` to help folks that are less familiar with the project
get their bearings. It reflects that state of the project right now so
it will inevitably go out of date. But I'll try and keep it up to date.
Auto-expands replicas in the same cluster state update (instead of a follow-up reroute) where nodes are added or removed.
Closes#1873, fixing an issue where nodes drop their copy of auto-expanded data when coming up, only to sync it again later.
Adds verification that geohashes are not empty and contain only
valid characters. It fixes the issue when en empty geohash is
treated as [-180, -90] and geohashes with non-geohash character
are getting resolved into invalid coordinates.
Closes#23579
Starting watcher should wait for the watcher to be started before
marking the status as started, which is now done via a callback.
Also, reloading watcher could set the execution service to paused. This could
lead to watches not being executed, when run in tests. This fix does not
change the paused flag in the execution service, just clears out the
current queue and executions.
Closes#30381
[CCR] added rest specs and simple rest test for follow and unfollow apis, also
Added an acknowledge field in follow and unfollow api responses. Currently these api return an empty response and fixed bug in unfollow api that didn't cleanup node tasks properly.
That PR changed the execution path of index settings default to be on the master
until the PR is back-ported the old master will not return default settings.
When deleting or creating a snapshot for a given shard, elasticsearch
usually starts by listing all the existing snapshotted files in the repository.
Then it computes a diff and deletes the snapshotted files that are not
needed anymore. During this deletion, an exception is thrown if the file
to be deleted does not exist anymore.
This behavior is challenging with cloud based repository implementations
like S3 where a file that has been deleted can still appear in the bucket for
few seconds/minutes (because the deletion can take some time to be fully
replicated on S3). If the deleted file appears in the listing of files, then the
following deletion will fail with a NoSuchFileException and the snapshot
will be partially created/deleted.
This pull request makes the deletion of these files a bit less strict, ie not
failing if the file we want to delete does not exist anymore. It introduces a
new BlobContainer.deleteIgnoringIfNotExists() method that can be used
at some specific places where not failing when deleting a file is
considered harmless.
Closes#28322
The test indexes new documents and is thus correct in testing that the response result
is `CREATED`. Sadly we can't guarantee exactly once delivery just yet.
Relates #9967Closes#21658
This commit introduces a soft-deletes retention merge policy based on
the global checkpoint. Some notes on this simple retention policy:
- This policy keeps all operations whose seq# is greater than the
persisted global checkpoint and configurable extra operations prior to
the global checkpoint. This is good enough for querying history changes.
- This policy is not watertight for peer-recovery. We send the
safe-commit in peer-recovery, thus we need to also send all operations
after the local checkpoint of that commit. This is analog to the min
translog generation for recovery.
- This policy is too simple to support rollback.
Relates #29530
Today when processing a request for a URL path for which we can not find
a handler we send back a plain-text response. Yet, we have the accept
header in our hand and can respect the accepted media type of the
request. This commit addresses this.
This change adds a new plugin called `analysis-nori` that exposes
Korean text analysis in es using the new Lucene Korean analyzer module named (`nori`).
The plugin adds:
* a Korean analyzer: `nori`
* a Korean tokenizer: `nori_tokenizer`
* a part of speech stop filter: `nori_part_of_speech`
* a filter that can replace Hanja characters with their Hangul transcription: `nori_readingform`
This commit removes the unnecessary transport_client cluster permission
from the role that is used as an example in our documentation. This
permission is not needed to use cross cluster search.
When validating the search request, we make sure any date_histogram
aggregations have timezones that match the jobs. But we didn't
do any such validation on range queries.
While it wouldn't produce incorrect results, it would be confusing
to the user as no documents would match the aggregation (because we
add a filter clause on the timezone for the agg).
Now the user gets an exception up front, and some helpful text about
why the range query didnt match, and which timezones are acceptable
Changes how data is read from CipherInputStream
Instead of using `read()` and checking that the bytes read are what we
expect, use `readFully()` which will read exactly the number of bytes
while keep reading until the end of the stream or throw an
`EOFException` if not all bytes can be read.
This approach keeps the simplicity of using CipherInputStream while
working as expected with both JCE and BCFIPS Security Providers
This commit updates the multi cluster search test with security so that
the user that is simply performing a multi cluster search does not have
any cluster permissions. This is done as none are needed by this user
and excess privileges could mask a behavior change.