* es/master:
Watcher: Fold two smoke test projects into smoke-test-watcher (#30137)
In the field capabilities API, deprecate support for providing fields in the request body. (#30157)
Set JAVA_HOME before forking setup commands (#29647)
Remove animal sniffer from low-level REST client (#29646)
Cleanup .gitignore (#30145)
Do not add noop from local translog to translog again (#29637)
Build: Assert jar LICENSE and NOTICE files match
Correct transport compression algorithm in docs (#29645)
[Test] Fix docs check for DEB package in packaging tests (#30126)
Painless: Docs Clean Up (#29592)
Fixes Eclipse build for sql jdbc project (#30114)
Remove reference to `not_analyzed`.
[Docs] Add community analysis plugin (#29612)
In order to have less qa/ projects, this commit removes the
watcher-smoke-test-mustache and the watcher-smoke-test-painless projects
and moves the tests of both into the watcher-smoke-test projects.
This results in less projects to parse when calling gradle and a shorter test
time due to being able to run everything within one elasticsearch
instance.
Adds tasks that check that the all jars that we build have LICENSE.txt
and NOTICE.txt files and that the files are correct. Sets check to
depend on these task.
This is mostly there for extra parnoia because we automatically
configure all Jar tasks to include the LICENSE.txt and NOTICE.txt
files anyway. But it is quite possible to add configuration to those
tasks that would override either file.
This causes check to depend on several more things than it used to.
Take, for example, javadoc:
check depends on the new verifyJavadocJarNotice which depends on
extractJavadocJar which depends on javadocJar which depends on
javadoc, this check now depends on javadoc.
The bundled configuration isn't recognised by eclipse so these
dependencies are missed when it imports the `x-pack:plugin:sql:jdbc`
project. This change makes these dependencies compile dependencies if
the build is running for Eclipse.
Tests need to wait for changes to the job's established memory usage to
propagate and an over enthusiastic optimisation meant jobs were updated
from stale state causing recent change to be lost.
This commit fixes the classpath for the SQL CLI tool on Windows. As the
x-pack bin folder was collapsed into the distribution bin folder, the
location of the classpath here needed to no longer contain the old
plugins directory.
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
With the move of X-Pack to a module, the classpath for the scripts needs
to be adjusted. This was done on Unix, but not for Windows. This commit
addresses Windows.
This commit moves the apache and elastic license files into a new
root level `licenses` directory and rewrites the top level LICENSE.txt
to clarify the repository has a mix of apache and elastic licensed code.
This commit adapts the license headers check to no longer look for the
ealsticsearch confidential license, but instead to look for the new
Elastic License header.
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
The follow index api completely reuses CCS infrastructure that was exposed via:
https://github.com/elastic/elasticsearch/pull/29495
This means that the leader index parameter support the same ccs index
to indicate that an index resides in a different cluster.
I also added a qa module that smoke tests the cross cluster nature of ccr.
The idea is that this test just verifies that ccr can read data from a
remote leader index and that is it, no crazy randomization or indirectly
testing other features.
keep track of shard follow stats inside shard follow stats' node task instead of persistent task status.
By maintaining the shard follow stats inside its node task the stats update is quicker as
no cluster state update is required. The stats are now transient; meaning if the task
is going to run a different node then the stats are gone too. Currently only the processed
global checkpoint is being tracked and this is being restored when a shard follow node task
starts via the indices stats api (the reason of the first change of this change). Other stats
that we may add in the future (like fetch_time, see: https://gist.github.com/s1monw/dba13daf8493bf48431b72365e110717)
it is ok if we start from zero in case a shard follow task moves to another node.
This limit is based on the number of estimate bytes in each translog
operation that fall between the minimum and maximum request sequence number.
If this limit is met then the shard follow task executor will make sure
that a subsequent shard changes request will be performed to fetch the
remaining translog operations.
This limit is needed in order to protect against returning too many
translog operations in a single shard changes response.
Relates to #2436
We check for the existence of both leader and follower index, then properly
report to the caller. However, we do not return after reporting failure. This
causes the caller receive exception twice: IllegalArgumentException then
NullPointerException. This commit makes sure to stop the action after reporting
failure.
This commit enables the run task for ccr by specifying that the ccr
project not be evaluated until after core is evaluated. This is
important since ccr is alphabetically before core and thus Gradle
evaluates it first.
Relates #3665
The checkpoints in the assertion message that the follower checkpoint is
less than the leader checkpoint are backwards. This commit fixes this
message.
The shard follow task executor determines the range of translog operations
between the leader shard's global checkpoint and the last know processed
seqno by the current shard follow task that are missing.
Then the chunks coordinator can then chunk this range up in smaller ranges
if the requested range is above the configured max chunk size. If it is
smaller than the entire range then the chunk coordinator has just one
chuck to coordinate.
Each chunk is added to a queue and is processed by the ChunkProcessor,
that reads the translog ops from the leader shard and then indexes
these translog ops into the follow shard. After that a new chuck is polled
from the queue and the ChunkProcessor performs the same actions until
there are no more chunks in the queue to process. After that the shard
follow task executor will determine a new range of translog operations
to process.
This change changes the chunk coordinator to start polling from the chunk
queue with multiple threads at the same time to handle dealing with a higher
indexing load on the leader side better.
* Fixed a small issue where each batch would fetch / index the previous batch last operation
* Made batch size a request param on the follow existing index api request.
This makes is easy to tune this param when running tests from scripts.
* Changed default batch size from 256 to 1024.
I forgot to configure a mapping in the follow shard shard, which caused
a dynamic update (due to type auto creation), but this was ignored.
Subsequent searches in follow index then failed due to a mapping missing.
(The _id couldn't be fetched during fetch phase, because the mapping was missing)
We should at a later stage investigate how to best solve this, but for
know to avoid confusion just fail if a dynamic update happens in a
follow shard.
This commit adds a bulk action for apply translog operations in bulk to
an index. This action is then used in the persistent task for CCR to
apply shard changes from a leader shard.
Relates #3147
This test was broken by an upstream change that no longer guarantees we
see the operations from the upstream translog in the order they appear
in that translog. As such, the assertions in this test were too strong
so this commit relaxes them.
Relates #3153
Operations from a leader shard will be indexed into the engine with the
origin set to primary. The problem is here is that then we have primary
semantics in the engine such as assertions about sequence numbers being
unassigned, and we do not have correct semantics for out-of-order
delivery of operations (as we should on a following engine, whether or
not it is primary since the ordering is determined from the
leader). This commit handles this by always using the replica plan for
indexing into a following engine, whether or not the engine is for a
primary shard.
Relates #3000
A following engine even for a primary shard needs to maintain order of
operations semantics as if it were behaving like a replica. That is,
rather than assuming that the order of operations presented to the
engine is the de facto order of operations as is the case for a leader
engine for a primary shard, a following engine must behave like all
replicas behave which is that they resolve order of operations based on
sequence numbers. This commit causes this to be the case for following
engines.
Relates #2931
This commit is a first step towards a following engine
implementation. Future work will build on this by using this engine to
execute operations on a following engine from another engine (typically
a remote leader engine) that has already assigned sequence numbers to
such operations.
Relates #2776
This commit sets an index setting for the size of a translog generation
and increases the number of documents indexed to increase the chance of
multiple generations being present when testing getting operations
between two sequence numbers.
This commit fixes an off-by-one error in the shard changes action test
for getting operations between two sequence numbers. The off-by-one
error arises because sequence numbers are indexed from zero, so if N
documents are indexed then the maximum sequence number starting from
zero would be N - 1.
* xdcr: Add an internal api to read translog operations between a sequence number range.
This api will be used later by the persistent task for the following index to pull data from the leader index.
The persistent task can fetch the global checkpoint from the shard stats for each primary shard of the leader index.
Based on the global checkpoint of the primary shards of the following index, the persistent task can send several
calls to the internal api added in this commit to replicate changes from follow index to leader index in a batched manner.
This commit utilizes the pluggable engine factory feature in core to
introduce a pluggable engine factory for XDCR. For now this is only a
skeleton implementation to proof out the pluggable engine factory
concept. Future work will implement a genuine following engine for XDCR.
Relates #2655
This commit introduces the container class for CCR functionality. Future
work will expose more specific CCR functionality to the X-Pack plugin
through this class.
Relates #2704