Raw requests are supported only by the java yaml test runner and were introduced to test docs snippets. Some yaml tests ended up using them (see #23497) which causes failures for other language clients. This commit migrates those yaml tests to Java tests that send requests through the Java low-level REST client, and also moves the ability to send raw requests to a special client that's only available when testing docs snippets.
Closes#25694
This commit updates the s3 repository docs to clearly mark settings as
part of the s3 client settings, as well as those that are secure and
must be stored in the elasticsearch keystore.
relates #25619
When `refresh=wait_for` is set on an indexing request, we register a listener on the shards that are call during the next refresh. During the recover translog phase, when the engine is open, we have a window of time when indexing operations succeed and they can add their listeners. Those listeners will only be called when the recovery finishes as we do not refresh during recoveries (unless the indexing buffer is full). Next to being a bad user experience, it can also cause deadlocks with an ongoing peer recovery that may wait for those operations to mark the replica in sync (details below).
To fix this, this PR changes refresh listeners to be a noop when the shard is not yet serving reads (implicitly covering the recovery period). It doesn't matter anyway.
Deadlock with recovery:
When finalizing a peer recovery we mark the peer as "in sync". To do so we wait until the peer's local checkpoint is at least as high as the global checkpoint. If an operation with `refresh=wait_for` is added as a listener on that peer during recovery, it is not completed from the perspective of the primary. The primary than may wait for it to complete before advancing the local checkpoint for that peer. Since that peer is not considered in sync, the global checkpoint on the primary can be higher, causing a deadlock. Operation waits for recovery to finish and a refresh to happen. Recovery waits on the operation.
Today our shell scripts march on if they encounter an error during
execution. One place that this actually causes a problem is with the
Java version checker. What can happen is this: if the user botches their
installation so that the JavaVersionChecker can not be found on the
classpath, when we attempt to run the Java version checker, first an
error message that the class can not be found is displayed, and then we
print a message that their version of Java is not compatible; this
happens even if they are using a Java 8 installation. The problem is
that we should have immediately aborted when the class could not be
loaded. Since we do not exit when the shell script encounters an error,
we end up conflating failue to run the version check with a failed
version check. Instead, we really should abort the moment that one of
our scripts encounters an error. To do this, we make the following
changes:
- enable set -e and set -o pipefail
- make the Java version checker responsible for printing the error
message to the console
- remove the exit status check from the scripts
- actually on Windows, we still have to check the exit status because
there is no equivalent of set -e
- when we check for daemonization, we can no longer check the exit
status from grep because a failed grep will abort the script;
instead, we move the grep execution to be the condition for the if as
this does not trip the set -e failure conditions
- we should source elasticsearch-env before doing anything, so we move
the definition of parse_jvm_options below sourcing elasticsearch-env
- we make consistent all places where we use a subshell to use
backticks
Relates #26057
Compiling all of elasticsearch classes in one jvm, which is shared with
all of the loaded classes of gradle, can trip gc overhead limits. This
commit re-enables forking javac.
* Allow ingest simulate to parse _id, _index, _type, _routing and _parent as either string or int (#23823)
* Generate data that includes Integer and String type fields for testing document parsing.
https://github.com/elastic/elasticsearch/pull/17379 fixed many metric aggs so that if the parent aggregation does not collect any documents an empty bucket value is returned instead of an ArrayOutOfBoundsException being thrown. Unfortunately the value count aggregation was mised from this fix.
This change applies this fix from #17379 for the value count aggregation.
`ClusterSearchShardsResponseTests.testSerialization` randomly uses `IdsQueryBuilderTests` to generate an alias filter. `IdsQueryBuilderTests` shecks if the array of current types is length zero but it can also be null which causes a `NullPointerException`. This changes adds a null check to avoid the exception.
Closes#26021
* Adds mutate function to various tests
Relates to #25929
* fix test
* implements mutate function for all single bucket aggs
* review comments
* convert getMutateFunction to mutateIInstance
Currently there is an issue where the send listener is not called in the
nio transport when an exception is throw during channel flush. This
leads to memory leaks. This commit ensures that the listener is called
The s3 repository plugin has "third party" integ tests which rely
on external service and configuration setup. These tests are really
internal verification of the plugin (and should be moved to real integ
tests). Running them is not something a user should do, and the
documentation has been out of date for all of 5.x. This commit removes
the docs, removing potential confusion for users.
This commit adds the nio transport as an option in place of the mock tcp
transport for tests. Each test will only use one transport type. The
transport type is decided by a random boolean generated inside of the
`ESTestCase` class.
This commit updates the version for master to 7.0.0-alpha1. It also adds
the 6.1 version constant, and fixes many tests, as well as marking some
as awaits fix.
Closes#25893Closes#25870
We are currently quite lenient about the targets of `copy_to`. However in a
number of cases we can detect illegal use of `copy_to` at mapping update time.
For instance, it does not make sense to use object fields as targets of
`copy_to`, or fields that would end up in a different nested document.
When ES starts up we verify we can write to all data folders and that they support atomic moves. We do so by creating and deleting temp files. If for some reason the files was successfully created but not successfully deleted, we still shut down correctly but subsequent start attempts will fail with a file already exists exception.
This commit makes sure to first clean any existing temporary files.
Superseeds #21007
This commit fixes an issue with the Netty 4 multi-port test that a
transport client can connect. The problem here is that in case the
bottom of the random port range was already bound to (for example, by
another JVM) then then transport client could not connect to the data
node. This is because the transport client was in fact using the bottom
of the port range only. Instead, we simply try all the ports that the
data node might be bound to.
Closes#24441
In the refresh REST tests we setup some persistent settings for debug
logging. In the teardown, we try to restore the logging level back to
info via another persistent setting but this is a mistake because other
tests check if there are no persistent settings. To fix this, we remove
the persistent setting that we added.
We are chasing a test failure in the "refresh=wait_for waits until
changes are visible in search" test yet the logs currently give us no
indication what is happening. This commit adds debug logging for this
test, and cleans up this logging in a teardown section. We can remove
this additional logging after we chase the test failure down.
This commit adds a small note to the discovery docs to include a note
that we recommend that the unicast hosts list be maintained as the list
of master-eligible nodes in the cluster.
Relates #25991
Some REST tests can rapid-fire script compilations that exceed the
default script compilations per minute. Rather than subjecting ourselves
to spurious failures because of the limit being too low, we opt for a
larger limit here.
This commit updates the docs for the config files to explain the new
mechanism for customizing the configuration directory via the
environment variable CONF_DIR.
Relates #25990
ToXContentToBytes is used as a base class that adds toString and buildAsBytes method implementation to classes that implement ToXContent. With the ongoing cleanups, this class is limited and doesn't add a lot of value, given that buildAsBytes can be replaced with XContentHelper.toXContent and toString can be replaced with Strings.toString(this).
The plan would be to remove ToXContentToBytes entirely, and AbstractQueryBuilder is the first place where we can remove its usage.
During peer recoveries, we need to copy over lucene files and replay the operations they miss from the source translog. Guaranteeing that translog files are not cleaned up has seen many iterations overtime. Back in the old 1.0 days, recoveries went through the Engine and actively prevented both translog cleaning and lucene commits. We then moved to a notion called Translog Views, which allowed the recovery code to "acquire" a view into the translog which is then guaranteed to be kept around until the view is closed. The Engine code was free to commit lucene and do what it ever it wanted without coordinating with recoveries. Translog file deletion logic was based on reference counting on the file level. Those counters were incremented when a view was acquired but also when the view was used to create a `Snapshot` that allowed you to read operations from the files. At some point we removed the file based counting complexity in favor of constructs on the Translog level that just keep track of "open" views and the minimum translog generation they refer to. To do so, Views had to be kept around until the last snapshot that was made from them was consumed. This was fine in recovery code but lead to [a subtle bug](https://github.com/elastic/elasticsearch/pull/25862) in the [Primary Replica Resyncer](https://github.com/elastic/elasticsearch/pull/25862).
Concurrently, we have developed the notion of a `TranslogDeletionPolicy` which is responsible for the liveness aspect of translog files. This class makes it very simple to take translog Snapshot into account for keep translog files around, allowing people that just need a snapshot to just take a snapshot and not worry about views and such. Recovery code which actually does need a view can now prevent trimming by acquiring a simple retention lock (a `Closable`). This removes the need for the notion of a View.