Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
I found the following bugs:
- The 6.0 logic for conjunctions didn't work when there were only `match_all`
queries in MUST/FILTER clauses as they didn't propagate the `matchAllDocs`
flag.
- Some queries still had the same issue as `BooleanQuery` used to have with
duplicate terms (see #28353), eg. `MultiPhraseQuery`.
Closes#29376
Today we have a silent batch mode in the install plugin command when
standard input is closed or there is no tty. It appears that
historically this was useful when running tests where we want to accept
plugin permissions without having to acknowledge them. Now that we have
an explicit batch mode flag, this use-case is removed. The motivation
for removing this now is that there is another place where silent batch
mode arises and that is when a user attempts to install a plugin inside
a Docker container without keeping standard input open and attaching a
tty. In this case, the install plugin command will treat the situation
as a silent batch mode and therefore the user will never have the chance
to acknowledge the additional permissions required by a plugin. This
commit removes this silent batch mode in favor of using the --batch flag
when running tests and requiring the user to take explicit action to
acknowledge the additional permissions (either by leaving standard input
open and attaching a tty, or by passing the --batch flags themselves).
Note that with this change the user will now see a null pointer
exception when they try to install a plugin in a Docker container
without keeping standard input open and attaching a tty. This will be
addressed in an immediate follow-up, but because the implications of
that change are larger, they should be handled separately from this one.
The vagrant test plugin adds tasks for the groovy packaging tests,
which run after the bats packaging test tasks.Rename the 'bats'
configuration to 'packaging' and remove the option to inherit
archives from this configuration.
I did a little digging. It looks like IOException is thrown when the other
side closes its connection while we're waiting on our buffer to fill up. We
totally expect that in this test. It feels to me like we should throw a
`ConnectionClosedException` but upstream does not agree:
https://issues.apache.org/jira/browse/HTTPASYNC-134
While we *could* catch the exception and transform it ourselves that
seems like a bigger change than is merited at this point.
Closes#29136
Currently we store the indices specified in the request URL together with all
the other ranking evaluation specification in RankEvalSpec. This is not ideal
since e.g. the indices are not rendered to xContent and so cannot be parsed
back. Instead we should keep them in RankEvalRequest.
This commit (which will be reverted soon) adds logging on the output of
starting Wildfly. This is needed to debug an issue with Wildfly not
starting in CI.
* Decouple XContentBuilder from BytesReference
This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.
While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).
Relates to #28504
I have long wanted an actual test that dying with dignity works. It is
tricky because if dying with dignity works, it means the test JVM dies
which is usually an abnormal condition. And anyway, how does one force a
fatal error to be thrown. I was motivated to investigate this again by
the fact that I missed a backport to one branch leading to an issue
where Elasticsearch would not successfully die with dignity. And now we
have a solution: we install a plugin that throws an out of memory error
when it receives a request. We hack the standalone test infrastructure
to prevent this from failing the test. To do this, we bypass the
security manager and remove the PID file for the node; this tricks the
test infrastructure into thinking that it does not need to stop the
node. We also bypass seccomp so that we can fork jps to make sure that
Elasticsearch really died. And to be extra paranoid, we parse the logs
of the dead Elasticsearch process to make sure it died with
dignity. Never forget.
This commit removes the ability to specify that a plugin requires the
keystore and instead creates the keystore on package installation or
when Elasticsearch is started for the first time. The reason that we opt
to create the keystore on package installation is to ensure that the
keystore has the correct permissions (the package installation scripts
run as root as opposed to Elasticsearch running as the elasticsearch
user) and to enable removing the keystore on package removal if the
keystore is not modified.
This commit makes the controller spawner also look under modules. It
also fixes a bug in module security policy loading where the module is a
meta plugin.
Applying the rest test gradle plugin already uses the zip distribution
by default, so specifying it explicitly is not necessary. These are
leftovers from before zip was the default for rest tests.
Add support version and version_type in ingest pipelines
Add support for setting document version and version type in set
processor of an ingest pipeline.
Nodes are reusing task ids after restart. So in some rare circumstances
the same task id might be assigned to the reindexing task stored by the
old cluster and the new task that is trying to retrieve the task
results. As a result, the get task request can timeout waiting on
itself. Since we already waited for the task to finish before restarting
the cluster, waiting for the task here doesn't make any sense to start
with.
Fixes#28732
* Remove log4j dependency from elasticsearch-core
This removes the log4j dependency from our elasticsearch-core project. It was
originally necessary only for our jar classpath checking. It is now replaced by
a `Consumer<String>` so that the es-core dependency doesn't have external
dependencies.
The parts of #28191 which were moved in conjunction (like `ESLoggerFactory` and
`Loggers`) have been moved back where appropriate, since they are not required
in the core jar.
This is tangentially related to #28504
* Add javadocs for `output` parameter
* Change @code to @link
Previously a user could set a custom config path to a relative directory
using ES_PATH_CONF. In a previous change related to enabling GC logging
by default, we forced the working directory for Elasticsearch to be
ES_HOME. This had the impact of causing all relative paths to be
relative to ES_HOME, against the intent of the user. This commit
addresses this by making ES_PATH_CONF absolute before we switch the
working directory to ES_HOME.
Relates #28700
When we submit a task to a thread pool for asynchronous execution, we
are returned a future. Since we submitted to go asynchronous, these
futures are not inspected for failure (we would have to block a thread
to do that). While we have on failure handlers for exceptions that are
thrown during execution, we do not handle throwables that are not
exceptions and these end up silently lost. This commit adds a check
after the runnable returns that inspects the status of the future. If an
unhandled throwable occurred during execution, this throwable is
propogated out where it will land in the uncaught exception handler.
Relates #28667
* Move more XContent.createParser calls to non-deprecated version
This moves more of the callers to pass in the DeprecationHandler.
Relates to #28504
* Use parser's deprecation handler where available
[TEST] packaging: function to collect debug info
Sometimes when packaging tests fail in CI the test logs aren't enough to
tell what went wrong. This routine helps collect more info about the
state of the es installation at failure time
Generalizing BWC building so that there is less code to modify for a release. This ensures we do not
need to think about what major or minor version is in the gradle code. It follows the general rules of the
elastic release structure. For more information on the rules, see the VersionCollection's javadoc.
This also removes the additional bwc snapshots that will never be released, such as 6.0.2, which were
being built and tested against every time we ran bwc tests.
Additionally, it creates 4 new projects that correspond to the different types of snapshots that may exist
for a given version. Its possible to now run those individual tasks to work out bwc logic whereas
previously it was impossible and the entire suite of bwc tests had to be run to work out any logic
changes in the build tools' bwc project. Please note that if the project does not make sense for the
version that is current, that an error will be thrown from that individual project if an attempt is made to
run it.
This should allow for automating the version bumps as well, since it removes all the hardcoded version
logic from the configs.
Currently meta plugins will ask for confirmation of security policy
exceptions for each bundled plugin. This commit collects the necessary
permissions of each bundled plugin, and asks for confirmation of all of
them at the same time.
This pull request replaces the jvm-example plugin (from the jvm/site plugins era) by two new plugins: a custom-settings that shows how to register and use custom settings (including secured settings) in a plugin, and rest-handler plugin that shows how to register a rest handler.
The two plugins now reside in the plugins/examples project. They can serve as sample plugins for users, a special attention has been put on documentation. The packaging tests have been adapted to use the custom-settings plugin.
Our rest client throws exceptions in funny ways that might cause
`suppressed` exceptions to be eaten. This works around that in the
reindex-from-old tests so we don't stomp on a real failure. It is fairly
diryt so we should work on fixing the high level rest client.
Relates to #25453
The current install_plugin() does not play well with meta plugins because
it always checks for the plugin's descriptor file.
This commit changes the install_plugin() so that it only runs the install plugin
command and lets the caller verify that the required files are correctly installed.
It also adds a install_meta_plugin() function to install meta plugins.
Currenty the rest response of the ranking evaluation API wraps all inside an
enclosing `rank_eval` object. This is redundant since it is clear from the API
call and it doesn't provide any other useful information. This change removes
this.
This commit lessens the burden on configuring settings.gradle when new
projects are added. In particular, this makes it trivial to move a
plugin to a module (or vice versa).
This commit modifies the build to require JDK 9 for
compilation. Henceforth, we will compile with a JDK 9 compiler targeting
JDK 8 as the class file format. Optionally, RUNTIME_JAVA_HOME can be set
as the runtime JDK used for running tests. To enable this change, we
separate the meaning of the compiler Java home versus the runtime Java
home. If the runtime Java home is not set (via RUNTIME_JAVA_HOME) then
we fallback to using JAVA_HOME as the runtime Java home. This enables:
- developers only have to set one Java home (JAVA_HOME)
- developers can set an optional Java home (RUNTIME_JAVA_HOME) to test
on the minimum supported runtime
- we can test compiling with JDK 9 running on JDK 8 and compiling with
JDK 9 running on JDK 9 in CI
Currently if the Lucene version is `X.Y.Z-snapshot-{gitrev}`, then we will
expect the docs to have `X.Y.Z-snapshot` as a Lucene version. I would like
to change it to `X.Y.Z` so that this doesn't need changing when we move from a
snapshot to a final release.
This is related to #27933. It introduces a jar named elasticsearch-core
in the lib directory. This commit moves the JarHell class from server to
elasticsearch-core. Additionally, PathUtils and some of Loggers are
moved as JarHell depends on them.
The configuration of the upgraded cluster task was missing a dependency
on the stopping of the second old node in the cluster. In some cases
(e.g., --parallel) Gradle would then try to run the configuration of a
node in the upgraded cluster before it had even configured the old nodes
in the cluster.
Relates #28036
This commit adds the ability to package multiple plugins in a single zip.
The zip file for a meta plugin must contains the following structure:
|____elasticsearch/
| |____ <plugin1> <-- The plugin files for plugin1 (the content of the elastisearch directory)
| |____ <plugin2> <-- The plugin files for plugin2
| |____ meta-plugin-descriptor.properties <-- example contents below
The meta plugin properties descriptor is mandatory and must contain the following properties:
description: simple summary of the meta plugin.
name: the meta plugin name
The installation process installs each plugin in a sub-folder inside the meta plugin directory.
The example above would create the following structure in the plugins directory:
|_____ plugins
| |____ <name_of_the_meta_plugin>
| | |____ meta-plugin-descriptor.properties
| | |____ <plugin1>
| | |____ <plugin2>
If the sub plugins contain a config or a bin directory, they are copied in a sub folder inside the meta plugin config/bin directory.
|_____ config
| |____ <name_of_the_meta_plugin>
| | |____ <plugin1>
| | |____ <plugin2>
|_____ bin
| |____ <name_of_the_meta_plugin>
| | |____ <plugin1>
| | |____ <plugin2>
The sub-plugins are loaded at startup like normal plugins with the same restrictions; they have a separate class loader and a sub-plugin
cannot have the same name than another plugin (or a sub-plugin inside another meta plugin).
It is also not possible to remove a sub-plugin inside a meta plugin, only full removal of the meta plugin is allowed.
Closes#27316
We have a packaging test that tries to install all plugins, and then
asserts that all expected plugins are installed. The expected plugins
are dervied from the list of plugins in the plugins sub-project. The
plugin transport-nio was recently added here, but explicit commands to
install and remove this plugin were never added. This commit addresses
this.
The full-cluster-restart tests are run with two nodes. This can lead to situations where the shrink tests fail because
the replicas are not allocated yet and the shrink operation needs the source shards to be available on the same node.
This is related to #27260. This commit moves the NioTransport from
:test:framework to a new nio-transport plugin. Additionally, supporting
tcp decoding classes are moved to this plugin. Generic byte reading and
writing contexts are moved to the nio library.
Additionally, this commit adds a basic MockNioTransport to
:test:framework that is a TcpTransport implementation for testing that
is driven by nio.
Lucene does not allow adding Lucene 6 files to a Lucene 7 index. This commit ensures that we carry over the Lucene version to the newly created Lucene index.
Closes#28061
This commit adds the infrastructure to plugin building and loading to
allow one plugin to extend another. That is, one plugin may extend
another by the "parent" plugin allowing itself to be extended through
java SPI. When all plugins extending a plugin are finished loading, the
"parent" plugin has a callback (through the ExtensiblePlugin interface)
allowing it to reload SPI.
This commit also adds an example plugin which uses as-yet implemented
extensibility (adding to the painless whitelist).
Previously we disabled these tests on JDK 9 and JDK 10 because Wildfly
10 did not support JDK 9 and JDK 10. With the release of Wildfly 11
supporting JDK 9 and JDK 10 and now that we depend on Wildfly 11 in
these tests, we can enable these tests on JDK 9 and JDK 10.
Relates #28042
This commit fixes a flubbed assertion in the Wildfly build file; the
intention was to assert that we have only set the value of the
management port once except the assertion that the management port was
equal to zero when finding a matching log line was left off.
The Wildfly tests previously needed hardcoded ports which would
sometimes already be in use leading to Wildfly test failures because the
server could not bind during startup. We had to wait until some upstream
changes (wildfly/wildfly#9943) and (wildfly/wildfly-core#2390) were
released before we could use ephemeral ports in these tests. These
changes are released now so this commit changes the Wildfly tests to use
ephemeral ports when starting Wildfly.
Relates #28040
* Adds task dependenciesInfo to BuildPlugin to generate a CSV file with dependencies information (name,version,url,license)
* Adds `ConcatFilesTask.groovy` to concatenates multiple files into one
* Adds task `:distribution:generateDependenciesReport` to concatenate `dependencies.csv` files into a single file (`es-dependencies.csv` by default)
# Examples:
$ gradle dependenciesInfo :distribution:generateDependenciesReport
## Use `csv` system property to customize the output file path
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dcsv=/tmp/elasticsearch-dependencies.csv
## When branch is not master, use `build.branch` system property to generate correct licenses URLs
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dbuild.branch=6.x -Dcsv=/tmp/elasticsearch-dependencies.csv
We disable these tests because the JVM flags we use will not start with
JDK 10. Since we do not support running these old versions on newer JDKs
anyway, this commit disables testing of these when running on JDK 10.
This commit moves GlobalCheckpointTracker from the engine to IndexShard, where it better fits logically: Tracking the global checkpoint based on the local checkpoints of all shards in the replication group is not a property of the engine, but rather a property fulfilled by the current primary shard. The LocalCheckpointTracker on the other hand is driven by the contents of the local translog. By moving GlobalCheckpointTracker to IndexShard, it makes little sense to keep the SequenceNumbersService class around - it would only wrap the LocalCheckpointTracker. This commit therefore removes the class and replaces occurrences of SequenceNumbersService in the engine directly by LocalCheckpointTracker.
This reverts commit e04e5ab037 as we no longer need the increased
logging for the mixed cluster tests. This will reduce the size of logs for some build failures.
If you assert that a pattern of files exists but it matches more then
one file the "assert this file exists" code failed with a misleading
error message. This tests if the patter resolved to multiple files and
prints a better error message if it did.
For too long we have been groping around in the dark when faced with GC
issues because we rarely have GC logs at our disposal. This commit
enables GC logging by default out of the box.
Relates #27610
Any CLI commands that depend on core Elasticsearch might touch classes
(directly or indirectly) that depends on logging. If they do this and
logging is not configured, Log4j will dump status error messages to the
console. As such, we need to ensure that any such CLI command configures
logging (with a trivial configuration that dumps log messages to the
console). Previously we did this in the base CLI command but with the
refactoring of this class out of core Elasticsearch, we no longer
configure logging there (since we did not want this class to depend on
settings and logging). However, this meant for some CLI commands (like
the plugin CLI) we were no longer configuring logging. This commit adds
base classes between the low-level command and multi-command classes
that ensure that logging is configured. Any CLI command that depends on
core Elasticsearch should use this infrastructure to ensure logging is
configured. There is one exception to this: Elasticsearch itself because
it takes reponsibility into its own hands for configuring logging from
Elasticsearch settings and log4j2.properties. We preserve this special
status.
Relates #27523
Today we require users to prepare their indices for split operations.
Yet, we can do this automatically when an index is created which would
make the split feature a much more appealing option since it doesn't have
any 3rd party prerequisites anymore.
This change automatically sets the number of routinng shards such that
an index is guaranteed to be able to split once into twice as many shards.
The number of routing shards is scaled towards the default shard limit per index
such that indices with a smaller amount of shards can be split more often than
larger ones. For instance an index with 1 or 2 shards can be split 10x
(until it approaches 1024 shards) while an index created with 128 shards can only
be split 3x by a factor of 2. Please note this is just a default value and users
can still prepare their indices with `index.number_of_routing_shards` for custom
splitting.
NOTE: this change has an impact on the document distribution since we are changing
the hash space. Documents are still uniformly distributed across all shards but since
we are artificually changing the number of buckets in the consistent hashign space
document might be hashed into different shards compared to previous versions.
This is a 7.0 only change.
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
Projects the depend on the CLI currently depend on core. This should not
always be the case. The EnvironmentAwareCommand will remain in :core,
but the rest of the CLI components have been moved into their own
subproject of :core, :core:cli.
Currently, we are using a plain TransportRequestHandler to post snapshot
status messages to the master. However, it doesn't have a robust retry
mechanism as TransportMasterNodeAction. This change migrates from
TransportRequestHandler to TransportMasterNodeAction for the new
versions and keeps the current implementation for the old versions.
Closes#27151
When the vagrant box is very very slow, the elasticsearch service can
take more than 60 sec to start. This commit changes the timeout to 120.
closes#27372
* REST: Rename ingest.processor.grok to ingest.processor_grok
* REST: Rename remote.info to cluster.remote_info
* REST: Fixed bad YAML comments
* REST: Force dummy scripts to be strings, not numbers
* REST: Fix bad YAML in search/110_field_collapsing.yml
* REST: Adjust percentile tests to work with Perl number handling
extract all clauses from a conjunction query.
When clauses from a conjunction are extracted the number of clauses is
also stored in an internal doc values field (minimum_should_match field).
This field is used by the CoveringQuery and allows the percolator to
reduce the number of false positives when selecting candidate matches and
in certain cases be absolutely sure that a conjunction candidate match
will match and then skip MemoryIndex validation. This can greatly improve
performance.
Before this change only a single clause was extracted from a conjunction
query. The percolator tried to extract the clauses that was rarest in order
(based on term length) to attempt less candidate queries to be selected
in the first place. However this still method there is still a very high
chance that candidate query matches are false positives.
This change also removes the influencing query extraction added via #26081
as this is no longer needed because now all conjunction clauses are extracted.
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extractionCloses#26307
If an out of memory error is thrown while merging, today we quietly
rewrap it into a merge exception and the out of memory error is
lost. Instead, we need to rethrow out of memory errors, and in fact any
fatal error here, and let those go uncaught so that the node is torn
down. This commit causes this to be the case.
Relates #27265
The warnings headers have a fairly limited set of valid characters
(cf. quoted-text in RFC 7230). While we have assertions that we adhere
to this set of valid characters ensuring that our warning messages do
not violate the specificaion, we were neglecting the possibility that
arbitrary user input would trickle into these warning headers. Thus,
missing here was tests for these situations and encoding of characters
that appear outside the set of valid characters. This commit addresses
this by encoding any characters in a deprecation message that are not
from the set of valid characters.
Relates #27269
Only tests should use the single argument Environment constructor. To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.
Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns. This Environment
is also available via dependency injection.
Prior to this change if the `bwcTest` task is run then it would create
task for each version, but each task in reality would use wireCompatVersions - 1
ES version. So we were not actually testing against 5.6.x versions in the
6.x and 6.0 branches.
This commit adjusts the assertions for the sequence number BWC tests to
account for the fact that sometimes these tests are run in
mixed-clusters with 5.6 nodes (that do not understand sequence numbers),
and sometimes these tests are run in mixed-cluster with 6.0+ nodes (that
all understood sequence numbers).
Relates #27251
Windows handles trying to read a file that does not exist because a
component of the path is not a directory differently than other OS
handle this situation. This commit adjusts these assertions for Windows.
Finder creates these files if you browse a directory there. These files
are really annoying, but it's an incredible pain for users that these
files are created unbeknownst to them, and then they get in the way of
Elasticsearch starting. This commit adds leniency on macOS only to skip
these files.
Relates #27108
Today all these API calls have a sideeffect of making documents visible
to search requests. While this is sometimes desired it's an unnecessary sideeffect
and now that we have an internal (engine-private) index reader (#26972) we artificially
add a refresh call for bwc. This change removes this sideeffect in 7.0.
The rolling-upgrade test was only writing the "minimum_master_nodes" setting to the configuration file of the old nodes, but not the upgraded ones.
Also changes the value of "minimum_master_nodes" from "number_of_nodes" to "(number_of_nodes / 2) + 1".
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.