This commit introduces a new execution mode for the query_string query, which
is intended down the road to be a replacement for the current _all field.
It now does auto-field-expansion and auto-leniency when the following criteria
are ALL met:
The _all field is disabled
No default_field has been set in the index settings
No default_field has been set in the request
No fields are specified in the request
Additionally, a user can force the "all-like" execution by setting the
all_fields parameter to true.
When executing in all field mode, the query_string query will look at all the
fields in the mapping that are not metafields and can be searched, and
automatically expand the list of fields that are going to be queried.
Relates to #19784
According to ISO 8601, a time zone offset of zero, can be stated numerically as
"+00:00", "+0000", or "00". The Joda library also seems to allow for "-00:00",
"-00" and "-0000". This adds some test to the DateMathParserTests that check
that we also conform to this.
Closes#21320
The usage information for `elasticsearch-plugin` is quiet verbose and makes the
actual error message that is shown when trying to remove an non-existing plugin
hard to spot. This changes the error code to not trigger printing the usage
information.
Closes#21250
Dependencies are currently marked as non-transitive in generated POM files by adding a wildcard (*) exclusion. This breaks compatibility with the dependency manager Apache Ivy as it incorrectly translates POMs with * excludes to Ivy XML with * excludes which results in the main artifact being excluded as well (see https://issues.apache.org/jira/browse/IVY-1531). To stay compatible with the current release of Ivy this commit uses explicit excludes for each transitive artifact instead to ensure that the main artifact is not excluded. This should be revisited when we upgrade Gradle to a higher version as the current one (2.13) as Gradle automatically translates non-transitive dependencies to * excludes in 2.14+.
This is a bit funky to do with junit because we need per test state
but we only want to log it per suite. So we use a static flag that
we test per test and reset before every suite.
ShardInfo#toString resorts to calling ShardInfo#toXContent via
Strings#toString. However, the resulting XContent object will not start
with an object, and this is a violation of the generator state
machine. This commit fixes this issue by replacing the override of
toString to provide simple non-toXContent output of the ShardInfo
instance. Without this fix, ShardInfo#toString will instead produce
"Error building toString out of XContent:
com.fasterxml.jackson.core.JsonGenerationException: Can not write a
field name, expecting a value"
Relates #21319
Our test infrastructure checks after running each test that there are no more in-flight requests on the shard level. Whenever the check fails, we only know that there were in-flight requests but don't know what requests were causing this issue. This commit adds the replication tasks that are still active at that moment to the assertion error.
With #21099 we removed support for the ignored allow_no_indices parameter in indices upgrade API. Truth is that ignore_unavailable and expand_wildcards were also ignored, in indices upgrade as well as upgrade status API. Those parameters are though supported internally and settable through java API, hence they should be all supported on the REST layer too.
`FilterAggregationBuilder` today misses to rewrite queries which causes failures
if a query that uses a client for instance to lookup terms since it must be rewritten first.
This change also ensures that if a client is used from the rewrite context we mark the query as
non-cacheable.
Closes#21301
Plugins: Remove pluggability of ZenPing
ZenPing is the part of zen discovery which knows how to ping nodes.
There is only one alternative implementation, which is just for testing.
This change removes the ability to add custom zen pings, and instead
hooks in the MockZenPing for tests through an overridden method in
MockNode. This also folds in the ZenPingService (which was really just a
single method) into ZenDiscovery, and removes the idea of having
multiple ZenPing instances. Finally, this was the last usage of the
ExtensionPoint classes, so that is also removed here.
Currently the default S3 buffer size is 100MB, which can be a lot for small
heaps. This pull request updates the default to be 100MB for heaps that are
greater than 2GB and 5% of the heap size otherwise.
The important settings docs previously referred to a section regarding
the node.max_local_storage_nodes setting. This section was removed, but
the link was not. This commit removes that link.
Previously node.max_local_storage_nodes defaulted to fifty, and this
permitted users to start multiple instances of Elasticsearch sharing the
same data folder. This can be dangerous, and usually it does not make
sense to run more than one instance of Elasticsearch on a single
server. Because of this, we had a note in the important settings docs
advising users to set this setting to one. However, we have since
changed the default value of this setting to one so this advise is no
longer needed.
Relates #21305
Since we now validate all consumed request parameter, users can't specify
`_cat/nodes?full_id=true|false` anymore since this parameter is consumed late.
This commit adds a test for this parameter and consumes it before request is processed.
Closes#21266
This would allow MapperService to still be usable in contexts when a
QueryShardContext cannot be obtained, for instance in the case that a
MapperService needs to be created only to merge mappings.
Currently the request cache adds a `CacheEntity` object. It looks quite innocent
but in practice it has a reference to a lambda that knows how to create a value.
The issue is that this lambda has implicit references to the SearchContext
object, which can be quite heavy since it holds a structured representation of
aggregations for instance.
This pull request splits the `CacheEntity` object from the object that generates
cache values.
Stored scripts and ingest node configuration are important parts of the overall cluster state and should be included into a snapshot together with index templates and persistent settings if the includeGlobalState is set to true.
Closes#21184
These tests had a single method due to the fact that es didn't support resetting settings when they were first written. They can now be rewritten to have separate methods and an after method that resets the setting that is left behind.
When installing a plugin when the plugins directory does not exist, the
install plugin command outputs a line saying that it is creating this
directory. The packaging tests for the archive distributions accounted
for this including an assertion that this line was output. The packages
have since been updated to include an empty plugins folder, so this line
will no longer be output. This commit removes this stale assertion from
the packaging tests.
Relates #21275
Today when running gradle clean
:distribution:(integ-test-zip|tar|zip):assemble, the created archive
distribution will be missing the empty plugins directory. This is
because the empty plugins folder created in the build folder for the
copy spec task is created during configuration and then is later wiped
away by the clean task. This commit addresses this issue, by pushing
creation of the directory out of the configuration phase.
Relates #21271
today the `_shrink` tests do relocate all shards to a single node in
the cluster. Yet, that is not always possible since the only node we can
safely identify in the cluster is the master and if the master is a BWC
node in such a cluster we won't be able to relocate shards that have a
primary on the newer version nodes since allocation deciders forbid this.
This change restricts allocation for that index when the index is created
to restrict allocation to the master that guarantees that all primaries
are on the same node which is sufficient for the `_shrink` API to run.
This adds checks for expected warning headers to the query builder test
infrastructure. Tests that are adding deprecation warnings to the response
headers need to check those, otherwise the abstract base class for the test
class will complain at teardown.
This query is deprecated from 5.0 on. Similar to IndicesQueryBuilder we should
log a deprecation warning whenever this query is used.
Relates to #15760
Checks on static test state are run by an @After method in ESTestCase. Suite-scoped tests in ESIntegTestCase only shut down in an @AfterClass method, which executes after the @After method in ESTestCase. The suite-scoped cluster can thus still execute actions that will violate the checks in @After without those being caught. A subsequent test executing within the same JVM will fail these checks however when @After gets called for that test.
This commit adds an explicit call to check the static test state after the suite-scoped cluster has been shut down.
The pending cluster state queue is used to hold incoming cluster states from the master. Currently the elected master doesn't publish to itself as thus the queue is not used. Sometimes, depending on the timing of disruptions, a pending cluster state can be put on the queue (but not committed) but another master before being isolated. If this happens concurrently with a master election the elected master can have a pending cluster state in its queue. This is not really a problem but it does confuse our assertions during tests as we check to see everything was processed correctly.
This commit takes a temporary step to clear (and fail) any pending cluster state on the master after it has successfully published a CS. Most notably this will happen when the master publishes the cluster state indicating it has just become the master.
Long term we are working to change the publishing mechanism to make the master use the pending queue just like other nodes, which will make this a non issue.
See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/509 for example.
Lucene 6.2 introduces the new `Analyzer.normalize` API, which allows to apply
only character-level normalization such as lowercasing or accent folding, which
is exactly what is needed to process queries that operate on partial terms such
as `prefix`, `wildcard` or `fuzzy` queries. As a consequence, the
`lowercase_expanded_terms` option is not necessary anymore. Furthermore, the
`locale` option was only needed in order to know how to perform the lowercasing,
so this one can be removed as well.
Closes#9978
Today these two are considered mutual exclusive but they are not in
practice. For instance a mixed version cluster might not return a
given warning depending on which node we talk to but on the other hand
some runners might not even support warnings at all so the test might be
skipped either by version or by feature.
Since we have the ingest node type, there is either IngestActionFilter or IngestProxyActionFilter registered, depending on whether the node is an ingest node or not. The special case that shortcuts the execution in case there are no filters is never exercised.
This change adds an option called `split_on_whitespace` which prevents the query parser to split free text part on whitespace prior to analysis. Instead the queryparser would parse around only real 'operators'. Default to true.
For instance the query `"foo bar"` would let the analyzer of the targeted field decide how the tokens should be splitted.
Some options are missing in this change but I'd like to add them in a follow up PR in order to be able to simplify the backport in 5.x. The missing options (changes) are:
* A `type` option which similarly to the `multi_match` query defines how the free text should be parsed when multi fields are defined.
* Simple range query with additional tokens like ">100 50" are broken when `split_on_whitespace` is set to false. It should be possible to preserve this syntax and make the parser aware of this special syntax even when `split_on_whitespace` is set to false.
* Since all this options would make the `query_string_query` very similar to a match (multi_match) query we should be able to share the code that produce the final Lucene query.
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.