This would allow MapperService to still be usable in contexts when a
QueryShardContext cannot be obtained, for instance in the case that a
MapperService needs to be created only to merge mappings.
Currently the request cache adds a `CacheEntity` object. It looks quite innocent
but in practice it has a reference to a lambda that knows how to create a value.
The issue is that this lambda has implicit references to the SearchContext
object, which can be quite heavy since it holds a structured representation of
aggregations for instance.
This pull request splits the `CacheEntity` object from the object that generates
cache values.
Stored scripts and ingest node configuration are important parts of the overall cluster state and should be included into a snapshot together with index templates and persistent settings if the includeGlobalState is set to true.
Closes#21184
These tests had a single method due to the fact that es didn't support resetting settings when they were first written. They can now be rewritten to have separate methods and an after method that resets the setting that is left behind.
When installing a plugin when the plugins directory does not exist, the
install plugin command outputs a line saying that it is creating this
directory. The packaging tests for the archive distributions accounted
for this including an assertion that this line was output. The packages
have since been updated to include an empty plugins folder, so this line
will no longer be output. This commit removes this stale assertion from
the packaging tests.
Relates #21275
Today when running gradle clean
:distribution:(integ-test-zip|tar|zip):assemble, the created archive
distribution will be missing the empty plugins directory. This is
because the empty plugins folder created in the build folder for the
copy spec task is created during configuration and then is later wiped
away by the clean task. This commit addresses this issue, by pushing
creation of the directory out of the configuration phase.
Relates #21271
today the `_shrink` tests do relocate all shards to a single node in
the cluster. Yet, that is not always possible since the only node we can
safely identify in the cluster is the master and if the master is a BWC
node in such a cluster we won't be able to relocate shards that have a
primary on the newer version nodes since allocation deciders forbid this.
This change restricts allocation for that index when the index is created
to restrict allocation to the master that guarantees that all primaries
are on the same node which is sufficient for the `_shrink` API to run.
This adds checks for expected warning headers to the query builder test
infrastructure. Tests that are adding deprecation warnings to the response
headers need to check those, otherwise the abstract base class for the test
class will complain at teardown.
This query is deprecated from 5.0 on. Similar to IndicesQueryBuilder we should
log a deprecation warning whenever this query is used.
Relates to #15760
Checks on static test state are run by an @After method in ESTestCase. Suite-scoped tests in ESIntegTestCase only shut down in an @AfterClass method, which executes after the @After method in ESTestCase. The suite-scoped cluster can thus still execute actions that will violate the checks in @After without those being caught. A subsequent test executing within the same JVM will fail these checks however when @After gets called for that test.
This commit adds an explicit call to check the static test state after the suite-scoped cluster has been shut down.
The pending cluster state queue is used to hold incoming cluster states from the master. Currently the elected master doesn't publish to itself as thus the queue is not used. Sometimes, depending on the timing of disruptions, a pending cluster state can be put on the queue (but not committed) but another master before being isolated. If this happens concurrently with a master election the elected master can have a pending cluster state in its queue. This is not really a problem but it does confuse our assertions during tests as we check to see everything was processed correctly.
This commit takes a temporary step to clear (and fail) any pending cluster state on the master after it has successfully published a CS. Most notably this will happen when the master publishes the cluster state indicating it has just become the master.
Long term we are working to change the publishing mechanism to make the master use the pending queue just like other nodes, which will make this a non issue.
See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/509 for example.
Lucene 6.2 introduces the new `Analyzer.normalize` API, which allows to apply
only character-level normalization such as lowercasing or accent folding, which
is exactly what is needed to process queries that operate on partial terms such
as `prefix`, `wildcard` or `fuzzy` queries. As a consequence, the
`lowercase_expanded_terms` option is not necessary anymore. Furthermore, the
`locale` option was only needed in order to know how to perform the lowercasing,
so this one can be removed as well.
Closes#9978
Today these two are considered mutual exclusive but they are not in
practice. For instance a mixed version cluster might not return a
given warning depending on which node we talk to but on the other hand
some runners might not even support warnings at all so the test might be
skipped either by version or by feature.
Since we have the ingest node type, there is either IngestActionFilter or IngestProxyActionFilter registered, depending on whether the node is an ingest node or not. The special case that shortcuts the execution in case there are no filters is never exercised.
This change adds an option called `split_on_whitespace` which prevents the query parser to split free text part on whitespace prior to analysis. Instead the queryparser would parse around only real 'operators'. Default to true.
For instance the query `"foo bar"` would let the analyzer of the targeted field decide how the tokens should be splitted.
Some options are missing in this change but I'd like to add them in a follow up PR in order to be able to simplify the backport in 5.x. The missing options (changes) are:
* A `type` option which similarly to the `multi_match` query defines how the free text should be parsed when multi fields are defined.
* Simple range query with additional tokens like ">100 50" are broken when `split_on_whitespace` is set to false. It should be possible to preserve this syntax and make the parser aware of this special syntax even when `split_on_whitespace` is set to false.
* Since all this options would make the `query_string_query` very similar to a match (multi_match) query we should be able to share the code that produce the final Lucene query.
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
This commit introduces a single-shard balance step for deciding on
rebalancing a single shard (without taking any other shards in the
cluster into account). This method will be used by the cluster
allocation explain API to explain in detail the decision process for
finding a more optimal location for a started shard, if one exists.
* Add docs with up to date instructions on updating default similarity
The default similarity can no longer be set in the configuration file
(you will get an error on startup). Update the docs with the method
that works.
* Add instructions for changing similarity on index creation
Today when installing Elasticsearch from an archive distribution (tar.gz
or zip), an empty plugins folder is not included. This means that if you
install Elasticsearch and immediately run elasticsearch-plugin list, you
will receive an error message about the plugins directory missing. While
the plugins directory would be created when starting Elasticsearch for
the first time, it would be better to just include an empty plugins
directory in the archive distributions. This commit makes this the
case. Note that the package distributions already include an empty
plugins folder.
Relates #21204
This doesn't make much sense to have at all, since a user can do a `GET`
request without a version of they want to get it unconditionally.
Relates to #20995
It was 10mb and that was causing trouble when folks reindex-from-remoted
with large documents.
We also improve the error reporting so it tells folks to use a smaller
batch size if they hit a buffer size exception. Finally, adds some docs
to reindex-from-remote mentioning the buffer and giving an example of
lowering the size.
Closes#21185
There may be cases where the XContentBuilder is not used and therefore it never gets
closed, which can cause a leak of bytes. This change moves the creation of the builder
into a try with resources block and adds an assertion to verify that we always consume
the bytes in our code; the try-with resources provides protections against memory leaks
caused by plugins, which do not test this.