Today when a message is not fully read on a response, we log (among
other details) the handler name. Unfortunately, if the handler is a
wrapper, all that we see is
o.e.t.TransportService$ContextRestoreResponseHandler@7446ba18
completely losing the offending handler. This commit adds an override
for TransportService$ContextRestoreResponseHandler#toString so that the
underlying offender can be discovered.
Relates #21478
Today when handling responses from nodes in TransportNodesAction, if a
node timeouts or some other failure occurs and the action is not
accumulating exceptions, we log a confusing message:
org.elasticsearch.action.admin.cluster.stats.TransportClusterStatsAction]
ignoring unexpected response [null] of type [null], expected
[ClusterStatsNodeResponse] or [FailedNodeException]
Moreover, the original exception is completely lost. Since this log
message is confusing and unhelpful, we can drop it. Instead, we hold
onto the exception and log it at the warn level before dropping it from
the response.
Relates #21476
This commit updates JodaTime to version 2.9.5 that contain a fix for a bug when parsing time zones (see https://github.com/JodaOrg/joda-time/pull/332, https://github.com/JodaOrg/joda-time/issues/386 and https://github.com/JodaOrg/joda-time/issues/373).
It also remove the joda-convert dependency that seems to be unused.
closes#20911
Here is the changelog for 2.9.5:
```
Changes in 2.9.5
----------------
- Add Norwegian period translations [#378]
- Add Duration.dividedBy(long,RoundingMode) [#69, #379]
- DateTimeZone data updated to version 2016i
- Fixed bug where clock read twice when comparing two nulls in DateTimeComparator [#404]
- Fixed minor issues with historic time-zone data [#373]
- Fix bug in time-zone binary search [#332, #386]
The fix in v2.9.2 caused problems when the time-zone being parsed
was not the last element in the input string. New approach uses a
different approach to the problem.
- Update tests for JDK 9 [#394]
- Close buffered reader correctly in zone info compiler [#396]
- Handle locale correctly zone info compiler [#397]
```
The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly.
Closes#21419
Currently, `executeIndexRequestOnPrimary` and `executeDeleteRequestOnPrimary`
methods used to prepare and execute write operations, modifies the provided
request, updating the version and versionType. This commit makes it the
callers responsibility to update request version and versionType and avoids
mutating the provided request in the execute methods.
This commit introduces a new execution mode for the
`simple_query_string` query, which is intended down the road to be a
replacement for the current _all field.
It now does auto-field-expansion and auto-leniency when the following criteria
are ALL met:
The _all field is disabled
No default_field has been set in the index settings
No fields are specified in the request
Additionally, a user can force the "all-like" execution by setting the
all_fields parameter to true.
When executing in all field mode, the `simple_query_string` query will
look at all the fields in the mapping that are not metafields and can be
searched, and automatically expand the list of fields that are going to
be queried.
Relates to #20925, which is the `query_string` version of this work.
This is basically the same behavior, but for the `simple_query_string`
query.
Relates to #19784
This commit clarifies some log messages for the bootstrap checks. The
message is that the user has limits set that are below the minimums that
Elasticsearch requires.
Relates #21423
This commit ensures that we always restore the thread's original context after execution of
a context preserving runnable. We always wrap runnables in a wrapper that restores the context
at the time it was submitted to the execute method. The ContextPreservingAbstractRunnable
would restore the calling context in the doRun method and then in a try with resources
block would restore the thread's original context. However, the onFailure and onAfter methods
of a AbstractRunnable could modify the thread context and this modified thread context would
continue on as the thread's context after it was returned to the pool and potentially used
for a different purpose.
* Plugins: Convert custom discovery to pull based plugin
This change primarily moves registering custom Discovery implementations
to the pull based DiscoveryPlugin interface. It also keeps the cloud
based discovery plugins re-registering ZenDiscovery under their own name
in order to maintain backwards compatibility. However,
discovery.zen.hosts_provider is changed here to no longer fallback to
discovery.type. Instead, each plugin which previously relied on the
value of discovery.type now sets the hosts_provider to itself if
discovery.type is set to itself, along with a deprecation warning.
This commit cleans up the formatting of some Javadocs in
BootstrapCheck.java, and corrects some docs that had become stale as the
bootstrap checks evolved.
The ClusterState class currently has a mutable volatile field "status" that is only used by the ClusterStateObserver to differentiate between a cluster state that is being applied or one that has already been applied. This commit removes the field from cluster state, making it a truly immutable class. This information is stored instead by ClusterService, which is the only place that should update this field (PublishClusterStateAction was also updating it, but that information was never used anywhere). A new class is introduced (ClusterServiceState) which emcompasses the current cluster state as well as the current status, which is only used by the ClusterStateObserver mechanism.
Applied (almost) the same rules we use to validate index names
to new alias names. The only rule not applies it "must be lowercase".
We have tests that don't follow that rule and I except there are lots
of examples of camelCase alias names in the wild. We can add that
validation but I'm not sure it is worth it.
Closes#20748
Adds an alias that starts with `#` to the BWC index and validates
that you can read from it and remove it. Starting with `#` isn't
allowed after 5.1.0/6.0.0 so we don't create the alias or check it
after those versions.
Before this change Eclipse (4.6.1) would show compile errors in `ClusterRerouteRequestTests` for the elements in `RANDOM_COMMAND_GENERATORS`. This seems to be because the eclipse compiler does not recognised that the elements in the list are all `Supplier`s of bjects that are subclasses of `AllocationCommand`.
This change fixes the problem by adding a generics hint to the `Arrays.toList()` call.
All plugins currently have their own licenses dir for the
dependencyLicenses task, but core disables this and has the check inside
distribution. This may have been better for maven, but for
gradle it makes more sense to just use the dependencyLicenses task that
automatically exists inside :core, and remove the hacked up version that
is inside distribution.
This commit changes the Javadocs in OsProbe.java to take advantage of
the fact that the line-length limit is 140 characters. This change makes
these Javadocs easier to read and easier to maintain.
This commit removes JVMCheck. Previously there were three checks in this
class:
- check for super word bug in JDK 7
- check for G1GC bugs in JDK 8
- check for broken IBM JDKs
The first check is obsolete since we require JDK 8 now. The second check
is refactored into a bootstrap check. The third check is removed since
we do not even support the IBM JDK.
Relates #21389
This commit adds an assertion to ensure that we do not introduce blocking calls in code
that is called in a ClusterStateListener or another part of the cluster state update process.
Before, when requesting to get the snapshots in a repository, if `_all` was
specified for the set of snapshots to get, any in-progress snapshots would
be returned twice. This commit fixes the issue ensuring that each snapshot,
whether in-progress or completed, is only returned once when making a call to
get snapshots (GET /_snapshot/{repository_name}/_all).
This commit also enables `_current` to appear anywhere in
the get snapshots list, and will be used as a pseudo regex to
mean "match any currently running snapshots".
Closes#21335
Today when acquiring load average stats, this method is rather lenient
when reading /proc/loadavg. This commit removes this leniency so that
any such issues are more likely to be exposed to the end-user.
Relates #21380
Adds support for `?slices=N` to reindex which automatically
parallelizes the process using parallel scrolls on `_uid`. Performance
testing sees a 3x performance improvement for simple docs
on decent hardware, maybe 30% performance improvement
for more complex docs. Still compelling, especially because
clusters should be able to get closer to the 3x than the 30%
number.
Closes#20624
Today when writing an unbounded queue_size in the cat thread pool API,
we write null. This commit modifies this so that the output is -1 so
that the output is always present, and always a numeric value.
Relates #21342
Exist requests are supposed to never throw an exception, but rather return true or false depending on whether some resource exists or not. Indices exists does that for indices and accepts wildcard expressions too. The way the api works internally is by resolving indices and catching IndexNotFoundException: if an exception is thrown the index does not exist hence it returns false, otherwise it returns true. That works ok only if ignore_unavailable and allow_no_indices indices options are both set to false, meaning that they are strict and any missing index or wildcard expressions that resolves to no indices will lead to an exception that can be thrown and cause false to be returned.
Unfortunately the indices options have been configurable up until now for this request, meaning that one can set ignore_unavailable or allow_no_indices to true and have the indices exist request return true for indices that really don't exist, which makes very little sense in the context of this api.
This commit removes the indicesOptions setter from the IndicesExistsRequest and makes settable only expandWildcardsOpen and expandWildcardsClosed, hence a subset of the available indices options. This way we can guarantee more consistent behaviour of the indices exists api. We can then remove the ignore_unavailable and allow_no_indices option from indices exists api spec
Today if you start Elasticsearch with the status logger configured to
the warn level, or use a transport client with the default status logger
level, you will see warn messages about deprecation loggers being
created with different message factories and that formatting might be
broken. This happens because the deprecation logger is constructed using
the message factory from its parent, an artifact leftover from the first
Log4j 2 implementation that used a custom message factory. When that
custom message factory was removed, this constructor invocation should
have been changed to not explicitly use the message factory from the
parent. This commit fixes this invocation. However, we also had some
status checking to all tests to ensure that there are no warn status log
messages that might indicate a configuration problem with Log4j 2. These
assertions blow up badly without the fix for the deprecation logger
construction, and also caught a misconfiguration in one of the logging
tests.
Relates #21339
Today we validate the target index name late and therefore don't fail for instance
if the target index already exists and `dry_run=true` was specified. This change
validates the index name before we early terminate if dry_run is set.
Closes#21149
Today we still have a leftover from older percolators where lucene
query instances where created ahead of time and rewritten later.
This `LateParsingQuery` was resolving `now()` when it's really used which we
don't need anymore. As a side-effect this failed to execute some highlighting
queries when they get rewritten since at that point `now` access it not permitted
anymore to prevent bugs when queries get cached.
Closes#21295
This commit introduces a new execution mode for the query_string query, which
is intended down the road to be a replacement for the current _all field.
It now does auto-field-expansion and auto-leniency when the following criteria
are ALL met:
The _all field is disabled
No default_field has been set in the index settings
No default_field has been set in the request
No fields are specified in the request
Additionally, a user can force the "all-like" execution by setting the
all_fields parameter to true.
When executing in all field mode, the query_string query will look at all the
fields in the mapping that are not metafields and can be searched, and
automatically expand the list of fields that are going to be queried.
Relates to #19784
According to ISO 8601, a time zone offset of zero, can be stated numerically as
"+00:00", "+0000", or "00". The Joda library also seems to allow for "-00:00",
"-00" and "-0000". This adds some test to the DateMathParserTests that check
that we also conform to this.
Closes#21320
The usage information for `elasticsearch-plugin` is quiet verbose and makes the
actual error message that is shown when trying to remove an non-existing plugin
hard to spot. This changes the error code to not trigger printing the usage
information.
Closes#21250
ShardInfo#toString resorts to calling ShardInfo#toXContent via
Strings#toString. However, the resulting XContent object will not start
with an object, and this is a violation of the generator state
machine. This commit fixes this issue by replacing the override of
toString to provide simple non-toXContent output of the ShardInfo
instance. Without this fix, ShardInfo#toString will instead produce
"Error building toString out of XContent:
com.fasterxml.jackson.core.JsonGenerationException: Can not write a
field name, expecting a value"
Relates #21319
Our test infrastructure checks after running each test that there are no more in-flight requests on the shard level. Whenever the check fails, we only know that there were in-flight requests but don't know what requests were causing this issue. This commit adds the replication tasks that are still active at that moment to the assertion error.
With #21099 we removed support for the ignored allow_no_indices parameter in indices upgrade API. Truth is that ignore_unavailable and expand_wildcards were also ignored, in indices upgrade as well as upgrade status API. Those parameters are though supported internally and settable through java API, hence they should be all supported on the REST layer too.
`FilterAggregationBuilder` today misses to rewrite queries which causes failures
if a query that uses a client for instance to lookup terms since it must be rewritten first.
This change also ensures that if a client is used from the rewrite context we mark the query as
non-cacheable.
Closes#21301
Plugins: Remove pluggability of ZenPing
ZenPing is the part of zen discovery which knows how to ping nodes.
There is only one alternative implementation, which is just for testing.
This change removes the ability to add custom zen pings, and instead
hooks in the MockZenPing for tests through an overridden method in
MockNode. This also folds in the ZenPingService (which was really just a
single method) into ZenDiscovery, and removes the idea of having
multiple ZenPing instances. Finally, this was the last usage of the
ExtensionPoint classes, so that is also removed here.
Since we now validate all consumed request parameter, users can't specify
`_cat/nodes?full_id=true|false` anymore since this parameter is consumed late.
This commit adds a test for this parameter and consumes it before request is processed.
Closes#21266
This would allow MapperService to still be usable in contexts when a
QueryShardContext cannot be obtained, for instance in the case that a
MapperService needs to be created only to merge mappings.
Currently the request cache adds a `CacheEntity` object. It looks quite innocent
but in practice it has a reference to a lambda that knows how to create a value.
The issue is that this lambda has implicit references to the SearchContext
object, which can be quite heavy since it holds a structured representation of
aggregations for instance.
This pull request splits the `CacheEntity` object from the object that generates
cache values.
Stored scripts and ingest node configuration are important parts of the overall cluster state and should be included into a snapshot together with index templates and persistent settings if the includeGlobalState is set to true.
Closes#21184
These tests had a single method due to the fact that es didn't support resetting settings when they were first written. They can now be rewritten to have separate methods and an after method that resets the setting that is left behind.
This adds checks for expected warning headers to the query builder test
infrastructure. Tests that are adding deprecation warnings to the response
headers need to check those, otherwise the abstract base class for the test
class will complain at teardown.
This query is deprecated from 5.0 on. Similar to IndicesQueryBuilder we should
log a deprecation warning whenever this query is used.
Relates to #15760
The pending cluster state queue is used to hold incoming cluster states from the master. Currently the elected master doesn't publish to itself as thus the queue is not used. Sometimes, depending on the timing of disruptions, a pending cluster state can be put on the queue (but not committed) but another master before being isolated. If this happens concurrently with a master election the elected master can have a pending cluster state in its queue. This is not really a problem but it does confuse our assertions during tests as we check to see everything was processed correctly.
This commit takes a temporary step to clear (and fail) any pending cluster state on the master after it has successfully published a CS. Most notably this will happen when the master publishes the cluster state indicating it has just become the master.
Long term we are working to change the publishing mechanism to make the master use the pending queue just like other nodes, which will make this a non issue.
See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/509 for example.
Lucene 6.2 introduces the new `Analyzer.normalize` API, which allows to apply
only character-level normalization such as lowercasing or accent folding, which
is exactly what is needed to process queries that operate on partial terms such
as `prefix`, `wildcard` or `fuzzy` queries. As a consequence, the
`lowercase_expanded_terms` option is not necessary anymore. Furthermore, the
`locale` option was only needed in order to know how to perform the lowercasing,
so this one can be removed as well.
Closes#9978
Since we have the ingest node type, there is either IngestActionFilter or IngestProxyActionFilter registered, depending on whether the node is an ingest node or not. The special case that shortcuts the execution in case there are no filters is never exercised.
This change adds an option called `split_on_whitespace` which prevents the query parser to split free text part on whitespace prior to analysis. Instead the queryparser would parse around only real 'operators'. Default to true.
For instance the query `"foo bar"` would let the analyzer of the targeted field decide how the tokens should be splitted.
Some options are missing in this change but I'd like to add them in a follow up PR in order to be able to simplify the backport in 5.x. The missing options (changes) are:
* A `type` option which similarly to the `multi_match` query defines how the free text should be parsed when multi fields are defined.
* Simple range query with additional tokens like ">100 50" are broken when `split_on_whitespace` is set to false. It should be possible to preserve this syntax and make the parser aware of this special syntax even when `split_on_whitespace` is set to false.
* Since all this options would make the `query_string_query` very similar to a match (multi_match) query we should be able to share the code that produce the final Lucene query.
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
This commit introduces a single-shard balance step for deciding on
rebalancing a single shard (without taking any other shards in the
cluster into account). This method will be used by the cluster
allocation explain API to explain in detail the decision process for
finding a more optimal location for a started shard, if one exists.
This doesn't make much sense to have at all, since a user can do a `GET`
request without a version of they want to get it unconditionally.
Relates to #20995
There may be cases where the XContentBuilder is not used and therefore it never gets
closed, which can cause a leak of bytes. This change moves the creation of the builder
into a try with resources block and adds an assertion to verify that we always consume
the bytes in our code; the try-with resources provides protections against memory leaks
caused by plugins, which do not test this.
Refactored ScriptType to clean up some of the variable and method names. Added more documentation. Deprecated the 'in' ParseField in favor of 'stored' to match the indexed scripts being replaced by stored scripts.
We throw this exception in some cases that the shard is closed, so we have to be consistent here. Otherwise we get logs like:
```
1> [2016-10-30T21:06:22,529][WARN ][o.e.i.IndexService ] [node_s_0] [test] failed to run task refresh - suppressing re-occurring exceptions unless the exception changes
1> org.elasticsearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] operation only allowed when not closed
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:1147) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:1141) ~[main/:?]
```
Before publishing a cluster state the master connects to the nodes that are added in the cluster state. When publishing fails, however, it does not disconnect from these nodes, leaving NodeConnectionsService out of sync with the currently applied cluster state.
The assertion assertMaster checks if all nodes have each other in the cluster state and the correct master set.
It is usually called after a disruption has been healed and ensureStableCluster been called. In presence of a low
publish timeout of 1s in this test class, publishing might not be fully done even after ensureStableCluster returns.
This commit adds an assertBusy to assertMaster so that the node has a bit more time to apply the cluster state from
the master, even if it's a bit slow.
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case.
This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.
This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/
Relates #20211
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
This fixes our cluster formation task to run REST tests against a mixed version cluster.
Yet, due to some limitations in our test framework `indices.rollover` tests are currently
disabled for the BWC case since they select the current master as the merge node which
happens to be a BWC node and we can't relocate all shards to it since the primaries are on
a higher version node. This will be fixed in a followup.
Closes#21142
Note: This has been cherry-picked from 5.0 and fixes several rest tests
as well as a BWC break in `OsStats.java`
The network disruption type "network delay" continues delaying existing requests even after the disruption has been cleared. This commit ensures that the requests get to execute right after the delay rule is cleared.
This test failed when the node that was shutting down was not yet removed from the cluster state on the master.
The cluster allocation explain API will not see any unassigned shards until the node shutting down is removed from the
cluster state.