Each node checks on every cluster state update if there are shards that it can possibly delete from its disk. It decides this by doing a file-system lookup for each shard id that is fully allocated in the cluster. With lots of shards, this amounts to lots of Files.exists() checks, considerably slowing down cluster state updates. This commit adds a caching layer so that the Files.exists() checks can be skipped if not needed.
* master: (516 commits)
Avoid angering Log4j in TransportNodesActionTests
Add trace logging when aquiring and releasing operation locks for replication requests
Fix handler name on message not fully read
Remove accidental import.
Improve log message in TransportNodesAction
Clean up of Script.
Update Joda Time to version 2.9.5 (#21468)
Remove unused ClusterService dependency from SearchPhaseController (#21421)
Remove max_local_storage_nodes from elasticsearch.yml (#21467)
Wait for all reindex subtasks before rethrottling
Correcting a typo-Maan to Man-in README.textile (#21466)
Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
Replace all index date-math examples with the URI encoded form
Fix typos (#21456)
Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
Add VirtualBox version check (#21370)
Export ES_JVM_OPTIONS for SysV init
Skip reindex rethrottle tests with workers
Make forbidden APIs be quieter about classpath warnings (#21443)
...
Currently the task cancellation command returns as soon as the top-level parent child is marked as cancelled. This create race conditions in tests where child tasks on other nodes may continue to run for some time after the main task is cancelled. This commit fixes this situation making task cancellation command to wait until it got propagated to all nodes that have child tasks.
Closes#21126
ShardActiveResponseHandler doesn't need to hold to an entire cluster state since it only needs to know the cluster state version. It seems that on overloaded systems where nodes are unresponsive holding onto a lot of different cluster states can make the situation worse.
Closes#21394
Ensures pending index-* blobs are deleted when snapshotting. The
index-* blobs are generational files that maintain the snapshots
in the repository. To write these atomically, we first write a
`pending-index-*` blob, then move it to `index-*`, which also deletes
`pending-index-*` in case its not a file-system level move (e.g.
S3 repositories) . For example, to write the 5th generation of the
index blob for the repository, we would first write the bytes to
`pending-index-5` and then move `pending-index-5` to `index-5`. It is
possible that we fail after writing `pending-index-5`, but before
moving it to `index-5` or deleting `pending-index-5`. In this case,
we will have a dangling `pending-index-5` blob laying around. Since
snapshot #5 would have failed, the next snapshot assumes a generation
number of 5, so it tries to write to `index-5`, which first tries to
write to `pending-index-5` before moving the blob to `index-5`. Since
`pending-index-5` is leftover from the previous failure, the snapshot
fails as it cannot overwrite this blob.
This commit solves the problem by first, adding a UUID to the
`pending-index-*` blobs, and secondly, strengthen the logic around
failure to write the `index-*` generational blob to ensure pending
files are deleted on cleanup.
Closes#21462
This change was reverted after it caused random test failures. This was
due to a copy/paste error in the original PR which caused the mock
version of ClusterInfoService to be used whenever the mock *ZenPing* was
used, and the real ClusterInfoService to be used when MockZenPing was
not used.
Currently, pending operations can complete after tests with disruption scheme
completes. This commit waits for the pending operation counter to complete
after the tests are run
* Allows for an array of index template patterns to be provided to an
index template, and rename the field from 'template' to 'index_pattern'.
Closes#20690
When logging a mock exception, Log4j attempts to render the stack
trace. On a mock exception, this will be null and Log4j will hit a
NullPointerException. This NullPointerException will get recorded in the
status logger buffer that we use to ensure that we do not having any
misuses of Log4j in production code. This commit replaces the use of a
mock exception with an actual exception to avoid angering the Log4j
assertions in ESTestCase.
Today when a message is not fully read on a response, we log (among
other details) the handler name. Unfortunately, if the handler is a
wrapper, all that we see is
o.e.t.TransportService$ContextRestoreResponseHandler@7446ba18
completely losing the offending handler. This commit adds an override
for TransportService$ContextRestoreResponseHandler#toString so that the
underlying offender can be discovered.
Relates #21478
Today when handling responses from nodes in TransportNodesAction, if a
node timeouts or some other failure occurs and the action is not
accumulating exceptions, we log a confusing message:
org.elasticsearch.action.admin.cluster.stats.TransportClusterStatsAction]
ignoring unexpected response [null] of type [null], expected
[ClusterStatsNodeResponse] or [FailedNodeException]
Moreover, the original exception is completely lost. Since this log
message is confusing and unhelpful, we can drop it. Instead, we hold
onto the exception and log it at the warn level before dropping it from
the response.
Relates #21476
This commit updates JodaTime to version 2.9.5 that contain a fix for a bug when parsing time zones (see https://github.com/JodaOrg/joda-time/pull/332, https://github.com/JodaOrg/joda-time/issues/386 and https://github.com/JodaOrg/joda-time/issues/373).
It also remove the joda-convert dependency that seems to be unused.
closes#20911
Here is the changelog for 2.9.5:
```
Changes in 2.9.5
----------------
- Add Norwegian period translations [#378]
- Add Duration.dividedBy(long,RoundingMode) [#69, #379]
- DateTimeZone data updated to version 2016i
- Fixed bug where clock read twice when comparing two nulls in DateTimeComparator [#404]
- Fixed minor issues with historic time-zone data [#373]
- Fix bug in time-zone binary search [#332, #386]
The fix in v2.9.2 caused problems when the time-zone being parsed
was not the last element in the input string. New approach uses a
different approach to the problem.
- Update tests for JDK 9 [#394]
- Close buffered reader correctly in zone info compiler [#396]
- Handle locale correctly zone info compiler [#397]
```
The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly.
Closes#21419
Currently, `executeIndexRequestOnPrimary` and `executeDeleteRequestOnPrimary`
methods used to prepare and execute write operations, modifies the provided
request, updating the version and versionType. This commit makes it the
callers responsibility to update request version and versionType and avoids
mutating the provided request in the execute methods.
This commit introduces a new execution mode for the
`simple_query_string` query, which is intended down the road to be a
replacement for the current _all field.
It now does auto-field-expansion and auto-leniency when the following criteria
are ALL met:
The _all field is disabled
No default_field has been set in the index settings
No fields are specified in the request
Additionally, a user can force the "all-like" execution by setting the
all_fields parameter to true.
When executing in all field mode, the `simple_query_string` query will
look at all the fields in the mapping that are not metafields and can be
searched, and automatically expand the list of fields that are going to
be queried.
Relates to #20925, which is the `query_string` version of this work.
This is basically the same behavior, but for the `simple_query_string`
query.
Relates to #19784
This commit clarifies some log messages for the bootstrap checks. The
message is that the user has limits set that are below the minimums that
Elasticsearch requires.
Relates #21423
This commit ensures that we always restore the thread's original context after execution of
a context preserving runnable. We always wrap runnables in a wrapper that restores the context
at the time it was submitted to the execute method. The ContextPreservingAbstractRunnable
would restore the calling context in the doRun method and then in a try with resources
block would restore the thread's original context. However, the onFailure and onAfter methods
of a AbstractRunnable could modify the thread context and this modified thread context would
continue on as the thread's context after it was returned to the pool and potentially used
for a different purpose.
* Plugins: Convert custom discovery to pull based plugin
This change primarily moves registering custom Discovery implementations
to the pull based DiscoveryPlugin interface. It also keeps the cloud
based discovery plugins re-registering ZenDiscovery under their own name
in order to maintain backwards compatibility. However,
discovery.zen.hosts_provider is changed here to no longer fallback to
discovery.type. Instead, each plugin which previously relied on the
value of discovery.type now sets the hosts_provider to itself if
discovery.type is set to itself, along with a deprecation warning.
This commit cleans up the formatting of some Javadocs in
BootstrapCheck.java, and corrects some docs that had become stale as the
bootstrap checks evolved.
The ClusterState class currently has a mutable volatile field "status" that is only used by the ClusterStateObserver to differentiate between a cluster state that is being applied or one that has already been applied. This commit removes the field from cluster state, making it a truly immutable class. This information is stored instead by ClusterService, which is the only place that should update this field (PublishClusterStateAction was also updating it, but that information was never used anywhere). A new class is introduced (ClusterServiceState) which emcompasses the current cluster state as well as the current status, which is only used by the ClusterStateObserver mechanism.
Applied (almost) the same rules we use to validate index names
to new alias names. The only rule not applies it "must be lowercase".
We have tests that don't follow that rule and I except there are lots
of examples of camelCase alias names in the wild. We can add that
validation but I'm not sure it is worth it.
Closes#20748
Adds an alias that starts with `#` to the BWC index and validates
that you can read from it and remove it. Starting with `#` isn't
allowed after 5.1.0/6.0.0 so we don't create the alias or check it
after those versions.
Before this change Eclipse (4.6.1) would show compile errors in `ClusterRerouteRequestTests` for the elements in `RANDOM_COMMAND_GENERATORS`. This seems to be because the eclipse compiler does not recognised that the elements in the list are all `Supplier`s of bjects that are subclasses of `AllocationCommand`.
This change fixes the problem by adding a generics hint to the `Arrays.toList()` call.
All plugins currently have their own licenses dir for the
dependencyLicenses task, but core disables this and has the check inside
distribution. This may have been better for maven, but for
gradle it makes more sense to just use the dependencyLicenses task that
automatically exists inside :core, and remove the hacked up version that
is inside distribution.
This commit changes the Javadocs in OsProbe.java to take advantage of
the fact that the line-length limit is 140 characters. This change makes
these Javadocs easier to read and easier to maintain.
This commit removes JVMCheck. Previously there were three checks in this
class:
- check for super word bug in JDK 7
- check for G1GC bugs in JDK 8
- check for broken IBM JDKs
The first check is obsolete since we require JDK 8 now. The second check
is refactored into a bootstrap check. The third check is removed since
we do not even support the IBM JDK.
Relates #21389
This commit adds an assertion to ensure that we do not introduce blocking calls in code
that is called in a ClusterStateListener or another part of the cluster state update process.
Before, when requesting to get the snapshots in a repository, if `_all` was
specified for the set of snapshots to get, any in-progress snapshots would
be returned twice. This commit fixes the issue ensuring that each snapshot,
whether in-progress or completed, is only returned once when making a call to
get snapshots (GET /_snapshot/{repository_name}/_all).
This commit also enables `_current` to appear anywhere in
the get snapshots list, and will be used as a pseudo regex to
mean "match any currently running snapshots".
Closes#21335
Today when acquiring load average stats, this method is rather lenient
when reading /proc/loadavg. This commit removes this leniency so that
any such issues are more likely to be exposed to the end-user.
Relates #21380
Adds support for `?slices=N` to reindex which automatically
parallelizes the process using parallel scrolls on `_uid`. Performance
testing sees a 3x performance improvement for simple docs
on decent hardware, maybe 30% performance improvement
for more complex docs. Still compelling, especially because
clusters should be able to get closer to the 3x than the 30%
number.
Closes#20624
Today when writing an unbounded queue_size in the cat thread pool API,
we write null. This commit modifies this so that the output is -1 so
that the output is always present, and always a numeric value.
Relates #21342
Exist requests are supposed to never throw an exception, but rather return true or false depending on whether some resource exists or not. Indices exists does that for indices and accepts wildcard expressions too. The way the api works internally is by resolving indices and catching IndexNotFoundException: if an exception is thrown the index does not exist hence it returns false, otherwise it returns true. That works ok only if ignore_unavailable and allow_no_indices indices options are both set to false, meaning that they are strict and any missing index or wildcard expressions that resolves to no indices will lead to an exception that can be thrown and cause false to be returned.
Unfortunately the indices options have been configurable up until now for this request, meaning that one can set ignore_unavailable or allow_no_indices to true and have the indices exist request return true for indices that really don't exist, which makes very little sense in the context of this api.
This commit removes the indicesOptions setter from the IndicesExistsRequest and makes settable only expandWildcardsOpen and expandWildcardsClosed, hence a subset of the available indices options. This way we can guarantee more consistent behaviour of the indices exists api. We can then remove the ignore_unavailable and allow_no_indices option from indices exists api spec
Today if you start Elasticsearch with the status logger configured to
the warn level, or use a transport client with the default status logger
level, you will see warn messages about deprecation loggers being
created with different message factories and that formatting might be
broken. This happens because the deprecation logger is constructed using
the message factory from its parent, an artifact leftover from the first
Log4j 2 implementation that used a custom message factory. When that
custom message factory was removed, this constructor invocation should
have been changed to not explicitly use the message factory from the
parent. This commit fixes this invocation. However, we also had some
status checking to all tests to ensure that there are no warn status log
messages that might indicate a configuration problem with Log4j 2. These
assertions blow up badly without the fix for the deprecation logger
construction, and also caught a misconfiguration in one of the logging
tests.
Relates #21339
Today we validate the target index name late and therefore don't fail for instance
if the target index already exists and `dry_run=true` was specified. This change
validates the index name before we early terminate if dry_run is set.
Closes#21149
Today we still have a leftover from older percolators where lucene
query instances where created ahead of time and rewritten later.
This `LateParsingQuery` was resolving `now()` when it's really used which we
don't need anymore. As a side-effect this failed to execute some highlighting
queries when they get rewritten since at that point `now` access it not permitted
anymore to prevent bugs when queries get cached.
Closes#21295
This commit introduces a new execution mode for the query_string query, which
is intended down the road to be a replacement for the current _all field.
It now does auto-field-expansion and auto-leniency when the following criteria
are ALL met:
The _all field is disabled
No default_field has been set in the index settings
No default_field has been set in the request
No fields are specified in the request
Additionally, a user can force the "all-like" execution by setting the
all_fields parameter to true.
When executing in all field mode, the query_string query will look at all the
fields in the mapping that are not metafields and can be searched, and
automatically expand the list of fields that are going to be queried.
Relates to #19784
According to ISO 8601, a time zone offset of zero, can be stated numerically as
"+00:00", "+0000", or "00". The Joda library also seems to allow for "-00:00",
"-00" and "-0000". This adds some test to the DateMathParserTests that check
that we also conform to this.
Closes#21320
The usage information for `elasticsearch-plugin` is quiet verbose and makes the
actual error message that is shown when trying to remove an non-existing plugin
hard to spot. This changes the error code to not trigger printing the usage
information.
Closes#21250
ShardInfo#toString resorts to calling ShardInfo#toXContent via
Strings#toString. However, the resulting XContent object will not start
with an object, and this is a violation of the generator state
machine. This commit fixes this issue by replacing the override of
toString to provide simple non-toXContent output of the ShardInfo
instance. Without this fix, ShardInfo#toString will instead produce
"Error building toString out of XContent:
com.fasterxml.jackson.core.JsonGenerationException: Can not write a
field name, expecting a value"
Relates #21319
Our test infrastructure checks after running each test that there are no more in-flight requests on the shard level. Whenever the check fails, we only know that there were in-flight requests but don't know what requests were causing this issue. This commit adds the replication tasks that are still active at that moment to the assertion error.
With #21099 we removed support for the ignored allow_no_indices parameter in indices upgrade API. Truth is that ignore_unavailable and expand_wildcards were also ignored, in indices upgrade as well as upgrade status API. Those parameters are though supported internally and settable through java API, hence they should be all supported on the REST layer too.
`FilterAggregationBuilder` today misses to rewrite queries which causes failures
if a query that uses a client for instance to lookup terms since it must be rewritten first.
This change also ensures that if a client is used from the rewrite context we mark the query as
non-cacheable.
Closes#21301
Plugins: Remove pluggability of ZenPing
ZenPing is the part of zen discovery which knows how to ping nodes.
There is only one alternative implementation, which is just for testing.
This change removes the ability to add custom zen pings, and instead
hooks in the MockZenPing for tests through an overridden method in
MockNode. This also folds in the ZenPingService (which was really just a
single method) into ZenDiscovery, and removes the idea of having
multiple ZenPing instances. Finally, this was the last usage of the
ExtensionPoint classes, so that is also removed here.
Since we now validate all consumed request parameter, users can't specify
`_cat/nodes?full_id=true|false` anymore since this parameter is consumed late.
This commit adds a test for this parameter and consumes it before request is processed.
Closes#21266
This would allow MapperService to still be usable in contexts when a
QueryShardContext cannot be obtained, for instance in the case that a
MapperService needs to be created only to merge mappings.
Currently the request cache adds a `CacheEntity` object. It looks quite innocent
but in practice it has a reference to a lambda that knows how to create a value.
The issue is that this lambda has implicit references to the SearchContext
object, which can be quite heavy since it holds a structured representation of
aggregations for instance.
This pull request splits the `CacheEntity` object from the object that generates
cache values.
Stored scripts and ingest node configuration are important parts of the overall cluster state and should be included into a snapshot together with index templates and persistent settings if the includeGlobalState is set to true.
Closes#21184
These tests had a single method due to the fact that es didn't support resetting settings when they were first written. They can now be rewritten to have separate methods and an after method that resets the setting that is left behind.
This adds checks for expected warning headers to the query builder test
infrastructure. Tests that are adding deprecation warnings to the response
headers need to check those, otherwise the abstract base class for the test
class will complain at teardown.
This query is deprecated from 5.0 on. Similar to IndicesQueryBuilder we should
log a deprecation warning whenever this query is used.
Relates to #15760
The pending cluster state queue is used to hold incoming cluster states from the master. Currently the elected master doesn't publish to itself as thus the queue is not used. Sometimes, depending on the timing of disruptions, a pending cluster state can be put on the queue (but not committed) but another master before being isolated. If this happens concurrently with a master election the elected master can have a pending cluster state in its queue. This is not really a problem but it does confuse our assertions during tests as we check to see everything was processed correctly.
This commit takes a temporary step to clear (and fail) any pending cluster state on the master after it has successfully published a CS. Most notably this will happen when the master publishes the cluster state indicating it has just become the master.
Long term we are working to change the publishing mechanism to make the master use the pending queue just like other nodes, which will make this a non issue.
See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/509 for example.
Lucene 6.2 introduces the new `Analyzer.normalize` API, which allows to apply
only character-level normalization such as lowercasing or accent folding, which
is exactly what is needed to process queries that operate on partial terms such
as `prefix`, `wildcard` or `fuzzy` queries. As a consequence, the
`lowercase_expanded_terms` option is not necessary anymore. Furthermore, the
`locale` option was only needed in order to know how to perform the lowercasing,
so this one can be removed as well.
Closes#9978
Since we have the ingest node type, there is either IngestActionFilter or IngestProxyActionFilter registered, depending on whether the node is an ingest node or not. The special case that shortcuts the execution in case there are no filters is never exercised.
This change adds an option called `split_on_whitespace` which prevents the query parser to split free text part on whitespace prior to analysis. Instead the queryparser would parse around only real 'operators'. Default to true.
For instance the query `"foo bar"` would let the analyzer of the targeted field decide how the tokens should be splitted.
Some options are missing in this change but I'd like to add them in a follow up PR in order to be able to simplify the backport in 5.x. The missing options (changes) are:
* A `type` option which similarly to the `multi_match` query defines how the free text should be parsed when multi fields are defined.
* Simple range query with additional tokens like ">100 50" are broken when `split_on_whitespace` is set to false. It should be possible to preserve this syntax and make the parser aware of this special syntax even when `split_on_whitespace` is set to false.
* Since all this options would make the `query_string_query` very similar to a match (multi_match) query we should be able to share the code that produce the final Lucene query.
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
This commit introduces a single-shard balance step for deciding on
rebalancing a single shard (without taking any other shards in the
cluster into account). This method will be used by the cluster
allocation explain API to explain in detail the decision process for
finding a more optimal location for a started shard, if one exists.
This doesn't make much sense to have at all, since a user can do a `GET`
request without a version of they want to get it unconditionally.
Relates to #20995
There may be cases where the XContentBuilder is not used and therefore it never gets
closed, which can cause a leak of bytes. This change moves the creation of the builder
into a try with resources block and adds an assertion to verify that we always consume
the bytes in our code; the try-with resources provides protections against memory leaks
caused by plugins, which do not test this.
This adds support for templating in rank eval requests.
Relates to #20231
Problem: In it's current state the rank-eval request API forces the user to repeat complete queries for each test request. In most use cases the structure of the query to test will be stable with only parameters changing across requests, so this looks like lots of boilerplate json for something that could be expressed in a more concise way.
Uses templating/ ScriptServices to enable users to submit only one test request template and let them only specify template parameters on a per test request basis.
Refactored ScriptType to clean up some of the variable and method names. Added more documentation. Deprecated the 'in' ParseField in favor of 'stored' to match the indexed scripts being replaced by stored scripts.
We throw this exception in some cases that the shard is closed, so we have to be consistent here. Otherwise we get logs like:
```
1> [2016-10-30T21:06:22,529][WARN ][o.e.i.IndexService ] [node_s_0] [test] failed to run task refresh - suppressing re-occurring exceptions unless the exception changes
1> org.elasticsearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] operation only allowed when not closed
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:1147) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:1141) ~[main/:?]
```
Before publishing a cluster state the master connects to the nodes that are added in the cluster state. When publishing fails, however, it does not disconnect from these nodes, leaving NodeConnectionsService out of sync with the currently applied cluster state.
The assertion assertMaster checks if all nodes have each other in the cluster state and the correct master set.
It is usually called after a disruption has been healed and ensureStableCluster been called. In presence of a low
publish timeout of 1s in this test class, publishing might not be fully done even after ensureStableCluster returns.
This commit adds an assertBusy to assertMaster so that the node has a bit more time to apply the cluster state from
the master, even if it's a bit slow.
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case.
This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.
This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/
Relates #20211
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
This fixes our cluster formation task to run REST tests against a mixed version cluster.
Yet, due to some limitations in our test framework `indices.rollover` tests are currently
disabled for the BWC case since they select the current master as the merge node which
happens to be a BWC node and we can't relocate all shards to it since the primaries are on
a higher version node. This will be fixed in a followup.
Closes#21142
Note: This has been cherry-picked from 5.0 and fixes several rest tests
as well as a BWC break in `OsStats.java`
The network disruption type "network delay" continues delaying existing requests even after the disruption has been cleared. This commit ensures that the requests get to execute right after the delay rule is cleared.
This test failed when the node that was shutting down was not yet removed from the cluster state on the master.
The cluster allocation explain API will not see any unassigned shards until the node shutting down is removed from the
cluster state.
Previously, if a node left the cluster (for example, due to a long GC),
during a snapshot, the master node would mark the snapshot as failed, but
the node itself could continue snapshotting the data on its shards to the
repository. If the node rejoins the cluster, the master may assign it to
hold the replica shard (where it held the primary before getting kicked off
the cluster). The initialization of the replica shard would repeatedly fail
with a ShardLockObtainFailedException until the snapshot thread finally
finishes and relinquishes the lock on the Store.
This commit resolves the situation by ensuring that when a shard is removed
from a node (such as when a node rejoins the cluster and realizes it no longer
holds the active shard copy), any snapshotting of the removed shards is aborted.
In the scenario above, when the node rejoins the cluster, it will see in the cluster
state that the node no longer holds the primary shard, so IndicesClusterStateService
will remove the shard, thereby causing any snapshots of that shard to be aborted.
Closes#20876
The cluster state on a node is updated either
- by incoming cluster states that are received from the active master or
- by the node itself when it notices that the master has gone.
In the second case, the node adds the NO_MASTER_BLOCK and removes the current master as active master from its cluster state. In one particular case, it would also update the list of nodes, removing the master node that just failed. In the future, we want a clear separation between actions that can be executed by a master publishing a cluster state and a node locally updating its cluster state when no active master is around.
This commit fixes responses to HEAD requests so that the value of the
Content-Length is correct per the HTTP spec. Namely, the value of this
header should be equal to the Content-Length if the request were not a
HEAD request.
This commit also fixes a memory leak on HEAD requests to the main action
that arose from the bytes on a builder not being released due to them
being dropped on the floor to ensure that the response to the main
action did not have a body.
Relates #21123
This commit adds preformatted tags to the Javadoc for
OsProbe#readSysFsCgroupCpuAcctCpuStat to render the form of the cpu.stat
file in a fixed-width font.
When acquiring cgroup stats, we check if such stats are available by
invoking a method areCgroupStatsAvailable. This method checks
availability by looking for existence of some virtual files in
/proc/self/cgroup and /sys/fs/cgroups. If these stats are not available,
the getCgroup method returns null. The OsProbeTests#testCgroupProbe did
not account for this. On some systems where tests run, the cgroup stats
might not be available yet this test method was expecting them to be (we
mock the relevant virtual file reads). This commit handles the execution
of this test on such systems by overriding the behavior of
OsProbe#areCgroupStatsAvailable. We test both the possibility of this
method returning true as well as false.
On some systems, cgroups will be available but not configured. And in
some cases, cgroups will be configured, but not for the subsystems that
we are expecting (e.g., cpu and cpuacct). This commit strengthens the
handling of cgroup stats on such systems.
Relates #21094
When system starts, it creates a temporary file named .es_temp_file to ensure the data directories are writable.
If system crashes after creating the .es_temp_file but before deleting this file, next time the system will not be able to start because the Files.createFile(resolve) will throw an exception if the file already exists.
Currently test that check that equals() and hashCode() are working as expected
for classes implementing them are quiet similar. This change moves common
assertions in this method to a common utility class. In addition, another common
utility function in most of these test classes that creates copies of input
object by running them through a StreamOutput and reading them back in, is moved
to ESTestCase so it can be shared across all these classes.
Closes#20629
Today the request interceptor can't support async calls since the response
of the async call would execute on a different thread ie. a client or listener
thread. This means in-turn that the intercepted handler is not executed with the
thread it was supposed to run and therefor can, if it's executing blocking
operations, potentially deadlock an entire server.
Before this commit `curl -XHEAD localhost:9200?pretty` would return
`Content-Length: 1` and a body which is fairly upsetting to standards
compliant tools. Now it'll return `Content-Length: 0` with an empty
body like every other `HEAD` request.
Relates to #21075
Refactors the BalancedShardsAllocator to create a method that
provides an allocation decision for allocating a single
unassigned shard or a single started shard that can no longer
remain on its current node. Having a separate method that
provides a detailed decision on the allocation of a single shard
will enable the cluster allocation explain API to directly
invoke these methods to provide allocation explanations.
This commit reduces classes to handle write
operation results in TransportWriteAction, this
comes at the cost of handling write operations in
TransportShardBulkAction.
Now parsing, mapping failures (which happen before
executing engine write operation) are communicated
via a failure operation type while transient operation
failures are set on the index/delete operations.
This experimental setting enables relocation of shards that are being snapshotted, which can cause the shard allocation failures. This setting is undocumented and there is no good reason to set it in production.
This commit cleans up the code handling load averages in OsProbe:
- remove support for BSD; we do not support this OS
- add Javadocs
- strengthen assertions and testing
- add debug logging for exceptional situation
Relates #21037
This change moves providing UnicastHostsProvider for zen discovery to be
pull based, adding a getter in DiscoveryPlugin. A new setting is added,
discovery.zen.hosts_provider, to separate the discovery type from the
hosts provider for zen when it is selected. Unfortunately existing
plugins added ZenDiscovery with their own name in order to just provide
a hosts provider, so there are already many users setting the hosts
provider through discovery.type. This change also includes backcompat,
falling back to discovery.type when discovery.zen.hosts_provider is not
set.
The max score returned in the response of a query does not take rescorer into account.
This change updates the max_score when a rescorer is used in a query.
Fixes#20651
This change adds a TypesQuery that checks if the disjunction of types should be rewritten to a MatchAllDocs query. The check is done only if the number of terms is below a threshold (16 by default and configurable via max_boolean_clause).
* Move all zen discovery classes into o.e.discovery.zen
This collapses sub packages of zen into zen. These all had just a couple
classes each, and there is really no reason to have the subpackages.
* fix checkstyle
Date math index/alias expressions in mget will now be resolved to a concrete single index instead of failing the mget item with an `IndexNotFoundException`.
Added also an integration test to verify multi index aliases do not fail the entire mget request.
Closes#17957
This commit removes an undocumented output parameter node_info_format
from the cluster stats and node stats APIs. Currently the parameter does
not even work as it is not whitelisted as an output parameter. Since
this parameter is not documented, we opt to just remove it.
Relates #21021
When indices stats are requested via the node stats API, there is a
level parameter to request stats at the index, node, or shards
level. This parameter was not whitelisted when URL parsing was made
strict. This commit whitelists this parameter.
Additionally, there was some leniency in the parsing of this parameter
that has been removed.
Relates #21024
This change makes the ElectMasterService local to ZenDiscovery, no
longer created by guice, and thus also removes the ability for plugins
to customize. This extension point is no longer used by anything.
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
This is related to #20377, which removed the feature entirely. This
allows operations to continue to use the `force` version type if the
index was created before 6.0, in the event a document using it exists in
a translog being replayed.
Previous to this change any request using a script sort in a top_hits
aggregation would fail because the compilation of the script happened
after the QueryShardContext was frozen (after we had worked out if the
request is cachable).
This change moves the calling of build() on the SortBuilder to the
TopHitsAggregationBuilder which means that the script in the script_sort
will be compiled before we decide whether to cache the request and freeze
the context.
Closes#21022
This commit removes an undocumented output parameter output_uuid from
the cluster stats API. Currently the parameter does not even work as it
is not whitelisted as an output parameter. Since the cluster UUID is
available from the main action, and this parameter is not documented, we
opt to just remove it.
Relates #21020
`LocalDiscovery` is a discovery implementation that uses static in memory maps to keep track of current live nodes. This is used extensively in our tests in order to speed up cluster formation (i.e., shortcut the 3 second ping period used by `ZenDiscovery` by default). This is sad as that mean that most of the test run using a different discovery semantics than what is used in production. Instead of replacing the entire discovery logic, we can use a similar approach to only shortcut the pinging components.
This tests that the templates shipped with 5.0 versions of Logstash and
Beats still work on an Elasticsearch 6.0+ node, so that we ensure that
ES can be upgraded prior to upgrading tools dependent on it.
Related to #20491Resolves#17275
* Only negate index expression on all indices with preceding wildcard
There is currently a very confusing behavior in Elasticsearch for the
following:
Given the indices: `[test1, test2, -foo1, -foo2]`
```
DELETE /-foo*
```
Will cause the `test1` and `test2` indices to be deleted, when what is
usually intended is to delete the `-foo1` and `-foo2` indices.
Previously we added a change in #20033 to disallow creating indices
starting with `-` or `+`, which will help with this situation. However,
users may have existing indices starting with these characters.
This changes the negation to only take effect in a wildcard (`*`) has
been seen somewhere in the expression, so in order to delete `-foo1` and
`-foo2` the following now works:
```
DELETE /-foo*
```
As well as:
```
DELETE /-foo1,-foo2
```
so in order to actually delete everything except for the "foo" indices
(ie, `test1` and `test2`) a user would now issue:
```
DELETE /*,--foo*
```
Relates to #19800
Updating the circuit breaker settings (and other settings) should always be possible, even if the cluster is under stress. With #20827 we updated the cluster settings request to not trigger circuit breakers. However that change is not complete since the resulting cluster state can potentially not be published. This change makes sure cluster state publishing to not trigger circuit breakers as well.
Relates to #20960 where this was discovered.
Cleaning up a few remaining occurences of using junits ExpectedException rule in
favor of using LuceneTestCase#expectThrows() which is more concise and versatile.
* Scripting: Add support for booleans in scripts
Since 2.0, booleans have been represented as numeric fields (longs).
However, in scripts, this is odd, since you expect doing a comparison
against a boolean to work. While languages like groovy will auto convert
between booleans and longs, painless does not.
This changes the doc values accessor for boolean fields in scripts to
return Boolean objects instead of Long objects.
closes#20949
* Make Booleans final and remove wrapping of `this` for getValues()
During a recent merge from master, we lost the bridge from IndicesClusterStateService to the GlobalCheckpointService of primary shards, notifying them of changes to the current set of active/initializing shards. This commits add the bridge back (with unit tests). It also simplifies the GlobalCheckpoint tracking to use a simpler model (which makes use the fact that the global check point sync is done periodically).
The old integration CheckpointIT test is moved to IndexLevelReplicationTests. I also added similar assertions to RelocationsIT, which surfaced a bug in the primary relocation logic and how it plays with global checkpoint updates. The test is currently await-fixed and will be fixed in a follow up issue.
Today when logging an unknown or invalid setting, the log message does
not contain the source. This means that if we are archiving such a
setting, we do not specify where the setting is from (an index, and
which index, or a persistent or transient cluster setting). This commit
provides such logging for the end user can better understand the
consequences of the unknown or invalid setting.
Relates #20951
Now index and delete methods in index shard share code for
indexing stats. This commit collapses seperate methods for
index and delete operations into a generic execute method
for performing engine write operations. As an added benefit,
this commit cleans up the interface for indexing operation
listener making it more simple and concise to use.
Currently, we treat all write operation exceptions as equals, but in reality
every write operation can cause either an environment failure (i.e. a failure
that should fail the engine e.g. data corruption, lucene tragic events) or
operation failure (i.e. a failure that is transient w.r.t the operation e.g.
parsing exception).
This change bubbles up enironment failures from the engine, after failing the
engine but captures transient operation failures as part of the operation
to be processed appopriately at the transport level.
Today we don't parse alias filters on the coordinating node, we only forward
the alias patters to executing node and resolve it late. This has several problems
like requests that go through filtered aliases are never cached if they use date math,
since the parsing happens very late in the process even without rewriting. It also used
to be processed on every shard while we can only do it once per index on the coordinating node.
Another nice side-effect is that we are never prone to cluster-state updates that change an alias,
all nodes will execute the exact same alias filter since they are process based on the same
cluster state.
Today Elasticsearch limits the number of processors used in computing
thread counts to 32. This was from a time when Elasticsearch created
more threads than it does now and users would run into out of memory
errors. It appears the real cause of these out of memory errors was not
well understood (it's often due to ulimit settings) and so users were
left hitting these out of memory errors on boxes with high core
counts. Today Elasticsearch creates less threads (but still a lot) and
we have a bootstrap check in place to ensure that the relevant ulimit is
not too low.
There are some caveats still to having too many concurrent indexing
threads as it can lead to too many little segments, and it's not a
magical go faster knob if indexing is already bottlenecked by disk, but
this limitation is artificial and surprising to users and so it should
be removed.
This commit also increases the lower bound of the max processes ulimit,
to prepare for a world where Elasticsearch instances might be running
with more the previous cap of 32 processors. With the current settings,
Elasticsearch wants to create roughly 576 + 25 * p / 2 threads, where p
is the number of processors. Add in roughly 7 * p / 8 threads for the GC
threads and a fudge factor, and 4096 should cover us pretty well up to
256 cores.
Relates #20874
Some people apparently never run tests when they change this file.
Neither do they read comments right below the line they change that
they should do the change after all.
`AbstractSearchAsyncAction` has only been tested in integration tests.
The infrastructure is rather critical and should be tested on a unit-test
level. This change takes the first step.
This changes the CacheBuilder methods that are used to set expiration times to accept a
TimeValue instead of long. Accepting a long can lead to issues where the incorrect value is
passed in as the time unit is not clearly identified. By using TimeValue the caller no longer
needs to worry about the time unit used by the cache or builder.
Before this change the `MultiMatchQuery` called the field types
`termQuery()` with a null context. This is not correct so this change
fixes this so the `MultiMatchQuery` now uses the `ShardQueryContext` it
stores as a field.
Relates to https://github.com/elastic/elasticsearch/pull/20796#pullrequestreview-3606305
Both netty3 and netty4 http implementation printed the default
toString representation of PortRange if ports couldn't be bound.
This commit adds a better default toString method to PortRange and
uses the string representation for the error message in the http
implementations.
The test testDataFileCorruptionDuringRestore expects failures to happen when accessing snapshot data. It would sometimes
fail however as MockRepository (by default) only simulates 100 failures.
Sequence number related data (maximum sequence number, local checkpoint,
and global checkpoint) gets stored in Lucene on each commit. The logical
place to store this data is on each Lucene commit's user commit data
structure (see IndexWriter#setCommitData and the new version
IndexWriter#setLiveCommitData). However, previously we did not store the
maximum sequence number in the commit data because the commit data got
copied over before the Lucene IndexWriter flushed the documents to segments
in the commit. This means that between the time that the commit data was
set on the IndexWriter and the time that the IndexWriter completes the commit,
documents with higher sequence numbers could have entered the commit.
Hence, we would use FieldStats on the _seq_no field in the documents to get
the maximum sequence number value, but this suffers the drawback that if the
last sequence number in the commit corresponded to a delete document action,
that sequence number would not show up in FieldStats as there would be no
corresponding document in Lucene.
In Lucene 6.2, the commit data was changed to take an Iterable interface, so
that the commit data can be calculated and retrieved *after* all documents
have been flushed, while the commit data itself is being set on the Lucene commit.
This commit changes max_seq_no so it is stored in the commit data instead of
being calculated from FieldStats, taking advantage of the deferred calculation
of the max_seq_no through passing an Iterable that dynamically sets the iterator
data.
* improvements to iterating over commit data (and better safety guarantees)
* Adds sequence number and checkpoint testing for document deletion
intertwined with document indexing.
* improve test code slightly
* Remove caching of max_seq_no in commit data iterator and inline logging
* Adds a test for concurrently indexing and committing segments
to Lucene, ensuring the sequence number related commit data
in each Lucene commit point matches the invariants of
localCheckpoint <= highest sequence number in commit <= maxSeqNo
* fix comments
* addresses code review
* adds clarification on checking commit data on recovery from translog
* remove unneeded method
Sometimes it's useful / needed to use unreleased Version constants but we should not add those to the Version.java class for several reasons ie. BWC tests and assertions along those lines. Yet, it's not really obvious how to do that so I added some comments and a simple test for this.
There was an issue with using fuzziness parameter in multi_match query that has
been reported in #18710 and was fixed in Lucene 6.2 that is now used on master.
In order to verify that fix and close the original issue this PR adds the test
from that issue as an integration test.
today we might release a bytes array more than once if the send listener
throws an exception but already has released the array. Yet, this is already fixed
in the BytesArray class we use in production to ensure 3rd party users don't release
twice but our mocks still enforce it.
The snapshot restore state tracks information about shards being restored from a snapshot in the cluster state. For example it records if a shard has been successfully restored or if restoring it was not possible due to a corruption of the snapshot. Recording these events is usually based on changes to the shard routing table, i.e., when a shard is started after a successful restore or failed after an unsuccessful one. As of now, there were two communication channels to transmit recovery failure / success to update the routing table and the restore state. This lead to issues where a shard was failed but the restore state was not updated due to connection issues between data and master node. In some rare situations, this lead to an issue where the restore state could not be properly cleaned up anymore by the master, making it impossible to start new restore operations. The following change updates routing table and restore state in the same cluster state update so that both always stay in sync. It also eliminates the extra communication channel for restore operations and uses standard cluster state listener mechanism to update restore listener upon successful
completion of a snapshot.
* Fixed writeable name from range to geo_distance
* Added testGeoDistanceAggregation
* Added asserts for correct result in testGeoDistanceAggregation
* Setup mapping on test index.
When refactoring DirectCandidateGeneratorBuilder recently, the
ConstructingObjectParser that we have today was not available. Instead we used
some workaround, but it is better to remove this now and use
ConstructingObjectParser instead.
* Adding built-in sorting capability to _cat apis.
Closes#16975
* addressing pr comments
* changing value types back to original implementation and fixing cosmetic issues
* Changing compareTo, hashCode of value types to a better implementation
* Changed value compareTos to use Double.compare instead of if statements + fixed some failed unit tests
Shadow replicas can not be simply promoted to primary by updating boolean like normal shards. Instead the are reinitialized and shut down and rebuilt as primaries. Currently we also given them new allocation ids but that throws off the in-sync allocation ids management. This commit changes this behavior to keep the allocation id of the shard.
Closes#20650
This change adds a overloaded `XContentMapValues#filter` method that returns
a function enclosing the compiled automatons that can be reused across filter
calls. This for instance prevents compiling automatons over and over again when
hits are filtered or in the SourceFieldMapper for each document.
Closes#20839
Some objects like maps, iterables or arrays of objects can self-reference themselves. This is mostly due to a bug in code but the XContentBuilder should be able to detect such situations and throws an IllegalArgumentException instead of building objects over and over until a stackoverflow occurs.
closes#20540closes#19475
Settings updates are important to be able to help and administer a cluster in distress. We shouldn't block it due to circuit breakers. An extreme example is where we are actually trying to increase and unreasonable low setting for the circuit breaker itself.
See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/242/
The cache relies on the equals() method so we just need to make sure script
queries can never be equals, even to themselves in the case that a weight
is used to produce a Scorer on the same segment multiple times.
Closes#20763
`TcpTransport.ScheduledPing` doesn't handle rejected exceutions gracefully
if the executor is shutting down. This change adds correct exception handling
if we try to schedule another ping while the node is shutting down.
Update scripts might want to update the documents `_timestamp` but need a notion of `now()`.
Painless doesn't support any notion of now() since it would make scripts non-pure functions. Yet,
in the update case this is a valid value and we can pass it with the context together to allow the
script to record the timestamp the document was updated.
Relates to #17895
SynonymQuery was ignored by the FastVectorHighlighter.
This change adds the support for SynonymQuery in the FVH.
Although this change should be implemented in Lucene directly which is why https://issues.apache.org/jira/browse/LUCENE-7484 has been opened.
In the meantime this PR handles the issue on ES side and could be removed when LUCENE-7484 gets merged.
Fixes#20781
* Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery)
This change removes the ES version of the match no docs query and replaces it with the Lucene version.
relates #18030
* Add missing change
Today we throw an assertion error if we release an AbstractArray more than once.
Yet, it's recommended to implement close methods such that they can be invoked
more than once. Guaranteed single release calls are hard to implement and some
situations might not be tested causing for instance `CircuitBreaker` to operate on
corrupted memory stats.
* Fix match_phrase_prefix query with single term on _all field
This change fixes the match_phrase_prefix query when a single term is queried on the _all field.
It builds a prefix query instead of an AllTermQuery which would not match any prefix.
Fixes#20470
* Add missing change
Mappings treat dots in field names as sub objects, for instance
```
{
"a.b": "c"
}
```
generates the same dynamic mappings as
```
{
"a": {
"b": "c"
}
}
```
Source filtering should be consistent with this behaviour so that an include
list containing `a` should include fields whose name is `a.b`.
To make this change easier, source filtering was refactored to use automata.
The ability to treat dots in field names as sub objects is provided by the
`makeMatchDotsInFieldNames` method of `XContentMapValues`.
Closes#20719
Instead provide services where they are needed. The class worked
well as a temporary measure to easy removal of guice from the index
level but now we can remove it entirely.
-1 @Inject annotation
This commit upgrades the Log4j 2 dependency to version 2.7 and removes
some hacks that we had in place to work around bugs in Log4j 2 version
2.6.2.
Relates #20805
UpdateHelper, MetaDataIndexUpgradeService, and some recovery
stuff.
Move ClusterSettings to nullable ctor parameter of TransportService
so it isn't forgotten.
The `QueryShardContext.failIfFrozen()` and `QueryShardContext.freezeContext()`
methods should be final so that overriding/bypassing the freezing of
`QueryShardContext` is not possible. This is important so that we can
trust when the `QueryShardContext` says a request is cacheable.
This change also makes the methods that call `QueryShardContext.failIfFrozen()`
`final` so they cannot be overridden to bypass setting the request as not
cacheable.
Elasticsearch 1.x used to implicitly round up upper bounds of queries when they
were inclusive so that eg. `[2016-09-18 TO 2016-09-20]` would actually run
`[2016-09-18T00:00:00.000Z TO 2016-09-20T23:59:59.999Z]` and include dates like
`2016-09-20T15:32:44`. This behaviour was lost in the cleanups of #8889.
Closes#20579
The shards preference on a search request enables specifying a list of
shards to hit, and then a secondary preference (e.g., "_primary") can be
added. Today, the separator between the shards list and the secondary
preference is ';'. Unfortunately, this is also a valid separtor for URL
query parameters. This means that a preference like "_shards:0;_primary"
will be parsed into two URL parameters: "_shards:0" and "_primary". With
the recent change to strict URL parsing, the second parameter will be
rejected, "_primary" is not a valid URL parameter on a search
request. This means that this feature has never worked (unless the ';'
is escaped, but no one does that because our docs do not that, and there
was no indication from Elasticsearch that this did not work). This
commit changes the separator to '|'.
Relates #20786
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.
This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
Previous to this change the DateMathParser accepted a Callable<Long> to use for accessing the now value. The implementations of this callable would fall back on System.currentTimeMillis() if there was no context object provided. This is no longer necessary for two reasons:
We should not fall back to System.currentTimeMillis() as a context should always be provided. This ensures consistency between shards for the now value in all cases
We should use a LongSupplier rather than requiring an implementation of Callable. This means that we can just pass in context::noInMillis for this parameter and not have not implement anything.
This commit improves the shard decision container class in the following
ways:
1. Renames UnassignedShardDecision to ShardAllocationDecision, so that
the class can be used for general shard decisions, not just unassigned
shard decisions.
2. Changes ShardAllocationDecision to have the final decision as a Type
instead of a Decision, because all the information needed from the final
decision is contained in `Type`.
3. Uses cached instances of ShardAllocationDecision for NO and THROTTLE
decisions when no explanation is needed (which is the common case when
executing reroute's as opposed to using the explain API).
Today SearchContext expose the current context as a thread local which makes any kind of sane interface design very very hard. This PR removes the thread local entirely and instead passes the relevant context anywhere needed. This simplifies state management dramatically and will allow for a much leaner SearchContext interface down the road.
As the wise man @ywelsch said: currently when we batch cluster state update tasks by the same executor, we the first task un-queued from the pending task queue. That means that other tasks for the same executor are left in the queue. When those are dequeued, they will trigger another run for the same executor. This can give unfair precedence to future tasks of the same executor, even if they weren't batched in the first run. Take this queue for example (all with equal priority)
```
T1 (executor 1)
T2 (executor 1)
T3 (executor 2)
T4 (executor 2)
T5 (executor 1)
T6 (executor 1)
```
If T1 & T2 are picked up first (when T5 & T6 are not yet queued), one would expect T3 & T4 to run second. However, since T2 is still in the queue, it will trigger execution of T5 & T6.
The fix is easy - ignore processed tasks when extracting them from the queue.
Closes#20768
Currently, update action delegates to index and delete actions
for replication using a dedicated transport action. This change
makes update a replication operation, removing the dedicated
transport action. This simplifies bulk execution and removes
duplicate logic for update retries and translation. This
consolidates the interface for single document write requests.
Now on the primary, the update request is translated to
an index or delete request before execution and the translated
request is sent to copies for replication.
Today it compiles when creating the aggregator, meaning that scripts will be
compiled as many times as there are buckets. Instead it should compile when
creating the factory so that scripts are compiled only once regardless of the
number of buckets.
TRA currently resolves incoming requests to IndexShards in order to acquire operations locks on them. There is no need for all subclasses to have to go through the same IndicesService/IndexService song and dance. Also, doing it once means we don't need to worry about edge cases where the shard is removed while a TRA is in flight.
This commit improves the logic flow of BalancedShardsAllocator in
preparation for separating out components of this class to be used
in the cluster allocation explain APIs. In particular, this commit:
1. Adds a minimum value for the index/shard balance factor settings (0.0)
2. Makes the Balancer data structures immutable and pre-calculated at
construction time.
3. Removes difficult to follow labeled blocks / GOTOs
4. Better logic for skipping over the same replica set when one of
the replicas received a NO decision
5. Separates the decision making logic for a single shard from the logic
to iterate over all unassigned shards.
Before this change the processing of the ranges in the date range (and
other range type) aggregations was done when the Aggregator was created.
This meant that the SearchContext did not know that now had been used in
a range until after the decision to cache was made.
This change moves the processing of the ranges to the aggregation builders
so that the search context is made aware that now has been used before
it decides if the request should be cached
This commit adds a did you mean feature to the strict REST params error
message. This works by comparing any unconsumed parameters to all of the
consumer parameters, comparing the Levenstein distance between those
parameters, and taking any consumed parameters that are close to an
unconsumed parameter as candiates for the did you mean.
* Fix pluralization in strict REST params message
This commit fixes the pluralization in the strict REST parameters error
message so that the word "parameter" is not unconditionally written as
"parameters" even when there is only one unrecognized parameter.
* Strength strict REST params did you mean test
This commit adds an unconsumed parameter that is too far from every
consumed parameter to have any candidate suggestions.
Relates #20747
This commit changes the strict REST parameters message to say that
unconsumed parameters are unrecognized rather than unused. Additionally,
the test is beefed up to include two unused parameters.
Relates #20745
Before this change the processing of the ranges in the date range (and
other range type) aggregations was done when the Aggregator was created.
This meant that the SearchContext did not know that now had been used in
a range until after the decision to cache was made.
This change moves the processing of the ranges to the aggregation builders
so that the search context is made aware that now has been used before
it decides if the request should be cached
This commit adds a did you mean feature to the strict REST params error
message. This works by comparing any unconsumed parameters to all of the
consumer parameters, comparing the Levenstein distance between those
parameters, and taking any consumed parameters that are close to an
unconsumed parameter as candiates for the did you mean.
* Fix pluralization in strict REST params message
This commit fixes the pluralization in the strict REST parameters error
message so that the word "parameter" is not unconditionally written as
"parameters" even when there is only one unrecognized parameter.
* Strength strict REST params did you mean test
This commit adds an unconsumed parameter that is too far from every
consumed parameter to have any candidate suggestions.
Relates #20747
This commit changes the strict REST parameters message to say that
unconsumed parameters are unrecognized rather than unused. Additionally,
the test is beefed up to include two unused parameters.
Relates #20745
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).
This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.
Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.
Relates #20722
Today, the individual allocation deciders appear in random
order when initialized in AllocationDeciders, which means
potentially more performance intensive allocation deciders
could run before less expensive deciders. This adds to the
execution time when a less expensive decider could terminate
the decision making process early with a NO decision. This
commit orders the initialization of allocation deciders,
based on a general assessment of the big O runtime of each
decider, moving the likely more expensive deciders last.
Closes#12815
IndicesClusterStateService and IndicesStore are responsible for synchronizing local shard state based on incoming cluster state updates. On client/tribe nodes, which don't store any such shard/index data/metadata, all of the logic that computes which data is to be deleted, which shards to be initialized etc. can be completely skipped, saving precious CPU cycles.
CompositeIndicesRequest should be implemented by all requests that are composed of multiple subrequests which relate to one or more indices. A composite request is
executed by its own transport action class (e.g. TransportMultiSearchAction for _msearch), which goes through all the subrequests and delegates their execution to the appropriate transport action (e.g. TransportSearchAction for _msearch) for each single item. IndicesAliasesRequest is a particular request as it holds multiple items that implement AliasesRequest, but it shouldn't be considered a composite request, as it has no specific transport action for each of its items. Also, either all of its subitems fail or succeed.
Also clarified javadocs for CompositeIndicesRequest.
We have new icons for elastic products with 5.0. This change updates the
favicon embedded in elasticsearch that users see when using the rest api
through a browser.
When a node get disconnected from the cluster and rejoins during a master election, it may be that the new master already has that node in it's cluster and will try to assign it shards. If the node hosts started primaries, the new shards will be initializing and will have the same allocation id as the allocation ids of the current started size. We currently do not recognize this currently. We should clean the current IndexShard instances and initialize new ones.
This also hardens test assertions in the same area.
This makes geo-distance sorting use `LatLonDocValuesField.newDistanceSort`
whenever applicable, which should be faster that the current approach since it
tracks a bounding box that documents need to be in in order to be competitive
instead of doing a costly distance computation all the time.
Closes#20450
today it's not possible to use date-math efficiently with the `_rollover`
API. This change adds support for date-math in the target index as well as
support for preserving the math logic when an existing index that was created with
a date math expression all subsequent indices are created with the same expression.
Right now our unit tests in that area only simulate indexing single documents. As we go forward it should be easy
to add other actions, like delete & bulk indexing. This commit extracts the common parts of the current indexing
logic to a based class make it easier to extend.
The Setting.timeValue() method uses TimeValue.toString() which can produce fractional time values. These fractional time values cannot be parsed again by the settings framework.
This commit fix a method that still use the .toString() method and replaces it with .getStringRep(). It also changes a second method so that it's not up to the caller to decide which stringify method to call.
closes#20662
This commit fixes a failing cluster settings tests, namely the logger
level update test. The test was incorrectly assuming the default log
level was info, but it could be non-info, for example, if
tests.es.logger.level is set to some non-info level.
Closes#20318