* Fix NPE on FieldStats with mixed cluster on version pre/post 5.2
In 5.2 the FieldStats API can return null min/max values.
These values cannot be deserialized by a node with version pre 5.2 so if this node
is pick to coordinate a FieldStats request in a mixed cluster an NPE can be thrown.
This change prevents the NPE by removing the non serializable FieldStats object directly in the field stats shard request.
The filtered fields will not be present in the response when a node pre 5.2 acts as a coordinating node.
This change is a simple adaptation of https://github.com/elastic/elasticsearch/pull/19587 for the current state of master.
It allows to define search response listener in the form of `BiConsumer<SearchRequest, SearchResponse>`s in a search plugin.
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
Changes the error message when `action.auto_create_index` or
`index.mapper.dynamic` forbids automatic creation of an index
from `no such index` to one of:
* `no such index and [action.auto_create_index] is [false]`
* `no such index and [index.mapper.dynamic] is [false]`
* `no such index and [action.auto_create_index] contains [-<pattern>] which forbids automatic creation of the index`
* `no such index and [action.auto_create_index] ([all patterns]) doesn't match`
This should make it more clear *why* there is `no such index`.
Closes#22435
testAckedIndexing now waits for all nodes to stabilize in the cluster
state through an assertBusy before final validation that all documents
are found in tehir respective shards in the cluster. Before, what could
happen is that the ensureGreen check passes but only after that is a
ping failure from the network disruption processed by the master,
thereby rendering the cluster RED again. This assertBusy waits up to 30
seconds for all nodes to have stabilized and all get document actions to
succeed.
Today we do not preserve response headers if they are present on a transport protocol
response. While preserving these headers is not always desired, in the most cases we
should pass on these headers to have consistent results for depreciation headers etc.
yet, this hasn't been much of a problem since most of the deprecations are detected early
ie. on the coordinating node such that this bug wasn't uncovered until #22647
This commit allow to optionally preserve headers when a context is restored and also streamlines
the context restore since it leaked frequently into the callers thread context when the callers
context wasn't restored again.
Relates to #22024
On top of documentation, the PR adds deprecation loggers and deals with the resulting warning headers.
The yaml test is set exclude versions up to 6.0. This is need to make sure bwc tests pass until this is backported to 5.2.0 . Once that's done, I will change the yaml test version limits
This change makes it possible for custom routing values to go to a subset of shards rather than
just a single shard. This enables the ability to utilize the spatial locality that custom routing can
provide while mitigating the likelihood of ending up with an imbalanced cluster or suffering
from a hot shard.
This is ideal for large multi-tenant indices with custom routing that suffer from one or both of
the following:
- The big tenants cannot fit into a single shard or there is so many of them that they will likely
end up on the same shard
- Tenants often have a surge in write traffic and a single shard cannot process it fast enough
Beyond that, this should also be useful for use cases where most queries are done under the context
of a specific field (e.g. a category) since it gives a hint at how the data can be stored to minimize
the number of shards to check per query. While a similar solution can be achieved with multiple
concrete indices or aliases per value today, those approaches breakdown for high cardinality fields.
A partitioned index enforces that mappings have routing required, that the partition size does not
change when shrinking an index (the partitions will shrink proportionally), and rejects mappings
that have parent/child relationships.
Closes#21585
Instead of forcing each task to register all nodes where its children are running, this commit runs cancellation on all nodes. The task cancellation operation doesn't run too frequently, so this optimization doesn't seem to be worth additional complexity of the interface.
Previously, certain settings that could take multiple comma delimited
values would pick up incorrect values for all entries but the first if
each comma separated value was followed by a whitespace character. For
example, the multi-value "A,B,C" would be correctly parsed as
["A", "B", "C"] but the multi-value "A, B, C" would be incorrectly parsed
as ["A", " B", " C"].
This commit allows a comma separated list to have whitespace characters
after each entry. The specific settings that were affected by this are:
cluster.routing.allocation.awareness.attributes
index.routing.allocation.require.*
index.routing.allocation.include.*
index.routing.allocation.exclude.*
cluster.routing.allocation.require.*
cluster.routing.allocation.include.*
cluster.routing.allocation.exclude.*
http.cors.allow-methods
http.cors.allow-headers
For the allocation filtering related settings, this commit also provides
validation of each specified entry if the filtering is done by _ip,
_host_ip, or _publish_ip, to ensure that each entry is a valid IP
address.
Closes#22297
This commit tries to simplify the way ElasticsearchException are rendered to xcontent. It adds some documentation and renames and merges some methods. Current behavior is preserved, the goal is to be more readable and centralize everything in the ElasticsearchException class.
`EngineClosedException` is a ES level exception that is used to indicate that the engine is closed when operation starts. It doesn't really add much value and we can use `AlreadyClosedException` from Lucene (which may already bubble if things go wrong during operations). Having two exception can just add confusion and lead to bugs, like wrong handling of `EngineClosedException` when dealing with document level failures. The latter was exposed by `IndexWithShadowReplicasIT`.
This PR also removes the AwaitFix from the `IndexWithShadowReplicasIT` tests (which was what cause this to be discovered). While debugging the source of the issue I found some mismatches in document uid management in the tests. The term that was passed to the engine didn't correspond to the uid in the parsed doc - those are fixed as well.
Today we have quite some abstractions that are essentially providing a simple
dispatch method to the plugins defining a `HttpServerTransport`. This commit
removes `HttpServer` and `HttpServerAdaptor` and introduces a simple `Dispatcher` functional
interface that delegate to `RestController` by default.
Relates to #18482
#22025 deprecated this setting (pending it's removal) but it's frequent usage will spam the deprecation logs and also fails test. As temporary work around we should not use the setting object directly.
Currently both ProfileResult and CollectorResult print the time field in a human readable string format
(e.g. "time": "55.20315000ms"). When trying to parse this back to a long value, for example to use in
the planned high level java rest client, we can lose precision because of conversion and rounding issues.
This change adds a new additional field (`time_in_nanos`) to the profile response to be able to get the
original time value in nanoseconds back.
The old `time` field is only printed when the `?`human=true` flag in the url is set. This follow the behaviour for
all other stats-related apis. Also the format of the `time` field is slightly changed. Instead of always formatting
the output as a 10-digit ms value, by using the `XContentBuilder#timeValueField()` method we now print
the largest time unit present is used (e.g. "s", "ms", "micros").
An operation that completed successfully on a primary can result in a
version conflict on a replica due to the asynchronous nature of
operations. When a replica operation results in a version conflict, the
operation is not added to the translog. This leads to gaps in the
translog which is problematic as it can lead to situations where a
replica shard can never advance its local checkpoint. As such operations
are just normal course of business for a replica shard, these operations
should be treated as if they completed successfully. This commit adds
these operations to the translog.
Relates #22626
For certain situations, end-users need the base path for Elasticsearch
logs. Exposing this as a property is better than hard-coding the path
into the logging configuration file as otherwise the logging
configuration file could easily diverge from the Elasticsearch
configuration file. Additionally, Elasticsearch will only have
permissions to write to the log directory configured in the
Elasticsearch configuration file. This commit adds a property that
exposes this base path.
One use-case for this is configuring a rollover strategy to retain logs
for a certain period of time. As such, we add an example of this to the
documentation.
Additionally, we expose the property es.logs.cluster_name as this is
used as the name of the log files in the default configuration.
Finally, we expose es.logs.node_name in cases where node.name is
explicitly set in case users want to include the node name as part of
the name of the log files.
Relates #22625
When logger.level is set, we end up configuring a logger named "level"
because we look for all settings of the form "logger\..+" as configuring
a logger. Yet, logger.level is special and is meant to only configure
the default logging level. This commit causes is to avoid not
configuring a logger named level.
Relates #22624
The IndexingOperationListener interface did not provide any
information about the shard id when a document was indexed.
This commit adds the shard id as the first parameter to all methods
in the IndexingOperationListener.
This commit is a simple cleanup of the code related to cgroup stats:
- reduce visibility of a method
- remove an unneeded logger guard
- cleanup the formatting of comments
TransportInterceptors are commonly used to enrich requests with headers etc.
which requires access the the thread context. This is not always easily possible
since threadpools are hard to access for instance if the interceptor is used on a transport client.
This commit passes on the thread context to all the interceptors for further consumption.
Closes#22585
Deleting indices is an important event in a cluster and as such should
be logged at the info level. This commit changes the logging level on
index deletion to the info level.
Relates #22627
We have made the security manager non-optional, but the Javadocs for
Security.java imply that it still is. This commit fixes this issue.
Relates #16176
ClusterService and TransportService expect the local discovery node to be set
before they are started but this requires manual interaction and is error prone since
to work absolutely correct they should share the same instance (same ephemeral ID).
TransportService also has 2 modes of operation, mainly realted to transport client vs. internal
to a node. This change removes the mode where we don't maintain a local node and uses a dummy local
node in the transport client since we don't bind to any port in such a case.
Local discovery node instances are now managed by the node itself and only suppliers and factories that allow
creation only once are passed to TransportService and ClusterService.
There was still small race in MockTcpTransport where channesl that are concurrently
closing are not yet removed from the reference tracking causing tests to fail. Compared to
the other races before this is a rather small windown and requires very very short test durations.
```h
$ bin/elasticsearch-keystore create
Created elasticsearch keystore in /Users/dpilato/Documents/Elasticsearch/apps/elasticsearch/elasticsearch-6.0.0-alpha1/config
$ bin/elasticsearch-keystore add
Enter value for null: xyz
Exception in thread "main" java.lang.NullPointerException: invalid null input
at java.security.KeyStore.setEntry(KeyStore.java:1552)
at org.elasticsearch.common.settings.KeyStoreWrapper.setString(KeyStoreWrapper.java:264)
at org.elasticsearch.common.settings.AddStringKeyStoreCommand.execute(AddStringKeyStoreCommand.java:83)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:58)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:69)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
at org.elasticsearch.cli.Command.main(Command.java:88)
at org.elasticsearch.common.settings.KeyStoreCli.main(KeyStoreCli.java:39)
```
Today there are several races / holes in TcpTransport and MockTcpTransport
that can allow connections to be opened and remain unclosed while the actual
transport implementation is closed. A recently added assertions in #22554 exposes
these problems. This commit fixes several issues related to missed locks or channel
creations outside of a lock not checking if the resource is still open.
This change disables the _all meta field by default.
Now that we have the "all-fields" method of query execution, we can save both
indexing time and disk space by disabling it.
_all can no longer be configured for indices created after 6.0.
Relates to #20925 and #21341Resolves#19784
TcpTransport has an actual mechanism to stop resources in subclasses.
Instead of overriding `doStop` subclasses should override `stopInternal`
that is executed under the connection lock guaranteeing that there is no
concurrency etc.
Relates to #22554
* Settings: Make s3 repository sensitive settings use secure settings
This change converts repository-s3 to use the new secure settings. In
order to support the multiple ways we allow aws creds to be configured,
it also moves the main methods for the keystore wrapper into a
SecureSettings interface, in order to allow settings prefixing to work.
* Fix Translog.Delete serialization for sequence numbers
Translog.Delete used `.writeVLong` instead of `.writeLong` for the sequence
number and primary term (and their respective "read" variants). This could lead
to issues where a 5.x node sent a translog operation with a negative sequence
number (-2 for unassigned seq no) that tripped an assertion serializing a
negative number and causing ES to exit.
Adds a unit test for serialization and a mixed-cluster REST test, since that was
how this was originally caught.
* Use more realistic values for random seqNum and primary term
* Add comment with TODO for removal in 7.0
* Change comment into an assert
The low level TCP handshake can cause channel / connection leaks if it's interrupted
since the caller doesn't close the channel / connection if the handshake was not successful.
This commit fixes the channel leak and adds general test infrastructure to detect channel leaks
in the future.
Instead of `search.remote.seeds.${clustername}` we now specify the seeds as:
`search.remote.${clustername}.seeds` which is a real list setting compared to an unvalidated
group setting before.
Today affix settings are not dynamic since it's required to know
it's namespace in order to pull a concrete setting from it. This is not possible
in practice since the namespaces are dynamic by design. This change allows to register
a specialized settings consumer that consumes the namespace and the actual value if
a setting gets updated.
Moves fetching the local node id into `NodeClient` which is a
fairly useful place to put it so you can generate task ids from
`NodeClient#executeLocally`.
This commit adds the parsing fromXContent() methods to the IndexResponse class. The method is based on a ObjectParser because it is easier to use when parsing parent abstract classes like DocWriteResponse.
It also changes the ReplicationResponse.ShardInfo so that it now implements ToXContentObject. This way, the ShardInfo.fromXContent() method can be used by the IndexResponse's ObjectParser.
Previously, we removed all unneeded backward compatibility logic
from the BlobStoreRepository because 6.0 does not need to support
2.x snapshot formats. During the process of removing this backward
compatibility logic, some code was leftover that is no longer
necessary. This commit removes all the remaining unnecessary
backwards compatibility code in BlobStoreRepository.
It is no longer needed. It used to contain a lot of strings
used by serialization but those have since been removed. Now
it is just another thing to pass around that we don't really
need.
Affix settings are useful to namespace a certain setting. Yet, affix settings
must be specialized for their concrete type which causes lot of code duplication.
This commit allows to reuse an existing setting with and affix setting as soon as
a concrete key is available.
One needs to close the higher level objects (like UnicastZenPing) before closing the transport service. The latter can throw assertions w.r.t open connections
This adds methods to parse InternalSearchHit and InternalSearchHits from their
xContent representation. Most of the information in the original object is
preserved when rendering the object to xContent and then parsing it back.
However, some pieces of information are lost which we currently cannot parse
back from the rest response, most notably:
* the "match" property of the lucene explanation is not rendered in the
"_explain" section and cannot be reconstructed on the client side
* the original "shard" information (SearchShardTarget) is only rendered if the
"explanation" is also set, also we loose the indexUUID of the contained
ShardId because we don't write it out. As a replacement we can use
ClusterState.UNKNOWN_UUID on the receiving side
The NodeConnectionsService currently determines which nodes to connect to / disconnect from by inspecting cluster state changes and connecting to added nodes / disconnecting from removed nodes. When a master steps down (for example due to another master-eligible node shutting down which brings the number of master-eligible nodes below minimum_master_master), and the connection to other existing nodes was dropped while pinging, however, the connection to these nodes is not re-established while publishing the first cluster state that establishes the node as master.
This commit changes the NodeConnectionsService connect / disconnect logic to always rely on the state that is to be / was published, looking not only at the added / removed nodes, but validating that exactly all nodes that are currently registered in NodeConnectionsService are connected (corresponds to a NOOP if the node is already connected).
The document in the randomized GetResult can exist with no source (like if the _source was disabled in mappings), that's why the test should not always expect a non null source when the doc exists.
* Promote longs to doubles when a terms agg mixes decimal and non-decimal number
This change makes the terms aggregation work when the buckets coming from different indices are a mix of decimal numbers and non-decimal numbers. In this case non-decimal number (longs) are promoted to decimal (double) which can result in a loss of precision for big numbers.
Fixes#22232
There is a bug in the error message that is thrown if the number of docs differs between the source and target shards when recovering a shard with a syncId. The source and target doc counts are swapped around.
Closes#21893
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.
This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
The test ping and waited for the ping results to be returned but since we first return the result and then close temporary connections, assertions are tripped that expects all connections to close by end of test .
Closes#22497
This commit checks for a null BytesReference as the value for `source`
in GetResult#sourceRef and simply returns null. Previously this would
have resulted in a NPE. While this does seem internal at first glance, it can affect
user code as a GetResponse could trigger this when the document is missing.
Additionally, the CompressorFactory#uncompressIfNeeded now requires a
non-null argument.
The recovery process started during primary relocation of shadow replicas accesses the engine on the source shard after it's been closed, which results in the source shard failing itself.
Right now closing a shard looks like it strands refresh listeners,
causing tests like
`delete/50_refresh/refresh=wait_for waits until changes are visible in search`
to fail. Here is a build that fails:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+multi_cluster_search+multijob-darwin-compatibility/4/console
This attempts to fix the problem by implements `Closeable` on
`RefreshListeners` and rejecting listeners when closed. More importantly
the act of closing the instance flushes all pending listeners
so we shouldn't have any stranded listeners on close.
Because it was needed for testing, this also adds the number of
pending listeners to the `CommonStats` object and all API to which
that flows: `_cat/nodes`, `_cat/indices`, `_cat/shards`, and
`_nodes/stats`.
In pre 2.x versions, if the repository was set to compress snapshots,
then snapshots would be compressed with the LZF algorithm. In 5.x,
Elasticsearch no longer supports the LZF compression algorithm. This
presents an issue when retrieving snapshots in a repository or upgrading
repository data to the 5.x version, because Elasticsearch throws an
exception when it tries to read the snapshot metadata because it was
compressed using LZF.
This commit gracefully handles the situation by introducing a new
incompatible-snapshots blob to the repository. For any pre-2.x snapshot
that cannot be read, that snapshot is removed from the list of active
snapshots, because the snapshot could not be restored anyway. Instead,
the snapshot is recorded in the incompatible-snapshots blob. When
listing snapshots, both active snapshots and incompatible snapshots will
be listed, with incompatible snapshots showing a `INCOMPATIBLE` state.
Any attempt to restore an incompatible snapshot will result in an
exception.
`ToXContentObject` extends `ToXContent` without adding new methods to it, while allowing to mark classes that output complete xcontent objects to distinguish them from classes that require starting and ending an anonymous object externally.
Ideally ToXContent would be renamed to ToXContentFragment, but that would be a huge change in our codebase, hence we simply document the fact that toXContent outputs fragments with no guarantees that the output is valid per se without an external ancestor.
Relates to #16347