Today, we keep only the last index commit and use only it to calculate
the minimum required translog generation. This may no longer be correct
as we introduced a new deletion policy which keeps multiple index
commits. This change adjusts the CombinedDeletionPolicy so that it can
work correctly with a new index deletion policy.
Relates to #10708, #27367
Today we require users to prepare their indices for split operations.
Yet, we can do this automatically when an index is created which would
make the split feature a much more appealing option since it doesn't have
any 3rd party prerequisites anymore.
This change automatically sets the number of routinng shards such that
an index is guaranteed to be able to split once into twice as many shards.
The number of routing shards is scaled towards the default shard limit per index
such that indices with a smaller amount of shards can be split more often than
larger ones. For instance an index with 1 or 2 shards can be split 10x
(until it approaches 1024 shards) while an index created with 128 shards can only
be split 3x by a factor of 2. Please note this is just a default value and users
can still prepare their indices with `index.number_of_routing_shards` for custom
splitting.
NOTE: this change has an impact on the document distribution since we are changing
the hash space. Documents are still uniformly distributed across all shards but since
we are artificually changing the number of buckets in the consistent hashign space
document might be hashed into different shards compared to previous versions.
This is a 7.0 only change.
This is related to #27260. Currently, basic nio constructs (nio
channels, the channel factories, selector event handlers, etc) implement
logic that is specific to the tcp transport. For example, NioChannel
implements the TcpChannel interface. These nio constructs at some point
will also need to support other protocols (ex: http).
This commit separates the TcpTransport logic from the nio building
blocks.
Add an index level setting `index.mapping.nested_objects.limit` to control
the number of nested json objects that can be in a single document
across all fields. Defaults to 10000.
Throw an error if the number of created nested documents exceed this
limit during the parsing of a document.
Closes#26962
Today we allow exiting solely by being in certain packages. This commit
upgrades the securesm dependency to a new version that supports being
explicit about which classes can exit. We utilize that here to only
allow exiting from the uncaught exception handler and the base CLI
command class.
Relates #27482
Exclude "key" field from random modifications in tests, the composite agg uses
an array of object for bucket key and values are checked.
Relates #26800
When a field is not mapped, Elasticsearch tries to generate a mapping update
from the parsed document. Some documents can introduce corner-cases, for
instance in the event of a multi-valued field whose values would be mapped to
different field types if they were supplied on their own, see for instance:
```
PUT index/doc/1
{
"foo": ["2017-11-10T02:00:01.247Z","bar"]
}
```
In that case, dynamic mappings want to map the first value as a `date` field
and the second one as a `text` field. This currently throws an exception,
which is expected, but the wrong one since it throws a `class_cast_exception`
(which triggers a HTTP 5xx code) when it should throw an
`illegal_argument_exception` (HTTP 4xx).
This change stops indexing the `_primary_term` field for nested documents
to allow fast retrieval of parent documents. Today we create a docvalues
field for children to ensure we have a dense datastructure on disk. Yet,
since we only use the primary term to tie-break on when we see the same
seqID on indexing having a dense datastructure is less important. We can
use this now to improve the nested docs performance and it's memory footprint.
Relates to #24362
This change removes the module named aggs-composite and adds the `composite` aggs
as a core aggregation. This allows other plugins to use this new aggregation
and simplifies the integration in the HL rest client.
While we have an assertion that checks if the number of routing shards is a multiple
of the number of shards we need a real hard exception that checks this way earlier.
This change adds a check and test that is executed before we create the index.
Relates to #26931
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
This commit moves an assertion that some guard code that will eventually
be dead code in the resync replication request read serialization is
removed when the master branch is bumped to version 8.0.0.
This commit addresses a subtle bug in the serialization routine for
resync requests. The problem here is that Translog.Operation#readType is
not compatible with the implementations of
Translog.Operation#writeTo. Unfortunately, this issue prevents
primary-replica from succeeding, issues which we will address in
follow-ups.
Relates #27418
This is related to #27422. Right now when we send a write to the netty
transport, we attach a listener to the future. When you submit a write
on the netty event loop and the event loop is shutdown, the onFailure
method is called. Unfortunately, netty then tries to notify the listener
which cannot be done without dispatching to the event loop. In this
case, the dispatch fails and netty logs and error and does not tell us.
This commit checks that netty is still not shutdown after sending a
message. If netty is shutdown, we complete the listener.
This is related to #27260. Currently every nio channel has a profile
field. Profile is a concept that only relates to the tcp transport. Http
channels will not have profiles. This commit moves the profile from the
nio channel to the read context. The context is the level that protocol
specific features and logic should live.
Currently we use ActionListener<TcpChannel> for connect, close, and send
message listeners in TcpTransport. However, all of the listeners have to
capture a reference to a channel in the case of the exception api being
called. This commit changes these listeners to be type <Void> as passing
the channel to onResponse is not necessary. Additionally, this change
makes it easier to integrate with low level transports (which use
different implementations of TcpChannel).
Today we index dummy values for seq_ids and version on nested documents.
This is on the one hand trappy since users can request these values via
inner hits and on the other hand not necessarily good for compression since
the dummy value will likely not compress well when seqIDs are lowish.
This change ensures that we share the same field values for all documents in a
nested block. This won't have any overhead, in-fact it might be more efficient since
we even reduce the work needed slightly.
This commit removes the ability to use ${prompt.secret} and
${prompt.text} as valid config settings. Secure settings has obsoleted
the need for this, and it cleans up some of the code in Bootstrap.
Projects the depend on the CLI currently depend on core. This should not
always be the case. The EnvironmentAwareCommand will remain in :core,
but the rest of the CLI components have been moved into their own
subproject of :core, :core:cli.
This commit corrects a word usage error in the getting started
docs. Since pronunciation is what determines when to use either "a" or
"an" and the word "ubiquitous" is pronounced /yo͞oˈbikwədəs/, it should
be preceded by "a."
Relates #27420
When the Elasticsearch code is loaded in an unusual classloading
environment (e.g., when using the high-level REST client) in Jetty, the
code source can be null and we trip with an NPE. This commit addresses
this.
Relates #27442
We introduced a new snapshot status update handler in 6.1.0. We will
keep the old handler along with this new one in all 6.x. This commit
removes the old handler from 7.0.
Relates #27151
This is related to #27260. Currently, every ESSelector keeps track of
all channels that are registered with it. ESSelector is just an
abstraction over a raw java nio selector. The java nio selector already
tracks its own selection keys. This commit removes our tracking and
relies on the java nio selector tracking.