This commit adds the node name to the names of thread pool executors so
that the node name is visible in rejected execution exception messages.
Relates #27663
The main constructor for rejected execution exception its executor
shutdown constructor parameter to the super constructor where it would
be used as a formatting parameter. This is a mistake so this commit
fixes this issue.
In the global checkpoint sync action, we fsync the translog. However,
the last synced global checkpoint might already be equal to the current
global checkpoint in which case the fsyncing the translog is unnecessary
as either the sync needed guard in the translog will skip the translog,
or the translog needs an fsync for another reason that will be picked up
elsewhere (e.g., at the end of a bulk request).
Relates #27652
The hashCode contract states that equal objects must have equal hash
codes, however the unequal objects are not required to have unequal
hashCodes.
This commit rewrites GeoPointParsingTests#testEqualsHashCodeContract
using#checkEqualsAndHashCode helper.
Closes#27633
* Fix highlighting on a keyword field that defines a normalizer
The `plain` and sometimes the `unified` highlighters need to re-analyze the content to highlight a field
This change makes sure that we don't ignore the normalizer defined on the keyword field for this analysis.
After write operations in some situations we fire a post-operation
global checkpoint sync. The global checkpoint sync unconditionally
fsyncs the translog and this can then look like an fsync
per-request. This violates the translog durability settings on the index
if this durability is set to async. This commit changes the global
checkpoint sync to observe the translog durability.
Relates #27641
Today we exclude internal refreshes in the refresh stats. Yet, it's very much
confusing to not take these into account. This change includes internal refreshes
into the stats until we have a dedicated stats for this.
This new snapshot mostly brings a change to TopFieldCollector which can now
early terminate collection when trackTotalHits is `false`.
As a follow-up, we should replace our usage of
`EarlyTerminatingSortingCollector` with this new option.
Today, we maintain two sets in a SeqNoSet: ongoing sets and completed
sets. We can remove the completed sets and use only the ongoing sets by
releasing the internal bitset of a CountedBitSet when all its bits are
set. This behaves like two sets but simpler. This commit also makes
CountedBitSet as a drop-in replacement for BitSet.
Relates #27268
* Add accounting circuit breaker and track segment memory usage
This commit adds a new circuit breaker "accounting" that is used for tracking
the memory usage of non-request-tied memory users. It also adds tracking for the
amount of Lucene segment memory used by a shard as a user of the new circuit
breaker.
The Lucene segment memory is updated when the shard refreshes, and removed when
the shard relocates away from a node or is deleted. It should also be noted that
all tracking for segment memory uses `addWithoutBreaking` so as not to fail the
shard if a limit is reached.
The `accounting` breaker has a default limit of 100% and will contribute to the
parent breaker limit.
Resolves#27044
Today we carry on the size of the live version map to ensure that
we minimze rehashing. Yet, once we are idle or we can issue a sync-commit
we can resize it to defaults to free up memory.
Relates to #27516
Once a shard goes inactive we want the shard to be refreshed if
the refresh interval is default since we might hold on to unnecessary
segments and in the inactive case we stopped indexing and can release
old segments.
Relates to #27500
Add an index level setting `index.analyze.max_token_count` to control
the number of generated tokens in the _analyze endpoint.
Defaults to 10000.
Throw an error if the number of generated tokens exceeds this limit.
Closes#27038
The ChecksumBlobStoreFormat.writeAtomic() method writes a blob using a
temporary name and then moves the blob to its final name. The move
operation can fail and in this case the temporary blob is deleted. If
this delete operation also fails, then the initial exception is lost.
This commit ensures that when something goes wrong during the move
operation the initial exception is kept and thrown, and if the delete
operation also fails then this additional exception is added
as a suppressed exception to the initial one.
Today when configuring the data paths for the environment, we set data
paths to either the specified path.data or default to data relative to
the Elasticsearch home. Yet if node.local_storage is false, data paths
do not even make sense. In this case, we should reject if path.data is
set, and instead of defaulting data paths to data relative to home, we
should set this to empty paths. This commit does this.
Relates #27587
today a refresh listener won't preserve the entire context ie. won't carry
on response headers etc. from the caller side. This change adds support for
stored contexts.
Today we only expose the external readers segments. Yet, from a statistics
perspective both internal and external segments are relevant. This commit
exposes the additional segments of the internal and external reader respectively.
A compressible bytes output stream is a stream output which supports a
reset method. However, compressible bytes output streams are unusual in
that the current implementation sometimes supports a reset (if the
stream is not compressed) and sometimes does not support a rest (if the
stream is compressed). This inconsistent behavior is puzzling and
instead we should simply always throw an unsupported operation
exception.
Relates #27564
The GlobalOrdinalsStringTermsAggregator.LowCardinality aggregator casts global
values to `GlobalOrdinalMapping`, even though the implementation of global
values is different when a `missing` value is configured.
This commit adds a new API that gives access to the ordinal remapping in order
to fix this problem.
The main highlight of this new snapshot is that it introduces the opportunity
for queries to opt out of caching. In case a query opts out of caching, not only
will it never be cached, but also no compound query that wraps it will be
cached.
Also include _type and _id for parent/child hits inside inner hits.
In the case of top_hits aggregation the nested search hits are
directly returned and are not grouped by a root or parent document, so
it is important to include the _id and _index attributes in order to know
to what documents these nested search hits belong to.
Closes#27053
Method TruncateTranslogIT#corruptTranslogFiles corrupts some random
existing *.tlog files in a translog directory. However, this may not
actually corrupt translog at all if it corrupts only tlog files which
are not referenced by the Checkpoint (eg. their translog generations are
smaller the Checkpoint).
This commit makes sure that we corrupt some tlog files which are
referenced by the Checkpoint.
Closes#27538
Running with the all permission java.security.AllPermission granted is
equivalent to disabling the security manager. This commit adds a
bootstrap check that forbids running with this permission granted.
Relates #27548
Compressible bytes output stream swallows exceptions that occur when
closing. This commit changes this behavior so that such exceptions
bubble up.
Relates #27542
Today we refresh automatically in the background by default very second.
This default behavior has a significant impact on indexing performance
if the refreshes are not needed.
This change introduces a notion of a shard being `search idle` which a
shard transitions to after (default) `30s` without any access to an
external searcher. Once a shard is search idle all scheduled refreshes
will be skipped unless there are any refresh listeners registered.
If a search happens on a `serach idle` shard the search request _park_
on a refresh listener and will be executed once the next scheduled refresh
occurs. This will also turn the shard into the `non-idle` state immediately.
This behavior is only applied if there is no explicit refresh interval set.