OpenSearch/docs/reference/release-notes/highlights-7.7.0.asciidoc

192 lines
7.9 KiB
Plaintext
Raw Normal View History

[[release-highlights-7.7.0]]
== 7.7.0 release highlights
++++
<titleabbrev>7.7.0</titleabbrev>
++++
//NOTE: The notable-highlights tagged regions are re-used in the
//Installation and Upgrade Guide
// tag::notable-highlights[]
[float]
=== Fixed index corruption on shrunk indices
Applying deletes or updates on an index after it had been shrunk would likely
corrupt the index. We advise users of Elasticsearch 6.x who opt in for soft
deletes on some of their indices and all users of Elasticsearch 7.x to upgrade
to 7.7 as soon as possible to no longer be subject to this corruption bug. In
case upgrading in the near future is not an option, we recommend to completely
stop using `_shrink` on read-write indices and to do a force-merge right after
shrinking on read-only indices, which significantly reduces the likeliness of
being affected by this bug in case deletes or updates get applied by mistake.
This bug is fixed as of {es} 7.7.0. Low-level details can be found on the
https://issues.apache.org/jira/browse/LUCENE-9300[corresponding issue].
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== Significant reduction of heap usage of segments
This release of Elasticsearch significantly reduces the amount of heap memory
that is needed to keep Lucene segments open. In addition to helping with cluster
stability, this helps reduce costs by storing much more data per node before
hitting memory limits.
// end::notable-highlights[]
// tag::notable-highlights[]
[discrete]
=== {transforms-cap} now in GA!
In 7.7, we move {transforms} from beta to general availability.
{ref}/transforms.html[{transforms-cap}] enable you to pivot existing {es}
indices using group-by and aggregations into a destination feature index, which
provides opportunities for new insights and analytics. For example, you can use
{transforms} to pivot your data into entity-centric indices that summarize the
behavior of users or sessions or other entities in your data.
{transforms-cap} now include support for cross-cluster search. Allowing you to
create your destination feature index on a separate cluster from the source
indices.
Aggregation support has been expanded within {transforms} to include support for
{ref}/search-aggregations-metrics-percentile-aggregation.html[multi-value (percentiles)]
and
{ref}/search-aggregations-bucket-filter-aggregation.html[filter aggregations].
We also optimized the performance of the
{ref}/search-aggregations-bucket-datehistogram-aggregation.html[date histogram aggregations].
// end::notable-highlights[]
// tag::notable-highlights[]
[discrete]
=== Introducing multiclass {classification}
{ml-docs}/dfa-classification.html[{classification-cap}] using multiple classes
is now available in {dfanalytics}. {classification-cap} is a supervised {ml}
technique which has been already available as a binary process in the previous
release. Multiclass {classification} works well with up to 30 distinct
categories.
// end::notable-highlights[]
// tag::notable-highlights[]
[discrete]
=== {feat-imp-cap} at {infer} time
{feat-imp-cap} now can be calculated at {infer} time. This value provides
further insight into the results of a {classification} or {regression} job and
therefore helps interpret these results.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== Finer memory control for bucket aggregations
While building buckets, aggregations will now periodically check the
real-memory circuit breaker before continuing to allocate more buckets. This
allows better responsivity to memory pressure and avoids `OutOfMemory`
situations due to allocating more buckets than the node can handle.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== A new way of searching: asynchronously
You can now submit {ref}/async-search-intro.html[long-running searches] using
the new {ref}/async-search.html[`_async_search` API]. The new API accepts the
same parameters and request body as the {ref}/search-search.html[Search API].
However, instead of blocking and returning the final response only when it's
entirely finished, you can retrieve results from an async search as they become
available.
The request takes a parameter, `wait_for_completion`, which controls how long
the server will wait until it sends back a response. The first response
contains among others a search unique ID, a response version, an indication if
this response is partial or not, plus the usual metadata (shards involved,
number of hits etc) and potentially results. If the response is not complete
and final, the client can continue polling for results, issuing a new request
using the provided search ID. If new results are available, the returned
version is incremented and the new batch of results are returned. This can
continue until all the results are fetched.
Unless deleted earlier by the user, the asynchronous searches are kept alive
for a given interval. This defaults to 5 days and can be controlled by another
request parameter, `keep_alive`.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== Password protection for the keystore
{es} uses a custom on-disk {ref}/secure-settings.html[keystore] for secure settings such as
passwords and SSL certificates. Up until now, this prevented users with
{ref}/elasticsearch-keystore.html[command-line access] from viewing secure files by listing commands, but nothing
prevented such users from changing values in the keystore, or removing values
from it. Furthermore, the values were only obfuscated by a hash; no
user-specific secret protected the secure settings.
This new feature changes all of that by adding password-protection to the
keystore. This is not be a breaking change: if a keystore has no password,
there wont be any new prompts. A user must choose to password-protect their
keystore in order to benefit from the new behavior.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== A new aggregation: `top_metrics`
The new {ref}//search-aggregations-metrics-top-metrics.html[`top_metrics` aggregation] "selects" a metric from a document according
to a criteria on a given, different field. That criteria is currently the
largest or smallest "sort" value. It is fairly similar to `top_hits` in spirit,
but because it is more limited, `top_metrics` uses less memory and
is often faster.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== Query speed-up for sorted queries on time-based indices
We've optimized sorted, top-documents-only queries run on time-based indices.
The optimization stems from the fact that the ranges of (document) timestamps
in the shards don't overlap. It is implemented by rewriting the shard search
requests based on the partial results already available from other shards, if
it can be determined that the query will not yield any result from the current
shard; i.e. we know in advance that the bottom entry of the (sorted) result set
after a partial merge is better than the values contained in this current
shard.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== A new aggregation: `boxplot`
The https://en.wikipedia.org/wiki/Interquartile_range[interquartile range (IQR)] is a common robust measure of statistical dispersion.
Compared to the standard deviation, the IQR is less sensitive to outliers in
the data, with a breakdown point of 0.25. Along with the median, it is often
used in creating a box plot, a simple yet common way to summarize data and
identify potential outliers.
The new {ref}/search-aggregations-metrics-boxplot-aggregation.html[`boxplot`
aggregation] calculates the min, max, and medium as well as the first and third
quartiles of a given data set.
// end::notable-highlights[]
// tag::notable-highlights[]
[float]
=== AArch64 support
{es} now provides AArch64 packaging, including bundling an AArch64 JDK
distribution. There are some restrictions in place, namely no {ml} support and
depending on underlying page sizes, class data sharing is disabled.
// end::notable-highlights[]