[[release-highlights]] == What's new in {minor-version} coming[{minor-version}] Here are the highlights of what's new and improved in {es} {minor-version}! ifeval::["{release-state}"!="unreleased"] For detailed information about this release, see the <> and <>. endif::[] // Add previous release to the list Other versions: {ref-bare}/7.9/release-highlights.html[7.9] | {ref-bare}/7.8/release-highlights.html[7.8] | {ref-bare}/7.7/release-highlights.html[7.7] | {ref-bare}/7.6/release-highlights-7.6.0.html[7.6] | {ref-bare}/7.5/release-highlights-7.5.0.html[7.5] | {ref-bare}/7.4/release-highlights-7.4.0.html[7.4] | {ref-bare}/7.3/release-highlights-7.3.0.html[7.3] | {ref-bare}/7.2/release-highlights-7.2.0.html[7.2] | {ref-bare}/7.1/release-highlights-7.1.0.html[7.1] | {ref-bare}/7.0/release-highlights-7.0.0.html[7.0] // tag::notable-highlights[] [discrete] [[indexing-speed-improvement]] === Indexing speed improvement {es} 7.10 improves indexing speed by up to 20%. We've reduced the coordination needed to add entries to the {ref}/index-modules-translog.html[transaction log]. This reduction allows for more concurrency and increases the transaction log buffer size from `8KB` to `1MB`. However, performance gains are lower for full-text search and other analysis-intensive use cases. The heavier the indexing chain, the lower the gains, so indexing chains that involve many fields, ingest pipelines or full-text indexing will see lower gains. [discrete] [[more-space-efficient-indices]] === More space-efficient indices {es} 7.10 depends on Apache Lucene 8.7, which introduces higher compression of stored fields, the part of the index that notably stores the {ref}/mapping-source-field.html[`_source`]. On the various data sets that we benchmark against, we noticed space reductions between 0% and 10%. This change especially helps on data sets that have lots of redundant data across documents, which is typically the case of the documents that are produced by our Observability solutions, which repeat metadata about the host that produced the data on every document. Elasticsearch offers the ability to configure the {ref}/index-modules.html#index-codec[`index.codec`] setting to tell {es} how aggressively to compress stored fields. Both supported values `default` and `best_compression` will get better compression with this change. [discrete] [[data-tier-formalization]] === Data tiers 7.10 introduces the concept of formalized data tiers within {es}. {ref}/data-tiers.html[Data tiers] are a simple, integrated approach that gives users control over optimizing for cost, performance, and breadth/depth of data. Prior to this formalization, many users configured their own tier topology using custom node attributes as well as using {ilm-init} to manage the lifecycle and location of data within a cluster. With this formalization, data tiers (content, hot, warm, and cold) can be explicitly configured using {ref}/modules-node.html#node-roles[node roles], and indices can be configured to be allocated within a specific tier using {ref}/data-tier-shard-filtering.html[index-level data tier allocation filtering]. {ilm-init} will make use of these tiers to {ref}/ilm-migrate.html[automatically migrate] data between nodes as an index goes through the phases of its lifecycle. Newly created indices abstracted by a {ref}/data-streams.html[data stream] will be allocated to the `data_hot` tier automatically, while standalone indices will be allocated to the `data_content` tier automatically. Nodes with the pre-existing `data` role are considered to be part of all tiers. [discrete] [[auc-roc-eval-class]] === AUC ROC evaluation metrics for classification analysis {ml-docs}/ml-dfanalytics-evaluate.html#ml-dfanalytics-class-aucroc[Area under the curve of receiver operating characteristic (AUC ROC)] is an evaluation metric that has been available for {oldetection} since 7.3 and now is available for {classification} analysis. AUC ROC represents the performance of the {classification} process at different predicted probability thresholds. The true positive rate for a specific class is compared against the rate of all the other classes combined at the different threshold levels to create the curve. [discrete] [[custom-feature-processor-dfa]] === Custom feature processors in {dfanalytics} Feature processors enable you to extract process features from document fields. You can use these features in model training and model deployment. Custom feature processors provide a mechanism to create features that can be used at search and ingest time and they don’t take up space in the index. This process more tightly couples feature generation with the resulting model. The result is simplified model management as both the features and the model can easily follow the same life cycle. [discrete] [[points-in-time-for-search]] === Points in time (PITs) for search In 7.10, we're introducing points in time (PITs), a lightweight way to preserve index state over searches. PITs improve end-user experience by making UIs more reactive. By default, a search request waits for complete results before returning a response. For example, a search that retrieves top hits and aggregations returns a response only after both top hits and aggregations are computed. However, aggregations are usually slower and more expensive to compute than top hits. Instead of sending a combined request, you can send two separate requests: one for top hits and another one for aggregations. With separate search requests, a UI can display top hits as soon as they're available and display aggregation data after the slower aggregation request completes. You can use a PIT to ensure both search requests run on the same data and index state. To use a PIT in a search, you must first explicitly create the PIT using the new {ref}/point-in-time-api.html[open PIT API]. PITs get automatically garbage-collected after `keep_alive` if no follow-up request extends their duration. [source,console] ---- POST /my-index-000001/_pit?keep_alive=1m ---- // TEST[setup:my_index] The API returns a PIT ID you can use in search requests. You can also configure by how long to extend your PIT's lifespan using the search request's `keep_alive` parameter. [source,console] ---- POST /_search { "size": 100, "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ---- // TEST[catch:missing] PITs automatically close when their `keep_alive` period ends. You can also manually close PITs you no longer need using the {ref}/point-in-time-api.html[close PIT API]. Closing a PIT releases the resources needed to maintain the PIT's index state. [source,console] ---- DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ---- // TEST[catch:missing] For more information about using PITs in search, see {ref}/paginate-search-results.html#search-after[Paginate search results with `search_after`] or the {ref}/point-in-time-api.html[PIT API documentation]. [discrete] [[support-for-request-level-circuit-breakers]] === Request-level circuit breakers on coordinating nodes You can now use a coordinating node to account for memory used to perform partial and final reduce of aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException is thrown if exceeds the maximum memory allowed in this breaker. This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced. This estimation can be completely off for some aggregations but it is corrected with the real size after the reduce completes. If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations and replace the estimation with the serialized size of the newly reduced result. [discrete] [[eql-case-sensitivity-operator]] === EQL: Case-sensitivity and the `:` operator In 7.10, we made most EQL operators and functions case-sensitive by default. We've also added `:`, a new case-insensitive equal operator. Designed for security use cases, you can use the `:` operator to search for strings in Windows event logs and other event data containing a mix of letter cases. [source,console] ---- GET /my-index-000001/_eql/search { "query": """ process where process.executable : "c:\\\\windows\\\\system32\\\\cmd.exe" """ } ---- // TEST[setup:sec_logs] For more information, see the {ref}/eql-syntax.html[EQL syntax documentation]. [discrete] [[deprecate-rest-api-access-to-system-indices]] === REST API access to system indices is deprecated We are deprecating REST API access to system indices. Most REST API requests that attempt to access system indices will return the following deprecation warning: [source,text] ---- this request accesses system indices: [.system_index_name], but in a future major version, direct access to system indices will be prevented by default ---- The following REST API endpoints access system indices as part of their implementation and will not return the deprecation warning: * `GET _cluster/health` * `GET {index}/_recovery` * `GET _cluster/allocation/explain` * `GET _cluster/state` * `POST _cluster/reroute` * `GET {index}/_stats` * `GET {index}/_segments` * `GET {index}/_shard_stores` * `GET _cat/[indices,aliases,health,recovery,shards,segments]` We are also adding a new metadata flag to track indices. {es} will automatically add this flag to any existing system indices during upgrade. [discrete] [[add-system-read-thread-pool]] === New thread pools for system indices We've added two new thread pools for system indices: `system_read` and `system_write`. These thread pools ensure system indices critical to the Elastic Stack, such as those used by security or Kibana, remain responsive when a cluster is under heavy query or indexing load. `system_read` is a `fixed` thread pool used to manage resources for read operations targeting system indices. Similarly, `system_write` is a `fixed` thread pool used to manage resources for write operations targeting system indices. Both have a maximum number of threads equal to `5` or half of the available processors, whichever is smaller. // end::notable-highlights[]