2020-04-01 12:21:51 -04:00
|
|
|
|
[[release-highlights-7.8.0]]
|
|
|
|
|
== 7.8.0 release highlights
|
|
|
|
|
++++
|
|
|
|
|
<titleabbrev>7.8.0</titleabbrev>
|
|
|
|
|
++++
|
|
|
|
|
|
|
|
|
|
//NOTE: The notable-highlights tagged regions are re-used in the
|
|
|
|
|
//Installation and Upgrade Guide
|
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
2020-06-12 09:00:17 -04:00
|
|
|
|
[float]
|
|
|
|
|
=== Geo improvements
|
|
|
|
|
|
|
|
|
|
We have made several improvements to geo support in {es} 7.8.
|
|
|
|
|
|
|
|
|
|
- You can now run an aggregation that finds the bounding box (top left point and
|
|
|
|
|
bottom right point) that contains all shapes matching a query. A shape is
|
|
|
|
|
anything that is defined by multiple points. See
|
|
|
|
|
{ref}/search-aggregations-metrics-geobounds-aggregation.html[Geo Bounds Aggregations].
|
|
|
|
|
- {ref}/search-aggregations-bucket-geohashgrid-aggregation.html[GeoHash grid aggregations]
|
|
|
|
|
and {ref}/search-aggregations-bucket-geotilegrid-aggregation.html[map tile grid aggregations]
|
|
|
|
|
allow you to group geo_points into buckets.
|
|
|
|
|
- {ref}/search-aggregations-metrics-geocentroid-aggregation.html[Geo centroid aggregations]
|
|
|
|
|
allow you to compute the weighted https://en.wikipedia.org/wiki/Centroid[centroid]
|
|
|
|
|
from all coordinate values for a geo_point field.
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
=== Add support for t-test aggregations
|
|
|
|
|
|
|
|
|
|
{es} now supports a `t_test` metrics
|
|
|
|
|
aggregation, which performs a statistical hypothesis test in which the test
|
|
|
|
|
statistic follows a
|
|
|
|
|
https://en.wikipedia.org/wiki/Student%27s_t-distribution[Student’s
|
|
|
|
|
t-distribution] under the null hypothesis on numeric values extracted from
|
|
|
|
|
the aggregated documents or generated by provided scripts. In practice,
|
|
|
|
|
this will tell you if the difference between two population means are
|
|
|
|
|
statistically significant and did not occur by chance alone. See
|
|
|
|
|
{ref}/search-aggregations-metrics-ttest-aggregation.html[T-Test Aggregation].
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
=== Expose aggregation usage in feature usage API
|
|
|
|
|
|
|
|
|
|
It is now possible to fetch a count of aggregations that have been executed
|
|
|
|
|
via the {ref}/cluster-nodes-usage.html[node features API]. This is broken down per
|
|
|
|
|
combination of aggregation and data type, per shard on each node, from the
|
|
|
|
|
last restart until the time when the counts are fetched. When trying to
|
|
|
|
|
analyze how {es} is being used in practice, it is useful to know
|
|
|
|
|
the usage distribution across aggregations and field types. For example,
|
|
|
|
|
you might be able to conclude that a certain part of an index is not used a
|
|
|
|
|
lot and could perhaps can be eliminated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
=== Support `value_count` and `avg` aggregations over histogram fields
|
|
|
|
|
|
|
|
|
|
{es} now implements `value_count` and `avg` aggregations over histogram
|
|
|
|
|
fields.
|
|
|
|
|
|
|
|
|
|
When the `value_count` aggregation is computed on {ref}/histogram.html[histogram
|
|
|
|
|
fields], the result of the aggregation is the sum of all numbers in the
|
|
|
|
|
`counts` array of the histogram.
|
|
|
|
|
|
|
|
|
|
When the average is computed on histogram fields, the result of the
|
|
|
|
|
aggregation is the weighted average of all elements in the `values` array
|
|
|
|
|
taking into consideration the number in the same position in the `counts`
|
|
|
|
|
array.
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
=== Reduce aggregation memory consumption
|
|
|
|
|
|
|
|
|
|
{es} now attempts to save memory on the coordinating node by delaying
|
|
|
|
|
deserialization of the shard results for an aggregation until the last
|
|
|
|
|
second. This is helpful as it makes the shard-aggregations results "short
|
|
|
|
|
lived" garbage. It also should shrink the memory usage of aggregations when
|
|
|
|
|
they are waiting to be merged.
|
|
|
|
|
|
|
|
|
|
Additionally, when the search is in batched reduce mode, {es} will force
|
|
|
|
|
the results to be serialized between batch reduces in an attempt to keep
|
|
|
|
|
the memory usage as low as possible between reductions.
|
2020-04-01 12:21:51 -04:00
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
2020-06-12 09:00:17 -04:00
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
=== Scalar functions now supported in SQL aggregations
|
|
|
|
|
|
|
|
|
|
When querying {es} using SQL, it is now possible to use scalar functions
|
|
|
|
|
inside aggregations. This allows for more complex expressions, including
|
|
|
|
|
within `GROUP BY` or `HAVING` clauses. For example:
|
|
|
|
|
|
|
|
|
|
[source, sql]
|
|
|
|
|
----
|
|
|
|
|
SELECT
|
|
|
|
|
MAX(CASE WHEN a IS NULL then -1 ELSE abs(a * 10) + 1 END) AS max,
|
|
|
|
|
b
|
|
|
|
|
FROM test
|
|
|
|
|
GROUP BY b
|
|
|
|
|
HAVING
|
|
|
|
|
MAX(CASE WHEN a IS NULL then -1 ELSE abs(a * 10) + 1 END) > 5
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
[[release-highlights-7.8.0-throttling]]
|
|
|
|
|
=== Increase the performance and scalability of {transforms} with throttling
|
|
|
|
|
|
|
|
|
|
{transforms-cap} achieved GA status in 7.7 and now in 7.8 they are even better
|
|
|
|
|
with the introduction of
|
|
|
|
|
{ref}/transform-overview.html#transform-performance[throttling]. You can spread
|
|
|
|
|
out the impact of the {transforms} on your cluster by defining the rate at which
|
|
|
|
|
they perform search and index requests. Set the `docs_per_second` limit when you
|
|
|
|
|
create or update your {transform}.
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
[[release-highlights-7.8.0-mml]]
|
|
|
|
|
=== Better estimates for {ml} model memory usage
|
|
|
|
|
|
|
|
|
|
For 7.8, we introduce dynamic estimation of the model memory limit for jobs in
|
|
|
|
|
{ml-docs}/ootb-ml-jobs.html[ML solution modules]. The estimate is generated
|
|
|
|
|
during the job creation. It uses a calculation based on the specific detectors
|
|
|
|
|
of the job and the cardinality of the partitioning and influencer fields. It
|
|
|
|
|
means the job setup has better default values depending on the size of the data
|
|
|
|
|
being analyzed.
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
[[release-highlights-7.8.0-loss-functions]]
|
|
|
|
|
=== Additional loss functions for {regression}
|
|
|
|
|
|
|
|
|
|
{ml-docs}/dfa-regression.html#dfa-regression-lossfunction[Loss functions]
|
|
|
|
|
measure how well a {ml} model fits a specific data set. In 7.8, we added two new
|
|
|
|
|
loss functions for {regression} analysis. In addition to the existing mean
|
|
|
|
|
squared error function, there are now mean squared logarithmic error and
|
|
|
|
|
Pseudo-Huber loss functions. These additions enable you to choose the
|
|
|
|
|
loss function that fits best with your data set.
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|
|
|
|
|
// tag::notable-highlights[]
|
|
|
|
|
[float]
|
|
|
|
|
[[release-highlights-7.8.0-data-visualizer]]
|
|
|
|
|
=== Extended upload limit and explanations for Data Visualizer
|
|
|
|
|
|
|
|
|
|
You can now upload files up to 1 GB in Data Visualizer. The file structure
|
|
|
|
|
finder functionality of the Data Visualizer provides more detailed explanations
|
|
|
|
|
after both successful and unsuccessful analysis which makes it easier to
|
|
|
|
|
diagnose issues with file upload.
|
|
|
|
|
|
|
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|