[DOCS] Various aggregation doc fixes

This commit is contained in:
Kurt Hurtado 2014-03-12 18:28:40 -07:00 committed by Alexander Reelsen
parent 9fcee312dc
commit ca6a2bb790
1 changed files with 21 additions and 21 deletions

View File

@ -8,13 +8,13 @@ replacement for the functionality we currently refer to as "faceting".
<<search-facets, Facets>> provide a great way to aggregate data within a document set context.
This context is defined by the executed query in combination with the different levels of filters that can be defined
(filtered queries, top-level filters, and facet level filters). While powerful, their implementation is not designed
from ground up to support complex aggregations and thus limited.
from the ground up to support complex aggregations and is thus limited.
.Are facets deprecated?
**********************************
As the functionality facets offer is a subset of the one offered by aggregations, over time, we would like to
see users move to aggregations for all realtime data analytics. That said, we are well aware that such
transitions/migrations take time, and for this reason we are keeping the facets around for the time being.
transitions/migrations take time, and for this reason we are keeping facets around for the time being.
Facets are not officially deprecated yet but are likely to be in the future.
**********************************
@ -37,16 +37,16 @@ _Bucketing_::
documents that "belong" to it.
_Metric_::
Aggregations that keep track and compute metrics over a set of documents
Aggregations that keep track and compute metrics over a set of documents.
The interesting part comes next, since each bucket effectively defines a document set (all documents belonging to
the bucket), one can potentially associated aggregations on the bucket level, and those will execute within the context
The interesting part comes next. Since each bucket effectively defines a document set (all documents belonging to
the bucket), one can potentially associate aggregations on the bucket level, and those will execute within the context
of that bucket. This is where the real power of aggregations kicks in: *aggregations can be nested!*
NOTE: Bucketing aggregations can have sub-aggregations (bucketing or metric). The sub aggregations will be computed for
the buckets their parent aggregation generates. There is not hard limit on the level/depth of nested
aggregations (one can nest an aggregation under a "parent" aggregation which is itself a sub-aggregation of
another higher aggregations)
NOTE: Bucketing aggregations can have sub-aggregations (bucketing or metric). The sub-aggregations will be computed for
the buckets which their parent aggregation generates. There is no hard limit on the level/depth of nested
aggregations (one can nest an aggregation under a "parent" aggregation, which is itself a sub-aggregation of
another higher-level aggregation).
[float]
=== Structuring Aggregations
@ -67,15 +67,15 @@ The following snippet captures the basic structure of aggregations:
--------------------------------------------------
The `aggregations` object (the key `aggs` can also be used) in the JSON holds the aggregations to be computed. Each aggregation
is associated with a logical name that the user defines (e.g. if the aggregation computes the average price, then it'll
is associated with a logical name that the user defines (e.g. if the aggregation computes the average price, then it would
make sense to name it `avg_price`). These logical names will also be used to uniquely identify the aggregations in the
response. Each aggregation has a specific type (`<aggregation_type>` in the above snippet) and is typically the first
key within the named aggregation body. Each type of aggregation define its own body, depending on the nature of the
key within the named aggregation body. Each type of aggregation defines its own body, depending on the nature of the
aggregation (e.g. an `avg` aggregation on a specific field will define the field on which the average will be calculated).
At the same level of the aggregation type definition, one can optionally define a set of additional aggregations,
though this only makes sense if the aggregation you defined is of a bucketing nature. In this scenario, the
sub-aggregations you define on the bucketing aggregation level will be computed for all the buckets built by the
bucketing aggregation. For example, if the you define a set of aggregations under the `range` aggregation, the
bucketing aggregation. For example, if you define a set of aggregations under the `range` aggregation, the
sub-aggregations will be computed for the range buckets that are defined.
[float]
@ -83,22 +83,22 @@ sub-aggregations will be computed for the range buckets that are defined.
Some aggregations work on values extracted from the aggregated documents. Typically, the values will be extracted from
a specific document field which is set using the `field` key for the aggregations. It is also possible to define a
<<modules-scripting,`script`>> that will generate the values (per document).
<<modules-scripting,`script`>> which will generate the values (per document).
When both `field` and `script` settings are configured for the aggregation, the script will be treated as a
`value script`. While normal scripts are evaluated on a document level (i.e. the script has access to all the data
associated with the document), value scripts are evaluated on the *value* level. In this mode, the values are extracted
from the configured `field` and the `script` is used to apply a "transformation" over these value/s
from the configured `field` and the `script` is used to apply a "transformation" over these value/s.
["NOTE",id="aggs-script-note"]
===============================
When working with scripts, the `lang` and `params` settings can also be defined. The former defines the scripting
language that is used (assuming the proper language is available in Elasticsearch either by default or as a plugin). The latter
enables defining all the "dynamic" expressions in the script as parameters, and by that keep the script itself static
language which is used (assuming the proper language is available in Elasticsearch, either by default or as a plugin). The latter
enables defining all the "dynamic" expressions in the script as parameters, which enables the script to keep itself static
between calls (this will ensure the use of the cached compiled scripts in Elasticsearch).
===============================
Scripts can generate a single value or multiple values per documents. When generating multiple values, once can use the
Scripts can generate a single value or multiple values per document. When generating multiple values, one can use the
`script_values_sorted` settings to indicate whether these values are sorted or not. Internally, Elasticsearch can
perform optimizations when dealing with sorted values (for example, with the `min` aggregations, knowing the values are
sorted, Elasticsearch will skip the iterations over all the values and rely on the first value in the list to be the
@ -120,12 +120,12 @@ on the metrics in each bucket).
=== Bucket Aggregations
Bucket aggregations don't calculate metrics over fields like the metrics aggregations do, but instead, they create
buckets of documents. Each bucket is associated with a criteria (depends on the aggregation type) that determines
whether or not a document in the current context "falls" in it. In other words, the buckets effectively define document
buckets of documents. Each bucket is associated with a criteria (depending on the aggregation type) which determines
whether or not a document in the current context "falls" into it. In other words, the buckets effectively define document
sets. In addition to the buckets themselves, the `bucket` aggregations also compute and return the number of documents
that "fell in" each bucket.
that "fell in" to each bucket.
Bucket aggregations, as opposed to `metrics` aggregations, can hold sub-aggregations. These sub aggregations will be
Bucket aggregations, as opposed to `metrics` aggregations, can hold sub-aggregations. These sub-aggregations will be
aggregated for the buckets created by their "parent" bucket aggregation.
There are different bucket aggregators, each with a different "bucketing" strategy. Some define a single bucket, some