From e3ae1df6f0896bf3b835042525e1531c143cba93 Mon Sep 17 00:00:00 2001 From: Zachary Tong Date: Fri, 1 May 2015 16:04:55 -0400 Subject: [PATCH 1/2] [DOCS] Restructure Aggs documentation --- .../{search => }/aggregations.asciidoc | 153 ++--------------- docs/reference/aggregations/bucket.asciidoc | 49 ++++++ .../bucket/children-aggregation.asciidoc | 0 .../bucket/datehistogram-aggregation.asciidoc | 0 .../bucket/daterange-aggregation.asciidoc | 0 .../bucket/filter-aggregation.asciidoc | 0 .../bucket/filters-aggregation.asciidoc | 0 .../bucket/geodistance-aggregation.asciidoc | 0 .../bucket/geohashgrid-aggregation.asciidoc | 0 .../bucket/global-aggregation.asciidoc | 0 .../bucket/histogram-aggregation.asciidoc | 0 .../bucket/iprange-aggregation.asciidoc | 0 .../bucket/missing-aggregation.asciidoc | 0 .../bucket/nested-aggregation.asciidoc | 0 .../bucket/range-aggregation.asciidoc | 0 .../reverse-nested-aggregation.asciidoc | 0 .../bucket/sampler-aggregation.asciidoc | 0 .../significantterms-aggregation.asciidoc | 0 .../bucket/terms-aggregation.asciidoc | 0 docs/reference/aggregations/metrics.asciidoc | 48 ++++++ .../metrics/avg-aggregation.asciidoc | 0 .../metrics/cardinality-aggregation.asciidoc | 0 .../extendedstats-aggregation.asciidoc | 0 .../metrics/geobounds-aggregation.asciidoc | 0 .../metrics/max-aggregation.asciidoc | 0 .../metrics/min-aggregation.asciidoc | 0 .../metrics/percentile-aggregation.asciidoc | 0 .../percentile-rank-aggregation.asciidoc | 0 .../scripted-metric-aggregation.asciidoc | 0 .../metrics/stats-aggregation.asciidoc | 0 .../metrics/sum-aggregation.asciidoc | 0 .../metrics/tophits-aggregation.asciidoc | 0 .../metrics/valuecount-aggregation.asciidoc | 0 docs/reference/aggregations/misc.asciidoc | 76 +++++++++ docs/reference/aggregations/reducer.asciidoc | 160 ++++++++++++++++++ .../reducer/derivative-aggregation.asciidoc | 44 ++--- .../reducer/max-bucket-aggregation.asciidoc | 21 ++- .../reducer/min-bucket-aggregation.asciidoc | 20 +++ .../reducer/movavg-aggregation.asciidoc | 30 +--- docs/reference/index.asciidoc | 2 + docs/reference/search.asciidoc | 2 - .../search/aggregations/bucket.asciidoc | 33 ---- .../search/aggregations/metrics.asciidoc | 27 --- .../search/aggregations/reducer.asciidoc | 6 - 44 files changed, 415 insertions(+), 256 deletions(-) rename docs/reference/{search => }/aggregations.asciidoc (51%) create mode 100644 docs/reference/aggregations/bucket.asciidoc rename docs/reference/{search => }/aggregations/bucket/children-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/datehistogram-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/daterange-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/filter-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/filters-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/geodistance-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/geohashgrid-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/global-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/histogram-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/iprange-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/missing-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/nested-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/range-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/reverse-nested-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/sampler-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/significantterms-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/bucket/terms-aggregation.asciidoc (100%) create mode 100644 docs/reference/aggregations/metrics.asciidoc rename docs/reference/{search => }/aggregations/metrics/avg-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/cardinality-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/extendedstats-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/geobounds-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/max-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/min-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/percentile-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/percentile-rank-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/scripted-metric-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/stats-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/sum-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/tophits-aggregation.asciidoc (100%) rename docs/reference/{search => }/aggregations/metrics/valuecount-aggregation.asciidoc (100%) create mode 100644 docs/reference/aggregations/misc.asciidoc create mode 100644 docs/reference/aggregations/reducer.asciidoc rename docs/reference/{search => }/aggregations/reducer/derivative-aggregation.asciidoc (83%) rename docs/reference/{search => }/aggregations/reducer/max-bucket-aggregation.asciidoc (83%) rename docs/reference/{search => }/aggregations/reducer/min-bucket-aggregation.asciidoc (83%) rename docs/reference/{search => }/aggregations/reducer/movavg-aggregation.asciidoc (93%) delete mode 100644 docs/reference/search/aggregations/bucket.asciidoc delete mode 100644 docs/reference/search/aggregations/metrics.asciidoc delete mode 100644 docs/reference/search/aggregations/reducer.asciidoc diff --git a/docs/reference/search/aggregations.asciidoc b/docs/reference/aggregations.asciidoc similarity index 51% rename from docs/reference/search/aggregations.asciidoc rename to docs/reference/aggregations.asciidoc index cf4b4348eda..c6fb674834e 100644 --- a/docs/reference/search/aggregations.asciidoc +++ b/docs/reference/aggregations.asciidoc @@ -1,6 +1,8 @@ [[search-aggregations]] -== Aggregations += Aggregations +[partintro] +-- The aggregations framework helps provide aggregated data based on a search query. It is based on simple building blocks called aggregations, that can be composed in order to build complex summaries of the data. @@ -11,16 +13,19 @@ query/filters of the search request). There are many different types of aggregations, each with its own purpose and output. To better understand these types, it is often easier to break them into two main families: -_Bucketing_:: +<>:: A family of aggregations that build buckets, where each bucket is associated with a _key_ and a document criterion. When the aggregation is executed, all the buckets criteria are evaluated on every document in the context and when a criterion matches, the document is considered to "fall in" the relevant bucket. By the end of the aggregation process, we'll end up with a list of buckets - each one with a set of documents that "belong" to it. -_Metric_:: +<>:: Aggregations that keep track and compute metrics over a set of documents. +<>:: + Aggregations that aggregate the output of other aggregations and their associated metrics + The interesting part comes next. Since each bucket effectively defines a document set (all documents belonging to the bucket), one can potentially associate aggregations on the bucket level, and those will execute within the context of that bucket. This is where the real power of aggregations kicks in: *aggregations can be nested!* @@ -31,7 +36,7 @@ NOTE: Bucketing aggregations can have sub-aggregations (bucketing or metric). Th another higher-level aggregation). [float] -=== Structuring Aggregations +== Structuring Aggregations The following snippet captures the basic structure of aggregations: @@ -62,7 +67,7 @@ bucketing aggregation. For example, if you define a set of aggregations under th sub-aggregations will be computed for the range buckets that are defined. [float] -==== Values Source +=== Values Source Some aggregations work on values extracted from the aggregated documents. Typically, the values will be extracted from a specific document field which is set using the `field` key for the aggregations. It is also possible to define a @@ -89,142 +94,7 @@ perform optimizations when dealing with sorted values (for example, with the `mi sorted, Elasticsearch will skip the iterations over all the values and rely on the first value in the list to be the minimum value among all other values associated with the same document). -[float] -=== Metrics Aggregations - -The aggregations in this family compute metrics based on values extracted in one way or another from the documents that -are being aggregated. The values are typically extracted from the fields of the document (using the field data), but -can also be generated using scripts. - -Numeric metrics aggregations are a special type of metrics aggregation which output numeric values. Some aggregations output -a single numeric metric (e.g. `avg`) and are called `single-value numeric metrics aggregation`, others generate multiple -metrics (e.g. `stats`) and are called `multi-value numeric metrics aggregation`. The distinction between single-value and -multi-value numeric metrics aggregations plays a role when these aggregations serve as direct sub-aggregations of some -bucket aggregations (some bucket aggregations enable you to sort the returned buckets based on the numeric metrics in each bucket). - - -[float] -=== Bucket Aggregations - -Bucket aggregations don't calculate metrics over fields like the metrics aggregations do, but instead, they create -buckets of documents. Each bucket is associated with a criterion (depending on the aggregation type) which determines -whether or not a document in the current context "falls" into it. In other words, the buckets effectively define document -sets. In addition to the buckets themselves, the `bucket` aggregations also compute and return the number of documents -that "fell in" to each bucket. - -Bucket aggregations, as opposed to `metrics` aggregations, can hold sub-aggregations. These sub-aggregations will be -aggregated for the buckets created by their "parent" bucket aggregation. - -There are different bucket aggregators, each with a different "bucketing" strategy. Some define a single bucket, some -define fixed number of multiple buckets, and others dynamically create the buckets during the aggregation process. - -[float] -=== Reducer Aggregations - -coming[2.0.0] - -experimental[] - -Reducer aggregations work on the outputs produced from other aggregations rather than from document sets, adding -information to the output tree. There are many different types of reducer, each computing different information from -other aggregations, but these types can broken down into two families: - -_Parent_:: - A family of reducer aggregations that is provided with the output of its parent aggregation and is able - to compute new buckets or new aggregations to add to existing buckets. - -_Sibling_:: - Reducer aggregations that are provided with the output of a sibling aggregation and are able to compute a - new aggregation which will be at the same level as the sibling aggregation. - -Reducer aggregations can reference the aggregations they need to perform their computation by using the `buckets_paths` -parameter to indicate the paths to the required metrics. The syntax for defining these paths can be found in the -<> section. - -?????? SHOULD THE SECTION ABOUT DEFINING AGGREGATION PATHS -BE IN THIS PAGE AND REFERENCED FROM THE TERMS AGGREGATION DOCUMENTATION ??????? - -Reducer aggregations cannot have sub-aggregations but depending on the type it can reference another reducer in the `buckets_path` -allowing reducers to be chained. - -NOTE: Because reducer aggregations only add to the output, when chaining reducer aggregations the output of each reducer will be -included in the final output. - -[float] -=== Caching heavy aggregations - -Frequently used aggregations (e.g. for display on the home page of a website) -can be cached for faster responses. These cached results are the same results -that would be returned by an uncached aggregation -- you will never get stale -results. - -See <> for more details. - -[float] -=== Returning only aggregation results - -There are many occasions when aggregations are required but search hits are not. For these cases the hits can be ignored by -setting `size=0`. For example: - -[source,js] --------------------------------------------------- -$ curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{ - "size": 0, - "aggregations": { - "my_agg": { - "terms": { - "field": "text" - } - } - } -} -' --------------------------------------------------- - -Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient. - -[float] -=== Metadata - -You can associate a piece of metadata with individual aggregations at request time that will be returned in place -at response time. - -Consider this example where we want to associate the color blue with our `terms` aggregation. - -[source,js] --------------------------------------------------- -{ - ... - aggs": { - "titles": { - "terms": { - "field": "title" - }, - "meta": { - "color": "blue" - }, - } - } -} --------------------------------------------------- - -Then that piece of metadata will be returned in place for our `titles` terms aggregation - -[source,js] --------------------------------------------------- -{ - ... - "aggregations": { - "titles": { - "meta": { - "color" : "blue" - }, - "buckets": [ - ] - } - } -} --------------------------------------------------- +-- include::aggregations/metrics.asciidoc[] @@ -232,3 +102,4 @@ include::aggregations/bucket.asciidoc[] include::aggregations/reducer.asciidoc[] +include::aggregations/misc.asciidoc[] diff --git a/docs/reference/aggregations/bucket.asciidoc b/docs/reference/aggregations/bucket.asciidoc new file mode 100644 index 00000000000..2d185dd49a0 --- /dev/null +++ b/docs/reference/aggregations/bucket.asciidoc @@ -0,0 +1,49 @@ +[[search-aggregations-bucket]] +== Bucket Aggregations + +Bucket aggregations don't calculate metrics over fields like the metrics aggregations do, but instead, they create +buckets of documents. Each bucket is associated with a criterion (depending on the aggregation type) which determines +whether or not a document in the current context "falls" into it. In other words, the buckets effectively define document +sets. In addition to the buckets themselves, the `bucket` aggregations also compute and return the number of documents +that "fell in" to each bucket. + +Bucket aggregations, as opposed to `metrics` aggregations, can hold sub-aggregations. These sub-aggregations will be +aggregated for the buckets created by their "parent" bucket aggregation. + +There are different bucket aggregators, each with a different "bucketing" strategy. Some define a single bucket, some +define fixed number of multiple buckets, and others dynamically create the buckets during the aggregation process. + +include::bucket/children-aggregation.asciidoc[] + +include::bucket/datehistogram-aggregation.asciidoc[] + +include::bucket/daterange-aggregation.asciidoc[] + +include::bucket/filter-aggregation.asciidoc[] + +include::bucket/filters-aggregation.asciidoc[] + +include::bucket/geodistance-aggregation.asciidoc[] + +include::bucket/geohashgrid-aggregation.asciidoc[] + +include::bucket/global-aggregation.asciidoc[] + +include::bucket/histogram-aggregation.asciidoc[] + +include::bucket/iprange-aggregation.asciidoc[] + +include::bucket/missing-aggregation.asciidoc[] + +include::bucket/nested-aggregation.asciidoc[] + +include::bucket/range-aggregation.asciidoc[] + +include::bucket/reverse-nested-aggregation.asciidoc[] + +include::bucket/sampler-aggregation.asciidoc[] + +include::bucket/significantterms-aggregation.asciidoc[] + +include::bucket/terms-aggregation.asciidoc[] + diff --git a/docs/reference/search/aggregations/bucket/children-aggregation.asciidoc b/docs/reference/aggregations/bucket/children-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/children-aggregation.asciidoc rename to docs/reference/aggregations/bucket/children-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/datehistogram-aggregation.asciidoc b/docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/datehistogram-aggregation.asciidoc rename to docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/daterange-aggregation.asciidoc b/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/daterange-aggregation.asciidoc rename to docs/reference/aggregations/bucket/daterange-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/filter-aggregation.asciidoc b/docs/reference/aggregations/bucket/filter-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/filter-aggregation.asciidoc rename to docs/reference/aggregations/bucket/filter-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/filters-aggregation.asciidoc b/docs/reference/aggregations/bucket/filters-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/filters-aggregation.asciidoc rename to docs/reference/aggregations/bucket/filters-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/geodistance-aggregation.asciidoc b/docs/reference/aggregations/bucket/geodistance-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/geodistance-aggregation.asciidoc rename to docs/reference/aggregations/bucket/geodistance-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/geohashgrid-aggregation.asciidoc b/docs/reference/aggregations/bucket/geohashgrid-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/geohashgrid-aggregation.asciidoc rename to docs/reference/aggregations/bucket/geohashgrid-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/global-aggregation.asciidoc b/docs/reference/aggregations/bucket/global-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/global-aggregation.asciidoc rename to docs/reference/aggregations/bucket/global-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/histogram-aggregation.asciidoc b/docs/reference/aggregations/bucket/histogram-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/histogram-aggregation.asciidoc rename to docs/reference/aggregations/bucket/histogram-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/iprange-aggregation.asciidoc b/docs/reference/aggregations/bucket/iprange-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/iprange-aggregation.asciidoc rename to docs/reference/aggregations/bucket/iprange-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/missing-aggregation.asciidoc b/docs/reference/aggregations/bucket/missing-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/missing-aggregation.asciidoc rename to docs/reference/aggregations/bucket/missing-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/nested-aggregation.asciidoc b/docs/reference/aggregations/bucket/nested-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/nested-aggregation.asciidoc rename to docs/reference/aggregations/bucket/nested-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/range-aggregation.asciidoc b/docs/reference/aggregations/bucket/range-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/range-aggregation.asciidoc rename to docs/reference/aggregations/bucket/range-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/reverse-nested-aggregation.asciidoc b/docs/reference/aggregations/bucket/reverse-nested-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/reverse-nested-aggregation.asciidoc rename to docs/reference/aggregations/bucket/reverse-nested-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/sampler-aggregation.asciidoc b/docs/reference/aggregations/bucket/sampler-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/sampler-aggregation.asciidoc rename to docs/reference/aggregations/bucket/sampler-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc b/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc rename to docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/bucket/terms-aggregation.asciidoc b/docs/reference/aggregations/bucket/terms-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/bucket/terms-aggregation.asciidoc rename to docs/reference/aggregations/bucket/terms-aggregation.asciidoc diff --git a/docs/reference/aggregations/metrics.asciidoc b/docs/reference/aggregations/metrics.asciidoc new file mode 100644 index 00000000000..f80c36f2ebe --- /dev/null +++ b/docs/reference/aggregations/metrics.asciidoc @@ -0,0 +1,48 @@ +[[search-aggregations-metrics]] +== Metrics Aggregations + +The aggregations in this family compute metrics based on values extracted in one way or another from the documents that +are being aggregated. The values are typically extracted from the fields of the document (using the field data), but +can also be generated using scripts. + +Numeric metrics aggregations are a special type of metrics aggregation which output numeric values. Some aggregations output +a single numeric metric (e.g. `avg`) and are called `single-value numeric metrics aggregation`, others generate multiple +metrics (e.g. `stats`) and are called `multi-value numeric metrics aggregation`. The distinction between single-value and +multi-value numeric metrics aggregations plays a role when these aggregations serve as direct sub-aggregations of some +bucket aggregations (some bucket aggregations enable you to sort the returned buckets based on the numeric metrics in each bucket). + +include::metrics/avg-aggregation.asciidoc[] + +include::metrics/cardinality-aggregation.asciidoc[] + +include::metrics/extendedstats-aggregation.asciidoc[] + +include::metrics/geobounds-aggregation.asciidoc[] + +include::metrics/max-aggregation.asciidoc[] + +include::metrics/min-aggregation.asciidoc[] + +include::metrics/percentile-aggregation.asciidoc[] + +include::metrics/percentile-rank-aggregation.asciidoc[] + +include::metrics/scripted-metric-aggregation.asciidoc[] + +include::metrics/stats-aggregation.asciidoc[] + +include::metrics/sum-aggregation.asciidoc[] + +include::metrics/tophits-aggregation.asciidoc[] + +include::metrics/valuecount-aggregation.asciidoc[] + + + + + + + + + + diff --git a/docs/reference/search/aggregations/metrics/avg-aggregation.asciidoc b/docs/reference/aggregations/metrics/avg-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/avg-aggregation.asciidoc rename to docs/reference/aggregations/metrics/avg-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/cardinality-aggregation.asciidoc b/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/cardinality-aggregation.asciidoc rename to docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/extendedstats-aggregation.asciidoc b/docs/reference/aggregations/metrics/extendedstats-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/extendedstats-aggregation.asciidoc rename to docs/reference/aggregations/metrics/extendedstats-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/geobounds-aggregation.asciidoc b/docs/reference/aggregations/metrics/geobounds-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/geobounds-aggregation.asciidoc rename to docs/reference/aggregations/metrics/geobounds-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/max-aggregation.asciidoc b/docs/reference/aggregations/metrics/max-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/max-aggregation.asciidoc rename to docs/reference/aggregations/metrics/max-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/min-aggregation.asciidoc b/docs/reference/aggregations/metrics/min-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/min-aggregation.asciidoc rename to docs/reference/aggregations/metrics/min-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/percentile-aggregation.asciidoc b/docs/reference/aggregations/metrics/percentile-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/percentile-aggregation.asciidoc rename to docs/reference/aggregations/metrics/percentile-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/percentile-rank-aggregation.asciidoc b/docs/reference/aggregations/metrics/percentile-rank-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/percentile-rank-aggregation.asciidoc rename to docs/reference/aggregations/metrics/percentile-rank-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/scripted-metric-aggregation.asciidoc b/docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/scripted-metric-aggregation.asciidoc rename to docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/stats-aggregation.asciidoc b/docs/reference/aggregations/metrics/stats-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/stats-aggregation.asciidoc rename to docs/reference/aggregations/metrics/stats-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/sum-aggregation.asciidoc b/docs/reference/aggregations/metrics/sum-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/sum-aggregation.asciidoc rename to docs/reference/aggregations/metrics/sum-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/tophits-aggregation.asciidoc b/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/tophits-aggregation.asciidoc rename to docs/reference/aggregations/metrics/tophits-aggregation.asciidoc diff --git a/docs/reference/search/aggregations/metrics/valuecount-aggregation.asciidoc b/docs/reference/aggregations/metrics/valuecount-aggregation.asciidoc similarity index 100% rename from docs/reference/search/aggregations/metrics/valuecount-aggregation.asciidoc rename to docs/reference/aggregations/metrics/valuecount-aggregation.asciidoc diff --git a/docs/reference/aggregations/misc.asciidoc b/docs/reference/aggregations/misc.asciidoc new file mode 100644 index 00000000000..f494d5291c0 --- /dev/null +++ b/docs/reference/aggregations/misc.asciidoc @@ -0,0 +1,76 @@ + +[[caching-heavy-aggregations]] +== Caching heavy aggregations + +Frequently used aggregations (e.g. for display on the home page of a website) +can be cached for faster responses. These cached results are the same results +that would be returned by an uncached aggregation -- you will never get stale +results. + +See <> for more details. + +[[returning-only-agg-results]] +== Returning only aggregation results + +There are many occasions when aggregations are required but search hits are not. For these cases the hits can be ignored by +setting `size=0`. For example: + +[source,js] +-------------------------------------------------- +$ curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{ + "size": 0, + "aggregations": { + "my_agg": { + "terms": { + "field": "text" + } + } + } +} +' +-------------------------------------------------- + +Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient. + +[[agg-metadata]] +== Aggregation Metadata + +You can associate a piece of metadata with individual aggregations at request time that will be returned in place +at response time. + +Consider this example where we want to associate the color blue with our `terms` aggregation. + +[source,js] +-------------------------------------------------- +{ + ... + aggs": { + "titles": { + "terms": { + "field": "title" + }, + "meta": { + "color": "blue" + }, + } + } +} +-------------------------------------------------- + +Then that piece of metadata will be returned in place for our `titles` terms aggregation + +[source,js] +-------------------------------------------------- +{ + ... + "aggregations": { + "titles": { + "meta": { + "color" : "blue" + }, + "buckets": [ + ] + } + } +} +-------------------------------------------------- \ No newline at end of file diff --git a/docs/reference/aggregations/reducer.asciidoc b/docs/reference/aggregations/reducer.asciidoc new file mode 100644 index 00000000000..2ce379cd583 --- /dev/null +++ b/docs/reference/aggregations/reducer.asciidoc @@ -0,0 +1,160 @@ +[[search-aggregations-reducer]] + +== Reducer Aggregations + +coming[2.0.0] + +experimental[] + +Reducer aggregations work on the outputs produced from other aggregations rather than from document sets, adding +information to the output tree. There are many different types of reducer, each computing different information from +other aggregations, but these types can broken down into two families: + +_Parent_:: + A family of reducer aggregations that is provided with the output of its parent aggregation and is able + to compute new buckets or new aggregations to add to existing buckets. + +_Sibling_:: + Reducer aggregations that are provided with the output of a sibling aggregation and are able to compute a + new aggregation which will be at the same level as the sibling aggregation. + +Reducer aggregations can reference the aggregations they need to perform their computation by using the `buckets_paths` +parameter to indicate the paths to the required metrics. The syntax for defining these paths can be found in the +<> section below. + +Reducer aggregations cannot have sub-aggregations but depending on the type it can reference another reducer in the `buckets_path` +allowing reducers to be chained. For example, you can chain together two derivatives to calculate the second derivative +(e.g. a derivative of a derivative). + +NOTE: Because reducer aggregations only add to the output, when chaining reducer aggregations the output of each reducer will be +included in the final output. + +[[bucket-path-syntax]] +[float] +=== `buckets_path` Syntax + +Most reducers require another aggregation as their input. The input aggregation is defined via the `buckets_path` +parameter, which follows a specific format: + +-------------------------------------------------- +AGG_SEPARATOR := '>' +METRIC_SEPARATOR := '.' +AGG_NAME := +METRIC := +PATH := []*[] +-------------------------------------------------- + +For example, the path `"my_bucket>my_stats.avg"` will path to the `avg` value in the `"my_stats"` metric, which is +contained in the `"my_bucket"` bucket aggregation. + +Paths are relative from the position of the reducer; they are not absolute paths, and the path cannot go back "up" the +aggregation tree. For example, this moving average is embedded inside a date_histogram and refers to a "sibling" +metric `"the_sum"`: + +[source,js] +-------------------------------------------------- +{ + "my_date_histo":{ + "date_histogram":{ + "field":"timestamp", + "interval":"day" + }, + "aggs":{ + "the_sum":{ + "sum":{ "field": "lemmings" } <1> + }, + "the_movavg":{ + "moving_avg":{ "buckets_path": "the_sum" } <2> + } + } + } +} +-------------------------------------------------- +<1> The metric is called `"the_sum"` +<2> The `buckets_path` refers to the metric via a relative path `"the_sum"` + +`buckets_path` is also used for Sibling reducer aggregations, where the aggregation is "next" to a series of buckets +instead of embedded "inside" them. For example, the `max_bucket` aggregation uses the `buckets_path` to specify +a metric embedded inside a sibling aggregation: + +[source,js] +-------------------------------------------------- +{ + "aggs" : { + "sales_per_month" : { + "date_histogram" : { + "field" : "date", + "interval" : "month" + }, + "aggs": { + "sales": { + "sum": { + "field": "price" + } + } + } + }, + "max_monthly_sales": { + "max_bucket": { + "buckets_paths": "sales_per_month>sales" <1> + } + } + } +} +-------------------------------------------------- +<1> `bucket_paths` instructs this max_bucket aggregation that we want the maximum value of the `sales` aggregation in the +`sales_per_month` date histogram. + +[float] +==== Special Paths + +Instead of pathing to a metric, `buckets_path` can use a special `"_count"` path. This instructs +the reducer to use the document count as it's input. For example, a moving average can be calculated on the document +count of each bucket, instead of a specific metric: + +[source,js] +-------------------------------------------------- +{ + "my_date_histo":{ + "date_histogram":{ + "field":"timestamp", + "interval":"day" + }, + "aggs":{ + "the_movavg":{ + "moving_avg":{ "buckets_path": "_count" } <1> + } + } + } +} +-------------------------------------------------- +<1> By using `_count` instead of a metric name, we can calculate the moving average of document counts in the histogram + + +[float] +=== Dealing with gaps in the data + +There are a couple of reasons why the data output by the enclosing histogram may have gaps: + +* There are no documents matching the query for some buckets +* The data for a metric is missing in all of the documents falling into a bucket (this is most likely with either a small interval +on the enclosing histogram or with a query matching only a small number of documents) + +Where there is no data available in a bucket for a given metric it presents a problem for calculating the derivative value for both +the current bucket and the next bucket. In the derivative reducer aggregation has a `gap policy` parameter to define what the behavior +should be when a gap in the data is found. There are currently two options for controlling the gap policy: + +_ignore_:: + This option will not produce a derivative value for any buckets where the value in the current or previous bucket is + missing + +_insert_zeros_:: + This option will assume the missing value is `0` and calculate the derivative with the value `0`. + + + + +include::reducer/derivative-aggregation.asciidoc[] +include::reducer/max-bucket-aggregation.asciidoc[] +include::reducer/min-bucket-aggregation.asciidoc[] +include::reducer/movavg-aggregation.asciidoc[] diff --git a/docs/reference/search/aggregations/reducer/derivative-aggregation.asciidoc b/docs/reference/aggregations/reducer/derivative-aggregation.asciidoc similarity index 83% rename from docs/reference/search/aggregations/reducer/derivative-aggregation.asciidoc rename to docs/reference/aggregations/reducer/derivative-aggregation.asciidoc index be644091b51..17801055418 100644 --- a/docs/reference/search/aggregations/reducer/derivative-aggregation.asciidoc +++ b/docs/reference/aggregations/reducer/derivative-aggregation.asciidoc @@ -5,6 +5,28 @@ A parent reducer aggregation which calculates the derivative of a specified metr aggregation. The specified metric must be numeric and the enclosing histogram must have `min_doc_count` set to `0` (default for `histogram` aggregations). +==== Syntax + +A `derivative` aggregation looks like this in isolation: + +[source,js] +-------------------------------------------------- +{ + "derivative": { + "buckets_path": "the_sum" + } +} +-------------------------------------------------- + +.`derivative` Parameters +|=== +|Parameter Name |Description |Required |Default Value +|`buckets_path` |Path to the metric of interest (see <> for more details |Required | +|=== + + +==== First Order Derivative + The following snippet calculates the derivative of the total monthly `sales`: [source,js] @@ -82,7 +104,7 @@ And the following may be the response: <1> No derivative for the first bucket since we need at least 2 data points to calculate the derivative <2> Derivative value units are implicitly defined by the `sales` aggregation and the parent histogram so in this case the units would be $/month assuming the `price` field has units of $. -<3> The number of documents in the bucket are represented by the `doc_count` value +<3> The number of documents in the bucket are represented by the `doc_count` f ==== Second Order Derivative @@ -172,23 +194,3 @@ And the following may be the response: <1> No second derivative for the first two buckets since we need at least 2 data points from the first derivative to calculate the second derivative -==== Dealing with gaps in the data - -There are a couple of reasons why the data output by the enclosing histogram may have gaps: - -* There are no documents matching the query for some buckets -* The data for a metric is missing in all of the documents falling into a bucket (this is most likely with either a small interval -on the enclosing histogram or with a query matching only a small number of documents) - -Where there is no data available in a bucket for a given metric it presents a problem for calculating the derivative value for both -the current bucket and the next bucket. In the derivative reducer aggregation has a `gap_policy` parameter to define what the behavior -should be when a gap in the data is found. There are currently two options for controlling the gap policy: - -_ignore_:: - This option will not produce a derivative value for any buckets where the value in the current or previous bucket is - missing - -_insert_zeros_:: - This option will assume the missing value is `0` and calculate the derivative with the value `0`. - - diff --git a/docs/reference/search/aggregations/reducer/max-bucket-aggregation.asciidoc b/docs/reference/aggregations/reducer/max-bucket-aggregation.asciidoc similarity index 83% rename from docs/reference/search/aggregations/reducer/max-bucket-aggregation.asciidoc rename to docs/reference/aggregations/reducer/max-bucket-aggregation.asciidoc index a93c7ed8036..e1a5e9aa389 100644 --- a/docs/reference/search/aggregations/reducer/max-bucket-aggregation.asciidoc +++ b/docs/reference/aggregations/reducer/max-bucket-aggregation.asciidoc @@ -5,6 +5,26 @@ A sibling reducer aggregation which identifies the bucket(s) with the maximum va and outputs both the value and the key(s) of the bucket(s). The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation. +==== Syntax + +A `max_bucket` aggregation looks like this in isolation: + +[source,js] +-------------------------------------------------- +{ + "max_bucket": { + "buckets_path": "the_sum" + } +} +-------------------------------------------------- + +.`max_bucket` Parameters +|=== +|Parameter Name |Description |Required |Default Value +|`buckets_path` |The path to the buckets we wish to find the maximum for (see <> for more + details |Required | +|=== + The following snippet calculates the maximum of the total monthly `sales`: [source,js] @@ -32,7 +52,6 @@ The following snippet calculates the maximum of the total monthly `sales`: } } -------------------------------------------------- - <1> `bucket_paths` instructs this max_bucket aggregation that we want the maximum value of the `sales` aggregation in the `sales_per_month` date histogram. diff --git a/docs/reference/search/aggregations/reducer/min-bucket-aggregation.asciidoc b/docs/reference/aggregations/reducer/min-bucket-aggregation.asciidoc similarity index 83% rename from docs/reference/search/aggregations/reducer/min-bucket-aggregation.asciidoc rename to docs/reference/aggregations/reducer/min-bucket-aggregation.asciidoc index 558d0c19983..1ea26c17a2e 100644 --- a/docs/reference/search/aggregations/reducer/min-bucket-aggregation.asciidoc +++ b/docs/reference/aggregations/reducer/min-bucket-aggregation.asciidoc @@ -5,6 +5,26 @@ A sibling reducer aggregation which identifies the bucket(s) with the minimum va and outputs both the value and the key(s) of the bucket(s). The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation. +==== Syntax + +A `max_bucket` aggregation looks like this in isolation: + +[source,js] +-------------------------------------------------- +{ + "min_bucket": { + "buckets_path": "the_sum" + } +} +-------------------------------------------------- + +.`min_bucket` Parameters +|=== +|Parameter Name |Description |Required |Default Value +|`buckets_path` |Path to the metric of interest (see <> for more details |Required | +|=== + + The following snippet calculates the minimum of the total monthly `sales`: [source,js] diff --git a/docs/reference/search/aggregations/reducer/movavg-aggregation.asciidoc b/docs/reference/aggregations/reducer/movavg-aggregation.asciidoc similarity index 93% rename from docs/reference/search/aggregations/reducer/movavg-aggregation.asciidoc rename to docs/reference/aggregations/reducer/movavg-aggregation.asciidoc index 03f6b7e9fa1..18cf98d263d 100644 --- a/docs/reference/search/aggregations/reducer/movavg-aggregation.asciidoc +++ b/docs/reference/aggregations/reducer/movavg-aggregation.asciidoc @@ -35,16 +35,14 @@ A `moving_avg` aggregation looks like this in isolation: .`moving_avg` Parameters |=== -|Parameter Name |Description |Required |Default - -|`buckets_path` |The path to the metric that we wish to calculate a moving average for |Required | +|Parameter Name |Description |Required |Default Value +|`buckets_path` |Path to the metric of interest (see <> for more details |Required | |`model` |The moving average weighting model that we wish to use |Optional |`simple` |`gap_policy` |Determines what should happen when a gap in the data is encountered. |Optional |`insert_zero` |`window` |The size of window to "slide" across the histogram. |Optional |`5` |`settings` |Model-specific settings, contents which differ depending on the model specified. |Optional | |=== - `moving_avg` aggregations must be embedded inside of a `histogram` or `date_histogram` aggregation. They can be embedded like any other metric aggregation: @@ -73,27 +71,9 @@ embedded like any other metric aggregation: Moving averages are built by first specifying a `histogram` or `date_histogram` over a field. You can then optionally add normal metrics, such as a `sum`, inside of that histogram. Finally, the `moving_avg` is embedded inside the histogram. -The `buckets_path` parameter is then used to "point" at one of the sibling metrics inside of the histogram. +The `buckets_path` parameter is then used to "point" at one of the sibling metrics inside of the histogram (see +<> for a description of the syntax for `buckets_path`. -A moving average can also be calculated on the document count of each bucket, instead of a metric: - -[source,js] --------------------------------------------------- -{ - "my_date_histo":{ - "date_histogram":{ - "field":"timestamp", - "interval":"day" - }, - "aggs":{ - "the_movavg":{ - "moving_avg":{ "buckets_path": "_count" } <1> - } - } - } -} --------------------------------------------------- -<1> By using `_count` instead of a metric name, we can calculate the moving average of document counts in the histogram ==== Models @@ -250,7 +230,7 @@ image::images/reducers_movavg/double_0.2beta.png[] .Double Exponential moving average with window of size 100, alpha = 0.5, beta = 0.7 image::images/reducers_movavg/double_0.7beta.png[] -=== Prediction +==== Prediction All the moving average model support a "prediction" mode, which will attempt to extrapolate into the future given the current smoothed, moving average. Depending on the model and parameter, these predictions may or may not be accurate. diff --git a/docs/reference/index.asciidoc b/docs/reference/index.asciidoc index 1e63d18a4d2..696fbaa3bca 100644 --- a/docs/reference/index.asciidoc +++ b/docs/reference/index.asciidoc @@ -18,6 +18,8 @@ include::docs.asciidoc[] include::search.asciidoc[] +include::aggregations.asciidoc[] + include::indices.asciidoc[] include::cat.asciidoc[] diff --git a/docs/reference/search.asciidoc b/docs/reference/search.asciidoc index 79d3c7a93fd..b71a0dfe466 100644 --- a/docs/reference/search.asciidoc +++ b/docs/reference/search.asciidoc @@ -85,8 +85,6 @@ include::search/search-template.asciidoc[] include::search/search-shards.asciidoc[] -include::search/aggregations.asciidoc[] - include::search/facets.asciidoc[] include::search/suggesters.asciidoc[] diff --git a/docs/reference/search/aggregations/bucket.asciidoc b/docs/reference/search/aggregations/bucket.asciidoc deleted file mode 100644 index 7d7848fa1a2..00000000000 --- a/docs/reference/search/aggregations/bucket.asciidoc +++ /dev/null @@ -1,33 +0,0 @@ -[[search-aggregations-bucket]] - -include::bucket/global-aggregation.asciidoc[] - -include::bucket/filter-aggregation.asciidoc[] - -include::bucket/filters-aggregation.asciidoc[] - -include::bucket/missing-aggregation.asciidoc[] - -include::bucket/nested-aggregation.asciidoc[] - -include::bucket/reverse-nested-aggregation.asciidoc[] - -include::bucket/children-aggregation.asciidoc[] - -include::bucket/terms-aggregation.asciidoc[] - -include::bucket/significantterms-aggregation.asciidoc[] - -include::bucket/range-aggregation.asciidoc[] - -include::bucket/daterange-aggregation.asciidoc[] - -include::bucket/iprange-aggregation.asciidoc[] - -include::bucket/histogram-aggregation.asciidoc[] - -include::bucket/datehistogram-aggregation.asciidoc[] - -include::bucket/geodistance-aggregation.asciidoc[] - -include::bucket/geohashgrid-aggregation.asciidoc[] diff --git a/docs/reference/search/aggregations/metrics.asciidoc b/docs/reference/search/aggregations/metrics.asciidoc deleted file mode 100644 index 7dbbd090bbd..00000000000 --- a/docs/reference/search/aggregations/metrics.asciidoc +++ /dev/null @@ -1,27 +0,0 @@ -[[search-aggregations-metrics]] - -include::metrics/min-aggregation.asciidoc[] - -include::metrics/max-aggregation.asciidoc[] - -include::metrics/sum-aggregation.asciidoc[] - -include::metrics/avg-aggregation.asciidoc[] - -include::metrics/stats-aggregation.asciidoc[] - -include::metrics/extendedstats-aggregation.asciidoc[] - -include::metrics/valuecount-aggregation.asciidoc[] - -include::metrics/percentile-aggregation.asciidoc[] - -include::metrics/percentile-rank-aggregation.asciidoc[] - -include::metrics/cardinality-aggregation.asciidoc[] - -include::metrics/geobounds-aggregation.asciidoc[] - -include::metrics/tophits-aggregation.asciidoc[] - -include::metrics/scripted-metric-aggregation.asciidoc[] diff --git a/docs/reference/search/aggregations/reducer.asciidoc b/docs/reference/search/aggregations/reducer.asciidoc deleted file mode 100644 index a725bc77e38..00000000000 --- a/docs/reference/search/aggregations/reducer.asciidoc +++ /dev/null @@ -1,6 +0,0 @@ -[[search-aggregations-reducer]] - -include::reducer/derivative-aggregation.asciidoc[] -include::reducer/max-bucket-aggregation.asciidoc[] -include::reducer/min-bucket-aggregation.asciidoc[] -include::reducer/movavg-aggregation.asciidoc[] From 967e05ea76fb2b2a134917c9d2eec9c67b60319b Mon Sep 17 00:00:00 2001 From: Zachary Tong Date: Mon, 4 May 2015 09:18:24 -0400 Subject: [PATCH 2/2] [DOCS] Fix section levels for Sampler agg --- .../aggregations/bucket/sampler-aggregation.asciidoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/reference/aggregations/bucket/sampler-aggregation.asciidoc b/docs/reference/aggregations/bucket/sampler-aggregation.asciidoc index 5ad9dbc0194..29742709ea0 100644 --- a/docs/reference/aggregations/bucket/sampler-aggregation.asciidoc +++ b/docs/reference/aggregations/bucket/sampler-aggregation.asciidoc @@ -72,7 +72,7 @@ Response: The `shard_size` parameter limits how many top-scoring documents are collected in the sample processed on each shard. The default value is 100. -=== Controlling diversity +==== Controlling diversity Optionally, you can use the `field` or `script` and `max_docs_per_value` settings to control the maximum number of documents collected on any one shard which share a common value. The choice of value (e.g. `author`) is loaded from a regular `field` or derived dynamically by a `script`. @@ -139,16 +139,16 @@ The default setting is to use `global_ordinals` if this information is available The `bytes_hash` setting may prove faster in some cases but introduces the possibility of false positives in de-duplication logic due to the possibility of hash collisions. Please note that Elasticsearch will ignore the choice of execution hint if it is not applicable and that there is no backward compatibility guarantee on these hints. -=== Limitations +==== Limitations -==== Cannot be nested under `breadth_first` aggregations +===== Cannot be nested under `breadth_first` aggregations Being a quality-based filter the sampler aggregation needs access to the relevance score produced for each document. It therefore cannot be nested under a `terms` aggregation which has the `collect_mode` switched from the default `depth_first` mode to `breadth_first` as this discards scores. In this situation an error will be thrown. -==== Limited de-dup logic. +===== Limited de-dup logic. The de-duplication logic in the diversify settings applies only at a shard level so will not apply across shards. -==== No specialized syntax for geo/date fields +===== No specialized syntax for geo/date fields Currently the syntax for defining the diversifying values is defined by a choice of `field` or `script` - there is no added syntactical sugar for expressing geo or date units such as "1w" (1 week). This support may be added in a later release and users will currently have to create these sorts of values using a script. \ No newline at end of file