[role="xpack"] [testenv="basic"] [[sql-functions-aggs]] === Aggregate Functions Functions for computing a _single_ result from a set of input values. {es-sql} supports aggregate functions only alongside <> (implicit or explicit). ==== General Purpose [[sql-functions-aggs-avg]] ===== `AVG` .Synopsis: [source, sql] -------------------------------------------------- AVG(numeric_field<1>) -------------------------------------------------- *Input*: <1> numeric field *Output*: `double` numeric value .Description: Returns the https://en.wikipedia.org/wiki/Arithmetic_mean[Average] (arithmetic mean) of input values. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggAvg] -------------------------------------------------- [[sql-functions-aggs-count]] ===== `COUNT` .Synopsis: [source, sql] -------------------------------------------------- COUNT(expression<1>) -------------------------------------------------- *Input*: <1> a field name, wildcard (`*`) or any numeric value *Output*: numeric value .Description: Returns the total number (count) of input values. In case of `COUNT(*)` or `COUNT()`, _all_ values are considered (including `null` or missing ones). In case of `COUNT()` `null` values are not considered. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggCountStar] -------------------------------------------------- [[sql-functions-aggs-count-all]] ===== `COUNT(ALL)` .Synopsis: [source, sql] -------------------------------------------------- COUNT(ALL field_name<1>) -------------------------------------------------- *Input*: <1> a field name *Output*: numeric value .Description: Returns the total number (count) of all _non-null_ input values. `COUNT()` and `COUNT(ALL )` are equivalent. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggCountAll] -------------------------------------------------- [[sql-functions-aggs-count-distinct]] ===== `COUNT(DISTINCT)` .Synopsis: [source, sql] -------------------------------------------------- COUNT(DISTINCT field_name<1>) -------------------------------------------------- *Input*: <1> a field name *Output*: numeric value .Description: Returns the total number of _distinct non-null_ values in input values. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggCountDistinct] -------------------------------------------------- [[sql-functions-aggs-first]] ===== `FIRST/FIRST_VALUE` .Synopsis: [source, sql] ---------------------------------------------- FIRST(field_name<1>[, ordering_field_name]<2>) ---------------------------------------------- *Input*: <1> target field for the aggregation <2> optional field used for ordering *Output*: same type as the input .Description: Returns the first **non-NULL** value (if such exists) of the `field_name` input column sorted by the `ordering_field_name` column. If `ordering_field_name` is not provided, only the `field_name` column is used for the sorting. E.g.: [cols="<,<"] |=== s| a | b | 100 | 1 | 200 | 1 | 1 | 2 | 2 | 2 | 10 | null | 20 | null | null | null |=== [source, sql] ---------------------- SELECT FIRST(a) FROM t ---------------------- will result in: [cols="<"] |=== s| FIRST(a) | 1 |=== and [source, sql] ------------------------- SELECT FIRST(a, b) FROM t ------------------------- will result in: [cols="<"] |=== s| FIRST(a, b) | 100 |=== ["source","sql",subs="attributes,macros"] ----------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[firstWithOneArg] ----------------------------------------------------------- ["source","sql",subs="attributes,macros"] -------------------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[firstWithOneArgAndGroupBy] -------------------------------------------------------------------- ["source","sql",subs="attributes,macros"] ----------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[firstWithTwoArgs] ----------------------------------------------------------- ["source","sql",subs="attributes,macros"] --------------------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[firstWithTwoArgsAndGroupBy] --------------------------------------------------------------------- `FIRST_VALUE` is a name alias and can be used instead of `FIRST`, e.g.: ["source","sql",subs="attributes,macros"] -------------------------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[firstValueWithTwoArgsAndGroupBy] -------------------------------------------------------------------------- [NOTE] `FIRST` cannot be used in a HAVING clause. [NOTE] `FIRST` cannot be used with columns of type <> unless the field is also <>. [[sql-functions-aggs-last]] ===== `LAST/LAST_VALUE` .Synopsis: [source, sql] -------------------------------------------------- LAST(field_name<1>[, ordering_field_name]<2>) -------------------------------------------------- *Input*: <1> target field for the aggregation <2> optional field used for ordering *Output*: same type as the input .Description: It's the inverse of <>. Returns the last **non-NULL** value (if such exists) of the `field_name`input column sorted descending by the `ordering_field_name` column. If `ordering_field_name` is not provided, only the `field_name` column is used for the sorting. E.g.: [cols="<,<"] |=== s| a | b | 10 | 1 | 20 | 1 | 1 | 2 | 2 | 2 | 100 | null | 200 | null | null | null |=== [source, sql] ------------------------ SELECT LAST(a) FROM t ------------------------ will result in: [cols="<"] |=== s| LAST(a) | 200 |=== and [source, sql] ------------------------ SELECT LAST(a, b) FROM t ------------------------ will result in: [cols="<"] |=== s| LAST(a, b) | 2 |=== ["source","sql",subs="attributes,macros"] ----------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[lastWithOneArg] ----------------------------------------------------------- ["source","sql",subs="attributes,macros"] ------------------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[lastWithOneArgAndGroupBy] ------------------------------------------------------------------- ["source","sql",subs="attributes,macros"] ----------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[lastWithTwoArgs] ----------------------------------------------------------- ["source","sql",subs="attributes,macros"] -------------------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[lastWithTwoArgsAndGroupBy] -------------------------------------------------------------------- `LAST_VALUE` is a name alias and can be used instead of `LAST`, e.g.: ["source","sql",subs="attributes,macros"] ------------------------------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[lastValueWithTwoArgsAndGroupBy] ------------------------------------------------------------------------- [NOTE] `LAST` cannot be used in `HAVING` clause. [NOTE] `LAST` cannot be used with columns of type <> unless the field is also <>. [[sql-functions-aggs-max]] ===== `MAX` .Synopsis: [source, sql] -------------------------------------------------- MAX(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: same type as the input .Description: Returns the maximum value across input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggMax] -------------------------------------------------- [NOTE] `MAX` on a field of type <> or <> is translated into <> and therefore, it cannot be used in `HAVING` clause. [[sql-functions-aggs-min]] ===== `MIN` .Synopsis: [source, sql] -------------------------------------------------- MIN(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: same type as the input .Description: Returns the minimum value across input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggMin] -------------------------------------------------- [NOTE] `MIN` on a field of type <> or <> is translated into <> and therefore, it cannot be used in `HAVING` clause. [[sql-functions-aggs-sum]] ===== `SUM` .Synopsis: [source, sql] -------------------------------------------------- SUM(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: `bigint` for integer input, `double` for floating points .Description: Returns the sum of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggSum] -------------------------------------------------- ==== Statistics [[sql-functions-aggs-kurtosis]] ===== `KURTOSIS` .Synopsis: [source, sql] -------------------------------------------------- KURTOSIS(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: `double` numeric value .Description: https://en.wikipedia.org/wiki/Kurtosis[Quantify] the shape of the distribution of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggKurtosis] -------------------------------------------------- [[sql-functions-aggs-percentile]] ===== `PERCENTILE` .Synopsis: [source, sql] -------------------------------------------------- PERCENTILE(field_name<1>, numeric_exp<2>) -------------------------------------------------- *Input*: <1> a numeric field <2> a numeric expression (must be a constant and not based on a field) *Output*: `double` numeric value .Description: Returns the nth https://en.wikipedia.org/wiki/Percentile[percentile] (represented by `numeric_exp` parameter) of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggPercentile] -------------------------------------------------- [[sql-functions-aggs-percentile-rank]] ===== `PERCENTILE_RANK` .Synopsis: [source, sql] -------------------------------------------------- PERCENTILE_RANK(field_name<1>, numeric_exp<2>) -------------------------------------------------- *Input*: <1> a numeric field <2> a numeric expression (must be a constant and not based on a field) *Output*: `double` numeric value .Description: Returns the nth https://en.wikipedia.org/wiki/Percentile_rank[percentile rank] (represented by `numeric_exp` parameter) of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggPercentileRank] -------------------------------------------------- [[sql-functions-aggs-skewness]] ===== `SKEWNESS` .Synopsis: [source, sql] -------------------------------------------------- SKEWNESS(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: `double` numeric value .Description: https://en.wikipedia.org/wiki/Skewness[Quantify] the asymmetric distribution of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggSkewness] -------------------------------------------------- [[sql-functions-aggs-stddev-pop]] ===== `STDDEV_POP` .Synopsis: [source, sql] -------------------------------------------------- STDDEV_POP(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: `double` numeric value .Description: Returns the https://en.wikipedia.org/wiki/Standard_deviations[population standard deviation] of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggStddevPop] -------------------------------------------------- [[sql-functions-aggs-sum-squares]] ===== `SUM_OF_SQUARES` .Synopsis: [source, sql] -------------------------------------------------- SUM_OF_SQUARES(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: `double` numeric value .Description: Returns the https://en.wikipedia.org/wiki/Total_sum_of_squares[sum of squares] of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggSumOfSquares] -------------------------------------------------- [[sql-functions-aggs-var-pop]] ===== `VAR_POP` .Synopsis: [source, sql] -------------------------------------------------- VAR_POP(field_name<1>) -------------------------------------------------- *Input*: <1> a numeric field *Output*: `double` numeric value .Description: Returns the https://en.wikipedia.org/wiki/Variance[population variance] of input values in the field `field_name`. ["source","sql",subs="attributes,macros"] -------------------------------------------------- include-tagged::{sql-specs}/docs.csv-spec[aggVarPop] --------------------------------------------------