Clarified the behaviour of SQL COUNT(DISTINCT dim) on multi-value dimensions (#13128)

* Clarified the behaviour of COUNT(DISTINCT column) on multi-value columns

* Update docs/querying/sql-aggregations.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Co-authored-by: Vadim Ogievetsky <vadimon@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This commit is contained in:
hosswald 2022-09-21 03:03:34 +02:00 committed by GitHub
parent edc444a4bc
commit 5ed5c83aab
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 1 additions and 1 deletions

View File

@ -66,7 +66,7 @@ In the aggregation functions supported by Druid, only `COUNT`, `ARRAY_AGG`, and
|Function|Notes|Default| |Function|Notes|Default|
|--------|-----|-------| |--------|-----|-------|
|`COUNT(*)`|Counts the number of rows.|`0`| |`COUNT(*)`|Counts the number of rows.|`0`|
|`COUNT(DISTINCT expr)`|Counts distinct values of `expr`.<br /><br />When `useApproximateCountDistinct` is set to "true" (the default), this is an alias for `APPROX_COUNT_DISTINCT`. The specific algorithm depends on the value of [`druid.sql.approxCountDistinct.function`](../configuration/index.md#sql). In this mode, you can use strings, numbers, or prebuilt sketches. If counting prebuilt sketches, the prebuilt sketch type must match the selected algorithm.<br /><br />When `useApproximateCountDistinct` is set to "false", the computation will be exact. In this case, `expr` must be string or numeric, since exact counts are not possible using prebuilt sketches. In exact mode, only one distinct count per query is permitted unless `useGroupingSetForExactDistinct` is enabled.|`0`| |`COUNT(DISTINCT expr)`|Counts distinct values of `expr`.<br /><br />When `useApproximateCountDistinct` is set to "true" (the default), this is an alias for `APPROX_COUNT_DISTINCT`. The specific algorithm depends on the value of [`druid.sql.approxCountDistinct.function`](../configuration/index.md#sql). In this mode, you can use strings, numbers, or prebuilt sketches. If counting prebuilt sketches, the prebuilt sketch type must match the selected algorithm.<br /><br />When `useApproximateCountDistinct` is set to "false", the computation will be exact. In this case, `expr` must be string or numeric, since exact counts are not possible using prebuilt sketches. In exact mode, only one distinct count per query is permitted unless `useGroupingSetForExactDistinct` is enabled.<br /><br />Counts each distinct value in a [`multi-value`](../querying/multi-value-dimensions.md)-row separately.|`0`|
|`SUM(expr)`|Sums numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `0`| |`SUM(expr)`|Sums numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `0`|
|`MIN(expr)`|Takes the minimum of numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `9223372036854775807` (maximum LONG value)| |`MIN(expr)`|Takes the minimum of numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `9223372036854775807` (maximum LONG value)|
|`MAX(expr)`|Takes the maximum of numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `-9223372036854775808` (minimum LONG value)| |`MAX(expr)`|Takes the maximum of numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `-9223372036854775808` (minimum LONG value)|