2018-12-13 14:47:20 -05:00
---
2019-08-21 00:48:59 -04:00
id: post-aggregations
2020-04-15 19:12:20 -04:00
title: "Post-aggregations"
2018-12-13 14:47:20 -05:00
---
2018-11-13 12:38:37 -05:00
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
2023-08-16 22:01:21 -04:00
:::info
Apache Druid supports two query languages: [Druid SQL ](sql.md ) and [native queries ](querying.md ).
This document describes the native
language. For information about functions available in SQL, refer to the
[SQL documentation ](sql-aggregations.md ).
:::
2018-12-12 23:42:12 -05:00
2020-01-03 12:33:19 -05:00
Post-aggregations are specifications of processing that should happen on aggregated values as they come out of Apache Druid. If you include a post aggregation as part of a query, make sure to include all aggregators the post-aggregator requires.
2015-05-05 17:07:32 -04:00
There are several post-aggregators available.
### Arithmetic post-aggregator
The arithmetic post-aggregator applies the provided function to the given
fields from left to right. The fields can be aggregators or other post aggregators.
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"arithmetic"` . | Yes |
| `name` | Output name of the post-aggregation | Yes |
| `fn` | Supported functions are `+` , `-` , `*` , `/` , `pow` and `quotient` | Yes |
| `fields` | List of post-aggregator specs which define inputs to the `fn` | Yes |
| `ordering` | If no ordering (or `null` ) is specified, the default floating point ordering is used. `numericFirst` ordering always returns finite values first, followed by `NaN` , and infinite values last. | No |
2015-05-05 17:07:32 -04:00
2023-11-02 03:31:37 -04:00
**Note**:
2015-05-05 17:07:32 -04:00
* `/` division always returns `0` if dividing by`0`, regardless of the numerator.
* `quotient` division behaves like regular floating point division
2022-04-18 22:36:19 -04:00
* Arithmetic post-aggregators always use floating point arithmetic.
2015-05-05 17:07:32 -04:00
2023-11-02 03:31:37 -04:00
Example:
2015-05-05 17:07:32 -04:00
```json
2023-11-02 03:31:37 -04:00
{
2015-05-05 17:07:32 -04:00
"type" : "arithmetic",
2023-11-02 03:31:37 -04:00
"name" : "mult",
"fn" : "*",
"fields": [
{"type": "fieldAccess", "fieldName": "someAgg"},
{"type": "fieldAccess", "fieldName": "someOtherAgg"}
]
2015-05-05 17:07:32 -04:00
}
```
2017-08-31 14:45:49 -04:00
### Field accessor post-aggregators
2015-05-05 17:07:32 -04:00
2023-11-02 03:31:37 -04:00
These post-aggregators return the value produced by the specified [dimension ](../querying/dimensionspecs.md ) or [aggregator ](../querying/aggregations.md ).
2015-05-05 17:07:32 -04:00
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"fieldAccess"` or `"finalizingFieldAccess"` . Use type `"fieldAccess"` to return the raw aggregation object, or use type `"finalizingFieldAccess"` to return a finalized value, such as an estimated cardinality. | Yes |
| `name` | Output name of the post-aggregation | Yes if defined as a standalone post-aggregation, but may be omitted if used inline to some other post-aggregator in a `fields` list |
| `fieldName` | The output name of the dimension or aggregator to reference | Yes |
Example:
2015-05-05 17:07:32 -04:00
```json
2023-11-02 03:31:37 -04:00
{ "type" : "fieldAccess", "name": "someField", "fieldName" : "someAggregator" }
2015-05-05 17:07:32 -04:00
```
2017-08-31 14:45:49 -04:00
or
```json
2023-11-02 03:31:37 -04:00
{ "type" : "finalizingFieldAccess", "name": "someFinalizedField", "fieldName" : "someAggregator" }
2017-08-31 14:45:49 -04:00
```
2015-05-05 17:07:32 -04:00
### Constant post-aggregator
The constant post-aggregator always returns the specified value.
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"constant"` | Yes |
| `name` | Output name of the post-aggregation | Yes |
| `value` | The constant value | Yes |
Example:
2015-05-05 17:07:32 -04:00
```json
2023-11-02 03:31:37 -04:00
{ "type" : "constant", "name" : "someConstant", "value" : 1234 }
2015-05-05 17:07:32 -04:00
```
2022-04-18 22:36:19 -04:00
### Expression post-aggregator
2023-05-19 12:42:27 -04:00
The expression post-aggregator is defined using a Druid [expression ](math-expr.md ).
2022-04-18 22:36:19 -04:00
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"expression"` | Yes |
| `name` | Output name of the post-aggregation | Yes |
| `expression` | Native Druid [expression ](math-expr.md ) to compute, may refer to any dimension or aggregator output names | Yes |
| `ordering` | If no ordering (or `null` ) is specified, the "natural" ordering is used. `numericFirst` ordering always returns finite values first, followed by `NaN` , and infinite values last. If the expression produces array or complex types, specify `ordering` as null and use `outputType` instead to use the correct type native ordering. | No |
| `outputType` | Output type is optional, and can be any native Druid type: `LONG` , `FLOAT` , `DOUBLE` , `STRING` , `ARRAY` types (e.g. `ARRAY<LONG>` ), or `COMPLEX` types (e.g. `COMPLEX<json>` ). If not specified, the output type will be inferred from the `expression` . If specified and `ordering` is null, the type native ordering will be used for sorting values. If the expression produces array or complex types, this value must be non-null to ensure the correct ordering is used. If `outputType` does not match the actual output type of the `expression` , the value will be attempted to coerced to the specified type, possibly failing if coercion is not possible. | No |
Example:
2022-04-18 22:36:19 -04:00
```json
{
"type": "expression",
2023-11-02 03:31:37 -04:00
"name": "someExpression",
"expression": "someAgg + someOtherAgg",
"ordering": null,
"outputType": "LONG"
2022-04-18 22:36:19 -04:00
}
```
2016-12-07 20:58:23 -05:00
### Greatest / Least post-aggregators
`doubleGreatest` and `longGreatest` computes the maximum of all fields and Double.NEGATIVE_INFINITY.
`doubleLeast` and `longLeast` computes the minimum of all fields and Double.POSITIVE_INFINITY.
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"doubleGreatest"` , `"doubleLeast"` , `"longGreatest"` , or `"longLeast"` . | Yes |
| `name` | Output name of the post-aggregation | Yes |
| `fields` | List of post-aggregator specs which define inputs to the greatest or least function | Yes |
2016-12-07 20:58:23 -05:00
The difference between the `doubleMax` aggregator and the `doubleGreatest` post-aggregator is that `doubleMax` returns the highest value of
all rows for one specific column while `doubleGreatest` returns the highest value of multiple columns in one row. These are similar to the
2021-08-18 11:37:05 -04:00
SQL `MAX` and `GREATEST` functions.
2016-12-07 20:58:23 -05:00
Example:
```json
{
"type" : "doubleGreatest",
2023-11-02 03:31:37 -04:00
"name" : "theGreatest",
"fields": [
{ "type": "fieldAccess", "fieldName": "someAgg" },
{ "type": "fieldAccess", "fieldName": "someOtherAgg" }
]
2016-12-07 20:58:23 -05:00
}
```
2015-05-05 17:07:32 -04:00
### JavaScript post-aggregator
Applies the provided JavaScript function to the given fields. Fields are passed as arguments to the JavaScript function in the given order.
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"javascript"` | Yes |
| `name` | Output name of the post-aggregation | Yes |
| `fieldNames` | List of input dimension or aggregator output names | Yes |
| `function` | String javascript function which accepts `fieldNames` as arguments | Yes |
2015-05-05 17:07:32 -04:00
2023-11-02 03:31:37 -04:00
Example:
2015-05-05 17:07:32 -04:00
```json
{
"type": "javascript",
2023-11-02 03:31:37 -04:00
"name": "someJavascript",
"fieldNames" : ["someAgg", "someOtherAgg"],
"function": "function(someAgg, someOtherAgg) { return 100 * Math.abs(someAgg) / someOtherAgg;"
2015-05-05 17:07:32 -04:00
}
```
2016-09-13 16:46:55 -04:00
2023-08-16 22:01:21 -04:00
:::info
JavaScript-based functionality is disabled by default. Please refer to the Druid [JavaScript programming guide ](../development/javascript.md ) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it.
:::
2016-09-13 16:46:55 -04:00
2015-05-05 17:07:32 -04:00
### HyperUnique Cardinality post-aggregator
The hyperUniqueCardinality post aggregator is used to wrap a hyperUnique object such that it can be used in post aggregations.
2023-11-02 03:31:37 -04:00
| Property | Description | Required |
| --- | --- | --- |
| `type` | Must be `"hyperUniqueCardinality"` | Yes |
| `name` | Output name of the post-aggregation | Yes |
| `fieldName` | The output name of a [`hyperUnique` aggregator ](aggregations.md#cardinality-hyperunique ) | Yes |
2015-05-05 17:07:32 -04:00
```json
2017-08-28 17:52:11 -04:00
{
"type" : "hyperUniqueCardinality",
2023-11-02 03:31:37 -04:00
"name": "someCardinality",
"fieldName" : "someHyperunique"
2017-08-28 17:52:11 -04:00
}
2015-05-05 17:07:32 -04:00
```
It can be used in a sample calculation as so:
```json
2023-11-02 03:31:37 -04:00
{
...
2015-05-05 17:07:32 -04:00
"aggregations" : [{
{"type" : "count", "name" : "rows"},
{"type" : "hyperUnique", "name" : "unique_users", "fieldName" : "uniques"}
}],
2015-07-02 14:37:23 -04:00
"postAggregations" : [{
2015-05-05 17:07:32 -04:00
"type" : "arithmetic",
"name" : "average_users_per_row",
"fn" : "/",
"fields" : [
{ "type" : "hyperUniqueCardinality", "fieldName" : "unique_users" },
{ "type" : "fieldAccess", "name" : "rows", "fieldName" : "rows" }
]
2015-07-02 14:37:23 -04:00
}]
2023-11-02 03:31:37 -04:00
...
2015-05-05 17:07:32 -04:00
```
2017-08-28 17:52:11 -04:00
This post-aggregator will inherit the rounding behavior of the aggregator it references. Note that this inheritance
is only effective if you directly reference an aggregator. Going through another post-aggregator, for example, will
cause the user-specified rounding behavior to get lost and default to "no rounding".
2016-02-04 14:53:09 -05:00
## Example Usage
2015-05-05 17:07:32 -04:00
In this example, let’ s calculate a simple percentage using post aggregators. Let’ s imagine our data set has a metric called "total".
The format of the query JSON is as follows:
```json
{
...
"aggregations" : [
{ "type" : "count", "name" : "rows" },
{ "type" : "doubleSum", "name" : "tot", "fieldName" : "total" }
],
2015-07-02 14:37:23 -04:00
"postAggregations" : [{
2015-05-05 17:07:32 -04:00
"type" : "arithmetic",
"name" : "average",
2018-02-06 19:26:39 -05:00
"fn" : "/",
"fields" : [
{ "type" : "fieldAccess", "name" : "tot", "fieldName" : "tot" },
{ "type" : "fieldAccess", "name" : "rows", "fieldName" : "rows" }
]
}]
...
}
```
```json
{
...
"aggregations" : [
{ "type" : "doubleSum", "name" : "tot", "fieldName" : "total" },
{ "type" : "doubleSum", "name" : "part", "fieldName" : "part" }
],
"postAggregations" : [{
"type" : "arithmetic",
"name" : "part_percentage",
2015-05-05 17:07:32 -04:00
"fn" : "*",
"fields" : [
{ "type" : "arithmetic",
2018-02-06 19:26:39 -05:00
"name" : "ratio",
2015-05-05 17:07:32 -04:00
"fn" : "/",
"fields" : [
2018-02-06 19:26:39 -05:00
{ "type" : "fieldAccess", "name" : "part", "fieldName" : "part" },
{ "type" : "fieldAccess", "name" : "tot", "fieldName" : "tot" }
2015-05-05 17:07:32 -04:00
]
},
{ "type" : "constant", "name": "const", "value" : 100 }
]
2015-07-02 14:37:23 -04:00
}]
2015-05-05 17:07:32 -04:00
...
}
```
2022-04-18 22:36:19 -04:00
The same could be computed using an expression post-aggregator:
```json
{
...
"aggregations" : [
{ "type" : "doubleSum", "name" : "tot", "fieldName" : "total" },
{ "type" : "doubleSum", "name" : "part", "fieldName" : "part" }
],
"postAggregations" : [{
"type" : "expression",
"name" : "part_percentage",
"expression" : "100 * (part / tot)"
}]
...
}
```