mirror of https://github.com/apache/druid.git
e28424ea25
Currently compaction with MSQ engine doesn't work for rollup on multi-value dimensions (MVDs), the reason being the default behaviour of grouping on MVD dimensions to unnest the dimension values; for instance grouping on `[s1,s2]` with aggregate `a` will result in two rows: `<s1,a>` and `<s2,a>`. This change enables rollup on MVDs (without unnest) by converting MVDs to Arrays before rollup using virtual columns, and then converting them back to MVDs using post aggregators. If segment schema is available to the compaction task (when it ends up downloading segments to get existing dimensions/metrics/granularity), it selectively does the MVD-Array conversion only for known multi-valued columns; else it conservatively performs this conversion for all `string` columns. |
||
---|---|---|
.. | ||
avro-extensions | ||
azure-extensions | ||
datasketches | ||
druid-aws-rds-extensions | ||
druid-basic-security | ||
druid-bloom-filter | ||
druid-catalog | ||
druid-kerberos | ||
druid-pac4j | ||
druid-ranger-security | ||
ec2-extensions | ||
google-extensions | ||
hdfs-storage | ||
histogram | ||
kafka-extraction-namespace | ||
kafka-indexing-service | ||
kinesis-indexing-service | ||
kubernetes-extensions | ||
lookups-cached-global | ||
lookups-cached-single | ||
multi-stage-query | ||
mysql-metadata-storage | ||
orc-extensions | ||
parquet-extensions | ||
postgresql-metadata-storage | ||
protobuf-extensions | ||
s3-extensions | ||
simple-client-sslcontext | ||
stats | ||
testing-tools |