mirror of https://github.com/apache/druid.git
fbd305af0f
Currently, if we have a query with window function having PARTITION BY xyz, and we have a million unique values for xyz each having 1 row, we'd end up creating a million individual RACs for processing, each having a single row. This is unnecessary, and we can batch the PARTITION BY keys together for processing, and process them only when we can't batch further rows to adhere to maxRowsMaterialized config. The previous iteration of this PR was simplifying WindowOperatorQueryFrameProcessor to run all operators on all the rows instead of creating smaller RACs per partition by key. That approach was discarded in favor of the batching approach, and the details are summarized here: #16823 (comment). |
||
---|---|---|
.. | ||
avro-extensions | ||
azure-extensions | ||
datasketches | ||
druid-aws-rds-extensions | ||
druid-basic-security | ||
druid-bloom-filter | ||
druid-catalog | ||
druid-kerberos | ||
druid-pac4j | ||
druid-ranger-security | ||
ec2-extensions | ||
google-extensions | ||
hdfs-storage | ||
histogram | ||
kafka-extraction-namespace | ||
kafka-indexing-service | ||
kinesis-indexing-service | ||
kubernetes-extensions | ||
lookups-cached-global | ||
lookups-cached-single | ||
multi-stage-query | ||
mysql-metadata-storage | ||
orc-extensions | ||
parquet-extensions | ||
postgresql-metadata-storage | ||
protobuf-extensions | ||
s3-extensions | ||
simple-client-sslcontext | ||
stats | ||
testing-tools |