mirror of https://github.com/apache/druid.git
67f45fa7bf
There is a problem with Quantiles sketches and KLL Quantiles sketches. Queries using the histogram post-aggregator fail if: - the sketch contains at least one value, and - the values in the sketch are all equal, and - the splitPoints argument is not passed to the post-aggregator, and - the numBins argument is greater than 2 (or not specified, which leads to the default of 10 being used) In that case, the query fails and returns this error: { "error": "Unknown exception", "errorClass": "org.apache.datasketches.common.SketchesArgumentException", "host": null, "errorCode": "legacyQueryException", "persona": "OPERATOR", "category": "RUNTIME_FAILURE", "errorMessage": "Values must be unique, monotonically increasing and not NaN.", "context": { "host": null, "errorClass": "org.apache.datasketches.common.SketchesArgumentException", "legacyErrorCode": "Unknown exception" } } This behaviour is undesirable, since the caller doesn't necessarily know in advance whether the sketch has values that are diverse enough. With this change, the post-aggregators return [N, 0, 0...] instead of crashing, where N is the number of values in the sketch, and the length of the list is equal to numBins. That is what they already returned for numBins = 2. Here is an example of a query that would fail: {"queryType":"timeseries", "dataSource": { "type": "inline", "columnNames": ["foo", "bar"], "rows": [ ["abc", 42.0], ["def", 42.0] ] }, "intervals":["0000/3000"], "granularity":"all", "aggregations":[ {"name":"the_sketch", "fieldName":"bar", "type":"quantilesDoublesSketch"}], "postAggregations":[ {"name":"the_histogram", "type":"quantilesDoublesSketchToHistogram", "field":{"type":"fieldAccess","fieldName":"the_sketch"}, "numBins": 3}]} I believe this also fixes issue #10585. |
||
---|---|---|
.. | ||
avro-extensions | ||
azure-extensions | ||
datasketches | ||
druid-aws-rds-extensions | ||
druid-basic-security | ||
druid-bloom-filter | ||
druid-catalog | ||
druid-kerberos | ||
druid-pac4j | ||
druid-ranger-security | ||
ec2-extensions | ||
google-extensions | ||
hdfs-storage | ||
histogram | ||
kafka-extraction-namespace | ||
kafka-indexing-service | ||
kinesis-indexing-service | ||
kubernetes-extensions | ||
lookups-cached-global | ||
lookups-cached-single | ||
multi-stage-query | ||
mysql-metadata-storage | ||
orc-extensions | ||
parquet-extensions | ||
postgresql-metadata-storage | ||
protobuf-extensions | ||
s3-extensions | ||
simple-client-sslcontext | ||
stats | ||
testing-tools |