Recent upgrade of ranger introduced CVE regressions due to outdated elasticsearch components.
Druid-ranger-plugin does not elasticsearch components , and they have been explicitly removed.
Update woodstox-core to 6.4.0 to address GHSA-3f7h-mf4q-vrm4
Upgrade Jackson to version 2.12.7.1 to address CVE-2022-42003, CVE-2022-42004 which affects jackson-databind.
Upgrade com.google.code.gson:gson from 2.2.4 to the latest version (2.10.1) since 2.2.4 is affected by CVE-2022-25647.
Updates ARRAY_OVERLAP to use the same ArrayContainsElement filter added in #15366 when filtering ARRAY typed columns so that it can also use indexes like ARRAY_CONTAINS.
This PR revives #14978 with a few more bells and whistles. Instead of an unconditional cross-join, we will now split the join condition such that some conditions are now evaluated post-join. To decide what sub-condition goes where, I have refactored DruidJoinRule class to extract unsupported sub-conditions. We build a postJoinFilter out of these unsupported sub-conditions and push to the join.
* update confluent's dependencies to common, supported version
Update io.confluent.* dependencies to common, updated version 6.2.12
currently used versions are EOL
* move version definition to the top level pom
I think this is a problem as it discards the false return value when the putToKeyBuffer can't store the value because of the limit
Not forwarding the return value at that point may lead to the normal continuation here regardless something was not added to the dictionary like here
* Fixing failing compaction/parallel index jobs during upgrade due to new actions not available on the overlord.
* Fixing build
* Removing extra space.
* Fixing json getter.
* Review comments.
Changes:
- Fix log `Got end of partition marker for partition [%s] from task [%s] in discoverTasks`
by fixing order of args
- Simplify in-line classes by using lambda
- Update kill task message from `Task [%s] failed to respond to [set end offsets]
in a timely manner, killing task` to `Failed to set end offsets, killing task`
- Clean up tests
This PR fixes an issue where the grouping aggregator wrongly assumes that a key dimension is a virtual column and assigns a wrong name to it. This results in a mismatch between the dimensions that grouping aggregator sees and the dimension names that rows are aggregated on. And finally, grouping aggregator generates wrong result.
Fixes missing task failure error message on Overlord.
The error message was missing since TaskManagementResource#assignTask API wasn't annotated with @Produces(MediaType.APPLICATION_JSON) resulting in the response being treated as application/octet-stream, that in turn lead to MessageBodyWriter not found error on the middle manager. The exception is not logged on the middle manager itself since it happens even before entering the assignTask function -- while mapping arg Task -> MSQControllerTask.
Changes
- Suppress CVE-2023-36478 as there is no newer Hadoop version available that addresses
- Suppress CVE-2023-31582 in jose4j. Pulled in by Kubernetes/Kafka but not addressed yet.
Currently when we submit a task to druid and number of currently active tasks has already reached (druid.indexer.queue.maxSize) then 500 ISE is thrown as per shown in the screenshot in #15380.
This fix will return HTTP 429 Too Many Requests(with proper error message) instead of 500 ISE, when we submit a task and queueSize has reached.
* Fix NPE caused by realtime segment closing race, fix possible missing-segment retry bug.
Fixes#12168, by returning empty from FireHydrant when the segment is
swapped to null. This causes the SinkQuerySegmentWalker to use
ReportTimelineMissingSegmentQueryRunner, which causes the Broker to look
for the segment somewhere else.
In addition, this patch changes SinkQuerySegmentWalker to acquire references
to all hydrants (subsegments of a sink) at once, and return a
ReportTimelineMissingSegmentQueryRunner if *any* of them could not be acquired.
I suspect, although have not confirmed, that the prior behavior could lead to
segments being reported as missing even though results from some hydrants were
still included.
* Some more test coverage.
* Make numCorePartitions as 0 in the TombstoneShardSpec.
* fix up test
* Add tombstone core partition tests
* review comment
* Need to register the test shard type to make jackson happy
This patch introduces a param snapshotTime in the iceberg inputsource spec that allows the user to ingest data files associated with the most recent snapshot as of the given time. This helps the user ingest data based on older snapshots by specifying the associated snapshot time.
This patch also upgrades the iceberg core version to 1.4.1
Fixed the following flaky tests:
org.apache.druid.math.expr.ParserTest#testApplyFunctions
org.apache.druid.math.expr.ParserTest#testSimpleMultiplicativeOp1
org.apache.druid.math.expr.ParserTest#testFunctions
org.apache.druid.math.expr.ParserTest#testSimpleLogicalOps1
org.apache.druid.math.expr.ParserTest#testSimpleAdditivityOp1
org.apache.druid.math.expr.ParserTest#testSimpleAdditivityOp2
The above mentioned tests have been reported as flaky (tests assuming deterministic implementation of a non-deterministic specification ) when ran against the NonDex tool.
The tests contain assertions (Assertion 1 & Assertion 2) that compare an ArrayList created from a HashSet using the ArrayList() constructor with another List. However, HashSet does not guarantee the ordering of elements and thus resulting in these flaky tests that assume deterministic implementation of HashSet. Thus, when the NonDex tool shuffles the HashSet elements, it results in the test failures:
Co-authored-by: ythorat2 <ythorat2@illinois.edu>
The TaskQueue maintains a map of active task ids to tasks, which can be utilized to get active task payloads, before falling back to the metadata store.
There is a problem with Quantiles sketches and KLL Quantiles sketches.
Queries using the histogram post-aggregator fail if:
- the sketch contains at least one value, and
- the values in the sketch are all equal, and
- the splitPoints argument is not passed to the post-aggregator, and
- the numBins argument is greater than 2 (or not specified, which
leads to the default of 10 being used)
In that case, the query fails and returns this error:
{
"error": "Unknown exception",
"errorClass": "org.apache.datasketches.common.SketchesArgumentException",
"host": null,
"errorCode": "legacyQueryException",
"persona": "OPERATOR",
"category": "RUNTIME_FAILURE",
"errorMessage": "Values must be unique, monotonically increasing and not NaN.",
"context": {
"host": null,
"errorClass": "org.apache.datasketches.common.SketchesArgumentException",
"legacyErrorCode": "Unknown exception"
}
}
This behaviour is undesirable, since the caller doesn't necessarily
know in advance whether the sketch has values that are diverse
enough. With this change, the post-aggregators return [N, 0, 0...]
instead of crashing, where N is the number of values in the sketch,
and the length of the list is equal to numBins. That is what they
already returned for numBins = 2.
Here is an example of a query that would fail:
{"queryType":"timeseries",
"dataSource": {
"type": "inline",
"columnNames": ["foo", "bar"],
"rows": [
["abc", 42.0],
["def", 42.0]
]
},
"intervals":["0000/3000"],
"granularity":"all",
"aggregations":[
{"name":"the_sketch", "fieldName":"bar", "type":"quantilesDoublesSketch"}],
"postAggregations":[
{"name":"the_histogram",
"type":"quantilesDoublesSketchToHistogram",
"field":{"type":"fieldAccess","fieldName":"the_sketch"},
"numBins": 3}]}
I believe this also fixes issue #10585.