This change is meant to fix a issue where passing too large of a task payload to the mm-less task runner will cause the peon to fail to startup because the payload is passed (compressed) as a environment variable (TASK_JSON). In linux systems the limit for a environment variable is commonly 128KB, for windows systems less than this. Setting a env variable longer than this results in a bunch of "Argument list too long" errors.
Most of the testcases were disabled in CalciteWindowQueryTest during the Calcite-1.35 upgrade; there were some changes arising from the fact that the removal of DRUID_SUM had some unexpected sideffects:
SqlStdOperatorTable.SUM became the SUM operator
because of that SqlToRelConverter started rewriting windowed SUM -s into SUM0 -s
my opinion is that w.r.t to Druid this rewrite provides no real advantage - as SUM0 is serviced by SUM here
I believe that's not 100% correct in cases when it aggregates just null-s but that doesnt matter in this case
I propose to introduce back a local DRUID_SUM thing as an unchanged SUM and later when CALCITE-6020 is fixed ; we can drop that.
* coalesce on unnest row mismatch fix
* new example with coalesce over unnest with nested array columns
* New example with change in order which triggers the nvl
* new test plan update for useDefault=true
This PR updates the library used for Memcached client to AWS Elasticache Client : https://github.com/awslabs/aws-elasticache-cluster-client-memcached-for-java
This enables us to use the option of encrypting data in transit:
Amazon ElastiCache for Memcached now supports encryption of data in transit
For clusters running the Memcached engine, ElastiCache supports Auto Discovery—the ability for client programs to automatically identify all of the nodes in a cache cluster, and to initiate and maintain connections to all of these nodes.
Benefits of Auto Discovery - Amazon ElastiCache
AWS has forked spymemcached 2.12.1, and has since added all the patches included in 2.12.2 and 2.12.3 as part of the 1.2.0 release. So, this can now be considered as an equivalent drop-in replacement.
GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/elasticache/AmazonElastiCacheClient.html#AmazonElastiCacheClient--
How to enable TLS with Elasticache
On server side:
https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/in-transit-encryption-mc.html#in-transit-encryption-enable-existing-mc
On client side:
GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters.
contains Enable already passing tests in DecoupledPlanningCalciteQueryTest #14996
enables a transpose rule to support a query plan in which the plan was in the shape:
Sort
Project
Aggregate
* disable parallel builds; enable batch mode to get rid of transfer progress
* restore .m2 from setup-java if not found
* some change to sql
* add ws
* fix quote
* fix quote
* undo querytest change
* nullhandling in mvtest
* init more
* skip commitid plugin
* add-back 1.0C to build ; remove redundant skip-s from copy-resources; add comment
Upgrade maven shade plugin to try to fix build failures
Sometimes we get maven shade errors in our integ tests becasue we don't run clean in between runs to clear the cache in order to speed them up. This can lead to the below error.
Error: Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (opentelemetry-extension) on project opentelemetry-emitter: Error creating shaded jar: duplicate entry: META-INF/services/org.apache.druid.opentelemetry.shaded.io.grpc.NameResolverProvider
See: https://issues.apache.org/jira/projects/MSHADE/issues/MSHADE-425?filter=allissues
An example run that failed: https://github.com/apache/druid/actions/runs/6301662092/job/17117142375?pr=14887
According to the ticket this is fixed by updating shade to 3.4.1.
When I updated to 3.4.1 I kept running into a different issue during static checks. (Caused by: java.lang.NoClassDefFoundError: com/github/rvesse/airline/parser/errors/ParseException)
I had to add the createDependencyReducedPom: false to get the build to pass.
The dependency reduced pom feature was added in 3.3.0 which we were not using before so setting it explicitly to false should not be a issue. https://issues.apache.org/jira/browse/MSHADE-36)
The KubernetesAndWorkerTaskRunner currently doesn't implement getTaskLocation, so tasks run by it will show a unknown TaskLocation in the druid console after a task has completed.
Fix bug in KubernetesAndWorkerTaskRunner that manifests as missing information in the druid Web Console.
The aggregators had incorrect types for getResultType when shouldFinalze
is false. They had the finalized type, but they should have had the
intermediate type.
Also includes a refactor of how ExprMacroTable is handled in tests, to make
it easier to add tests for this to the MSQ module. The bug was originally
noticed because the incorrect result types caused MSQ queries with DS_HLL
to behave erratically.
* Remove stale comment since we're on avro version 1.11.1
* Update exception blocks. With 1.11.1, read() only throws IOException.
* Unit tests
* Cleanup and add more tests.
These were added in #14977, but the implementations are incorrect, because they return null when the input arg is null. They should return false when the input is null. Remove them for now, rather than fixing them, since they're so new that they might as well never have existed.
This entails:
Removing the enableUnnest flag and additional machinery
Updating the datasource plan and frame processors to support unnest
Adding support in MSQ for UnnestDataSource and FilteredDataSource
CalciteArrayTest now has a MSQ test component
Additional tests for Unnest on MSQ
Changes:
- Add task context parameter `taskLockType`. This determines the type of lock used by a batch task.
- Add new task actions for transactional replace and append of segments
- Add methods StorageCoordinator.commitAppendSegments and commitReplaceSegments
- Upgrade segments to appropriate versions when performing replace and append
- Add new metadata table `upgradeSegments` to track segments that need to be upgraded
- Add tests
- Add `KillTaskReport` that contains stats for `numSegmentsKilled`,
`numBatchesProcessed`, `numSegmentsMarkedAsUnused`
- Fix bug where exception message had no formatter but was was still being passed some args.
- Add some comments regarding deprecation of `markAsUnused` flag.
* K8s tasks restore should be from lifecycle start
* add test
* add more tests
* fix test
* wait tasks restore finish when start
* fix style
* revert previous change and add comment
With PR #14322 , MSQ insert/Replace q's will wait for segment to be loaded on the historical's before finishing.
The patch introduces a bug where in the main thread had a thread.sleep() which could not be interrupted via the cancel calls from the overlord.
This new patch addressed that problem by moving the thread.sleep inside a thread of its own. Thus the main thread is now waiting on the future object of this execution.
The cancel call can now shutdown the executor service via another method thus unblocking the main thread to proceed.
This commit pulls out some changes from #14407 to simplify that PR.
Changes:
- Rename `IndexerMetadataStorageCoordinator.announceHistoricalSegments` to `commitSegments`
- Rename the overloaded method to `commitSegmentsAndMetadata`
- Fix some typos
Currently, only the user who has submitted the async query has permission to interact with the status APIs for that async query. However, often we want an administrator to interact with these resources as well.
Druid handles these with the STATE resource traditionally, and if the requesting user has necessary permissions on it as well, alternatively, they should be allowed to interact with the status APIs, irrespective of whether they are the submitter of the query.
* Adding new function decode_base64_utf8 and expr macro
* using BaseScalarUnivariateMacroFunctionExpr
* Print stack trace in case of debug in ChainedExecutionQueryRunner
* fix static check
* update RoaringBitmap to 0.9.49
update RoaringBitmap from 0.9.0 to 0.9.49
Many optimizations and improvements have gone into recent releases of
RoaringBitmap. It seems worthwhile to incorporate those.
* implement workaround for BatchIterator interface change
* add test case for BatchIteratorAdapter.advanceIfNeeded
* Add IS [NOT] DISTINCT FROM to SQL and join matchers.
Changes:
1) Add "isdistinctfrom" and "notdistinctfrom" native expressions.
2) Add "IS [NOT] DISTINCT FROM" to SQL. It uses the new native expressions
when generating expressions, and is treated the same as equals and
not-equals when generating native filters on literals.
3) Update join matchers to have an "includeNull" parameter that determines
whether we are operating in "equals" mode or "is not distinct from"
mode.
* Main changes:
- Add ARRAY handling to "notdistinctfrom" and "isdistinctfrom".
- Include null in pushed-down filters when using "notdistinctfrom" in a join.
Other changes:
- Adjust join filter analyzer to more explicitly use InDimFilter's ValuesSets,
relying less on remembering to get it right to avoid copies.
* Remove unused "wrap" method.
* Fixes.
* Remove methods we do not need.
* Fix bug with INPUT_REF.
* Add option to copy query results to clipboard
* Refactor, allow copying in all formats
---------
Co-authored-by: Sam Wheating <sam.wheating@reddit.com>
* SQL: Plan non-equijoin conditions as cross join followed by filter.
Druid has previously refused to execute joins with non-equality-based
conditions. This was well-intentioned: the idea was to push people to
write their queries in a different, hopefully more performant way.
But as we're moving towards fuller SQL support, it makes more sense to
allow these conditions to go through with the best plan we can come up
with: a cross join followed by a filter. In some cases this will allow
the query to run, and people will be happy with that. In other cases,
it will run into resource limits during execution. But we should at
least give the query a chance.
This patch also updates the documentation to explain how people can
tell whether their queries are being planned this way.
* cartesian is a word.
* Adjust tests.
* Update docs/querying/datasource.md
Co-authored-by: Benedict Jin <asdf2014@apache.org>
---------
Co-authored-by: Benedict Jin <asdf2014@apache.org>
This is due to the recursive filter creation in unnest storage adapter not performing correctly in case of an empty children. This PR addresses the issue