* Reverse lookup fixes and enhancements.
1) Add a "mayIncludeUnknown" parameter to DimFilter#optimize. This is important
because otherwise the reverse-lookup optimization is done improperly when
the "in" filter appears under a "not", and the lookup extractionFn may return
null for some possible values of the filtered column. The "includeUnknown" test
cases in InDimFilterTest illustrate the difference in behavior.
2) Enhance InDimFilter#optimizeLookup to handle "mayIncludeUnknown", and to be able
to do a reverse lookup in a wider variety of cases.
3) Make "unapply" protected in LookupExtractor, and move callers to "unapplyAll".
The main reason is that MapLookupExtractor, a common implementation, lacks a
reverse mapping and therefore does a scan of the map for each call to "unapply".
For performance sake these calls need to be batched.
* Remove optimize call from BloomDimFilter.
* Follow the law.
* Fix tests.
* Fix imports.
* Switch function.
* Fix tests.
* More tests.
* New handling for COALESCE, SEARCH, and filter optimization.
COALESCE is converted by Calcite's parser to CASE, which is largely
counterproductive for us, because it ends up duplicating expressions.
In the current code we end up un-doing it in our CaseOperatorConversion.
This patch has a different approach:
1) Add CaseToCoalesceRule to convert CASE back to COALESCE earlier, before
the Volcano planner runs, using CaseToCoalesceRule.
2) Add FilterDecomposeCoalesceRule to decompose calls like
"f(COALESCE(x, y))" into "(x IS NOT NULL AND f(x)) OR (x IS NULL AND f(y))".
This helps use indexes when available on x and y.
3) Add CoalesceLookupRule to push COALESCE into the third arg of LOOKUP.
4) Add a native "coalesce" function so we can convert 3+ arg COALESCE.
The advantage of this approach is that by un-doing the CASE to COALESCE
conversion earlier, we have flexibility to do more stuff with
COALESCE (like decomposition and pushing into LOOKUP).
SEARCH is an operator used internally by Calcite to represent matching
an argument against some set of ranges. This patch improves our handling
of SEARCH in two ways:
1) Expand NOT points (point "holes" in the range set) from SEARCH as
`!(a || b)` rather than `!a && !b`, which makes it possible to convert
them to a "not" of "in" filter later.
2) Generate those nice conversions for NOT points even if the SEARCH
is not composed of 100% NOT points. Without this change, a SEARCH
for "x NOT IN ('a', 'b') AND x < 'm'" would get converted like
"x < 'a' OR (x > 'a' AND x < 'b') OR (x > 'b' AND x < 'm')".
One of the steps we take when generating Druid queries from Calcite
plans is to optimize native filters. This patch improves this step:
1) Extract common ANDed predicates in ConvertSelectorsToIns, so we can
convert "(a && x = 'b') || (a && x = 'c')" into "a && x IN ('b', 'c')".
2) Speed up CombineAndSimplifyBounds and ConvertSelectorsToIns on
ORs with lots of children by adjusting the logic to avoid calling
"indexOf" and "remove" on an ArrayList.
3) Refactor ConvertSelectorsToIns to reduce duplicated code between the
handling for "selector" and "equals" filters.
* Not so final.
* Fixes.
* Fix test.
* Fix test.
* Fix ColumnSelectorColumnIndexSelector#getColumnCapabilities.
It was using virtualColumns.getColumnCapabilities, which only returns
capabilities for virtual columns, not regular columns. The effect of this
is that expression filters (and in some cases, arrayContainsElement filters)
would build value matchers rather than use indexes.
I think this has been like this since #12315, which added the
getColumnCapabilities method to BitmapIndexSelector, and included the same
implementation as exists in the code today.
This error is easy to make due to the design of virtualColumns.getColumnCapabilities,
so to help avoid it in the future, this patch renames the method to
getColumnCapabilitiesWithoutFallback to emphasize that it does not return
capabilities for regular columns.
* Make getColumnCapabilitiesWithoutFallback package-private.
* Fix expression filter bitmap usage.
The PR: #13947 introduced a function evalDimension() in the interface RowFunction.
There was no default implementation added for this interface which causes all the implementations and custom transforms to fail and require to implement their own version of evalDimension method. This PR adds a default implementation in the interface which allows the evalDimension to return value as a Singleton array of eval result.
Fixes#15072
Before this modification , the third parameter (timezone) require to be a Literal, it will throw a error when this parameter is column Identifier.
Changes
- Add `log` implementation for `AuditManager` alongwith `SQLAuditManager`
- `LoggingAuditManager` simply logs the audit event. Thus, it returns empty for
all `fetchAuditHistory` calls.
- Add new config `druid.audit.manager.type` which can take values `log`, `sql` (default)
- Add new config `druid.audit.manager.logLevel` which can take values `DEBUG`, `INFO`, `WARN`.
This gets activated only if `type` is `log`.
- Remove usage of `ConfigSerde` from `AuditManager` as audit is not just limited to configs
- Add `AuditSerdeHelper` for a single implementation of serialization/deserialization of
audit payload and other utility methods.
* Allow for kafka emitter producer secrets to be masked in logs instead of being visible
This change will allow for kafka producer config values that should be secrets to not show up in the logs.
This will enhance the security of the people who use the kafka emitter to use this if they want to.
This is opt in and will not affect prior configs for this emitter
* fix checkstyle issue
* change property name
I was looking into a query which was performing a bit poorly because the case_searched was touching more than 1 columns (if there is only 1 column there is a cache based evaluator).
While I was doing that I've noticed that there are a few simple things which could help a bit:
use a static TRUE/FALSE instead of creating a new object every time
create the ExprEval early for ConstantExpr -s (except the one for BigInteger which seem to have some odd contract)
return early from type autodetection
these changes mostly reduce the amount of garbage the query creates during case_searched evaluation; although ExpressionSelectorBenchmark shows some improvements ~15% - but my manual trials on the taxi dataset with 60M rows showed more improvements - probably due to the fact that these changes mostly only reduce gc pressure.
* Add initial draft of MarkDanglingTombstonesAsUnused duty.
* Use overshadowed segments instead of all used segments.
* Add unit test for MarkDanglingSegmentsAsUnused duty.
* Add mock call
* Simplify code.
* Docs
* shorter lines formatting
* metric doc
* More tests, refactor and fix up some logic.
* update javadocs; other review comments.
* Make numCorePartitions as 0 in the TombstoneShardSpec.
* fix up test
* Add tombstone core partition tests
* Update docs/design/coordinator.md
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
* review comment
* Minor cleanup
* Only consider tombstones with 0 core partitions
* Need to register the test shard type to make jackson happy
* test comments
* checkstyle
* fixup misc typos in comments
* Update logic to use overshadowed segments
* minor cleanup
* Rename duty to eternity tombstone instead of dangling. Add test for full eternity tombstone.
* Address review feedback.
---------
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
Query with lookups in FilteredAggregator fails with this exception in router,
Cannot construct instance of `org.apache.druid.query.aggregation.FilteredAggregatorFactory`, problem: Lookup [campaigns_lookup[campaignId][is_sold][autodsp]] not found at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 913] (through reference chain: org.apache.druid.query.groupby.GroupByQuery["aggregations"]->java.util.ArrayList[1])
T
he problem is that constructor of FilteredAggregatorFactory is actually validating if the lookup exists in this statement dimFilter.toFilter().
This is failing on the router, which is to be expected, because, the router isn’t assigned any lookups.
The fix is to move to a lazy initialisation of the filter object in the constructor.
It wasn't checking the column name, so it would return a domain regardless
of the input column. This means that null filters on data sources with range
partitioning would lead to excessive pruning of segments, and therefore
missing results.
I think this is a problem as it discards the false return value when the putToKeyBuffer can't store the value because of the limit
Not forwarding the return value at that point may lead to the normal continuation here regardless something was not added to the dictionary like here
* Make numCorePartitions as 0 in the TombstoneShardSpec.
* fix up test
* Add tombstone core partition tests
* review comment
* Need to register the test shard type to make jackson happy
Fixed the following flaky tests:
org.apache.druid.math.expr.ParserTest#testApplyFunctions
org.apache.druid.math.expr.ParserTest#testSimpleMultiplicativeOp1
org.apache.druid.math.expr.ParserTest#testFunctions
org.apache.druid.math.expr.ParserTest#testSimpleLogicalOps1
org.apache.druid.math.expr.ParserTest#testSimpleAdditivityOp1
org.apache.druid.math.expr.ParserTest#testSimpleAdditivityOp2
The above mentioned tests have been reported as flaky (tests assuming deterministic implementation of a non-deterministic specification ) when ran against the NonDex tool.
The tests contain assertions (Assertion 1 & Assertion 2) that compare an ArrayList created from a HashSet using the ArrayList() constructor with another List. However, HashSet does not guarantee the ordering of elements and thus resulting in these flaky tests that assume deterministic implementation of HashSet. Thus, when the NonDex tool shuffles the HashSet elements, it results in the test failures:
Co-authored-by: ythorat2 <ythorat2@illinois.edu>
* MSQ generates tombstones honoring the query's granularity.
This change tweaks to only account for the infinite-interval tombstones.
For finite-interval tombstones, the MSQ query granualrity will be used
which is consistent with how MSQ works.
* more tests and some cleanup.
* checkstyle
* comment edits
* Throw TooManyBuckets fault based on review; add more tests.
* Add javadocs for both methods on reconciling the methods.
* review: Move testReplaceTombstonesWithTooManyBucketsThrowsException to MsqFaultsTest
* remove unused imports.
* Move TooManyBucketsException to indexing package for shared exception handling.
* lower max bucket for tests and fixup count
* Advance and count the iterator.
* checkstyle
In the current design, brokers query both data nodes and tasks to fetch the schema of the segments they serve. The table schema is then constructed by combining the schemas of all segments within a datasource. However, this approach leads to a high number of segment metadata queries during broker startup, resulting in slow startup times and various issues outlined in the design proposal.
To address these challenges, we propose centralizing the table schema management process within the coordinator. This change is the first step in that direction. In the new arrangement, the coordinator will take on the responsibility of querying both data nodes and tasks to fetch segment schema and subsequently building the table schema. Brokers will now simply query the Coordinator to fetch table schema. Importantly, brokers will still retain the capability to build table schemas if the need arises, ensuring both flexibility and resilience.
* Use filters for pruning properly for hash-joins.
Native used them too aggressively: it might use filters for the RHS
to prune the LHS. MSQ used them not at all. Now, both use them properly,
pruning based on base (LHS) columns only.
* Fix tests.
* Fix style.
* Clear filterFields too.
* Update.
* Add system fields to input sources.
Main changes:
1) The SystemField enum defines system fields "__file_uri", "__file_path",
and "__file_bucket". They are associated with each input entity.
2) The SystemFieldInputSource interface can be added to any InputSource
to make it system-field-capable. It sets up serialization of a list
of configured "systemFields" in the JSON form of the input source, and
provides a method getSystemFieldValue for computing the value of each
system field. Cloud object, HDFS, HTTP, and Local now have this.
* Fix various LocalInputSource calls.
* Fix style stuff.
* Fixups.
* Fix tests and coverage.
* better documentation for the differences between arrays and mvds
* add outputType to ExpressionPostAggregator to make docs true
* add output coercion if outputType is defined on ExpressionPostAgg
* updated post-aggregations.md to be consistent with aggregations.md and filters.md and use tables
* Frames: consider writing singly-valued column when input column hasMultipleValues is UNKNOWN.
Prior to this patch, columnar frames would always write multi-valued columns if
the input column had hasMultipleValues = UNKNOWN. This had the effect of flipping
UNKNOWN to TRUE when copying data into frames, which is problematic because TRUE
causes expressions to assume that string inputs must be treated as arrays.
We now avoid this by flipping UNKNOWN to FALSE if no multi-valuedness
is encountered, and flipping it to TRUE if multi-valuedness is encountered.
* Add regression test case.
Currently advance function in postJoinCursor calls advanceUninterruptibly which in turn keeps calling baseCursor.advanceUninterruptibly until the post join condition matches, without checking for interrupts. This causes the CPU to hit 100% without getting a chance for query to be cancelled.
With this change, the call flow of advance and advanceUninterruptibly is separated out so that they call baseCursor.advance and baseCursor.advanceUninterruptibly in them, respectively, giving a chance for interrupts in the former case between successive calls to baseCursor.advance.
* Fix error assuming a Complex Type that is a Number is a double
In the case where a complex type is a number, it may not be castable to double. It can safely be case as Number first to get to the doubleValue.
- adds a new query build path: DruidQuery#toScanAndSortQuery which:
- builds a ScanQuery without considering the current ordering
- builds an operator to execute the sort
- fixes a null string to "null" literal string conversion in the frame serializer code
- fixes some DrillWindowQueryTest cases
- fix NPE in NaiveSortOperator in case there was no input
- enables back CoreRules.AGGREGATE_REMOVE
- adds a processing level OffsetLimit class and uses that instead of just the limit in the rac parts
- earlier window expressions on top of a subquery with an offset may have ignored the offset
* provide function name when unknown exceptions are encountered
* fix keywords/etc
* fix keywrod order - regex excercise
* add test
* add check&fix keywords
* decoupledIgnore
* Revert "decoupledIgnore"
This reverts commit e922c820a7.
* unpatch Function
* move to a different location
* checkstyle
for some exotic queries like:
SELECT
'_'||dim1,
MIN(cast(0 as double)) OVER (),
MIN(cast((cnt||cnt) as bigint)) OVER ()
FROM foo
the compilation have resulted in NPE -s mostly because VirtualColumn -s were not handled properly
* add native filters for "(filter) is true" and "(filter) is false"
changes:
* add IsTrueDimFilter, IsFalseDimFilter, and abstract IsBooleanDimFilter for native json filter implementations of `(filter) IS TRUE` and `(filter) IS FALSE`
* add IsBooleanFilter for actual filtering logic for these filters, which ignore includeUnknown to always use matches with false for true and !matches with true for false
* fix test incorrectly adjusted to wrong answer in #15058
* add tests for default value mode
* sql compatible tri-state native logical filters when druid.expressions.useStrictBooleans=true and druid.generic.useDefaultValueForNull=false, and new druid.generic.useThreeValueLogicForNativeFilters=true
* log.warn if non-default configurations are used to guide operators towards SQL complaint behavior
* fixes
* check for latest rewrite place
* Revert "check for latest rewrite place"
This reverts commit 5cf1e2c1ca.
* some stuff
(cherry picked from commit ab346d4373ea888eb8ef6115e018e7fb0d27407f)
* update test output
* updates to test ouptuts
* some stuff
* move validator
* cleanup
* fix
* change test slightly
* add apidoc cleanup warnings
* cleanup/etc
* instead of telling the story; add a fail with some reason whats the issue
* lead-lag fix
* add test
* remove unnecessary throw
* druidexception-trial
* Revert "druidexception-trial"
This reverts commit 8fa06644bc.
* undo changes to no_grouping; add no_grouping2
* add missing assert on resultcount
* rename method; update
* introduce enum/etc
* make resultmatchmode accessible from TestBuilder#expectedResults
* fix dump results to use log
* fix
* handle null correctly
* disable feature type based things for MSQ
* fix varianssqlaggtest
* use eps in other test
* fix intellij error
* add final
* addrss review
* update test/string/etc
* write concat in 3 lines :D
This patch changes the thread name of the processing pool of the indexers/peons/historicals from query.getType() + "_" + query.getDataSource() + "_" + query.getIntervals() to query.getId()
* add a bunch of tests with array typed columns to CalciteArraysQueryTest
* fix a bug with unnest filter pushdown when filtering on unnested array columns
This PR aims to add the capabilities to:
1. Fetch the realtime segment metadata from the coordinator server view,
2. Adds the ability for workers to query indexers, similar to how brokers do the same for native queries.
* Fix IndexerWorkerClient#fetchChannelData when response has data and error.
When a channel data response from a worker includes some data and then
some I/O error, then when the call is retried, we will re-read the set
of data that was read by the previous connection and add it to the
local channel again. This causes the local channel to become corrupted.
The patch fixes this case by skipping data that has already been read.
* Updating plans when using joins with unnest on the left
* Correcting segment map function for hashJoin
* The changes done here are not reflected into MSQ yet so these tests might not run in MSQ
* native tests
* Self joins with unnest data source
* Making this pass
* Addressing comments by adding explanation and new test
Code relying on monomorphic processing on JDK8 doesn't work correctly, since it tries to reference getArrayLength using method handles, which might have been accidentally removed here since it seems unused. This PR adds the method back as is.
Fixes a bug caused by #14919, which was just using the column name as part of a temp file name, which.. isn't very cool, my bad. Switched to use StringUtils.urlEncode so that ugly chars don't explode stuff. The modified test fails without the changes in this PR.
Row-based frames, and by extension, MSQ now supports numeric array types. This means that all queries consuming or producing arrays would also work with MSQ. Numeric arrays can also be ingested via MSQ. Post this patch, queries like, SELECT [1, 2] would work with MSQ since they consume a numeric array, instead of failing with an unsupported column type exception.
When merging analyses, lenient merging sets unmergeable aggregators
to null. Merging such a null aggregator record into a nonnull record
would potentially lead to NPE in getMergingFactory.
The new code only calls getMergingFactory if both the old and new
aggregators are nonnull; else, if either is null, then the merged
aggregator is also set to null.
This patch introduces "processor managers" to processor factories, as a replacement for the sequence of processors. Processor managers can use the results of earlier processors to influence the creation of later processors, which provides us with the building block we need to ensure that broadcast join data is only read once.
In particular, when broadcast join is happening, the BaseFrameProcessorFactory now uses a ChainedProcessorManager to first run BroadcastJoinSegmentMapFnProcessor (in a single thread), and then run all of the regular processors (possibly multithreaded).
When moving timestamps by an offset using org.joda.time.chrono.ISOChronology library, if the new timestamp falls in Daylight Savings Time (DST) transition period, the library rounds it off to the nearest valid time. This can lead to incorrect final timestamp when calculated using intermediate offsets landing in DST transition, for e.g. +21D arrived at using +14D and +7D offset, where +14D lands in DST transition period. Since bucketStart values are calculated using this library, this behaviour can lead to incorrect bucketStart times.
This change updates dependencies as needed and fixes tests to remove code incompatible with Java 21
As a result all unit tests now pass with Java 21.
* update maven-shade-plugin to 3.5.0 and follow-up to #15042
* explain why we need to override configuration when specifying outputFile
* remove configuration from dependency management in favor of explicit overrides in each module.
* update to mockito to 5.5.0 for Java 21 support when running with Java 11+
* continue using latest mockito 4.x (4.11.0) when running with Java 8
* remove need to mock private fields
* exclude incorrectly declared mockito dependency from pac4j-oidc
* remove mocking of ByteBuffer, since sealed classes can no longer be mocked in Java 21
* add JVM options workaround for system-rules junit plugin not supporting Java 18+
* exclude older versions of byte-buddy from assertj-core
* fix for Java 19 changes in floating point string representation
* fix missing InitializedNullHandlingTest
* update easymock to 5.2.0 for Java 21 compatibility
* update animal-sniffer-plugin to 1.23
* update nl.jqno.equalsverifier to 3.15.1
* update exec-maven-plugin to 3.1.0
This change is meant to fix a issue where passing too large of a task payload to the mm-less task runner will cause the peon to fail to startup because the payload is passed (compressed) as a environment variable (TASK_JSON). In linux systems the limit for a environment variable is commonly 128KB, for windows systems less than this. Setting a env variable longer than this results in a bunch of "Argument list too long" errors.
The aggregators had incorrect types for getResultType when shouldFinalze
is false. They had the finalized type, but they should have had the
intermediate type.
Also includes a refactor of how ExprMacroTable is handled in tests, to make
it easier to add tests for this to the MSQ module. The bug was originally
noticed because the incorrect result types caused MSQ queries with DS_HLL
to behave erratically.
These were added in #14977, but the implementations are incorrect, because they return null when the input arg is null. They should return false when the input is null. Remove them for now, rather than fixing them, since they're so new that they might as well never have existed.
Changes:
- Add task context parameter `taskLockType`. This determines the type of lock used by a batch task.
- Add new task actions for transactional replace and append of segments
- Add methods StorageCoordinator.commitAppendSegments and commitReplaceSegments
- Upgrade segments to appropriate versions when performing replace and append
- Add new metadata table `upgradeSegments` to track segments that need to be upgraded
- Add tests
* Adding new function decode_base64_utf8 and expr macro
* using BaseScalarUnivariateMacroFunctionExpr
* Print stack trace in case of debug in ChainedExecutionQueryRunner
* fix static check
* update RoaringBitmap to 0.9.49
update RoaringBitmap from 0.9.0 to 0.9.49
Many optimizations and improvements have gone into recent releases of
RoaringBitmap. It seems worthwhile to incorporate those.
* implement workaround for BatchIterator interface change
* add test case for BatchIteratorAdapter.advanceIfNeeded
* Add IS [NOT] DISTINCT FROM to SQL and join matchers.
Changes:
1) Add "isdistinctfrom" and "notdistinctfrom" native expressions.
2) Add "IS [NOT] DISTINCT FROM" to SQL. It uses the new native expressions
when generating expressions, and is treated the same as equals and
not-equals when generating native filters on literals.
3) Update join matchers to have an "includeNull" parameter that determines
whether we are operating in "equals" mode or "is not distinct from"
mode.
* Main changes:
- Add ARRAY handling to "notdistinctfrom" and "isdistinctfrom".
- Include null in pushed-down filters when using "notdistinctfrom" in a join.
Other changes:
- Adjust join filter analyzer to more explicitly use InDimFilter's ValuesSets,
relying less on remembering to get it right to avoid copies.
* Remove unused "wrap" method.
* Fixes.
* Remove methods we do not need.
* Fix bug with INPUT_REF.
This is due to the recursive filter creation in unnest storage adapter not performing correctly in case of an empty children. This PR addresses the issue
* Fix for latest agg to handle nulls in time column. Also adding optimization for dictionary encoded string columns
* One minor fix
* Adding more tests for the new class
* Changing the init to a putInt
* Fix for schema mismatch to go down using the non vectorize path till we update the vectorized aggs properly
* Fixing a failed test
* Updating numericNilAgg
* Moving to use default values in case of nil agg
* Adding the same for first agg
* Fixing a test
* fixing vectorized string agg for last/first with cast if numeric
* Updating tests to remove mockito and cover the case of string first/last on non string columns
* Updating a test to vectorize
* Addressing review comments: Name change to NilVectorAggregator and using static variables now
* fixing intellij inspections
Changes:
- Move following configs from `CliCoordinator` to `DruidCoordinatorConfig`:
- `druid.coordinator.kill.on`
- `druid.coordinator.kill.pendingSegments.on`
- `druid.coordinator.kill.supervisors.on`
- `druid.coordinator.kill.rules.on`
- `druid.coordinator.kill.audit.on`
- `druid.coordinator.kill.datasource.on`
- `druid.coordinator.kill.compaction.on`
- In the Coordinator style used by historical management duties, always instantiate all
the metadata cleanup duties but execute only if enabled. In the existing code, they are
instantiated only when enabled by using optional binding with Guice.
- Add a wrapper `MetadataManager` which contains handles to all the different
metadata managers for rules, supervisors, segments, etc.
- Add a `CoordinatorConfigManager` to simplify read and update of coordinator configs
- Remove persistence related methods from `CoordinatorCompactionConfig` and
`CoordinatorDynamicConfig` as these are config classes.
- Remove annotations `@CoordinatorIndexingServiceDuty`,
`@CoordinatorMetadataStoreManagementDuty`
changes:
* add back nested column v4 serializers
* 'json' schema by default still uses the newer 'nested common format' used by 'auto', but now has an optional 'formatVersion' property which can be specified to override format versions on native ingest jobs
* add system config to specify default column format stuff, 'druid.indexing.formats', and property 'druid.indexing.formats.nestedColumnFormatVersion' to specify system level preferred nested column format for friendly rolling upgrades from versions which do not support the newer 'nested common format' used by 'auto'
* update test
* update test
* format
* test
* fix0
* Revert "fix0"
This reverts commit 44992cb393.
* ok resultset
* add plan
* update test
* before rewind
* test
* fix toString/compare/test
* move test
* add timeColumn to hashCode
Changes:
- Add new metric `kill/pendingSegments/count` with dimension `dataSource`
- Add tests for `KillStalePendingSegments`
- Reduce no-op logs that spit out for each datasource even when no pending
segments have been deleted. This can get particularly noisy at low values of `indexingPeriod`.
- Refactor the code in `KillStalePendingSegments` for readability and add javadocs
A new monitor SubqueryCountStatsMonitor which emits the metrics corresponding to the subqueries and their execution is now introduced. Moreover, the user can now also use the auto mode to automatically set the number of bytes available per query for the inlining of its subquery's results.
* Vectorizing earliest for numeric
* Vectorizing earliest string aggregator
* checkstyle fix
* Removing unnecessary exceptions
* Ignoring tests in MSQ as earliest is not supported for numeric there
* Fixing benchmarks
* Updating tests as MSQ does not support earliest for some cases
* Addressing review comments by adding the following:
1. Checking capabilities first before creating selectors
2. Removing mockito in tests for numeric first aggs
3. Removing unnecessary tests
* Addressing issues for dictionary encoded single string columns where we can use the dictionary ids instead of the entire string
* Adding a flag for multi value dimension selector
* Addressing comments
* 1 more change
* Handling review comments part 1
* Handling review comments and correctness fix for latest_by when the time expression need not be in sorted order
* Updating numeric first vector agg
* Revert "Updating numeric first vector agg"
This reverts commit 4291709901.
* Updating code for correctness issues
* fixing an issue with latest agg
* Adding more comments and removing an unnecessary check
* Addressing null checks for tie selector and only vectorize false for quantile sketches
Changes:
- Make ServiceMetricEvent.Builder extend ServiceEventBuilder<ServiceMetricEvent>
and thus convert it to a plain builder rather than a builder of builder.
- Add methods setCreatedTime , setMetricAndValue to the builder
Currently we have an error handler for https connection attempts, but
not for plaintext connection attempts. This leads to warnings like the
following for plaintext connection errors:
EXCEPTION, please implement org.jboss.netty.handler.codec.http.HttpContentDecompressor.exceptionCaught() for proper handling.
This happens because if we don't add our own error handler, the last
handler in the chain during a connection attempt is HttpContentDecompressor,
which doesn't handle errors.
The new error handler for plaintext doesn't do much: it just closes
the channel.
This patch fixes a few issues toward #14858
1. some phony classes were added to enable maven to track the compilation of those classes
2. cyclonedx 2.7.9 seem to handle incremental compilation better; it had a PR relating to that
3. needed to update root pom to 25
4. update antlr to 4.5.3 older one didn't really worked incrementally; 4.5.3 works much better
Currently, Druid is using Guava 16.0.1 version. This upgrade to 31.1-jre fixes the following issues.
CVE-2018-10237 (Unbounded memory allocation in Google Guava 11.0 through 24.x before 24.1.1 allows remote attackers to conduct denial of service attacks against servers that depend on this library and deserialize attacker-provided data because the AtomicDoubleArray class (when serialized with Java serialization) and the CompoundOrdering class (when serialized with GWT serialization) perform eager allocation without appropriate checks on what a client has sent and whether the data size is reasonable). We don't use Java or GWT serializations. Despite being false positive they're causing red security scans on Druid distribution.
Latest version of google-client-api is incompatible with the existing Guava version. This PR unblocks Update google client apis to latest version #14414
Follow up changes to #12599
Changes:
- Rename column `used_flag_last_updated` to `used_status_last_updated`
- Remove new CLI tool `UpdateTables`.
- We already have a `CreateTables` with similar functionality, which should be able to
handle update cases too.
- Any user running the cluster for the first time should either just have `connector.createTables`
enabled or run `CreateTables` which should create tables at the latest version.
- For instance, the `UpdateTables` tool would be inadequate when a new metadata table has
been added to Druid, and users would have to run `CreateTables` anyway.
- Remove `upgrade-prep.md` and include that info in `metadata-init.md`.
- Fix log messages to adhere to Druid style
- Use lambdas
* Add new configurable buffer period to create gap between mark unused and kill of segment
* Changes after testing
* fixes and improvements
* changes after initial self review
* self review changes
* update sql statement that was lacking last_used
* shore up some code in SqlMetadataConnector after self review
* fix derby compatibility and improve testing/docs
* fix checkstyle violations
* Fixes post merge with master
* add some unit tests to improve coverage
* ignore test coverage on new UpdateTools cli tool
* another attempt to ignore UpdateTables in coverage check
* change column name to used_flag_last_updated
* fix a method signature after column name switch
* update docs spelling
* Update spelling dictionary
* Fixing up docs/spelling and integrating altering tasks table with my alteration code
* Update NULL values for used_flag_last_updated in the background
* Remove logic to allow segs with null used_flag_last_updated to be killed regardless of bufferPeriod
* remove unneeded things now that the new column is automatically updated
* Test new background row updater method
* fix broken tests
* fix create table statement
* cleanup DDL formatting
* Revert adding columns to entry table by default
* fix compilation issues after merge with master
* discovered and fixed metastore inserts that were breaking integration tests
* fixup forgotten insert by using pattern of sharing now timestamp across columns
* fix issue introduced by merge
* fixup after merge with master
* add some directions to docs in the case of segment table validation issues
* Update to Calcite 1.35.0
* Update from.ftl for Calcite 1.35.0.
* Fixed tests in Calcite upgrade by doing the following:
1. Added a new rule, CoreRules.PROJECT_FILTER_TRANSPOSE_WHOLE_PROJECT_EXPRESSIONS, to Base rules
2. Refactored the CorrelateUnnestRule
3. Updated CorrelateUnnestRel accordingly
4. Fixed a case with selector filters on the left where Calcite was eliding the virtual column
5. Additional test cases for fixes in 2,3,4
6. Update to StringListAggregator to fail a query if separators are not propagated appropriately
* Refactored for testcases to pass after the upgrade, introduced 2 new data sources for handling filters and select projects
* Added a literalSqlAggregator as the upgraded Calcite involved changes to subquery remove rule. This corrected plans for 2 queries with joins and subqueries by replacing an useless literal dimension with a post agg. Additionally a test with COUNT DISTINCT and FILTER which was failing with Calcite 1.21 is added here which passes with 1.35
* Updated to latest avatica and updated code as SqlUnknownTimeStamp is now used in Calcite which needs to be resolved to a timestamp literal
* Added a wrapper segment ref to use for unnest and filter segment reference
The Azure connector is introduced and MSQ's fault tolerance and durable storage can now be used with Microsoft Azure's blob storage. Also, the results of newly introduced queries from deep storage can now store and fetch the results from Azure's blob storage.
The current version of jackson-databind is flagged for vulnerabilities CVE-2020-28491 (Although cbor format is not used in druid), CVE-2020-36518 (Seems genuine as deeply nested json in can cause resource exhaustion). Updating the dependency to the latest version 2.12.7 to fix these vulnerabilities.
* Minimize PostAggregator computations
Since a change back in 2014, the topN query has been computing
all PostAggregators on all intermediate responses from leaf nodes
to brokers. This generates significant slow downs for queries
with relatively expensive PostAggregators. This change rewrites
the query that is pushed down to only have the minimal set of
PostAggregators such that it is impossible for downstream
processing to do too much work. The final PostAggregators are
applied at the very end.
Fixes a case I missed in #14688 when the return type is STRING but its coming from a top level array typed column instead of a nested array column while making a vector object selector.
Also while here I noticed that the internal JSON_VALUE functions for array types were named inconsistently with the non-array functions, so I renamed them. These are not documented so it should not be disruptive in any way, since they are only used internally for rewrites while planning to make the correctly virtual column.
JSON_VALUE_RETURNING_ARRAY_VARCHAR -> JSON_VALUE_ARRAY_VARCHAR
JSON_VALUE_RETURNING_ARRAY_BIGINT -> JSON_VALUE_ARRAY_BIGINT
JSON_VALUE_RETURNING_ARRAY_DOUBLE -> JSON_VALUE_ARRAY_DOUBLE
The internal non-array functions are JSON_VALUE_VARCHAR, JSON_VALUE_BIGINT, and JSON_VALUE_DOUBLE.
This PR has fixes a bug in the SqlStatementAPI where if the task is not found on the overlord, the response status is 500.
This changes the response to invalid input since the queryID passed is not valid.
* fix issue with nested virtual column array element vector selectors when input is numeric array but output is non-numeric
* add vector value selector for mixed numeric type variant and nested variant fields, tests
* Save a metadata call when reading files from CloudObjectInputSource.
The call to createSplits(inputFormat, null) in formattableReader would
use the default split hint spec, MaxSizeSplitHintSpec, which makes
getObjectMetadata calls in order to compute its splits. This isn't
necessary; we're just trying to unpack the files inside the input
source.
To fix this, use FilePerSplitHintSpec to extract files without any
funny business.
* Adjust call.
* Fix constant.
* Test coverage.
* Frames support for string arrays that are null.
The row format represents null arrays as 0x0001, which older readers
would interpret as an empty array. This provides compatibility with
older readers, which is useful during updates.
The column format represents null arrays by writing -(actual length) - 1
instead of the length, and using FrameColumnWriters.TYPE_STRING_ARRAY for
the type code for string arrays generally. Older readers will report this
as an unrecognized type code. Column format is only used by the operator
query, which is currently experimental, so the impact isn't too severe.
* Remove unused import.
* Return Object[] instead of List from frame array selectors.
Update MSQSelectTest and MSQInsertTest to reflect the fact that null
arrays are possible.
Add a bunch of javadocs to object selectors describing expected behavior,
including the requirement that array selectors return Object[].
* update test case.
* Update test cases.
* fix issues with equality and range filters matching double values to long typed inputs
* adjust to ensure we never homogenize null, [], and [null] into [null] for expressions on real array columns
* allow for batched delete of segments instead of deleting segment data one by one
create new batchdelete method in datasegment killer that has default functionality
of iterating through all segments and calling delete on them. This will enable
a slow rollout of other deepstorage implementations to move to a batched delete
on their own time
* cleanup batchdelete segments
* batch delete with the omni data deleter
cleaned up code
just need to add tests and docs for this functionality
* update java doc to explain how it will try to use batch if function is overwritten
* rename killBatch to kill
add unit tests
* add omniDataSegmentKillerTest for deleting multiple segments at a time. fix checkstyle
* explain test peculiarity better
* clean up batch kill in s3.
* remove unused return value. cleanup comments and fix checkstyle
* default to batch delete. more specific java docs. list segments that couldn't be deleted
if there was a client error or server error
* simplify error handling
* add tests where an exception is thrown when killing multiple s3 segments
* add test for failing to delete two calls with the s3 client
* fix javadoc for kill(List<DataSegment> segments) clean up tests remove feature flag
* fix typo in javadocs
* fix test failure
* fix checkstyle and improve tests
* fix intellij inspections issues
* address comments, make delete multiple segments not assume same bucket
* fix test errors
* better grammar and punctuation. fix test. and better logging for exception
* remove unused code
* avoid extra arraylist instantiation
* fix broken test
* fix broken test
* fix tests to use assert.throws
* Use OverlordClient for all Overlord RPCs.
Continuing the work from #12696, this patch removes HttpIndexingServiceClient
and the IndexingService flavor of DruidLeaderClient completely. All remaining
usages are migrated to OverlordClient.
Supporting changes include:
1) Add a variety of methods to OverlordClient.
2) Update MetadataTaskStorage to skip the complete-task lookup when
the caller requests zero completed tasks. This helps performance of
the "get active tasks" APIs, which don't want to see complete ones.
* Use less forbidden APIs.
* Fixes from CI.
* Add test coverage.
* Two more tests.
* Fix test.
* Updates from CR.
* Remove unthrown exceptions.
* Refactor to improve testability and test coverage.
* Add isNil tests.
* Remove unnecessary "deserialize" methods.
* Add ingest/input/bytes metric and Kafka consumer metrics.
New metrics:
1) ingest/input/bytes. Equivalent to processedBytes in the task reports.
2) kafka/consumer/bytesConsumed: Equivalent to the Kafka consumer
metric "bytes-consumed-total". Only emitted for Kafka tasks.
3) kafka/consumer/recordsConsumed: Equivalent to the Kafka consumer
metric "records-consumed-total". Only emitted for Kafka tasks.
* Fix anchor.
* Fix KafkaConsumerMonitor.
* Interface updates.
* Doc changes.
* Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java
Co-authored-by: Benedict Jin <asdf2014@apache.org>
---------
Co-authored-by: Benedict Jin <asdf2014@apache.org>