Commit Graph

13494 Commits

Author SHA1 Message Date
Vadim Ogievetsky 8d125b7c7f
Web console: segment writing progress indication (#13929)
* add segment writing progress indication

* update with more metrics

* add push metric
2023-03-22 16:34:38 -07:00
Nicholas Lippis d81d13b9ba
Pod template task adapter (#13896)
* Pod template task adapter

* Use getBaseTaskDirPaths

* Remove unused task from getEnv

* Use Optional.ifPresent() instead of Optional.map()

* Pass absolute path

* Don't pass task to getEnv

* Assert the correct adapter is created

* Javadocs and Comments

* Add exception message to assertions
2023-03-22 14:20:24 -06:00
Clint Wylie 086eb26b74
fix join and unnest planning to ensure that duplicate join prefixes are not used (#13943)
* fix join and unnest planning to ensure that duplicate join prefixes are not used

* wont somebody please think of the children
2023-03-22 12:53:55 -07:00
Adarsh Sanjeev 7bab407495
Add segment generator counters to MSQ reports (#13909)
* Add segment generator counters to reports

* Remove unneeded annotation

* Fix checkstyle and coverage

* Add persist and merged as new metrics

* Address review comments

* Fix checkstyle

* Create metrics class to handle updating counters

* Address review comments

* Add rowsPushed as a new metrics
2023-03-22 09:17:26 -07:00
Clint Wylie f4392a3155
expression transform improvements and fixes (#13947)
changes:
* fixes inconsistent handling of byte[] values between ExprEval.bestEffortOf and ExprEval.ofType, which could cause byte[] values to end up as java toString values instead of base64 encoded strings in ingest time transforms
* improved ExpressionTransform binding to re-use ExprEval.bestEffortOf when evaluating a binding instead of throwing it away
* improved ExpressionTransform array handling, added RowFunction.evalDimension that returns List<String> to back Row.getDimension and remove the automatic coercing of array types that would typically happen to expression transforms unless using Row.getDimension
* added some tests for ExpressionTransform with array inputs
* improved ExpressionPostAggregator to use partial type information from decoration
* migrate some test uses of InputBindings.forMap to use other methods
2023-03-21 23:26:53 -07:00
Kashif Faraz b7752a909c
Enable round-robin segment assignment and batch segment allocation by default (#13942)
Changes:
- Set `useRoundRobinSegmentAssignment` in coordinator dynamic config to `true` by default.
- Set `batchSegmentAllocation` in `TaskLockConfig` (used in Overlord runtime properties) to `true` by default.
2023-03-22 08:20:01 +05:30
Victoria Lim ede9903ff4
pip install for Python Druid API (#13938)
Broken test appears unrelated to this PR

* make druidapi pip installable

* include druidapi in prerequisites

* add license to setup.py

* updates from Paul's review

* note about editable install

* Apply suggestions from code review

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* update install instructions

* found unrelated typos

* standardize install cmd with pip

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-03-21 11:37:39 -07:00
Gian Merlino 1c7a03a47b
Lower default maxRowsInMemory for realtime ingestion. (#13939)
* Lower default maxRowsInMemory for realtime ingestion.

The thinking here is that for best ingestion throughput, we want
intermediate persists to be as big as possible without using up all
available memory. So, we rely mainly on maxBytesInMemory. The default
maxRowsInMemory (1 million) is really just a safety: in case we have
a large number of very small rows, we don't want to get overwhelmed
by per-row overheads.

However, maximum ingestion throughput isn't necessarily the primary
goal for realtime ingestion. Query performance is also important. And
because query performance is not as good on the in-memory dataset, it's
helpful to keep it from growing too large. 150k seems like a reasonable
balance here. It means that for a typical 5 million row segment, we
won't trigger more than 33 persists due to this limit, which is a
reasonable number of persists.

* Update tests.

* Update server/src/main/java/org/apache/druid/segment/indexing/RealtimeTuningConfig.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Fix test.

* Fix link.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2023-03-21 10:36:36 -07:00
Jill Osborne 4f95285406
Correct nested columns JSON example (#13953) 2023-03-21 09:17:26 -07:00
Atul Mohan 617c325c70
Make zk connection retries configurable (#13913)
* This makes the zookeeper connection retry count configurable. This is presently hardcoded to 29 tries which ends up taking a long time for the druid node to shutdown in case of ZK connectivity loss.
Having a shorter retry count helps k8s deployments to fail fast. In situations where the underlying k8s node loses network connectivity or is no longer able to talk to zookeeper, failing fast can trigger pod restarts which can then reassign the pod to a healthy k8s node.
Existing behavior is preserved, but users can override this property if needed.
2023-03-21 14:45:28 +05:30
Adarsh Sanjeev 143fdcfacf
Change test name so it triggers in CI (#13844)
As the name of the class did not end or start with "Test", CalciteSelectQueryMSQTest was not triggered in CI. This PR renames the test.
2023-03-20 15:55:52 +05:30
Tejaswini Bandlamudi 1c250a0bc0
Fix error in cron job ITs workflow (#13945) 2023-03-17 17:29:45 +05:30
John Gozde 38adac4369
Dart sass (#13937)
* Run npx saas-migrator division

* Switch to dart sass

* Upgrade blueprint

* Remove deprecated import syntax

* Prettify

* Snapshots
2023-03-16 12:44:24 -07:00
Karan Kumar bf13156b55
Regression bug fix where ever LimitFrameProcessor's were used. (#13941) 2023-03-16 09:18:18 -07:00
Karan Kumar 67df1324ee
Undocumenting certain context parameter in MSQ. (#13928)
* Removing intermediateSuperSorterStorageMaxLocalBytes, maxInputBytesPerWorker, composedIntermediateSuperSorterStorageEnabled, clusterStatisticsMergeMode from docs

* Adding documentation in the context class.
2023-03-16 17:56:44 +05:30
Tejaswini Bandlamudi da197c9273
Migrate existing jdk11 ITs to cron job (#13918)
This cron job runs on the latest commit of the master branch by default daily at 3:00 AM UTC.
2023-03-16 15:30:07 +05:30
Tejaswini Bandlamudi 6837289cb0
Fixes parquet uint_32 datatype conversion (#13935)
After parquet ingestion, uint_32 parquet datatypes are stored as null values in the dataSource. This PR fixes this conversion bug.
2023-03-16 15:27:38 +05:30
abhagraw c7d864d3bc
Update container creation in AzureTestUtil.java (#13911)
*
1. Handling deletion/creation of container created during the previously run test in AzureTestUtil.java.
2. Adding/updating log messages and comments in Azure and GCS deep storage tests.
2023-03-16 11:04:43 +05:30
317brian 65a663adbb
docs: clarify Java precision (#13671)
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-03-15 11:43:41 -07:00
Benedict Jin cee2dfd768
Upgrade ZK from 3.5.9 to 3.5.10 to avoid data inconsistency risk (#13715) 2023-03-15 19:21:09 +05:30
Andreas Maechler 46766d245c
Replace deprecated substr with slice (#13822) 2023-03-15 03:57:06 -07:00
Clint Wylie ed57c5c853
better FrontCodedIndexed (#13854)
* Adds new implementation of 'frontCoded' string encoding strategy, which writes out a v1 FrontCodedIndexed which stores buckets on a prefix of the previous value instead of the first value in the bucket
2023-03-14 18:14:11 -07:00
somu-imply 7ce3371730
Fixing a github workflow to resolve conflicts and use the correct tag for jdk (#13933) 2023-03-14 16:06:27 -07:00
somu-imply a7ba361666
Refactoring and bug fixes on top of unnest. The allowList now is not passed … (#13922)
* Refactoring and bug fixes on top of unnest. The filter now is passed inside the unnest cursors. Added tests for scenarios such as
1. filter on unnested column which involves a left filter rewrite
2. filter on unnested virtual column which pushes the filter to the right only and involves no rewrite
3. not filters
4. SQL functions applied on top of unnested column
5. null present in first row of the column to be unnested
2023-03-14 16:05:56 -07:00
Paul Rogers 4493275d88
Use Maven central repo rather than Apache (#13921)
* Use Maven central repo rather than Apache

* Disable snapshots
2023-03-13 10:49:32 -07:00
Karan Kumar 29b6bf0942
Removing the forbidden check on getOrDefault due to java8 incompatibility. (#13920) 2023-03-11 09:49:32 +05:30
Elliott Freis 8a1dc2f51c
We want to tag the container based on the build jdk version, not the runtime version (#13917)
Co-authored-by: Elliott Freis <elliottfreis@Elliott-Freis.earth.dynamic.blacklight.net>
2023-03-10 11:35:33 -08:00
Suneet Saldanha 44547614ae
Report engine as a dimension for sqlQuery metrics (#13906)
* Report engine as a dimension for sqlQuery metrics

* docs
2023-03-10 11:23:57 -08:00
Karan Kumar 67be70e82e
Removing the forbidden check until we find a fix for java 8 to unblock builds. (#13910) 2023-03-10 21:37:19 +05:30
Gian Merlino 4b1ffbc452
Various changes and fixes to UNNEST. (#13892)
* Various changes and fixes to UNNEST.

Native changes:

1) UnnestDataSource: Replace "column" and "outputName" with "virtualColumn".
   This enables pushing expressions into the datasource. This in turn
   allows us to do the next thing...

2) UnnestStorageAdapter: Logically apply query-level filters and virtual
   columns after the unnest operation. (Physically, filters are pulled up,
   when possible.) This is beneficial because it allows filters and
   virtual columns to reference the unnested column, and because it is
   consistent with how the join datasource works.

3) Various documentation updates, including declaring "unnest" as an
   experimental feature for now.

SQL changes:

1) Rename DruidUnnestRel (& Rule) to DruidUnnestRel (& Rule). The rel
   is simplified: it only handles the UNNEST part of a correlated join.
   Constant UNNESTs are handled with regular inline rels.

2) Rework DruidCorrelateUnnestRule to focus on pulling Projects from
   the left side up above the Correlate. New test testUnnestTwice verifies
   that this works even when two UNNESTs are stacked on the same table.

3) Include ProjectCorrelateTransposeRule from Calcite to encourage
   pushing mappings down below the left-hand side of the Correlate.

4) Add a new CorrelateFilterLTransposeRule and CorrelateFilterRTransposeRule
   to handle pulling Filters up above the Correlate. New tests
   testUnnestWithFiltersOutside and testUnnestTwiceWithFilters verify
   this behavior.

5) Require a context feature flag for SQL UNNEST, since it's undocumented.
   As part of this, also cleaned up how we handle feature flags in SQL.
   They're now hooked into EngineFeatures, which is useful because not
   all engines support all features.
2023-03-10 16:42:08 +05:30
AdheipSingh 64b67c22c4
add latest version of druid operator to integeration tests (#13883)
* add latest version of druid operator to integeration tests

* Update integration-tests/k8s/tiny-cluster.yaml
2023-03-10 16:11:25 +05:30
EylonLevy 29b519a7a7
Add ability to add config files when using druid on Kubernetes (#13795)
You can now add additional configuration files to be copied to the final conf directory on startup when running in a containerized environment. Useful for running on Kubernetes and needing to add more files with a config map. To specify the path where the configMap is mounted, utilize the DRUID_ADDITIONAL_CONF_DIR environment variable.
2023-03-10 14:52:02 +05:30
imply-cheddar 6b90a320cf
Add back function signature for compat (#13914)
* Add back function signature for compat

* Suppress IntelliJ Error
2023-03-09 21:06:34 -08:00
Laksh Singla c16d9da35a
Improve documentation for tombstone generation and minor improvement (#13907)
* As a follow up to #13893, this PR improves the comments added along with examples for the code, as well as adds handling for an edge case where the generated tombstone boundaries were overshooting the bounds of MIN_TIME (or MAX_TIME).
2023-03-10 06:59:51 +05:30
Laksh Singla 5b0b3a9b2c
Add a readOnly() method for PartitionedOutputChannel (#13755)
With SuperSorter using the PartitionedOutputChannels for sorting, it might OOM on inputs of reasonable size because the channel consists of both the writable frame channel and the frame allocator, both of which are not required once the output channel has been written to.
This change adds a readOnly to the output channel which contains only the readable channel, due to which unnecessary memory references to the writable channel and the memory allocator are lost once the output channel has been written to, preventing the OOM.
2023-03-10 06:58:00 +05:30
Gian Merlino bf39b4d313
Window planning: use collation traits, improve subquery logic. (#13902)
* Window planning: use collation traits, improve subquery logic.

SQL changes:

1) Attach RelCollation (sorting) trait to any PartialDruidQuery
   that ends in AGGREGATE or AGGREGATE_PROJECT. This allows planning to
   take advantage of the fact that Druid sorts by dimensions when
   doing aggregations.

2) Windowing: inspect RelCollation trait from input, and insert naiveSort
   if, and only if, necessary.

3) Windowing: add support for Project after Window, when the Project
   is a simple mapping. Helps eliminate subqueries.

4) DruidRules: update logic for considering subqueries to reflect that
   subqueries are not required to be GroupBys, and that we have a bunch
   of new Stages now. With all of this evolution that has happened, the
   old logic didn't quite make sense.

Native changes:

1) Use merge sort (stable) rather than quicksort when sorting
   RowsAndColumns. Makes it easier to write test cases for plans that
   involve re-sorting the data.

* Changes from review.

* Mark the bad test as failing.

* Additional update.

* Fix failingTest.

* Fix tests.

* Mark a var final.
2023-03-09 15:48:13 -08:00
Gian Merlino fe9d0c46d5
Improve memory efficiency of WrappedRoaringBitmap. (#13889)
* Improve memory efficiency of WrappedRoaringBitmap.

Two changes:

1) Use an int[] for sizes 4 or below.
2) Remove the boolean compressRunOnSerialization. Doesn't save much
   space, but it does save a little, and it isn't adding a ton of value
   to have it be configurable. It was originally configurable in case
   anything broke when enabling it, but it's been a while and nothing
   has broken.

* Slight adjustment.

* Adjust for inspection.

* Updates.

* Update snaps.

* Update test.

* Adjust test.

* Fix snaps.
2023-03-09 15:48:02 -08:00
Clint Wylie 48ac5ce50b
use native nvl expression for SQL NVL and 2 argument COALESCE (#13897)
* use custom case operator conversion instead of direct operator conversion, to produce native nvl expression for SQL NVL and 2 argument COALESCE, and add optimization for certain case filters from coalesce and nvl statements
2023-03-09 05:46:17 -08:00
Gian Merlino 90d8f67e3d
Avoid creating new RelDataTypeFactory during SQL planning. (#13904)
* Avoid creating new RelDataTypeFactory during SQL planning.

Reduces unnecessary CPU cycles.

* Fix.
2023-03-08 21:55:49 -08:00
Laksh Singla dc67296e9d
Fix for OOM in the Tombstone generating logic in MSQ (#13893)
fix OOMs using a different logic for generating tombstones

---------

Co-authored-by: Paul Rogers <paul-rogers@users.noreply.github.com>
2023-03-08 21:38:08 -08:00
Clint Wylie c7f4bb5056
fix KafkaInputFormat when used with Sampler API (#13900)
* fix KafkaInputFormat when used with Sampler API

* handle key format sampling the same as value format sampling
2023-03-08 16:23:24 -08:00
Gian Merlino 82f7a56475
Sort-merge join and hash shuffles for MSQ. (#13506)
* Sort-merge join and hash shuffles for MSQ.

The main changes are in the processing, multi-stage-query, and sql modules.

processing module:

1) Rename SortColumn to KeyColumn, replace boolean descending with KeyOrder.
   This makes it nicer to model hash keys, which use KeyOrder.NONE.

2) Add nullability checkers to the FieldReader interface, and an
   "isPartiallyNullKey" method to FrameComparisonWidget. The join
   processor uses this to detect null keys.

3) Add WritableFrameChannel.isClosed and OutputChannel.isReadableChannelReady
   so callers can tell which OutputChannels are ready for reading and which
   aren't.

4) Specialize FrameProcessors.makeCursor to return FrameCursor, a random-access
   implementation. The join processor uses this to rewind when it needs to
   replay a set of rows with a particular key.

5) Add MemoryAllocatorFactory, which is embedded inside FrameWriterFactory
   instead of a particular MemoryAllocator. This allows FrameWriterFactory
   to be shared in more scenarios.

multi-stage-query module:

1) ShuffleSpec: Add hash-based shuffles. New enum ShuffleKind helps callers
   figure out what kind of shuffle is happening. The change from SortColumn
   to KeyColumn allows ClusterBy to be used for both hash-based and sort-based
   shuffling.

2) WorkerImpl: Add ability to handle hash-based shuffles. Refactor the logic
   to be more readable by moving the work-order-running code to the inner
   class RunWorkOrder, and the shuffle-pipeline-building code to the inner
   class ShufflePipelineBuilder.

3) Add SortMergeJoinFrameProcessor and factory.

4) WorkerMemoryParameters: Adjust logic to reserve space for output frames
   for hash partitioning. (We need one frame per partition.)

sql module:

1) Add sqlJoinAlgorithm context parameter; can be "broadcast" or
   "sortMerge". With native, it must always be "broadcast", or it's a
   validation error. MSQ supports both. Default is "broadcast" in
   both engines.

2) Validate that MSQs do not use broadcast join with RIGHT or FULL join,
   as results are not correct for broadcast join with those types. Allow
   this in native for two reasons: legacy (the docs caution against it,
   but it's always been allowed), and the fact that it actually *does*
   generate correct results in native when the join is processed on the
   Broker. It is much less likely that MSQ will plan in such a way that
   generates correct results.

3) Remove subquery penalty in DruidJoinQueryRel when using sort-merge
   join, because subqueries are always required, so there's no reason
   to penalize them.

4) Move previously-disabled join reordering and manipulation rules to
   FANCY_JOIN_RULES, and enable them when using sort-merge join. Helps
   get to better plans where projections and filters are pushed down.

* Work around compiler problem.

* Updates from static analysis.

* Fix @param tag.

* Fix declared exception.

* Fix spelling.

* Minor adjustments.

* wip

* Merge fixups

* fixes

* Fix CalciteSelectQueryMSQTest

* Empty keys are sortable.

* Address comments from code review. Rename mux -> mix.

* Restore inspection config.

* Restore original doc.

* Reorder imports.

* Adjustments

* Fix.

* Fix imports.

* Adjustments from review.

* Update header.

* Adjust docs.
2023-03-08 14:19:39 -08:00
Gian Merlino f0fb094cc7
Fix start-druid for indexers. (#13891)
There was an unused parameter causing the unpack to fail.
2023-03-08 10:32:07 -08:00
Clint Wylie 68db39d08a
fix ci (#13901)
This PR is #13899 plus spotbugs fix to fix the failures introduced by #13815
2023-03-08 16:55:47 +05:30
Abhishek Agarwal 52bd9e6adb
Improved error message when topic name changes within same supervisor (#13815)
Improved error message when topic name changes within same supervisor

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
2023-03-07 18:10:18 -08:00
Adarsh Sanjeev ef82756176
Add validation for aggregations on __time (#13793)
* Add validation for aggregations on __time
2023-03-07 17:16:36 -08:00
Nicholas Lippis faac43eabe
Use base task dir in kubernetes task runner (#13880)
* Use TaskConfig to get task dir in KubernetesTaskRunner

* Use the first path specified in baseTaskDirPaths instead of deprecated baseTaskDirPath

* Use getBaseTaskDirPaths in generate command
2023-03-07 15:30:42 -07:00
Clint Wylie 3924f0eff4
use Calcites.getColumnTypeForRelDataType for SQL CAST operator conversion (#13890)
* use Calcites.getColumnTypeForRelDataType for SQL CAST operator conversion

* fix comment

* intervals are strings but also longs
2023-03-07 13:12:15 -08:00
Vadim Ogievetsky ca4df85941
fix SQL in segment card (#13895) 2023-03-07 13:08:04 -08:00
Gian Merlino fcfb7b8ff6
Add warning comments to Granularity.getIterable. (#13888)
This function is notorious for causing memory exhaustion and excessive
CPU usage; so much so that it was valuable to work around it in the
SQL planner in #13206. Hopefully, a warning comment will encourage
developers to stay away and come up with solutions that do not involve
computing all possible buckets.
2023-03-06 22:57:10 -08:00