* from start to step 3 of Ingest data using Theta sketche
* updated upto "Query the Theta sketch column"
* fixed sentence
* another typo
* using sql ingestion instead of batch-sql
* waiting for explanations on DS_THETA
* Revert "using sql ingestion instead of batch-sql"
This reverts commit b95fcb9b32608dba55deee9910e295f2391d77e2.
* Revert "using sql ingestion instead of batch-sql"
This reverts commit b95fcb9b32608dba55deee9910e295f2391d77e2.
* just copy and pasting to where I was
* updated tutorial
* fixing images, and removing unused
* slightly updating explanatio
* Update docs/tutorials/tutorial-sketches-theta.md
* Apply suggestions from code review
* addressing comments in review
* made filter clause consitent with other instances
* Apply suggestions from code review
---------
Co-authored-by: Edgar Melendrez <evmelendrez@gmail.com>
Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* draft update to known issues
* Update known issues
Remove addressed known issues. Clarify the issue with SELECT * queries.
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Fixes#16587
Streaming ingestion tasks operate by allocating segments before ingesting rows.
These allocations happen across replicas which may send different requests but
must get the same segment id for a given (datasource, interval, version, sequenceName)
across replicas.
This patch fixes the bug by ignoring the previousSegmentId when skipLineageCheck is true.
Co-authored-by: AmatyaAvadhanula <amatya.avadhanula@imply.io>
The patch makes the following changes:
1. Fixes a bug causing compaction to fail on array, complex, and other non-primitive-type columns
2. Updates compaction status check to be conscious of partition dimensions when comparing dimension ordering.
3. Ensures only string columns are specified as partition dimensions
4. Ensures `rollup` is true if and only if metricsSpec is non-empty
5. Ensures disjoint intervals aren't submitted for compaction
6. Adds `compactionReason` to compaction task context.
(cherry picked from commit 7e35e50052ee1b4f4d65222e0d5c4883e9fa26da)
changes:
adds ExpressionProcessing.allowVectorizeFallback() and ExpressionProcessingConfig.allowVectorizeFallback(), defaulting to false until few remaining bugs can be fixed (mostly complex types and some odd interactions with mixed types)
add cannotVectorizeUnlessFallback functions to make it easy to toggle the default of this config, and easy to know what to delete when we remove it in the future
Co-authored-by: Clint Wylie <cwylie@apache.org>
* abstract `IncrementalIndex` cursor stuff to prepare for using different "views" of the data based on the cursor build spec (#17064)
* abstract `IncrementalIndex` cursor stuff to prepare to allow for possibility of using different "views" of the data based on the cursor build spec
changes:
* introduce `IncrementalIndexRowSelector` interface to capture how `IncrementalIndexCursor` and `IncrementalIndexColumnSelectorFactory` read data
* `IncrementalIndex` implements `IncrementalIndexRowSelector`
* move `FactsHolder` interface to separate file
* other minor refactorings
* add DataSchema.Builder to tidy stuff up a bit (#17065)
* add DataSchema.Builder to tidy stuff up a bit
* fixes
* fixes
* more style fixes
* review stuff
* Projections prototype (#17214)
In a Dart query, all Historicals are given worker IDs, but not all of them
are going to actually be started or receive work orders.
Attempting to send a getCounters or postFinish command to a worker that
never received a work order is not only wasteful, but it causes errors due
to the workers not knowing about that query ID.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
* MSQ: Properly report errors that occur when starting up RunWorkOrder. (#17069)
* Dart: Smoother handling of stage early-exit. (#17228)
---------
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
* SQL: Use regular filters for time filtering in subqueries. (#17173)
* RunWorkOrder: Account for two simultaneous statistics collectors. (#17216)
* DartTableInputSpecSlicer: Fix for TLS workers. (#17224)
* Upgrade avro - minor version (#17230)
* SuperSorter: Don't set allDone if it's already set. (#17238)
* Decoupled planning: improve join support (#17039)
---------
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Zoltan Haindrich <kirk@rxd.hu>
Backport for the following patches
* MSQ profile for Brokers and Historicals. (#17140)
* Remove workerId parameter from postWorkerError. (#17072)
---------
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
* ScanQueryFrameProcessor: Close CursorHolders as we go along. (#17152)
* fix issue with ScanQueryFrameProcessor cursor build not adjusting intervals (#17168)
---------
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Co-authored-by: Clint Wylie <cwylie@apache.org>
Backport the following patches for a clean backport of Dart changes
1. Add "targetPartitionsPerWorker" setting for MSQ. (#17048)
2. MSQ: Improved worker cancellation. (#17046)
3. Add "includeAllCounters()" to WorkerContext. (#17047)
4. MSQ: Include worker context maps in WorkOrders. (#17076)
5. TableInputSpecSlicer changes to support running on Brokers. (#17074)
6. Fix call to MemoryIntrospector in IndexerControllerContext. (#17066)
7. MSQ: Add QueryKitSpec to encapsulate QueryKit params. (#17077)
8. MSQ: Use task context flag useConcurrentLocks to determine task lock type (#17193)
* Web console: add support for Dart engine (#17147)
* add console support for Dart engine
This reverts commit 6e46edf15dd55e5c51a1a4068e83deba4f22529b.
* feedback fixes
* surface new fields
* prioratize error over results
* better metadata refresh
* feedback fixes
* Web console: misc fixes to the Explore view (#17213)
* make record table able to hide column
* stickyness
* refactor query log
* fix measure drag
* start nested column dialog
* nested expand
* fix filtering on Measures
* use output name
* fix scrolling
* select all / none
* use ARRAY_CONCAT_AGG
* no need to limit if aggregating
* remove magic number
* better search
* update arg list
* add, don't replace
Currently, TaskDataSegmentProvider fetches the DataSegment from the Coordinator while loading the segment, but just discards it later. This PR refactors this to also return the DataSegment so that it can be used by workers without a separate fetch.
Co-authored-by: Adarsh Sanjeev <adarshsanjeev@gmail.com>
The "suggested server memory" figure needs to take into account
maxConcurrentStages. The fix here does not affect the main memory
calculations, but it does affect the accuracy of error messages.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
This patch makes the following adjustments to enable writing larger
single rows to frames:
1) RowBasedFrameWriter: Max out allocation size on the final doubling.
i.e., if the final allocation "naturally" would be 1 MiB but the
max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
allocation.
2) AppendableMemory: In reserveAdditional, release the last block if it
is empty. This eliminates waste when a frame writer uses a
successive-doubling approach to find the right allocation size.
3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
the last allocation is closed.
Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
This patch reworks memory management to better support multi-threaded
workers running in shared JVMs. There are two main changes.
First, processing buffers and threads are moved from a per-JVM model to
a per-worker model. This enables queries to hold processing buffers
without blocking other concurrently-running queries. Changes:
- Introduce ProcessingBuffersSet and ProcessingBuffers to hold the
per-worker and per-work-order processing buffers (respectively). On Peons,
this is the JVM-wide processing pool. On Indexers, this is a per-worker
pool of on-heap buffers. (This change fixes a bug on Indexers where
excessive processing buffers could be used if MSQ tasks ran concurrently
with realtime tasks.)
- Add "bufferPool" argument to GroupingEngine#process so a per-worker pool
can be passed in.
- Add "druid.msq.task.memory.maxThreads" property, which controls the
maximum number of processing threads to use per task. This allows usage of
multiple processing buffers per task if admins desire.
- IndexerWorkerContext acquires processingBuffers when creating the FrameContext
for a work order, and releases them when closing the FrameContext.
- Add "usesProcessingBuffers()" to FrameProcessorFactory so workers know
how many sets of processing buffers are needed to run a given query.
Second, adjustments to how WorkerMemoryParameters slices up bundles, to
favor more memory for sorting and segment generation. Changes:
- Instead of using same-sized bundles for processing and for sorting,
workers now use minimally-sized processing bundles (just enough to read
inputs plus a little overhead). The rest is devoted to broadcast data
buffering, sorting, and segment-building.
- Segment-building is now limited to 1 concurrent segment per work order.
This allows each segment-building action to use more memory. Note that
segment-building is internally multi-threaded to a degree. (Build and
persist can run concurrently.)
- Simplify frame size calculations by removing the distinction between
"standard" and "large" frames. The new default frame size is the same
as the old "standard" frames, 1 MB. The original goal of of the large
frames was to reduce the number of temporary files during sorting, but
I think we can achieve the same thing by simply merging a larger number
of standard frames at once.
- Remove the small worker adjustment that was added in #14117 to account
for an extra frame involved in writing to durable storage. Instead,
account for the extra frame whenever we are actually using durable storage.
- Cap super-sorter parallelism using the number of output partitions, rather
than using a hard coded cap at 4. Note that in practice, so far, this cap
has not been relevant for tasks because they have only been using a single
processing thread anyway.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
changes:
* CursorHolder.isPreAggregated method indicates that a cursor has pre-aggregated data for all AggregatorFactory specified in a CursorBuildSpec. If true, engines should rewrite the query to use AggregatorFactory.getCombiningAggreggator, and column selector factories will provide selectors with the aggregator interediate type for the aggregator factory name
* Added groupby, timeseries, and topN support for CursorHolder.isPreAggregated
* Added synthetic test since no CursorHolder implementations support isPreAggregated at this point in time
Co-authored-by: Clint Wylie <cwylie@apache.org>
Tasks control the loading of broadcast datasources via BroadcastDatasourceLoadingSpec getBroadcastDatasourceLoadingSpec(). By default, tasks download all broadcast datasources, unless there's an override as with kill and MSQ controller task.
The CLIPeon command line option --loadBroadcastSegments is deprecated in favor of --loadBroadcastDatasourceMode.
Broadcast datasources can be specified in SQL queries through JOIN and FROM clauses, or obtained from other sources such as lookups.To this effect, we have introduced a BroadcastDatasourceLoadingSpec. Finding the set of broadcast datasources during SQL planning will be done in a follow-up, which will apply only to MSQ tasks, so they load only required broadcast datasources. This PR primarily focuses on the skeletal changes around BroadcastDatasourceLoadingSpec and integrating it from the Task interface via CliPeon to SegmentBootstrapper.
Currently, only kill tasks and MSQ controller tasks skip loading broadcast datasources.
Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
* Web console query view improvements (#16991)
* Made maxNumTaskOptions configurable in the Query view
* Updated the copy for taskAssignment options
* Reordered options in engine menu for msq engine
* fixed snapshot
* maxNumTaskOptions -> maxTasksOptions
* added back select destination item
* fixed duplicate menu item
* snapshot
* Added the ability to hide certain engine menu options
* Added the ability to hide/show more menu items
* Make the tooltip better and improve structure (#17132)
* switch to using arrays by default (#17133)
* Web console: add stage graph (#17135)
* Web console: revamp the experimental explore view (#17180)
* explore revamp
* remove ToDo
* fix CodeQL
* add tooltips
* show issue on echart chars
* fix: browser back does not refresh chart
* fix maxRows 0
* be more resiliant to missing __time
---------
Co-authored-by: Sébastien <sebastien@imply.io>
Updated Delta Kernel from 3.2.0 to 3.2.1. This upstream version bump contains fixes to reading long columns, class loader and better retry mechanism when reading checkpoint files.
Previously, the processor used "remainingChannels" to track the number of
non-null entries of currentFrame. Now, "remainingChannels" tracks the
number of channels that are unfinished.
The difference is subtle. In the previous code, when an input channel
was blocked upon exiting nextFrame(), the "currentFrames" entry would be
null, and therefore the "remainingChannels" variable would be decremented.
After the next await and call to populateCurrentFramesAndTournamentTree(),
"remainingChannels" would be incremented if the channel had become
unblocked after awaiting.
This means that finished(), which returned true if remainingChannels was
zero, would not be reliable if called between nextFrame() and the
next await + populateCurrentFramesAndTournamentTree().
This patch changes things such that finished() is always reliable. This
fixes a regression introduced in PR #16911, which added a call to
finished() that was, at that time, unsafe.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
These are two heavily parameterized tests that, together, account for
about 60% of runtime in the test suite.
FrameFileTest changes:
1) Cache frame files in a static, rather than building the frame file
for each parameterization of the test.
2) Adjust TestArrayCursorFactory to cache the signature, rather than
re-creating it on each call to getColumnCapabilities.
SuperSorterTest changes:
1) Dramatically reduce the number of tests that run with
"maxRowsPerFrame" = 1. These are particularly slow due to writing so
many small files. Some still run, since it's useful to test edge cases,
but much fewer than before.
2) Reduce the "maxActiveProcessors" axis of the test from [1, 2, 4] to
[1, 3]. The aim is to reduce the number of cases while still getting
good coverage of the feature.
3) Reduce the "maxChannelsPerProcessor" axis of the test from [2, 3, 8]
to [2, 7]. The aim is to reduce the number of cases while still getting
good coverage of the feature.
4) Use in-memory input channels rather than file channels.
5) Defer formatting of assertion failure messages until they are needed.
6) Cache the cursor factory and its signature in a static.
7) Cache sorted test rows (used for verification) in a static.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
changes:
* filter index processing is now automatically ordered based on estimated 'cost', which is approximated based on how many expected bitmap operations are required to construct the bitmap used for the 'offset'
* cursorAutoArrangeFilters context flag now defaults to true, but can be set to false to disable cost based filter index sorting
Fixes a mistake introduced in #16533 which can result in CursorGranularizer incorrectly trying to get values from a selector after calling cursor.advance because of a missing check for cursor.isDone
The existing tests are moved into a "WithMaximalBuffering" subclass,
and a new "WithMinimalBuffering" subclass is added to test cases
where only a single frame is buffered.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>