druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	ee890965f4	LocalInputSource: Serialize File paths without forcing resolution. (#13534 ) * LocalInputSource: Serialize File paths without forcing resolution. Fixes #13359. * Add one more javadoc.	2022-12-19 11:47:36 +05:30
Adarsh Sanjeev	2b605aa9cf	Multiple fixes for the MSQ stats merging piece which (#13463 ) * Add validation checks to worker chat handler apis * Merge things and polishing the error messages. * Minor error message change * Fixing race and adding some tests * Fixing controller fetching stats from wrong workers. Fixing race Changing default mode to Parallel Adding logging. Fixing exceptions not propagated properly. * Changing to kernel worker count * Added a better logic to figure out assigned worker for a stage. * Nits * Moving to existing kernel methods * Adding more coverage Co-authored-by: cryptoe <karankumar1100@gmail.com>	2022-12-15 09:35:11 +05:30
Paul Rogers	013a12e86f	Enhanced MSQ table functions (#13360 ) * Enhanced MSQ table functions * HTTP, LOCALFILES and INLINE table functions powered by catalog metadata. * Documentation	2022-12-08 13:56:02 -08:00
Gian Merlino	91ef9872ec	MSQ: Improve TooManyBuckets error message, improve error docs. (#13525 ) 1) Edited the TooManyBuckets error message to mention PARTITIONED BY instead of segmentGranularity. 2) Added error-code-specific anchors in the docs. 3) Add information to various error codes in the docs about common causes and solutions.	2022-12-08 13:18:26 -08:00
Adarsh Sanjeev	280a0f7158	Add sequential sketch merging to MSQ (#13205 ) * Add sketch fetching framework * Refactor code to support sequential merge * Update worker sketch fetcher * Refactor sketch fetcher * Refactor sketch fetcher * Add context parameter and threshold to trigger sequential merge * Fix test * Add integration test for non sequential merge * Address review comments * Address review comments * Address review comments * Resolve maxRetainedBytes * Add new classes * Renamed key statistics information class * Rename fetchStatisticsSnapshotForTimeChunk function * Address review comments * Address review comments * Update documentation and add comments * Resolve build issues * Resolve build issues * Change worker APIs to async * Address review comments * Resolve build issues * Add null time check * Update integration tests * Address review comments * Add log messages and comments * Resolve build issues * Add unit tests * Add unit tests * Fix timing issue in tests	2022-11-22 09:56:32 +05:30
Jill Osborne	a860baf496	Updated docs on front coding (#13387 )	2022-11-19 00:01:04 -08:00
Laksh Singla	9e938b5a6f	Add a limit to the number of columns in the CLUSTERED BY clause (#13352 ) * Add clustered by limit * change semantics, add docs * add fault class to the module * add test * unambiguate test	2022-11-15 22:05:15 +05:30
Andreas Maechler	03175a2b8d	Add missing MSQ error code fields to docs (#13308 ) * Fix typo * Fix some spacing * Add missing fields * Cleanup table spacing * Remove durable storage docs again Thanks Brian for pointing out previous discussions. * Update docs/multi-stage-query/reference.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Mark codes as code * And even more codes as code * Another set of spaces * Combine `ColumnTypeNotSupported` Thanks Karan. * More whitespaces and typos * Add spelling and fix links Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-11-10 21:03:04 +05:30
Laksh Singla	b7a513fe09	Add a OverlordHelper that cleans up durable storage objects in MSQ (#13269 ) * scratch * s3 ls fix, add docs * add documentation, update method name * Add tests, address commits, change default value of the helper * fix test * update the default value of config, remove initial delay config * Trigger Build * update class * add more tests * docs update * spellcheck * remove ioe from the signature * add back dmmy constructor for initialization * fix guice bindings, intellij inspections	2022-11-09 17:23:35 +05:30
Gian Merlino	9423aa9163	MSQ: Consider PARTITION_STATS_MAX_BYTES in WorkerMemoryParameters. (#13274 ) * MSQ: Consider PARTITION_STATS_MAX_BYTES in WorkerMemoryParameters. This consideration is important, because otherwise we can run out of memory due to large statistics-tracking objects. * Improved calculations.	2022-11-07 14:27:18 +05:30
Gian Merlino	d1877e41ec	Use lookup memory footprint in MSQ memory computations. (#13271 ) * Use lookup memory footprint in MSQ memory computations. Two main changes: 1) Add estimateHeapFootprint to LookupExtractor. 2) Use this in MSQ's IndexerWorkerContext when determining the total amount of available memory. It's taken off the top. This prevents MSQ tasks from running out of memory when there are lookups defined in the cluster. * Updates from code review.	2022-11-03 07:36:54 -07:00
317brian	ae638e338c	docs(msq): update insert vs replace for dimension-based segment pruning (#13228 ) * docs(msq): update insert vs replace to mention dimension-based segment pruning * make suggested changes	2022-11-03 14:17:44 +05:30
Gian Merlino	d851985cf5	MSQ: Add support for indexSpec. (#13275 )	2022-10-28 14:27:50 -07:00
Adarsh Sanjeev	4775427e2c	Add task start status to worker report (#13263 ) * Add task start status to worker report * Address review comments * Address review comments * Update documentation * Update spelling checks	2022-10-28 12:00:15 +05:30
Karan Kumar	9d51e466b1	Minor doc update for BroadcastTablesTooLarge (#13218 ) Minor doc update for `BroadcastTablesTooLarge`. Now the user will know what to do in case this fault is encountered.	2022-10-14 09:06:55 +05:30
317brian	0edceead80	msq: update known issue about GROUPING SETS and COUNT DISTINCT (#13185 ) * msq: update known issue about GROUPING SETS and COUNT DISTINCT * address feedback from Gian	2022-10-05 19:47:03 -07:00
Adarsh Sanjeev	92d2633ae6	Update ClusterByStatisticsCollectorImpl to use bytes instead of keys (#12998 ) * Update clusterByStatistics to use bytes instead of keys * Address review comments * Resolve checkstyle * Increase test coverage * Update test * Update thresholds * Update retained keys function * Update docs * Fix spelling	2022-10-03 12:08:23 +05:30
317brian	7fa35839c0	fix: follow naming convention for msq task engine (#13127 ) * fix: follow naming convention for msq task engine * more fixes * add back in experimental * fix anchor	2022-09-21 18:46:06 -07:00
Gian Merlino	d9b2968edb	Docs: Clarify the situation with SELECT. (#13109 )	2022-09-17 10:47:57 -07:00
Gian Merlino	d4967c38f8	Various documentation updates. (#13107 ) * Various documentation updates. 1) Split out "data management" from "ingestion". Break it into thematic pages. 2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so all conceptual content is in concepts.md and all syntax content is in reference.md. Shorten the known issues page to the most interesting ones. 3) Add SQL-based ingestion to the ingestion method comparison page. Remove the index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1. 4) Rename various mentions of "Druid console" to "web console". 5) Add additional information to ingestion/partitioning.md. 6) Remove a mention of Tranquility. 7) Remove a note about upgrading to Druid 0.10.1. 8) Remove no-longer-relevant task types from ingestion/tasks.md. 9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated. 10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some places, but it isn't very useful compared to index_parallel, so it shouldn't take up space in the sidebar. 11) Make all br tags self-closing. 12) Certain other cosmetic changes. 13) Update to node-sass 7. * make travis use node12 for docs Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>	2022-09-16 21:58:11 -07:00
Vadim Ogievetsky	2493eb17bf	Doc fixes around msq (#13090 ) * remove things that do not apply * fix more things * pin node to a working version * fix * fixes * known issues tidy up * revert auto formatting changes * remove management-uis page which is 100% lies * don't mention the Coordinator console (that no longer exits) * goodies * fix typo	2022-09-16 02:15:26 -07:00
317brian	d4233ef2a1	msq: add multi-stage-query docs (#12983 ) * msq: add multi-stage-query docs * add screenshots add back theta sketches tutoria change filename fix filename fix link fix headings * fixes * fixes * fix spelling issues and update spell file * address feedback from karan * add missing guardrail to known issues * update blurb * fix typo * remove durable storage info * update titles * Restore en.json * Update query view * address comments from vad * Update docs/multi-stage-query/msq-known-issues.md finish sentence * add apache license to docs * add apache license to docs Co-authored-by: Katya Macedo <katya.macedo@imply.io> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-09-06 23:06:09 +05:30

22 Commits