druid

Commit Graph

Author	SHA1	Message	Date
Karan Kumar	9d51e466b1	Minor doc update for BroadcastTablesTooLarge (#13218 ) Minor doc update for `BroadcastTablesTooLarge`. Now the user will know what to do in case this fault is encountered.	2022-10-14 09:06:55 +05:30
zachjsh	2f2fe20089	Improve global-cached-lookups metric reporting (#13219 ) It was found that the namespace/cache/heapSizeInBytes metric that tracks the total heap size in bytes of all lookup caches loaded on a service instance was being under reported. We were not accounting for the memory overhead of the String object, which I've found in testing to be ~40 bytes. While this overhead may be java version dependent, it should not vary much, and accounting for this provides a better estimate. Also fixed some logging, and reading bytes from the JDBI result set a little more efficient by saving hash table lookups. Also added some of the lookup metrics to the default statsD emitter metric whitelist.	2022-10-13 18:51:54 -04:00
Rohan Garg	45dfd679e9	Composite approach for checking in-filter values set in column dictionary (#13133 )	2022-10-13 12:32:48 +05:30
Kashif Faraz	346fbf133f	Make DimensionDictionary abstract (#13215 ) This is in preparation for eventually retiring the flag `useMaxMemoryEstimates`, after which the footprint of a value in the dimension dictionary will always be estimated using the `estimateSizeOfValue()` method.	2022-10-13 07:18:46 +05:30
Abhishek Agarwal	548d0d0bb2	Add more information to exceptions occurred while writing temporary data (#13217 ) * Add more information to exceptions when writing tmp data to disk * Better error message	2022-10-13 08:23:51 +08:00
Clint Wylie	6eff6c9ae4	fix json_value sql planning with decimal type, fix vectorized expression math null value handling in default mode (#13214 ) * fix json_value sql planning with decimal type, fix vectorized expression math null value handling in default mode changes: * json_value 'returning' decimal will now plan to native double typed query instead of ending up with default string typing, allowing decimal vector math expressions to work with this type * vector math expressions now zero out 'null' values even in 'default' mode (druid.generic.useDefaultValueForNull=false) to prevent downstream things that do not check the null vector from producing incorrect results * more better * test and why not vectorize * more test, more fix	2022-10-12 16:28:41 -07:00
Tejaswini Bandlamudi	3e13584e0e	Adds Idle feature to `SeekableStreamSupervisor` for inactive stream (#13144 ) * Idle Seekable stream supervisor changes. * nit * nit * nit * Adds unit tests * Supervisor decides it's idle state instead of AutoScaler * docs update * nit * nit * docs update * Adds Kafka unit test * Adds Kafka Integration test. * Updates travis config. * Updates kafka-indexing-service dependencies. * updates previous offsets snapshot & doc * Doesn't act if supervisor is suspended. * Fixes highest current offsets fetch bug, adds new Kafka UT tests, doc changes. * Reverts Kinesis Supervisor idle behaviour changes. * nit * nit * Corrects SeekableStreamSupervisorSpec check on idle behaviour config, adds tests. * Fixes getHighestCurrentOffsets to fetch offsets of publishing tasks too * Adds Kafka Supervisor UT * Improves test coverage in druid-server * Corrects IT override config * Doc updates and Syntactic changes * nit * supervisorSpec.ioConfig.idleConfig changes	2022-10-12 18:31:08 +05:30
Clint Wylie	59e2afc566	use object[] instead of string[] for vector expressions to be consistent with vector object selectors (#13209 ) * use object[] instead of string[] for vector expressions to be consistent with vector object selectors * simplify	2022-10-12 02:53:43 -07:00
Sam Rash	80e10ffe22	CompressedBigDecimal Min/Max (#13141 ) This adds min/max functions for CompressedBigDecimal. It exposes these functions via sql (BIG_MAX, BIG_MIN--see the SqlAggFunction implementations). It also includes various bug fixes and cleanup to the original CompressedBigDecimal code include the AggregatorFactories. Various null handling was improved. Additional test cases were added for both new and existing code including a base test case for AggregationFactories. Other tests common across sum,min,max may be refactored also to share the varoius cases in the future.	2022-10-11 16:35:21 -07:00
Clint Wylie	9688674ea8	fix issue with nested column null value index incorrectly matching non-null values (#13211 )	2022-10-11 15:54:36 -07:00
Gian Merlino	c19ae13323	Improve direct-memory check on startup. (#13207 ) 1) Better support for Java 9+ in RuntimeInfo. This means that in many cases, an actual validation can be done. 2) Clearer log message in cases where an actual validation cannot be done.	2022-10-12 05:10:25 +08:00
Jonathan Wei	9b8e69c99a	Add inline descriptor Protobuf bytes decoder (#13192 ) * Add inline descriptor Protobuf bytes decoder * PR comments * Update tests, check for IllegalArgumentException * Fix license, add equals test * Update extensions-core/protobuf-extensions/src/main/java/org/apache/druid/data/input/protobuf/InlineDescriptorProtobufBytesDecoder.java Co-authored-by: Frank Chen <frankchen@apache.org> Co-authored-by: Frank Chen <frankchen@apache.org>	2022-10-11 13:37:28 -05:00
317brian	2a24c20454	process: update PR template to include release notes (#13188 ) * process: update PR template to include release notes * Update .github/pull_request_template.md [ci skip] Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update .github/pull_request_template.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * incorporate feedback from paul Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2022-10-11 10:29:18 +08:00
Frank Chen	d30cf8c308	Dependency cleanup (#13194 ) * Clean up dependency in extensions * Bump protobuf/aws.sdk * Bump aws-sdk to 1.12.317 * Fix CI * Fix CI * Update license * Update license	2022-10-10 20:34:38 +08:00
Gian Merlino	5b519f3689	Fix null message handling in AllowedRegexErrorResponseTransformStrategy. (#13177 ) Error messages can be null. If the incoming error message is null, then return null.	2022-10-09 07:42:41 -07:00
Vadim Ogievetsky	573e12c75f	Web console: making the cell filter menu more functional, removing the old query view, and updating d3 (#13169 ) * remove old query view * update tests * add filter * fix test * bump d3 things to latest versions * rent too far into the future with d3 * make config dialogs load * goodies * update snapshots * only compute duration when running or pending	2022-10-07 12:44:40 -07:00
Charles Smith	25c1d55dd6	Clarify behavior when decommissioningMaxPercentOfMaxSegmentsToMove = 0 (#13157 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-10-07 09:01:32 -07:00
Sam Rash	f89496ccac	Revert Accidental Change to Druid.xml (#13190 ) See commit 54a2eb for accidental commit	2022-10-06 14:42:35 -07:00
317brian	0edceead80	msq: update known issue about GROUPING SETS and COUNT DISTINCT (#13185 ) * msq: update known issue about GROUPING SETS and COUNT DISTINCT * address feedback from Gian	2022-10-05 19:47:03 -07:00
AmatyaAvadhanula	41e51b21c3	Make http options the default configurations (#13092 ) Druid currently uses Zookeeper dependent options as the default. This commit updates the following to use HTTP as the default instead. - task runner. `druid.indexer.runner.type=remote -> httpRemote` - load queue peon. `druid.coordinator.loadqueuepeon.type=curator -> http` - server inventory view. `druid.serverview.type=curator -> http`	2022-10-05 05:35:17 +05:30
Xavier Léauté	eff7edb603	update core Apache Kafka dependencies to 3.3.1 (#13176 ) Announcement: - https://blogs.apache.org/kafka/entry/what-rsquo-s-new-in Release notes: - https://archive.apache.org/dist/kafka/3.3.0/RELEASE_NOTES.html - https://downloads.apache.org/kafka/3.3.1/RELEASE_NOTES.html	2022-10-04 12:52:16 -07:00
Abhishek Agarwal	e3f9a0ed44	Lazy initialization of segment killers, movers and archivers (#13170 ) * Lazy initialization of segment killers, movers and archivers * Add test for lazy killer * Add more tests * Intellij fixes	2022-10-04 15:55:46 +05:30
Kashif Faraz	b07f01d645	Set useMaxMemoryEstimates=false by default (#13178 ) A value of `false` denotes that the new flow with improved estimates will be used.	2022-10-04 15:04:23 +05:30
Abhishek Agarwal	7fa53ff4b3	Exclude calcite from dependabot (#13160 ) * Exclude calcite from dependabot * Update .github/dependabot.yml Co-authored-by: Liam Newman <96086065+liam-verta@users.noreply.github.com> * Update dependabot.yml Co-authored-by: Liam Newman <96086065+liam-verta@users.noreply.github.com>	2022-10-04 10:21:11 +08:00
Vadim Ogievetsky	4bfae1deee	Docs: fix doc search (#13164 ) * fix doc search * upgrade website node to 16 * change website travis script * move spellcheck notification * explicit path to npm bin * cd to the correct place	2022-10-03 16:48:13 -07:00
Adarsh Sanjeev	92d2633ae6	Update ClusterByStatisticsCollectorImpl to use bytes instead of keys (#12998 ) * Update clusterByStatistics to use bytes instead of keys * Address review comments * Resolve checkstyle * Increase test coverage * Update test * Update thresholds * Update retained keys function * Update docs * Fix spelling	2022-10-03 12:08:23 +05:30
Vadim Ogievetsky	ebfe1c0c90	Web console: fix DQT import (#13159 ) * fix dqt import * update licenses * update tests	2022-09-30 09:31:06 -07:00
Kashif Faraz	ce5f55e5ce	Fix over-replication caused by balancing when inventory is not updated yet (#13114 ) * Add coordinator test framework * Remove outdated changes * Add more tests * Add option to auto-sync inventory * Minor cleanup * Fix inspections * Add README for simulations, add SegmentLoadingNegativeTest * Fix over-replication from balancing * Fix README * Cleanup unnecessary fields from DruidCoordinator * Add a test * Fix DruidCoordinatorTest * Remove unused import * Fix CuratorDruidCoordinatorTest * Remove test log4j2.xml	2022-09-29 12:06:23 +05:30
Abhishek Agarwal	61b34950e7	Fix assertion error in sql planning for latest aggregators (#13151 ) * Fix sql planning bug for latest aggregators * change test name * Fix error messages * fix error message again	2022-09-28 21:01:32 +05:30
AmatyaAvadhanula	acafd0d1e0	Upgrade kafka version to 3.2.3 to fix CVE (#13142 ) Upgrade to 3.2.3 to fix CVE: https://nvd.nist.gov/vuln/detail/CVE-2022-34917	2022-09-28 10:47:09 +05:30
Jill Osborne	548d810baa	Correct nested columns example (#13150 )	2022-09-28 10:39:56 +05:30
David Palmer	0d7bf66578	Add a note to the documentation about pre-built HLLSketches (#13088 ) * add a note to the documentation about pre-built HLLSketches Druid actually supports ingesting a pre-generated sketch column by using the HLLSketchMerge aggregator. However, this functionality was previously not made clear in the documentation. * copyedit from the King's English to American English * add suggested style changes Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-09-27 10:29:39 +08:00
Apoorv Gupta	c8f4d72fb1	Fix documentation bug about injective lookups (#13147 ) replace mapping to `unique keys` with mapping to `unique values`.	2022-09-27 10:16:48 +08:00
Sam Rash	28b9edc2a8	Add BIG_SUM SQL function (#13102 ) This adds a sql function, "BIG_SUM", that uses CompressedBigDecimal to do a sum. Other misc changes: 1. handle NumberFormatExceptions when parsing a string (default to set to 0, configurable in agg factory to be strict and throw on error) 2. format pom file (whitespace) + add dependency 3. scaleUp -> scale and always require scale as a parameter	2022-09-26 18:02:25 -07:00
Jonathan Wei	1f1fced6d4	Add JsonInputFormat option to assume newline delimited JSON, improve parse exception handling for multiline JSON (#13089 ) * Add JsonInputFormat option to assume newline delimited JSON, improve handling for non-NDJSON * Fix serde and docs * Add PR comment check	2022-09-26 19:51:04 -05:00
imply-cheddar	e839660b6a	Grab the thread name in a poisoned pool (#13143 )	2022-09-26 17:09:10 -07:00
Laksh Singla	0bfa81b7df	Fix the Injector creation in HadoopTask (#13138 ) * Injector fix in HadoopTask * Log the ExtensionsConfig while instantiating the HadoopTask * Log the config in the run() method instead of the ctor	2022-09-24 10:38:25 +05:30
Adarsh Sanjeev	306f612f86	Suppress Calcite CVE (#13119 ) * Suppress Calcite CVE * Update comment	2022-09-23 16:23:26 +05:30
Vadim Ogievetsky	a910764e41	better spec conversion with issues (#13136 )	2022-09-22 10:46:57 -07:00
Vadim Ogievetsky	6c1dc6589e	initialize all counters for stages with input (#13137 )	2022-09-22 08:10:50 -07:00
Laksh Singla	728745a1d3	Add IT for MSQ task engine using the new IT framework (#12992 ) * first test, serde causing problems * serde working * insert and select check * Add cluster annotations for MSQ test cases * Add cluster config for MSQ * Add MSQ config to the pom.xml * cleanup unnecessary changes * Remove model classes * Comments, checkstyle, check queries from file * fixup test case name * build failure fix * review changes * build failure fix * Trigger Build * Log the mismatch in QueryResultsVerifier * Trigger Build * Change the signature of the results verifier * review changes * LGTM fix * build, change pom * Trigger Build * Trigger Build * trigger build with minimal pom changes * guice fix in tests * travis.yml	2022-09-22 16:09:47 +05:30
Sam Rash	044cab5094	Optimize CompressedBigDecimal compareTo() (#13086 ) Optimizes the compareTo() function in CompressedBigDecimal. It directly compares the int[] rather than creating BigDecimal objects and using its compareTo. It handles unequal sized CBDs, but does require the scales to match.	2022-09-21 20:31:02 -07:00
Vadim Ogievetsky	f1d3728371	append to exisitng callout (#13130 )	2022-09-21 19:39:28 -07:00
Charles Smith	eb760c3d1d	update log4j example (#13095 ) * update log4j example * fix some style issues * Update docs/configuration/logging.md Co-authored-by: Frank Chen <frankchen@apache.org> Co-authored-by: Frank Chen <frankchen@apache.org>	2022-09-22 09:46:49 +08:00
317brian	12f12a13a9	fix: fix broken postgres link (#13135 )	2022-09-22 09:46:20 +08:00
317brian	7fa35839c0	fix: follow naming convention for msq task engine (#13127 ) * fix: follow naming convention for msq task engine * more fixes * add back in experimental * fix anchor	2022-09-21 18:46:06 -07:00
Gian Merlino	2f731f356e	Update pull-deps docs with correct repo list. (#13134 ) There is only one default remote repo at this time.	2022-09-21 12:16:57 -07:00
Jonathan Wei	331e6d707b	Add KafkaConfigOverrides extension point (#13122 ) * Add KafkaConfigOverrides extension point * X	2022-09-21 11:47:19 +05:30
Katya Macedo	90d14f629a	spatial-filters (#13124 )	2022-09-20 22:48:36 -07:00
Kashif Faraz	0039409817	Add test framework to simulate segment loading and balancing (#13074 ) Fixes #12822 The framework added here make it easy to write tests that verify the behaviour and interactions of the following entities under various conditions: - `DruidCoordinator` - `HttpLoadQueuePeon`, `LoadQueueTaskMaster` - coordinator duties: `BalanceSegments`, `RunRules`, `UnloadUnusedSegments`, etc. - datasource retention rules: `LoadRule`, `DropRule` Changes: Add the following main classes: - `CoordinatorSimulation` and related interfaces to dictate behaviour of simulation - `CoordinatorSimulationBuilder` to build a simulation. - `BlockingExecutorService` to keep submitted tasks in queue and execute them only when explicitly invoked. Add tests: - `CoordinatorSimulationBaseTest`, `SegmentLoadingTest`, `SegmentBalancingTest` - `SegmentLoadingNegativeTest` to contain tests which assert the existing erroneous behaviour of segment loading. Once the behaviour is fixed, these tests will be moved to the regular `SegmentLoadingTest`. Please refer to the README.md in `org.apache.druid.server.coordinator.simulate` for more details	2022-09-21 09:51:58 +05:30

1 2 3 4 5 ...

12112 Commits All Branches Search

12112 Commits

All Branches