druid

Commit Graph

Author	SHA1	Message	Date
Tejaswini Bandlamudi	984904779b	Increase default DatasourceCompactionConfig.inputSegmentSizeBytes to Long.MAX_VALUE (#12381 ) The current default value of inputSegmentSizeBytes is 400MB, which is pretty low for most compaction use cases. Thus most users are forced to override the default. The default value is now increased to Long.MAX_VALUE.	2022-04-04 16:28:53 +05:30
Yuanli Han	f2495a67d2	fix messageGap metric (#12337 )	2022-03-28 09:21:06 -07:00
Maytas Monsereenusorn	ea51d8a16c	Duties in Indexing group (such as Auto Compaction) does not report metrics (#12352 ) * add impl * add unit tests * fix checkstyle * address comments * fix checkstyle	2022-03-23 18:18:28 -07:00
Jihoon Son	b6eeef31e5	Store null columns in the segments (#12279 ) * Store null columns in the segments * fix test * remove NullNumericColumn and unused dependency * fix compile failure * use guava instead of apache commons * split new tests * unused imports * address comments	2022-03-23 16:54:04 -07:00
Maytas Monsereenusorn	dbb9518f50	Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set (#12334 ) * add impl * add ITs * address comments * address comments * address comments * fix failure * fix checkstyle * fix checkstyle	2022-03-18 12:46:16 -07:00
Jihoon Son	5e23674fe5	Fix a race condition in the '/tasks' Overlord API (#12330 ) * finds complete and active tasks from the same snapshot * overlord resource * unit test * integration test * javadoc and cleanup * more cleanup * fix test and add more	2022-03-17 10:47:45 +09:00
AmatyaAvadhanula	7bf1d8c5c0	Facilitate lazy initialization of connections to mitigate overwhelming of Coordinator (#12298 ) Add config for eager / lazy connection initialization in ResourcePool Description Currently, when multiple tasks are launched, each of them eagerly initializes a full pool's worth of connections to the coordinator. While this is acceptable when the parameter for number of eagerConnections (== maxSize) is small, this can be problematic in environments where it's a large value (say 1000) and multiple tasks are launched simultaneously, which can cause a large number of connections to be created to the coordinator, thereby overwhelming it. Patch Nodes like the broker may require eager initialization of resources and do not create connections with the Coordinator. It is unnecessary to do this with other types of nodes. A config parameter eagerInitialization is added, which when set to true, initializes the max permissible connections when ResourcePool is initialized. If set to false, lazy initialization of connection resources takes place. NOTE: All nodes except the broker have this new parameter set to false in the quickstart as part of this PR Algorithm The current implementation relies on the creation of maxSize resources eagerly. The new implementation's behaviour is as follows: If a resource has been previously created and is available, lend it. Else if the number of created resources is less than the allowed parameter, create and lend it. Else, wait for one of the lent resources to be returned.	2022-03-09 23:17:43 +05:30
Agustin Gonzalez	abe76ccb90	Batch ingestion replace (#12137 ) * Tombstone support for replace functionality * A used segment interval is the interval of a current used segment that overlaps any of the input intervals for the spec * Update compaction test to match replace behavior * Adapt ITAutoCompactionTest to work with tombstones rather than dropping segments. Add support for tombstones in the broker. * Style plus simple queriableindex test * Add segment cache loader tombstone test * Add more tests * Add a method to the LogicalSegment to test whether it has any data * Test filter with some empty logical segments * Refactor more compaction/dropexisting tests * Code coverage * Support for all empty segments * Skip tombstones when looking-up broker's timeline. Discard changes made to tool chest to avoid empty segments since they will no longer have empty segments after lookup because we are skipping over them. * Fix null ptr when segment does not have a queriable index * Add support for empty replace interval (all input data has been filtered out) * Fixed coverage & style * Find tombstone versions from lock versions * Test failures & style * Interner was making this fail since the two segments were consider equal due to their id's being equal * Cleanup tombstone version code * Force timeChunkLock whenever replace (i.e. dropExisting=true) is being used * Reject replace spec when input intervals are empty * Documentation * Style and unit test * Restore test code deleted by mistake * Allocate forces TIME_CHUNK locking and uses lock versions. TombstoneShardSpec added. * Unused imports. Dead code. Test coverage. * Coverage. * Prevent killer from throwing an exception for tombstones. This is the killer used in the peon for killing segments. * Fix OmniKiller + more test coverage. * Tombstones are now marked using a shard spec * Drop a segment factory.json in the segment cache for tombstones * Style * Style + coverage * style * Add TombstoneLoadSpec.class to mapper in test * Update core/src/main/java/org/apache/druid/segment/loading/TombstoneLoadSpec.java Typo Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com> * Update docs/configuration/index.md Missing Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com> * Typo * Integrated replace with an existing test since the replace part was redundant and more importantly, the test file was very close or exceeding the 10 min default "no output" CI Travis threshold. * Range does not work with multi-dim Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>	2022-03-08 20:07:02 -07:00
Gian Merlino	28f8bcce9b	Always reopen stream in FileUtils.copyLarge, RetryingInputStream. (#12307 ) * Always reopen stream in FileUtils.copyLarge, RetryingInputStream. When an InputStream throws an exception from one of its read methods, we should assume it's bad and reopen it. The main changes here are: - In FileUtils.copyLarge, replace InputStream with InputStreamSupplier. - In RetryingInputStream, collapse retryCondition and resetCondition into a single condition. Also, make it required, since every usage is passing in a specific condition anyway. * Test fixes. * Fix read impl.	2022-03-05 14:39:14 -08:00
Sandeep	61e1ffc7f7	add a new query laning metrics to visualize lane assignment (#12111 ) * add a new query laning metrics to visualize lane assignment * fixes :spotbugs check * Update docs/operations/metrics.md Co-authored-by: Benedict Jin <asdf2014@apache.org> * Update server/src/main/java/org/apache/druid/server/QueryScheduler.java Co-authored-by: Benedict Jin <asdf2014@apache.org> * Update server/src/main/java/org/apache/druid/server/QueryScheduler.java Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Benedict Jin <asdf2014@apache.org>	2022-03-04 15:21:17 +08:00
Laksh Singla	3f709db173	Make ParseExceptions more informative (#12259 ) This PR aims to make the ParseExceptions in Druid more informative, by adding additional information (metadata) to the ParseException, which can contain additional information about the exception. For example - the path of the file generating the issue, the line number (where it can be easily fetched - like CsvReader) Following changes are addressed in this PR: A new class CloseableIteratorWithMetadata has been created which is like CloseableIterator but also has a metadata method that returns a context Map<String, Object> about the current element returned by next(). IntermediateRowParsingReader#read() now attaches the InputEntity and the "record number" which created the exception (while parsing them), and IntermediateRowParsingReader#sample attaches the InputEntity (but not the "record number"). TextReader (and its subclasses), which is a specific implementation of the IntermediateRowParsingReader also include the line number which caused the generation of the error. This will also help in triaging the issues when InputSourceReader generates ParseException because it can point to the specific InputEntity which caused the exception (while trying to read it).	2022-02-28 22:31:15 +05:30
Xavier Léauté	d105519558	Replace use of PowerMock with Mockito (#12282 ) Mockito now supports all our needs and plays much better with recent Java versions. Migrating to Mockito also simplifies running the kind of tests that required PowerMock in the past. * replace all uses of powermock with mockito-inline * upgrade mockito to 4.3.1 and fix use of deprecated methods * import mockito bom to align all our mockito dependencies * add powermock to forbidden-apis to avoid accidentally reintroducing it in the future	2022-02-27 22:47:09 -08:00
Xavier Léauté	1434197ee1	update airline dependency to 2.x (#12270 ) * upgrade Airline to Airline 2 https://github.com/airlift/airline is no longer maintained, updating to https://github.com/rvesse/airline (Airline 2) to use an actively maintained version, while minimizing breaking changes. Note, this is a backwards incompatible change, and extensions relying on the CliCommandCreator extension point will also need to be updated. * fix dependency checks where jakarta.inject is now resolved first instead of javax.inject, due to Airline 2 using jakarta	2022-02-27 15:19:28 -08:00
Jihoon Son	e5ad862665	A new includeAllDimension flag for dimensionsSpec (#12276 ) * includeAllDimensions in dimensionsSpec * doc * address comments * unused import and doc spelling	2022-02-25 18:27:48 -08:00
Maytas Monsereenusorn	6e2eded277	Allow coordinator run auto compaction duty period to be configured separately from other indexing duties (#12263 ) * add impl * add impl * add unit tests * add impl * add impl * add serde test * add tests * add docs * fix test * fix test * fix docs * fix docs * fix spelling	2022-02-18 23:02:57 -08:00
tejaswini-imply	70c40c4281	Fix long overflow in SegmentCostCache.Bucket.toLocalInterval (#12257 ) Problem: When using a `CachingCostBalancerStrategy` with segments of granularity ALL, no segment gets loaded. - With granularity ALL, segments of eternity interval are created which have `start = Long.MIN_VALUE / 2` and `end = Long.MAX_VALUE / 2`. - For cost calculation in the balancer strategy, `toLocalInterval()` method is invoked where `Long.MIN_VALUE / 2` or `Long.MAX_VALUE / 2` cause an overflow thus resulting in no overlap. - The strategy is unable to find any eligible server for loading a given segment. Fix: - Reverse order of operations to divide by `MILLIS_FACTOR` (~10^8) first, then do the subtraction to prevent Long overflow.	2022-02-17 15:13:51 +05:30
Jihoon Son	ab3d994a17	Lazy instantiation for segmentKillers, segmentMovers, and segmentArchivers (#12207 ) * working * Lazily load segmentKillers, segmentMovers, and segmentArchivers * more tests * test-jar plugin * more coverage * lazy client * clean up changes * checkstyle * i did not change the branch condition * adjust failure rate to run tests faster * javadocs * checkstyle	2022-02-08 13:02:06 -08:00
Suneet Saldanha	ced1389d4c	Enable auto kill segments by default (#12187 ) * Enable auto-kill by default * tests * wip * test * fix IT * fix it * remove from docs * make coverage bot happy	2022-02-07 06:57:54 -08:00
Maytas Monsereenusorn	2b8e7fc0b4	Add a flag to allow auto compaction task slot ratio to consider auto scaler slots (#12228 ) * add impl * fix checkstyle * add unit tests * checkstyle * add IT * fix IT * add comments * fix checkstyle	2022-02-06 20:46:05 -08:00
Suneet Saldanha	159f97dcb0	Update docs for druid.processing.numThreads in brokers (#12231 ) * Update docs for druid.processing.numThreads * error msg * one more reference	2022-02-04 17:34:21 -08:00
Clint Wylie	8fd587b28c	remove duplicate Broker ServerInventoryView, improve HttpServerInventoryView logging (#12209 ) * changes: * remove SystemSchema duplicate ServerInventoryView in broker * suppress duplicate segment added/removed warnings in HttpServerInventoryView when doing a full sync * fixes	2022-02-03 12:57:34 -08:00
Kashif Faraz	e648b01afb	Improve memory estimates in Aggregator and DimensionIndexer (#12073 ) Fixes #12022 ### Description The current implementations of memory estimation in `OnHeapIncrementalIndex` and `StringDimensionIndexer` tend to over-estimate which leads to more persistence cycles than necessary. This PR replaces the max estimation mechanism with getting the incremental memory used by the aggregator or indexer at each invocation of `aggregate` or `encode` respectively. ### Changes - Add new flag `useMaxMemoryEstimates` in the task context. This overrides the same flag in DefaultTaskConfig i.e. `druid.indexer.task.default.context` map - Add method `AggregatorFactory.factorizeWithSize()` that returns an `AggregatorAndSize` which contains the aggregator instance and the estimated initial size of the aggregator - Add method `Aggregator.aggregateWithSize()` which returns the incremental memory used by this aggregation step - Update the method `DimensionIndexer.processRowValsToKeyComponent()` to return the encoded key component as well as its effective size in bytes - Update `OnHeapIncrementalIndex` to use the new estimations only if `useMaxMemoryEstimates = false`	2022-02-03 10:34:02 +05:30
Rohan Garg	c4fa3ccfc4	Fix load-drop-load sequence for same segment and historical in http loadqueue peon (#11717 ) Fixes an issue where a load-drop-load sequence for a segment and historical doesn't work correctly for http based load queue peon. The first cycle of load-drop works fine - the problem comes when there is an attempt to reload the segment. The historical caches load success for some recent segments and makes the reload as a no-op. But it doesn't consider that fact that the segment was also dropped in between the load requests. This change invalidates the cache after a client tries to fetch a success result.	2022-01-31 13:16:58 +05:30
Clint Wylie	5d2291991e	use reflection to check for mysql transient exception type (#12205 ) * use reflection to check for mysql transient exception type * better * oops	2022-01-27 13:13:16 -08:00
zachjsh	f906f2f577	Fix HttpRemoteTaskRunner LifecycleStart / LifecycleStop race condition (#12184 ) * * stop workers, remove listener, and call exitStop() on HttpRemoteTaskRunner @LifecycleStop * * fix test failure	2022-01-27 13:15:14 -06:00
TSFenwick	a813816fb1	add module test for QueryableModule to allow for better runtime.properties testing (#12202 ) added a default GetRequestLoggerProviderTest and GetEmitterRequestLoggerProviderTest	2022-01-25 22:26:11 -08:00
Karan Kumar	96b3498a40	Grouping on arrays as arrays (#12078 ) * init multiValue column group by * Changing sorting to Lexicographic as default * Adding initial tests * 1.Fixing test cases adding 2.Optimized inmem structs * Linking SQL layer to native layer * Adding multiDimension support to group by column strategy * 1. Removing array coercion in Calcite layer 2. Removing ResultRowDeserializer * 1. Supporting all primitive array types 2. Removing dimension spec as part of columnSelector * 1. Supporting all primitive array types 2. Removing dimension spec as part of columnSelector * 1. Checkstyle things 2. Removing flag * Minor naming things * CheckStyle Things * Fixing test case * Fixing hashing * 1. Adding the MV function 2. Added few test cases * 1. Adding MV function test cases * Adding Selector strategy function test cases * Fixing ClientQuerySegmentWalkerTest * Adding GroupByQueryRunnerTest test cases * Fixing test cases * Adding few more test cases * Fixing Exception asset statement and intellij inspection * Adding null compatibility tests * Review comments * Fixing few failing tests * Fixing few failing tests * Do no convert to topN Q incase of group by on array * Fixing checkstyle * Fixing differences between jdk's class cast exception message * 1. Fixing ordering if the grouping key is an array * Fixing DefaultLimitSpec * Fixing CalciteArraysQueryTest * Dummy commit for LGTM * changes: * only coerce multi-value string null values when `ExpressionPlan.Trait.NEEDS_APPLIED` is set * correct return type inference for ARRAY_APPEND,ARRAY_PREPEND,ARRAY_SLICE,ARRAY_CONCAT * fix bug with ExprEval.ofType when actual type of object from binding doesn't match its claimed type * Review comments * Fixing test cases * Fixing spot bugs * Fixing strict compile Co-authored-by: Clint Wylie <cwylie@apache.org>	2022-01-25 20:30:56 -08:00
Suneet Saldanha	2b32d86f3b	Enable automatic metdata cleanup by default (#12188 )	2022-01-24 20:04:17 -08:00
Jihoon Son	cc2ffc6c0f	Fix node discovery to ignore unknown DruidServices (#12157 ) * Fix node discovery to ignore unknown DruidServices * ignore all runtime exceptions * fix test * add custom deserializer * custom serializer * log host for unparseable druidService	2022-01-18 22:08:59 -08:00
Maytas Monsereenusorn	bd7fe45da0	Support adding metrics in Auto Compaction (#12125 ) * add impl * add impl * add unit tests * add unit tests * add unit tests * add unit tests * add unit tests * add integration tests * add integration tests * fix LGTM * fix test * remove doc	2022-01-17 20:19:31 -08:00
Marcelo R Costa	c28b2834a1	Add http response status code to org.eclipse.jetty.server.RequestLog (#12116 ) * Add http response status code to org.eclipse.jetty.server.RequestLog * http response code is expressed as an int. Set log msg interpolation based on digit * trying to add an unit test to verify if the logger.debug method is called * trying to add an unit test to verify if the logger.debug method is called * fix compilation issues * remove test	2022-01-06 20:10:01 +08:00
Maytas Monsereenusorn	b53e7f4d12	Support overlapping segment intervals in auto compaction (#12062 ) * add impl * add impl * fix more bugs * add tests * fix checkstyle * address comments * address comments * fix test	2022-01-04 11:47:38 -08:00
somu-imply	c267b65f97	Removing unused processing threadpool on broker (#12070 ) * Thread pool for broker * Updating two tests to improve coverage for new method added * Updating druidProcessingConfigTest to cover coverage * Adding missed spelling errors caused in doc * Adding test to cover lines of new function added	2021-12-21 13:07:53 -08:00
lokesh-lingarajan	60a3a802b6	Modifying index from druid_segments(datasource, used, end) to druid_segments(datasource, used, end, start) to support kill task (#11894 ) This index helps in faster query results during kill task's query on interval based unused segment listing. This can become a bottleneck in some production loads causing coordinator to wait longer for metadata db replies and impacting Kafka ingestion. The modified index has helped reduce the query times for such queries.	2021-12-16 10:28:20 -08:00
Jonathan Wei	229f82a6f0	Add parse error list API for stream supervisors, use structured object for parse exceptions, simplify parse exception message (#11961 ) * Add parse error list API for stream supervisors, simplify parse exception message * Add input string to parse exception * Use structured ParseExceptionReport * Fix tests * Add test * PR comments, add ParseExceptionReport equals verifier * Fix test	2021-12-09 15:42:55 -06:00
Lucas Capistrant	150902b95c	clean up the balancing code around the batched vs deprecated way of sampling segments to balance (#11960 ) * clean up the balancing code around the batched vs deprecated way of sampling segments to balance * fix docs, clarify comments, add deprecated annotations to legacy code * remove unused variable * update dynamic config dialog in console to state percentOfSegmentsToConsiderPerMove deprecated * fix dynamic config text for percentOfSegmentsToConsiderPerMove * run prettier to cleanup coordinator-dynamic-config.tsx changes * update jest snapshot * update documentation per review feedback	2021-12-07 14:47:46 -08:00
Clint Wylie	a8815f671e	Fix druid client timeout zero (#12023 ) * fix bug where queries fail immediately when timeout is 0 instead of using default timeout * fix to use serverside max * more better * less flaky test * oops	2021-12-07 12:41:01 -08:00
zachjsh	65cadbe42a	Fix bad lookup config fails task (#12021 ) This PR fixes an issue in which if a lookup is configured incorreclty; does not serialize properly when being pulled by peon node, it causes the task to fail. The failure occurs because the peon and other leaf nodes (broker, historical), have retry logic that continues to retry the lookup loading for 3 minutes by default. The http listener thread on the peon task is not started until lookup loading completes, by default, the overlord waits 1 minute by default, to communicate with the peon task to get the task status, after which is orders the task to shut down, causing the ingestion task to fail. To fix the issue, we catch the exception serialization error, and do not retry. Also fixed an issue in which a bad lookup config interferes with any other good lookup configs from being loaded.	2021-12-07 00:55:34 -05:00
Abhishek Agarwal	834aae096a	Human-readable and actionable SQL error messages (#11911 ) This PR does two things 1. It adds the capability to surface missing features in SQL to users - The calcite planner will explore through multiple rules to convert a logical SQL query to a druid native query. Some rules change the shape of the query itself, optimize it and some rules are responsible for translating the query into a druid native query. These are DruidQueryRule, DruidOuterQueryRule, DruidJoinRule, DruidUnionDataSourceRule, DruidUnionRule etc. These rules will look at SQL and will do the necessary transformation. But if the rule can't transform the query, it returns back the control to the calcite planner without recording why was it not able to transform. E.g. there is a join query with a non-equal join condition. DruidJoinRule will look at the condition, see that it is not supported, and return back the control. The reason can be that a query can be planned in many different ways so if one rule can't parse it, the query may still be parseable by other rules. In this PR, we are intercepting these gaps and passing them back to the user if the query could not be planned at all. 2. The said capability has been used to generate actionable errors for some common unsupported SQL features. However, not all possible errors are covered and we can keep adding more in the future.	2021-12-07 09:44:08 +05:30
Paul Rogers	34a3d45737	Refactor ResponseContext (#11828 ) * Refactor ResponseContext Fixes a number of issues in preparation for request trailers and the query profile. * Converts keys from an enum to classes for smaller code * Wraps stored values in functions for easier capture for other uses * Reworks the "header squeezer" to handle types other than arrays. * Uses metadata for visibility, and ability to compress, to replace ad-hoc code. * Cleans up JSON serialization for the response context. * Other miscellaneous cleanup. * Handle unknown keys in deserialization Also, make "Visibility" into a boolean. * Revised comment * Renamd variable	2021-12-06 17:03:12 -08:00
Karan Kumar	2539b7a748	Adding ToString() to ExceptionEvent (#12027 ) For readable output for exception events, while generating the report in SeekableStreamSupervisor	2021-12-06 13:37:16 +05:30
Jihoon Son	1f052b43c5	Better serverView exec name; remove SingleServerInventoryView (#11770 ) Druid currently has 2 serverViews, regular serverView and filtered serverView. The regular serverView is used to monitor all segment announcements from all data nodes (historicals, tasks, indexers). The filtered serverView is used when you want to watch segment announcements from particular tiers. Since these server views keep track of different sets of druidServers and segments in memory, they should be maintained separately. However, they currently share the same name for their executorService, which can cause confusion and make debugging harder especially in the broker since it is using both serverViews, the filtered view for normal query processing and the regular view to serve the servers table (I'm unsure whether this is intended or whether this is a good behavior). This PR changes it to a more obvious name. This PR also removes SingleServerInventoryView. This view was deprecated a long time ago and has not been documented at least since 0.13 (#6127). I also don't think this can be better in any case than BatchServerInventoryView. Finally, I merged AbstractCuratorServerInventoryView and BatchServerInventoryView as we no longer need AbstractCuratorServerInventoryView after SingleServerInventoryView is removed.	2021-12-04 18:43:05 +05:30
Jihoon Son	fc9513b6cd	Make NodeRole available during binding; add support for dynamic registration of DruidService (#12012 ) * Make nodeRole available during binding; add support for dynamic registration of DruidService * fix checkstyle and test * fix customRole test * address comments * add more javadoc	2021-12-03 11:59:00 -08:00
Gian Merlino	e0e05aad99	Enhancements to IndexTaskClient. (#12011 ) * Enhancements to IndexTaskClient. 1) Ability to use handlers other than StringFullResponseHandler. This functionality is not used in production code yet, but is useful because it will allow tasks to communicate with each other in non-string-based formats and in streaming fashion. In the future, we'll be able to use this to make task-to-task communication more efficient. 2) Truncate server errors at 1KB, so long errors do not pollute logs. 3) Change error log level for retryable errors from WARN to INFO. (The final error is still WARN.) 4) Harmonize log and exception messages to have a more consistent format. * Additional tests and improvements.	2021-12-03 09:14:32 -08:00
Paul Rogers	a66f10eea1	Code cleanup from query profile project (#11822 ) * Code cleanup from query profile project * Fix spelling errors * Fix Javadoc formatting * Abstract out repeated test code * Reuse constants in place of some string literals * Fix up some parameterized types * Reduce warnings reported by Eclipse * Reverted change due to lack of tests	2021-11-30 11:35:38 -08:00
Gian Merlino	f6e6ca2893	Use intermediate-persist IndexSpec during multiphase merge. (#11940 ) * Use intermediate-persist IndexSpec during multiphase merge. The main change is the addition of an intermediate-persist IndexSpec to the main "merge" method in IndexMerger. There are also a few minor adjustments to the IndexMerger interface to encourage more harmonious usage of its methods in the future. * Additional changes inspired by the test coverage checker. - Remove unused-in-production IndexMerger methods "append" and "convert". - Add additional unit tests to UnifiedIndexerAppenderatorsManager. * Additional adjustments. * Even more additional adjustments. * Test fixes.	2021-11-29 15:08:49 -08:00
Sandeep	9bc18a93a2	warn when segment cannot be loaded by Historical nodes (#11849 )	2021-11-26 17:27:17 +08:00
Gian Merlino	3d72e66f56	Consolidate a bunch of ad-hoc segments metadata SQL; fix some bugs. (#11582 ) * Consolidate a bunch of ad-hoc segments metadata SQL; fix some bugs. This patch gathers together a variety of SQL from SqlSegmentsMetadataManager and IndexerSQLMetadataStorageCoordinator into a new class SqlSegmentsMetadataQuery. It focuses on SQL related to retrieving segment payloads and marking segments used and unused. In addition to cleaning up the code a bit, this patch also fixes a bug with years before 0 or after 9999. The prior SQL did not work properly because dates outside this range cannot be compared as strings. The new code does work for these far-past and far-future years. So, if you're ever interested in using Druid to analyze things from ancient Babylon, you better apply this patch first! * Fix test compiling. * Fixes and improvements. * Fix forbidden API. * Additional fixes.	2021-11-24 14:51:53 -08:00
Maytas Monsereenusorn	bb3d2a433a	Support filtering data in Auto Compaction (#11922 ) * add impl * fix checkstyle * add test * add test * add unit tests * fix unit tests * fix unit tests * fix unit tests * add IT * add IT * add comments * fix spelling	2021-11-24 10:56:38 -08:00
Agustin Gonzalez	311d9a2370	Log correct hydrant count (#11976 )	2021-11-23 08:22:17 -08:00
Gian Merlino	b13f07a057	Harmonize local input sources; fix batch index integration test. (#11965 ) * Make LocalInputSource.files a List instead of Set and adjust wikipedia_index_task to use file list. Rationale: the behavior of wikipedia_index_task.json is order-dependent with regard to its input files; some orders produce 4 segments and some produce 5 segments. Some integration tests, like ITSystemTableBatchIndexTaskTest and ITAutoCompactionTest, are written assuming that the 4-segment case will always happen. Providing the file list in a specific order ensures that this will happen as expected by the tests. I didn't see a specific reason why the LocalInputSource.files parameter needed to be a Set, so changing it to a List was the simplest way to achieve the consistent ordering. I think it will also make the behavior make more sense if someone does specify the same input file multiple times in a spec: I think they'd expect it to be loaded multiple times instead of deduped. This is consistent with the behavior of other input sources like S3, GCS, HTTP. * Sort files in LocalFirehoseFactory.	2021-11-21 22:26:31 -08:00
Nikhil Navadiya	3c51136098	Add worker category dimension (#11554 ) * Add worker category as dimension in TaskSlotCountStatsMonitor * Change description * Add workerConfig as field * Modify HttpRemoteTaskRunnerTest to test worker category in taskslot metrics * Fixing tests * Fixing alerts * Adding unit test in SingleTaskBackgroundRunnerTest for task slot metrics APIs * Resolving false positive spell check * addressing comments * throw UnsupportedOperationException for tasklotmetrics APIs in SingleTaskBackgroundRunner Co-authored-by: Nikhil Navadiya <nnavadiya@twitter.com>	2021-11-18 22:59:07 -08:00
Agustin Gonzalez	a4353aa1f4	Fix bug Unrecognized token 'No': was expecting (JSON String,...) when… (#11934 ) * Fix bug Unrecognized token 'No': was expecting (JSON String,...) when calling the API /druid/indexer/v1/task/taskId/reports and the report is not found * Also log other non-OK statuses	2021-11-18 10:29:28 -07:00
Gian Merlino	a04f99a950	Indexer: Demote WARN to DEBUG for tasks that don't register Appenderators. (#11939 )	2021-11-18 07:54:43 -08:00
TSFenwick	1487f558b1	Use a simple class to sanitize JDBC exceptions and also log them (#11843 ) * Use a simple class to sanitize sanitizable errors and log them The purpose of this is to sanitize JDBC errors, but can sanitize other errors if they implement SanitizableError Interface add a class to log errors and sanitize them added a simple test that tests out that the error gets sanitized add @NonNull annotation to serverconfig's ErrorResponseTransfromStrategy * return less information as part of too many connections, and instead only log specific details This is so an end user gets relevant information but not too much info since they might now how many brokers they have * return only runtime exceptions added new error types that need to be sanitized also sanitize deprecated and unsupported exceptions. * dont reqrewite exceptions unless necessary for checked exceptions add docs avoid blanket turning all exceptions into runtime exceptions * address comments, to fix up docs. add more javadocs add support UOE sanitization * use try catch instead and sanitize at public methods * checkstyle fixes * throw noSuchStatement and NoSuchConnection as Avatica is affected by those * address comments. move log error back to druid meta clean up bad formatting and commented code. add missed catch for NoSuchStatementException clean up comments for error handler and add comment explainging not wanting to santize avatica exceptions * alter test to reflect new error message	2021-11-16 13:13:03 -08:00
Laksh Singla	57ed5127a7	Make subquery IDs more comprehensive (#11809 ) There are 3 types of query IDs - id, subQueryId, sqlQueryId. Currently, whenever a query generates subqueries, the subquery's subQueryId is populated randomly. Also, subquery's Id is not set to the parent query Id. Therefore there is no way of linking the subqueries to the parent query, and one loses the ability to look at end to end view of the query. This PR aims to implement following couple of things: Populate the subqueries with it's parent's id (and sqlQueryId if present) Populate the subqueryId such that it forms a hierarchical relationship amongs themselves. For example, if there is a query which launches a subquery, which in turn launches a couple of subqueries, then the ids and subQueryIds should have following structure.	2021-11-11 16:31:56 +05:30
Gian Merlino	14b0b4aee2	RowBasedSegment: Use Sequence instead of Iterable. (#11886 ) * RowBasedSegment: Use Sequence instead of Iterable. The main reason this is good is that Sequences can include baggage that must be closed after iteration is finished. This enables creating RowBasedSegments on top of closeable sequences of rows. To preserve the optimization that allows reversing a List without copying it, this patch also makes SimpleSequence its own class and allows extracting the Iterable that was used to create it. * Fix tests.	2021-11-10 06:06:52 -08:00
Gian Merlino	6c196a5ea2	Remove StorageAdapter.getColumnTypeName. (#11893 ) * Remove StorageAdapter.getColumnTypeName. It was only used by SegmentAnalyzer, and isn't necessary anymore due to the recent improvements to ColumnCapabilities. Also: tidy ColumnDescriptor.read slightly by removing an instanceof check, and moving the relevant logic into ComplexColumnPartSerde. * Fix spellings.	2021-11-09 15:18:07 -08:00
Gian Merlino	babf00f8e3	Migrate File.mkdirs to FileUtils.mkdirp. (#11879 ) * Migrate File.mkdirs to FileUtils.mkdirp. * Remove unused imports. * Fix LookupReferencesManager. * Simplify. * Also migrate usages of forceMkdir. * Fix var name. * Fix incorrect call. * Update test.	2021-11-09 11:10:49 -08:00
Maytas Monsereenusorn	ddc68c6a81	Support changing dimension schema in Auto Compaction (#11874 ) * add impl * add unit tests * fix checkstyle * add impl * add impl * add impl * add impl * add impl * add impl * fix test * add IT * add IT * fix docs * add test * address comments * fix conflict	2021-11-08 21:17:08 -08:00
Clint Wylie	7237dc837c	complex typed expressions (#11853 ) * complex typed expressions * add built-in hll collector expressions to get coverage on druid-processing, more types, more better * rampage!!! * more javadoc * adjustments * oops * lol * remove unused dependency * contradiction? * more test	2021-11-08 00:33:06 -08:00
Jian Wang	8e7e679984	Add more metrics for Jetty server thread pool usage (#11113 ) Add more metrics for jetty server thread pool usage so we know if we have allocated enough http threads to handle requests.	2021-11-07 16:51:44 +05:30
Kashif Faraz	2d77e1a3c6	Add support for multi dimension range partitioning (#11848 ) This PR adds support for range partitioning on multiple dimensions. It extends on the concept and implementation of single dimension range partitioning. The new partition type added is range which corresponds to a set of Dimension Range Partition classes. single_dim is now treated as a range type partition with a single partition dimension. The start and end values of a DimensionRangeShardSpec are represented by StringTuples, where each String in the tuple is the value of a partition dimension.	2021-11-06 12:50:17 +05:30
Gian Merlino	8971056763	Properly count segment references in tests. (#11870 )	2021-11-05 12:49:10 -07:00
Kashif Faraz	a22687ecbe	Add Broker config `druid.broker.segment.watchRealtimeNodes` (#11732 ) The new config is an extension of the concept of "watchedTiers" where the Broker can choose to add the info of only the specified tiers to its timeline. Similarly, with this config, Broker can choose to skip the realtime nodes and thus it would query only Historical processes for any given segment.	2021-11-02 12:38:42 +05:30
Maytas Monsereenusorn	ba2874ee1f	Support changing query granularity in Auto Compaction (#11856 ) * add queryGranularity * fix checkstyle * fix test	2021-11-01 15:18:44 -07:00
Maytas Monsereenusorn	33d9d9bd74	Add rollup config to auto and manual compaction (#11850 ) * add rollup to auto and manual compaction * add unit tests * add unit tests * add IT * fix checkstyle	2021-10-29 10:22:25 -07:00
Lucas Capistrant	43383c73a8	refactor BalanceSegments#balanceServers to exit early if there is no work to be done (#11768 ) * remove useless call to balanceServers for move from decom servers when there are no decom servers * refactor approach to this PR but accomplish the same thing	2021-10-25 10:06:35 -05:00
Gian Merlino	98ecbb21cd	Remove CloseQuietly and migrate its usages to other methods. (#10247 ) * Remove CloseQuietly and migrate its usages to other methods. These other methods include: 1) New method CloseableUtils.closeAndWrapExceptions, which wraps IOExceptions in RuntimeExceptions for callers that just want to avoid dealing with checked exceptions. Most usages were migrated to this method, because it looks like they were mainly attempts to avoid declaring a throws clause, and perhaps were unintentionally suppressing IOExceptions. 2) New method CloseableUtils.closeInCatch, designed to properly close something in a catch block without losing exceptions. Some usages from catch blocks were migrated here, when it seemed that they were intended to avoid checked exception handling, and did not really intend to also suppress IOExceptions. 3) New method CloseableUtils.closeAndSuppressExceptions, which sends all exceptions to a "chomper" that consumes them. Nothing is thrown or returned. The behavior is slightly different: with this method, _all_ exceptions are suppressed, not just IOExceptions. Calls that seemed like they had good reason to suppress exceptions were migrated here. 4) Some calls were migrated to try-with-resources, in cases where it appeared that CloseQuietly was being used to avoid throwing an exception in a finally block. 🎵 You don't have to go home, but you can't stay here... 🎵 * Remove unused import. * Fix up various issues. * Adjustments to tests. * Fix null handling. * Additional test. * Adjustments from review. * Fixup style stuff. * Fix NPE caused by holder starting out null. * Fix spelling. * Chomp Throwables too.	2021-10-23 17:03:21 -07:00
Clint Wylie	187df58e30	better types (#11713 ) * better type system * needle in a haystack * ColumnCapabilities is a TypeSignature instead of having one, INFORMATION_SCHEMA support * fixup merge * more test * fixup * intern * fix * oops * oops again * ... * more test coverage * fix error message * adjust interning, more javadocs * oops * more docs more better	2021-10-19 01:47:25 -07:00
David Bar	7d4841471f	Optimize supervisor history retrieval for specific id (#11807 ) Optimization. Fetch from the metadata store only the relevant history items for the requested supervisor id.	2021-10-19 14:08:25 +05:30
TSFenwick	9c15f938fd	fix test issue where JettyTest would fail if JettyWithResponseFilterEnabledTest ran before it (#11803 ) this change ensures that JettyTest is setting the properties it needs in case some other test overwrites them this also changes up the ordering of the call for setProperties to call super's first in case super is setting the same property	2021-10-18 12:42:41 -07:00
Lucas Capistrant	1930ad1f47	Implement configurable internally generated query context (#11429 ) * Add the ability to add a context to internally generated druid broker queries * fix docs * changes after first CI failure * cleanup after merge with master * change default to empty map and improve unit tests * add doc info and fix checkstyle * refactor DruidSchema#runSegmentMetadataQuery and add a unit test	2021-10-06 09:02:41 -07:00
Kashif Faraz	b688db790b	Add Broker config `druid.broker.segment.ignoredTiers` (#11766 ) The new config is an extension of the concept of "watchedTiers" where the Broker can choose to add the info of only the specified tiers to its timeline. Similarly, with this config, Broker can choose to ignore the segments being served by the specified historical tiers. By default, no tier is ignored. This config is useful when you want a completely isolated tier amongst many other tiers. Say there are several tiers of historicals Tier T1, Tier T2 ... Tier Tn and there are several brokers Broker B1, Broker B2 .... Broker Bm If we want only Broker B1 to query Tier T1, instead of setting a long list of watchedTiers on each of the other Brokers B2 ... Bm, we could just set druid.broker.segment.ignoredTiers=["T1"] for these Brokers, while Broker B1 could have druid.broker.segment.watchedTiers=["T1"]	2021-10-06 10:06:32 +05:30
Maytas Monsereenusorn	a04b08e45c	Add new config to filter internal Druid-related messages from Query API response (#11711 ) * add impl * add impl * add tests * add unit test * fix checkstyle * address comments * fix checkstyle * fix checkstyle * fix checkstyle * fix checkstyle * fix checkstyle * address comments * address comments * address comments * fix test * fix test * fix test * fix test * fix test * change config name * change config name * change config name * address comments * address comments * address comments * address comments * address comments * address comments * fix compile * fix compile * change config * add more tests * fix IT	2021-09-29 12:55:49 +07:00
Agustin Gonzalez	2355a60419	Avoid primary key violation in segment tables under certain conditions when appending data to same interval (#11714 ) * Fix issue of duplicate key under certain conditions when loading late data in streaming. Also fixes a documentation issue with skipSegmentLineageCheck. * maxId may be null at this point, need to check for that * Remove hypothetical case (it cannot happen) * Revert compaction is simply "killing" the compacted segment and previously, used, overshadowed segments are visible again * Add comments	2021-09-22 19:21:48 -05:00
Clint Wylie	5de26cf6d9	add optional system schema authorization (#11720 ) * add optional system schema authorization * remove unused * adjust docs * doc fixes, missing ldap config change for integration tests * style	2021-09-21 13:28:26 -07:00
Clint Wylie	392f0ca1b5	refactor sql authorization to get resource type from schema, resource type to be string (#11692 ) * refactor sql authorization to get resource type from schema, refactor resource type from enum to string * information schema auth filtering adjustments * refactor * minor stuff * Update SqlResourceCollectorShuttle.java	2021-09-17 09:53:25 -07:00
Jonathan Wei	22b41ddbbf	Task reports for parallel task: single phase and sequential mode (#11688 ) * Task reports for parallel task: single phase and sequential mode * Address comments * Add null check for currentSubTaskHolder	2021-09-16 13:58:11 -05:00
Frank Chen	155a0c7a5c	return underlying object instead of the Optional object (#11596 )	2021-09-08 22:30:57 -07:00
Clint Wylie	fe1d8c206a	bump version to 0.23.0-SNAPSHOT (#11670 )	2021-09-08 15:56:04 -07:00
Agustin Gonzalez	9efa6cc9c8	Make persists concurrent with adding rows in batch ingestion (#11536 ) * Make persists concurrent with ingestion * Remove semaphore but keep concurrent persists (with add) and add push in the backround as well * Go back to documented default persists (zero) * Move to debug * Remove unnecessary Atomics * Comments on synchronization (or not) for sinks & sinkMetadata * Some cleanup for unit tests but they still need further work * Shutdown & wait for persists and push on close * Provide support for three existing batch appenderators using batchProcessingMode flag * Fix reference to wrong appenderator * Fix doc typos * Add BatchAppenderators class test coverage * Add log message to batchProcessingMode final value, fix typo in enum name * Another typo and minor fix to log message * LEGACY->OPEN_SEGMENTS, Edit docs * Minor update legacy->open segments log message * More code comments, mostly small adjustments to naming etc * fix spelling * Exclude BtachAppenderators from Jacoco since it is fully tested but Jacoco still refuses to ack coverage * Coverage for Appenderators & BatchAppenderators, name change of a method that was still using "legacy" rather than "openSegments" Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2021-09-08 13:31:52 -07:00
Jihoon Son	82049bbf0a	Cancel API for sqls (#11643 ) * initial work * reduce lock in sqlLifecycle * Integration test for sql canceling * javadoc, cleanup, more tests * log level to debug * fix test * checkstyle * fix flaky test; address comments * rowTransformer * cancelled state * use lock * explode instead of noop * oops * unused import * less aggressive with state * fix calcite charset * don't emit metrics when you are not authorized	2021-09-05 10:57:45 -07:00
Agustin Gonzalez	2405a9f25e	Fix create segment phase of batch ingestion to take segment identifiers that have a non UTC interval… (#11635 ) * Fix create segment phase of batch ingestion to take segment identifiers with non UTC time zones * Fix comment and LGTM forbidden error	2021-08-30 23:19:07 -07:00
Caroline1000	adeae3960f	DataSchema: improve rollup WARN message (#11631 ) * improve rollup WARN message * Update server/src/main/java/org/apache/druid/segment/indexing/DataSchema.java Co-authored-by: Suneet Saldanha <suneet@apache.org> * Update server/src/main/java/org/apache/druid/segment/indexing/DataSchema.java Co-authored-by: Suneet Saldanha <suneet@apache.org> * Update server/src/main/java/org/apache/druid/segment/indexing/DataSchema.java Co-authored-by: Caroline <caroline@Caroline-Harris.attlocal.net> Co-authored-by: Suneet Saldanha <suneet@apache.org> Co-authored-by: Caroline <caroline@Caroline-Harris.local>	2021-08-30 20:22:11 -07:00
zhangyue19921010	6d14ea2d14	Dynamic auto scale Kinesis-Stream ingest tasks (#10985 ) * ready to test * revert misc.xml * document kinesis md * Update docs/development/extensions-core/kafka-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update docs/development/extensions-core/kinesis-ingestion.md * Update kafka-ingestion.md remove leading ` * Update kinesis-ingestion.md add missing ` Co-authored-by: yuezhang <yuezhang@freewheel.tv> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2021-08-30 15:44:29 -07:00
Maytas Monsereenusorn	ce4dd48bb8	Support custom coordinator duties (#11601 ) * impl * fix checkstyle * fix checkstyle * fix checkstyle * add test * add test * add test * add integration tests * add integration tests * add more docs * address comments * address comments * address comments * add test * fix checkstyle * fix test	2021-08-19 11:54:11 +07:00
Yi Yuan	bf863343f8	delete some code (#11552 ) Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-08-16 10:40:40 -07:00
Parag Jain	c7b46671b3	option to use deep storage for storing shuffle data (#11507 ) Fixes #11297. Description Description and design in the proposal #11297 Key changed/added classes in this PR DataSegmentPusher ShuffleClient PartitionStat PartitionLocation *IntermediaryDataManager	2021-08-13 16:40:25 -04:00
Kashif Faraz	aaf0aaad8f	Enable routing of SQL queries at Router (#11566 ) This PR adds a new property druid.router.sql.enable which allows the Router to handle SQL queries when set to true. This change does not affect Avatica JDBC requests and they are still routed by hashing the Connection ID. To allow parsing of the request object as a SqlQuery (contained in module druid-sql), some classes have been moved from druid-server to druid-services with the same package name.	2021-08-13 18:44:39 +05:30
Jihoon Son	e9d964d504	Improve concurrency between DruidSchema and BrokerServerView (#11457 ) * Improve concurrency between DruidSchema and BrokerServerView * unused imports and workaround for error prone faiure * count only known segments * add comments	2021-08-06 14:07:13 -07:00
Kashif Faraz	39a3db7943	Add unit test for config `druid.broker.segment.watchedTiers` (#11555 )	2021-08-07 00:12:40 +05:30
Suneet Saldanha	e423e99997	Update default maxSegmentsInNodeLoadingQueue (#11540 ) * Update default maxSegmentsInNodeLoadingQueue Update the default maxSegmentsInNodeLoadingQueue from 0 (unbounded) to 100. An unbounded maxSegmentsInNodeLoadingQueue can cause cluster instability. Since this is the default druid operators need to run into this instability and then look through the docs to see that the recommended value for a large cluster is 1000. This change makes it so the default will prevent clusters from falling over as they grow over time. * update tests * codestyle	2021-08-05 11:26:58 -07:00
Maytas Monsereenusorn	3257913737	Improve query error logging (#11519 ) * Improve query error logging * add docs * address comments * address comments	2021-08-05 22:51:09 +07:00
Maytas Monsereenusorn	4470ca6a92	Fix hostname validation not skipping with `druid.client.https.validateHostnames=false` in java 8u275 and later (#11538 ) * fix skip hostname validation in java 8u275 and later * add unit test * fix checkstyle	2021-08-05 15:42:55 +07:00
Jihoon Son	8ba7f6a48c	Fix incorrect result of exact topN on an inner join with limit (#11517 )	2021-07-31 15:55:49 -07:00
Maytas Monsereenusorn	05a7da792f	compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded (#11510 ) * fix compaction status api * fix checkstyle * address comment	2021-07-30 22:19:24 +07:00
Yuanli Han	b83742179a	Reduce method invocation of reservoir sampling (#11257 ) * reduce method invocation of reservoir sampling * add a dynamic parameter and add benchmark * rebase	2021-07-30 22:09:50 +08:00
Xavier Léauté	4bca7f014e	update error-prone to 2.8.0 with fix for crashing check (#11494 ) * error-prone 2.8.0 fixes https://github.com/google/error-prone/issues/2396 * fix for a few ignored return values * fix unknown args in sub-modules	2021-07-29 09:13:46 -07:00
Jonathan Wei	9b250c54aa	Allow kill task to mark segments as unused (#11501 ) * Allow kill task to mark segments as unused * Add IndexerSQLMetadataStorageCoordinator test * Update docs/ingestion/data-management.md Co-authored-by: Jihoon Son <jihoonson@apache.org> * Add warning to kill task doc Co-authored-by: Jihoon Son <jihoonson@apache.org>	2021-07-29 10:48:43 -05:00

1 2 3 4 5 ...

3894 Commits