druid

Commit Graph

Author	SHA1	Message	Date
Clint Wylie	3ee66bb492	allow optimizing sql expressions and virtual columns (#12241 ) * rework sql planner expression and virtual column handling * simplify a bit * add back and deprecate old methods, more tests, fix multi-value string coercion bug and associated tests * spotbugs * fix bugs with multi-value string array expression handling * javadocs and adjust test * better * fix tests	2022-02-09 14:55:50 -08:00
Clint Wylie	f260bbed23	restore and deprecate AggregatorFactory methods (#11917 ) * add back and deprecate aggregator factory methods so i can say i told you so when i delete these later * rename to make less ambiguous, fix fill method * adjust	2021-11-19 15:59:35 -08:00
Clint Wylie	a8805ab60d	add missing json type for ListFilteredVirtualColumn (#11887 ) * add missing json type for ListFilteredVirtualColumn, and tests to try to avoid this happening again * fixes * ugly, but maybe this * oops * too many mappers	2021-11-09 17:25:12 -08:00
Clint Wylie	741b4ed516	add output type information to ExpressionPostAggregator (#11818 ) * add ColumnInspector argument to PostAggregator.getType to allow post-aggs to compute their output type based on input types * add test for test for coverage * simplify * Remove unused imports. Co-authored-by: Gian Merlino <gian@imply.io>	2021-10-22 13:52:51 -07:00
Clint Wylie	187df58e30	better types (#11713 ) * better type system * needle in a haystack * ColumnCapabilities is a TypeSignature instead of having one, INFORMATION_SCHEMA support * fixup merge * more test * fixup * intern * fix * oops * oops again * ... * more test coverage * fix error message * adjust interning, more javadocs * oops * more docs more better	2021-10-19 01:47:25 -07:00
Clint Wylie	f6662b4893	fix count and average SQL aggregators on constant virtual columns (#11208 ) * fix count and average SQL aggregators on constant virtual columns * style * even better, why are we tracking virtual columns in aggregations at all if we have a virtual column registry * oops missed a few * remove unused * this will fix it	2021-05-10 13:41:48 -07:00
Clint Wylie	691d7a1d54	SQL timeseries no longer skip empty buckets with all granularity (#11188 ) * SQL timeseries no longer skip empty buckets with all granularity * add comment, fix tests * the ol switcheroo * revert unintended change * docs and more tests * style * make checkstyle happy * docs fixes and more tests * add docs, tests for array_agg * fixes * oops * doc stuffs * fix compile, match doc style	2021-05-10 10:13:37 -07:00
Jihoon Son	25db8787b3	Fix CAST being ignored when aggregating on strings after cast (#11083 ) * Fix CAST being ignored when aggregating on strings after cast * fix checkstyle and dependency * unused import	2021-04-12 22:21:24 -07:00
Clint Wylie	ab60661008	refactor internal type system (#9638 ) * better type tracking: add typed postaggs, finalized types for agg factories * more javadoc * adjustments * transition to getTypeName to be used exclusively for complex types * remove unused fn * adjust * more better * rename getTypeName to getComplexTypeName * setup expression post agg for type inference existing * more javadocs * fixup * oops * more test * more test * more comments/javadoc * nulls * explicitly handle only numeric and complex aggregators for incremental index * checkstyle * more tests * adjust * more tests to showcase difference in behavior * timeseries longsum array	2020-08-26 10:53:44 -07:00
Clint Wylie	339876b69d	fill out missing test coverage for druid-stats, druid-momentsketch, druid-tdigestsketch postaggs (#9740 ) * postagg test coverage for druid-stats, druid-momentsketch, druid-tdigestsketch and fixes * style fixes * fix comparator for TDigestQuantilePostAggregator	2020-05-07 13:48:33 -07:00
Gian Merlino	1ef25a438f	Broker: Add ability to inline subqueries. (#9533 ) * Broker: Add ability to inline subqueries. The main changes: - ClientQuerySegmentWalker: Add ability to inline queries. - Query: Add "getSubQueryId" and "withSubQueryId" methods. - QueryMetrics: Add "subQueryId" dimension. - ServerConfig: Add new "maxSubqueryRows" parameter, which is used by ClientQuerySegmentWalker to limit how many rows can be inlined per query. - IndexedTableJoinMatcher: Allow creating keys on top of unknown types, by assuming they are strings. This is useful because not all types are known for fields in query results. - InlineDataSource: Store RowSignature rather than component parts. Add more zealous "equals" and "hashCode" methods to ease testing. - Moved QuerySegmentWalker test code from CalciteTests and SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest. * Adjustments from CI. * Fix integration test.	2020-03-18 15:06:45 -07:00
Gian Merlino	ff59d2e78b	Move RowSignature from druid-sql to druid-processing and make use of it. (#9508 ) * Move RowSignature from druid-sql to druid-processing and make use of it. 1) Moved (most of) RowSignature from sql to processing. Left behind the SQL-specific stuff in a RowSignatures utility class. It also picked up some new convenience methods along the way. 2) There were a lot of places in the code where Map<String, ValueType> was used to associate columns with type info. These are now all replaced with RowSignature. 3) QueryToolChest's resultArrayFields method is replaced with resultArraySignature, and it now provides type info. * Fix up extensions. * Various fixes	2020-03-12 11:06:44 -07:00
Clint Wylie	b408a6d774	sql support for dynamic parameters (#6974 ) * sql support for dynamic parameters * fixup * javadocs * fixup from merge * formatting * fixes * fix it * doc fix * remove druid fallback self-join parameterized test * unused imports * ignore test for now * fix imports * fixup * fix merge * merge fixup * fix test that cannot vectorize * fixup and more better * dependency thingo * fix docs * tweaks * fix docs * spelling * unused imports after merge * review stuffs * add comment * add ignore text * review stuffs	2020-02-19 13:09:20 -08:00
Suneet Saldanha	33a97dfaae	Guicify druid sql module (#9279 ) * Guicify druid sql module Break up the SQLModule in to smaller modules and provide a binding that modules can use to register schemas with druid sql. * fix some tests * address code review * tests compile * Working tests * Add all the tests * fix up licenses and dependencies * add calcite dependency to druid-benchmarks * tests pass * rename the schemas	2020-02-04 11:33:48 -08:00
Gian Merlino	b411443d22	SQL join support for lookups. (#9294 ) * SQL join support for lookups. 1) Add LookupSchema to SQL, so lookups show up in the catalog. 2) Add join-related rels and rules to SQL, allowing joins to be planned into native Druid queries. * Add two missing LookupSchema calls in tests. * Fix tests. * Fix typo.	2020-01-31 23:51:16 -08:00
Samarth Jain	9ec9619143	Handle null values for metrics in TDigest aggregators. (#9073 ) Add support for rollup during ingestion.	2019-12-23 17:49:06 -08:00
Clint Wylie	3fcaa1a61b	fix sql compatible null handling config work with runtime.properties (#8876 ) * fix sql compatible null handling config work with runtime.properties * fix npe * fix tests * add friendly error * comment, and friendlier still * fix compile * fix from merges	2019-11-20 03:55:29 -08:00
Roman Leventov	5c0fc0a13a	Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 ) * IndexerSQLMetadataStorageCoordinator.getTimelineForIntervalsWithHandle() don't fetch abutting intervals; simplify getUsedSegmentsForIntervals() * Add VersionedIntervalTimeline.findNonOvershadowedObjectsInInterval() method; Propagate the decision about whether only visible segmetns or visible and overshadowed segments should be returned from IndexerMetadataStorageCoordinator's methods to the user logic; Rename SegmentListUsedAction to RetrieveUsedSegmentsAction, SegmetnListUnusedAction to RetrieveUnusedSegmentsAction, and UsedSegmentLister to UsedSegmentsRetriever * Fix tests * More fixes * Add javadoc notes about returning Collection instead of Set. Add JacksonUtils.readValue() to reduce boilerplate code * Fix KinesisIndexTaskTest, factor out common parts from KinesisIndexTaskTest and KafkaIndexTaskTest into SeekableStreamIndexTaskTestBase * More test fixes * More test fixes * Add a comment to VersionedIntervalTimelineTestBase * Fix tests * Set DataSegment.size(0) in more tests * Specify DataSegment.size(0) in more places in tests * Fix more tests * Fix DruidSchemaTest * Set DataSegment's size in more tests and benchmarks * Fix HdfsDataSegmentPusherTest * Doc changes addressing comments * Extended doc for visibility * Typo * Typo 2 * Address comment	2019-11-06 11:07:04 -08:00
Clint Wylie	1054d85171	add mechanism to control filter optimization in historical query processing (#8209 ) * add support for mechanism to control filter optimization in historical query processing * oops * adjust * woo * javadoc * review comments * fix * default * oops * oof * this will fix it * more nullable, refactor DimFilter.getRequiredColumns to use Set, formatting * extract class DimFilterToStringBuilder with common code from custom DimFilter toString implementations * adjust variable naming * missing nullable * more nullable * fix javadocs * nullable * address review comments * javadocs, precondition * nullable * rename method to be consistent * review comments * remove tuning from ColumnComparisonFilter/ColumnComparisonDimFilter	2019-08-09 16:36:18 -07:00
Samarth Jain	93cf9d4ad4	SQL support for t-digest based sketch aggregators (#8100 ) * SQL support for t-digest based sketch aggregators * Fix teamcity errors * Add missing dependencies * Remove unused dependency * Address code review comments * Add checks for compression param	2019-08-05 12:01:42 -07:00
Gian Merlino	77297f4e6f	GroupBy array-based result rows. (#8196 ) * GroupBy array-based result rows. Fixes #8118; see that proposal for details. Other than the GroupBy changes, the main other "interesting" classes are: - ResultRow: The array-based result type. - BaseQuery: T is no longer required to be Comparable. - QueryToolChest: Adds "decorateObjectMapper" to enable query-aware serialization and deserialization of result rows (necessary due to their positional nature). - QueryResource: Uses the new decoration functionality. - DirectDruidClient: Also uses the new decoration functionality. - QueryMaker (in Druid SQL): Modifications to read ResultRows. These classes weren't changed, but got some new javadocs: - BySegmentQueryRunner - FinalizeResultsQueryRunner - Query * Adjustments for TC stuff.	2019-07-31 16:15:12 -07:00
Clint Wylie	abf9843e2a	fail complex type 'serde' registration when registered type does not match expected type (#7985 ) * make ComplexMetrics.registerSerde type check on register, resolves #7959 * add test * simplify * unused imports :/ * simplify * burned by imports yet again * remove unused constructor * switch to getName * heh oops	2019-07-11 23:03:15 -07:00
Xue Yu	b9c6a26c0e	Use ComplexMetrics.registerSerde() across the codebase (#7925 ) * refactor complexmetric registerserde * fix error * feedback address	2019-06-25 11:39:04 +03:00
SandishKumarHN	01881e3a98	Use only com.google.errorprone.annotations.concurrent.GuardedBy, not javax.annotations.concurrent.GuardedBy (#7889 )	2019-06-17 15:58:51 +02:00
Clint Wylie	ffc2397bcd	fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs (#7731 ) * AggregatorFactory finalizeComputation is nullable with nullable input, make implementations honor this * fixes	2019-05-22 21:13:09 -07:00
Samarth Jain	b542bb9f34	TDigest backed sketch aggregators (#7331 ) * First set of changes for tDigest histogram * Add license * Address code review comments * Add a doc page for new T-Digest sketch aggregators. Minor code cleanup and comments. * Remove synchronization from BufferAggregators. Address code review comments * Fix typo	2019-05-09 17:22:55 -07:00

26 Commits