druid

mirror of https://github.com/apache/druid.git synced 2025-03-02 15:29:10 +00:00

Author	SHA1	Message	Date
Jonathan Wei	a6105cbb86	Add numeric StringComparator (#3270 ) * Add numeric StringComparator * Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query * Address PR comments, add multithreaded BoundDimFilter test * Add comment on strlen tie handling * Add timeseries interval filter benchmark * Adjust docs * Use jackson for StringComparator, address PR comments * Add new TopNMetricSpec and SearchSortSpec with tests (WIP) * More TopNMetricSpec and SearchSortSpec tests * Fix NewSearchSortSpec serde * Update docs for new DimensionTopNMetricSpec * Delete NumericDimensionTopNMetricSpec * Delete old SearchSortSpec * Rename NewSearchSortSpec to SearchSortSpec * Add TopN numeric comparator benchmark, address PR comments * Refactor OrderByColumnSpec * Add null checks to NumericComparator and String->BigDecimal conversion function * Add more OrderByColumnSpec serde tests	2016-07-29 15:44:16 -07:00
Navis Ryu	884017d981	"all" type search query spec (#3300 ) * "all" type search query spec * addressed comments * added unit test	2016-07-28 18:16:15 -07:00
Gian Merlino	2553997200	Associate groupBy v2 resources with the Sequence lifecycle. (#3296 ) This fixes a potential issue where groupBy resources could be allocated to create a Sequence, but then the Sequence is never used, and thus the resources are never freed. Also simplifies how groupBy handles config overrides (this made the new unit test easier to write).	2016-07-27 18:44:19 -07:00
Gian Merlino	9b5523add3	Reference counting, better error handling for resources in groupBy v2. (#3268 ) Refcounting prevents releasing the merge buffer, or closing the concurrent grouper, before the processing threads have all finished. The better error handling prevents an avalanche of per-runner exceptions when grouping resources are exhausted, by grouping those all up into a single merged exception.	2016-07-27 01:59:02 +05:30
Erik Dubbelboer	76fabcfdb2	Fix #2782 , Unit test failed for DruidProcessingConfigTest.testDeserialization (#3231 ) On systems with only once processor this test fails.	2016-07-25 15:51:09 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Jonathan Wei	a42ccb6d19	Support filtering on long columns (including __time) (#3180 ) * Support filtering on __time column * Rename DruidPredicate * Add docs for ValueMatcherFactory, add comment on getColumnCapabilities * Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate * Address PR comments (support filter on all long columns) * Use predicate factory instead of composite predicate * Address PR comments * Lazily initialize long handling in selector/in filter * Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe * Add multithreaded selector/in filter test * Fix non-final lock object in SelectorDimFilter	2016-07-20 17:08:49 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Nishant	7995818220	Increase test timeout to prevent failing on slow machines (#3224 ) constantly timing out on one of slow build machines, increasing the timeout fixed it. Running io.druid.granularity.QueryGranularityTest Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.776 sec - in io.druid.granularity.QueryGranularityTest	2016-07-17 18:44:48 -07:00
Gian Merlino	6cd1f5375b	Better harmonized dimensions for query metrics. (#3245 ) All query metrics now start with toolChest.makeMetricBuilder, and all of those now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id" moved to common code, since all query metrics added it anyway. In particular this will add query-type specific dimensions like "threshold" and "numDimensions" to servlet-originated metrics like query/time.	2016-07-14 11:55:51 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Gian Merlino	fdc7e88a7d	Allow queries with no aggregators. (#3216 ) This is actually reasonable for a groupBy or lexicographic topNs that is being used to do a "COUNT DISTINCT" kind of query. No aggregators are needed for that query, and including a dummy aggregator wastes 8 bytes per row. It's kind of silly for timeseries, but why not.	2016-07-06 20:38:54 +05:30
Jonathan Wei	f3a3662133	Fix compile error in SearchBinaryFnTest (#3201 )	2016-06-29 09:44:45 -05:00
jaehong choi	efbcbf5315	Support alphanumeric sort in search query (#2593 ) * support alphanumeric sort in search query * address a comment about handling equals() and hashCode() * address comments * add Ut for string comparators * address a comment about space indentations.	2016-06-28 15:06:18 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	da660bb592	DumpSegment tool. (#3182 ) Fixes #2723.	2016-06-23 14:37:50 -07:00
Gian Merlino	a437fb150b	Fix SegmentMetadataQuery when queryGranularity is requested but not present. (#3181 )	2016-06-23 14:30:50 -07:00
Jonathan Wei	24860a1391	Two-stage filtering (#3018 ) * Two-stage filtering * PR comment	2016-06-22 16:08:21 -07:00
Nishant	f46ad9a4cb	support Union Segment metadata queries (#3132 ) * support Union Segment metadata queries fix 3128 * remove extraneous sys out	2016-06-21 10:30:50 -07:00
Dave Li	12be1c0a4b	Add bucket extraction function (#3033 ) * add bucket extraction function * add doc and header * updated doc and test	2016-06-17 09:24:27 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	0d427923c0	fix caching for search results (#3119 ) * fix caching for search results properly read count when reading from cache. * fix NPE during merging search count and add test * Update cache key to invalidate prev results	2016-06-09 17:49:47 -07:00
Gian Merlino	5998de7d5b	Fix lenient merging of conflicting aggregators. (#3113 ) This should have marked the conflicting aggregator as null, but instead it threw an NPE for the entire query.	2016-06-08 15:56:48 -07:00
Jonathan Wei	37c8a8f186	Speed up filter tests with adapter cache (#3103 )	2016-06-08 07:41:10 -07:00
Gian Merlino	54139c6815	Fix NPE in registeredLookup extractionFn when "optimize" is not provided. (#3064 )	2016-06-03 12:58:17 -05:00
Gian Merlino	6171e078c8	Improve NPE message in LookupDimensionSpec when lookup does not exist. (#3065 ) The message used to be empty, which made things hard to debug.	2016-06-02 19:59:12 -07:00
John Wang	e662efa79f	segment interface refactor for proposal 2965 (#2990 )	2016-05-26 20:36:41 -07:00
Kurt Young	b5bd406597	fix #2991 : race condition in OnheapIncrementalIndex#addToFacts (#3002 ) * fix #2991: race condition in OnheapIncrementalIndex#addToFacts * add missing header * handle parseExceptions when first doing first agg	2016-05-25 19:05:46 -07:00
Jonathan Wei	b72c54c4f8	Add benchmark data generator, basic ingestion/persist/merge/query benchmarks (#2875 )	2016-05-25 16:39:37 -07:00
Dave Li	dcabd4b1ee	Add lookup optimization for InDimFilter (#2938 ) * Add lookup optimization for InDimFilter * tests for in filter with lookup extraction fn * refactor * refactor2 and modified filter test * make optimizeLookup private	2016-05-19 16:29:16 -07:00
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	fb01db4db7	[QTL] Allows RegisteredLookupExtractionFn to find its lookups lazily (#2971 ) * Allows RegisteredLookupExtractionFn to find its lookups lazily * Use raw variables instead of AtomicReference * Make sure to use volatile * Remove extra local variable. * Move from BAOS to ByteBuffer	2016-05-17 11:29:39 -07:00
Himanshu	d3e9c47a5f	use correct ObjectMapper in Index[IO/Merger] in AggregationTestHelper and minor fix in theta sketch SketchMergeAggregatorFactory.getMergingFactory(..) (#2943 )	2016-05-13 10:06:31 +05:30
Himanshu	d821144738	at historicals GpBy query mergeResults does not need merging as results are already merged by GroupByQueryRunnerFactory.mergeRunners(..) (#2962 )	2016-05-12 17:41:24 -07:00
Gian Merlino	01bebf432a	GroupByQuery: Multi-value dimension tests. (#2959 )	2016-05-12 11:31:50 -07:00
Charles Allen	a31348450f	Add toString for LookupConfig (#2935 ) * Helps with operations and getting where the snapshot dir is	2016-05-09 18:20:00 -07:00
Dave Li	79a54283d4	Optimize filter for timeseries, search, and select queries (#2931 ) * Optimize filter for timeseries, search, and select queries * exception at failed toolchest type check * took out query type check * java7 error fix and test improvement	2016-05-09 11:04:06 -07:00
Slim	8b570ab130	make it clear what LookupExtractorFactory start/stop methods return (#2925 )	2016-05-05 10:38:40 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
Himanshu	8e2742b7e8	adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query (#2873 )	2016-05-03 11:31:10 -07:00
Gian Merlino	40e595c7a0	Remove types from TimeAndDims, they aren't needed. (#2865 )	2016-05-03 13:10:25 -05:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Navis Ryu	2729fea84d	Fix parsing fail of segment id with datasource containing underscore (#2797 ) * Fix parsing fail of segment id with underscored datasource (Fix for #2786) * addressed comment * renamed and moved code into api. added log4 dependency for tests * addressed comments * fixed test fails	2016-05-02 22:37:28 -07:00
Gian Merlino	90ce03c66f	Fix integer overflow in SegmentMetadataQuery numRows. (#2890 )	2016-04-27 14:37:04 -07:00
Gian Merlino	6dc7688a29	TimeAndDims equals/hashCode implementation. (#2870 ) Adapted from #2692, thanks @navis for original implementation.	2016-04-22 08:45:20 +08:00
Himanshu	3cfd9c64c9	make singleThreaded groupBy query config overridable at query time (#2828 ) * make isSingleThreaded groupBy query processing overridable at query time * refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent	2016-04-21 17:12:58 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00

1 2 3 4 5 ...

1535 Commits