druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	89d9c61894	Deprecate Aggregator.getName and AggregatorFactory.getAggregatorStartValue. (#3572 )	2016-10-31 15:24:30 -07:00
Navis Ryu	3fca3be9ea	SpecificSegmentQueryRunner misses missing segments from toYielder() (#3617 )	2016-10-30 11:47:29 -07:00
Himanshu	23a8e22836	fix SketchMergeAggregatorFactory.finalizeResults, comparator and more UTs for timeseries, topN (#3613 )	2016-10-28 15:48:33 -07:00
Navis Ryu	898c1c21af	More best-effort parse long (#3603 ) * More best-effort parse long * addressed comments	2016-10-25 10:31:51 -07:00
Akash Dwivedi	4b3bd8bd63	Migrating java-util from Metamarkets. (#3585 ) * Migrating java-util from Metamarkets. * checkstyle and updated license on java-util files. * Removed unused imports from whole project. * cherry pick metamx/java-util@826021f. * Copyright changes on java-util pom, address review comments.	2016-10-21 14:57:07 -07:00
Navis Ryu	8b7ff4409a	Math expressional parameters for aggregator (#2783 ) * Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was #2820) * Addressed comments * addressed comments	2016-10-19 13:58:35 -05:00
Roman Leventov	b113a34355	In CPUTimeMetricQueryRunner, account CPU consumed in baseSequence.toYielder() (#3587 )	2016-10-18 09:06:42 -05:00
Charles Allen	2c5c8198db	Make query/cpu/time still report on error (#3535 )	2016-10-18 08:26:21 -05:00
Roman Leventov	9611358f0a	Small topn scan improvements (#3526 ) * Remove unused numProcessed param from PooledTopNAlgorithm.aggregateDimValue() * Replace AtomicInteger with simple int in PooledTopNAlgorithm.scanAndAggregate() and aggregateDimValue() * Remove unused import	2016-10-17 10:36:19 -07:00
Gian Merlino	285516bede	Workaround non-thread-safe use of HLL aggregators. (#3578 ) Despite the non-thread-safety of HyperLogLogCollector, it is actually currently used by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and "get" methods can be called simultaneously by OnheapIncrementalIndex, since its "doAggregate" and "getMetricObjectValue" methods are not synchronized. This means that the optimization of HyperLogLogCollector.fold in #3314 (saving and restoring position rather than duplicating the storage buffer of the right-hand side) could cause corruption in the face of concurrent writes. This patch works around the issue by duplicating the storage buffer in "get" before returning a collector. The returned collector still shares data with the original one, but the situation is no worse than before #3314. In the future we may want to consider making a thread safe version of HLLC that avoids these kinds of problems in realtime indexing. But for now I thought it was best to do a small change that restored the old behavior.	2016-10-17 09:39:12 -07:00
Roman Leventov	5dc95389f7	Add Checkstyle framework (#3551 ) * Add Checkstyle framework * Avoid star import * Need braces for control flow statements * Redundant imports * Add NewLineAtEndOfFile check	2016-10-13 13:37:47 -07:00
Roman Leventov	85ac8eff90	Improve performance of IndexMergerV9 (#3440 ) * Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9 * Don't mask index in MergeIntIterator.makeQueueElement() * DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator * Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable * Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }	2016-10-13 08:28:46 -07:00
Charles Allen	76e77cb610	Make segment creation gauva 14 friendly (#3520 )	2016-10-05 15:25:03 -07:00
Gian Merlino	40f2fe7893	Bump versions to 0.9.3-SNAPSHOT (#3524 )	2016-09-29 13:53:32 -07:00
Charles Allen	654e1db309	Add simple test to FunctionalExtractionTest (#3522 )	2016-09-28 23:45:15 -07:00
Gian Merlino	d5a8a35fec	groupBy: GroupByRowProcessor fixes, invert subquery context overrides. (#3502 ) - Fix GroupByRowProcessor config overrides - Fix GroupByRowProcessor resource limit checking - Invert subquery context overrides such that for the subquery, its own keys override keys from the outer query, not the other way around. The last bit is necessary for the test to work, and seems like a better way to do it anyway.	2016-09-23 14:41:09 -07:00
Gian Merlino	7195be32d8	groupBy v2: Fix dangling references. (#3500 ) Acquiring references in the processing task prevents dangling references caused by canceled processing tasks.	2016-09-24 01:59:11 +05:30
Gian Merlino	f8d71fc602	groupBy: Fix maxMergingDictionarySize config. (#3488 )	2016-09-22 10:02:33 -07:00
Gian Merlino	c87ecea975	Fix ListFilteredDimensionSpec blacklisting on non-present values. (#3487 )	2016-09-22 09:12:02 -07:00
Navis Ryu	49c0fe0e8b	Show candidate hosts for the given query (#2282 ) * Show candidate hosts for the given query * Added test cases & minor changes to address comments * Changed path-param to query-pram for intervals/numCandidates	2016-09-22 08:32:38 -07:00
Keuntae Park	54ec4dd584	Support renaming of outputName for cached select and search query results (#3395 ) * support renaming of outputName for cached select and search queries * rebase and resolve conflicts * rollback CacheStrategy interface change * updated based on review comments	2016-09-20 08:19:14 -07:00
Charles Allen	95e08b38ea	[QTL] Reduced Locking Lookups (#3071 ) * Lockless lookups * Fix compile problem * Make stack trace throw instead * Remove non-germane change * * Add better naming to cache keys. Makes logging nicer * Fix #3459 * Move start/stop lock to non-interruptable for readability purposes	2016-09-16 11:54:23 -07:00
Jonathan Wei	df766b2bbd	Add dimension handling interface for ingestion and segment creation (#3217 ) * Add dimension handling interface for ingestion and segment creation * update javadocs for DimensionHandler/DimensionIndexer * Move IndexIO row validation into DimensionHandler * Fix null column skipping in mergerV9 * Add deprecation note for 'numeric_dims' filename pattern in IndexIO v8->v9 conversion * Fix java7 test failure	2016-09-12 12:54:02 -07:00
Gian Merlino	d108461838	groupBy v2: Parallel disk spilling. (#3433 ) In ConcurrentGrouper, when it becomes clear that disk spilling is necessary, switch from hash-based partitioning to thread-based partitioning. This stops processing threads from blocking each other while spilling is occurring.	2016-09-09 16:49:58 -06:00
Gian Merlino	1e3f94237e	groupBy v2: Configurable load factor. (#3437 ) Also change defaults: - bufferGrouperMaxLoadFactor from 0.75 to 0.7. - maxMergingDictionarySize to 100MB from 25MB, should be more appropriate for most heaps.	2016-09-07 14:14:59 -05:00
Roman Leventov	4f0bcdce36	Eager file unmapping in IndexIO, IndexMerger and IndexMergerV9 (#3422 ) * Eager file unmapping in IndexIO, IndexMerger and IndexMergerV9. The exact purpose for this change is to allow running IndexMergeBenchmark in Windows, however should also be universally 'better' than non-deterministic unmapping, done when MappedByteBuffers are garbage-collected (BACKEND-312) * Use Closer with a proper pattern in IndexIO, IndexMerger and IndexMergerV9 * Unmap file in IndexMergerV9.makeInvertedIndexes() using try-with-resources * Reformat IndexIO	2016-09-07 10:43:47 -07:00
Gian Merlino	8d2ae144a8	groupBy: Short-circuit identity preCompute manipulators. (#3434 )	2016-09-06 22:28:44 -06:00
Gian Merlino	1d07964987	LimitedTemporaryStorage: Fix perf bug. (#3432 ) FilterOutputStream has an inefficient implementation of write(byte[], int, int). So let's extend OutputStream directly and use efficient implementations of all methods.	2016-09-06 15:39:36 -07:00
Gian Merlino	8ed1894488	groupBy: Omit timestamp from merge key when granularity = all. (#3416 ) Fixes #3412.	2016-09-01 09:02:54 -07:00
Gian Merlino	6d25c5e053	Avoid materializing all groupBy results with order + limit. (#3410 ) The old TopNFunction code did Sequences.toList on the input sequence before using a priority queue to find the top N items. Now, the priority queue is used in an accumulator, so there is no need to fully materialize the results. Also removed equals/hashCode from the limitFn and remove limitFn from the GroupByQuery's hashCode, since that wasn't necessary and the implementation of hashCode wasn't correct anyway.	2016-08-31 14:08:07 -07:00
Gian Merlino	1268e2902c	Add groupBy test for multiple multi-value dimensions. (#3415 )	2016-08-31 11:21:10 -07:00
Gian Merlino	e9050c2b4c	TimeFormatExtractionFn: Allow null formats (equivalent to ISO8601) and granular bucketing. (#3411 )	2016-08-31 20:58:53 +05:30
Keuntae Park	0076b5fc1a	Interval bug fix for search query (#2903 ) * support query granularity and interval for search query * skip unncessary bitmap calculation when query interval contains whole the data interval of the given segments. * use binary search to find start and end index for the given interval * fix based on comment * bug fix based on the review comments and add unit tests	2016-08-31 20:52:44 +05:30
Dave Li	c4e8440c22	Adds long compression methods (#3148 ) * add read * update deprecated guava calls * add write and vsizeserde * add benchmark * separate encoding and compression * add header and reformat * update doc * address PR comment * fix buffer order * generate benchmark files * separate encoding strategy and format * fix benchmark * modify supplier write to channel * add float NONE handling * address PR comment * address PR comment 2	2016-08-30 16:17:46 -07:00
Jonathan Wei	4e91330a17	Use DimensionSpec in CardinalityAggregatorFactory (#3406 ) * Use DimensionSpec in CardinalityAggregatorFactory * Address PR comments * Fix requiredFields()	2016-08-30 15:54:02 -07:00
Gian Merlino	b11e9544ea	GroupBy v2: Improve hash code distribution. (#3407 ) Without this transformation, distribution of hash % X is poor in general. It is catastrophically poor when X is a multiple of 31 (many slots would be empty).	2016-08-30 12:09:08 +05:30
kaijianding	f037dfcaa4	fix missing segments duplicate retried (#3398 )	2016-08-29 23:46:21 +05:30
jaehong choi	2e0f253c32	introducing lists of existing columns in the fields of select queries' output (#2491 ) * introducing lists of existing columns in the fields of select queries' output * rebase master * address the comment. add test code for select query caching * change the cache code in SelectQueryQueryToolChest to 0x16	2016-08-25 21:37:53 +05:30
rajk-tetration	362b9266f8	Adding filters for TimeBoundary on backend (#3168 ) * Adding filters for TimeBoundary on backend Signed-off-by: Balachandar Kesavan <raj.ksvn@gmail.com> * updating TimeBoundaryQuery constructor in QueryHostFinderTest * add filter helpers * update filterSegments + test * Conditional filterSegment depending on whether a filter exists * Style changes * Trigger rebuild * Adding documentation for timeboundaryquery filtering * added filter serialization to timeboundaryquery cache * code style changes	2016-08-15 10:25:24 -07:00
Gian Merlino	e1b0b7de3e	IndexBuilder: Allow replacing rows, customizable maxRows. (#3359 )	2016-08-12 15:22:45 -07:00
Jonathan Wei	454587857c	Make StringComparator deserialization case-insensitive (#3356 )	2016-08-11 18:00:11 -07:00
Himanshu	043562914d	Update IncrementalIndex.getMetricType() to return type name stored by ComplexMetricsSerde instead of AggregatorFactory.getTypeName() (#3341 )	2016-08-10 11:03:44 -07:00
Gian Merlino	1eb7a7e882	Restore optimizations in BoundFilter. (#3343 )	2016-08-10 08:53:17 -07:00
Gian Merlino	a2bcd97512	IncrementalIndex: Fix multi-value dimensions returned from iterators. (#3344 ) They had arrays as values, which MapBasedRow doesn't understand and toStrings rather than converting to lists.	2016-08-10 08:47:29 -07:00
Jonathan Wei	890e3bdd3f	More informative query unit test names (#3342 )	2016-08-09 22:24:48 -07:00
Gian Merlino	8899affe48	Introduce standardized "Resource limit exceeded" error. (#3338 ) Fixes #3336.	2016-08-09 10:50:56 -07:00
Gian Merlino	21bce96c4c	More useful query errors. (#3335 ) Follow-up to #1773, which meant to add more useful query errors but did not actually do so. Since that patch, any error other than interrupt/cancel/timeout was reported as `{"error":"Unknown exception"}`. With this patch, the error fields are: - error, one of the specific strings "Query interrupted", "Query timeout", "Query cancelled", or "Unknown exception" (same behavior as before). - errorMessage, the message of the topmost non-QueryInterruptedException in the causality chain. - errorClass, the class of the topmost non-QueryInterruptedException in the causality chain. - host, the host that failed the query.	2016-08-09 16:14:52 +08:00
Gian Merlino	1aae5bd67d	Nicer handling for cancelled groupBy v2 queries. (#3330 ) 1. Wrap temporaryStorage in a resource holder, to avoid spurious "Closed" errors from already-running processing tasks. 2. Exit early from the merging accumulator if the query is cancelled.	2016-08-05 14:48:06 -07:00
Jonathan Wei	decefb7477	Add time interval dim filter and retention analysis example (#3315 ) * Add time interval dim filter and retention analysis example * Use closed-open matching for intervals, update cache key generation * Fix time filtering tests for interval boundary change	2016-08-05 07:25:04 -07:00
Navis Ryu	5b3f0ccb1f	Support variance and standard deviation (#2525 ) * Support variance and standard deviation * addressed comments	2016-08-04 17:32:58 -07:00
Gian Merlino	9437a7a313	HLL: Avoid some allocations when possible. (#3314 ) - HLLC.fold avoids duplicating the other buffer by saving and restoring its position. - HLLC.makeCollector(buffer) no longer duplicates incoming BBs. - Updated call sites where appropriate to duplicate BBs passed to HLLC.	2016-08-03 18:08:52 -07:00
Gian Merlino	a4b95af839	Fix grouper closing in GroupByMergingQueryRunnerV2. (#3316 ) The grouperHolder should be closed on failure, not the grouper.	2016-08-02 21:02:30 -07:00
Gian Merlino	0299ac73b8	Fix FilteredAggregators at ingestion time and in groupBy v2 nested queries. (#3312 ) The common theme between the two is they both create "fake" DimensionSelectors that work on top of Rows. They both do it because there isn't really any dictionary for the underlying Rows, they're just a stream of data. The fix for both is to allow a DimensionSelector to tell callers that it has no dictionary by returning CARDINALITY_UNKNOWN from getValueCardinality. The callers, in turn, can avoid using it in ways that assume it has a dictionary. Fixes #3311.	2016-08-02 17:39:40 -07:00
Gian Merlino	ae3e0015b6	Fix ClassCastException in nested v2 groupBys with timeouts. (#3310 ) Add tests for the CCE and for a bunch of other groupBy stuff. Also avoids setting the interrupted flag when InterruptedExceptions happen, since this might interfere with resource closing, no other query does it, and is probably pointless anyway since the thread is likely to be a jetty thread that we don't actually want to set an interrupt flag on. Also fixes toString on OrderByColumnSpec.	2016-08-02 16:02:44 -06:00
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
Jonathan Wei	0bdaaa224b	Use Long.compare for NumericComparator when possible (#3309 )	2016-08-01 20:36:56 -07:00
Dave Li	bc20658239	groupBy nested query using v2 strategy (#3269 ) * changed v2 nested query strategy * add test for #3239 * update for new ValueMatcher interface and add benchmarks * enable time filtering * address PR comments * add failing test for outer filter aggregator * add helper class for sharing code * update nested groupby doc * move temporary storage instantiation * address PR comment * address PR comment 2	2016-08-01 18:30:39 -07:00
Jonathan Wei	a6105cbb86	Add numeric StringComparator (#3270 ) * Add numeric StringComparator * Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query * Address PR comments, add multithreaded BoundDimFilter test * Add comment on strlen tie handling * Add timeseries interval filter benchmark * Adjust docs * Use jackson for StringComparator, address PR comments * Add new TopNMetricSpec and SearchSortSpec with tests (WIP) * More TopNMetricSpec and SearchSortSpec tests * Fix NewSearchSortSpec serde * Update docs for new DimensionTopNMetricSpec * Delete NumericDimensionTopNMetricSpec * Delete old SearchSortSpec * Rename NewSearchSortSpec to SearchSortSpec * Add TopN numeric comparator benchmark, address PR comments * Refactor OrderByColumnSpec * Add null checks to NumericComparator and String->BigDecimal conversion function * Add more OrderByColumnSpec serde tests	2016-07-29 15:44:16 -07:00
Navis Ryu	884017d981	"all" type search query spec (#3300 ) * "all" type search query spec * addressed comments * added unit test	2016-07-28 18:16:15 -07:00
Gian Merlino	2553997200	Associate groupBy v2 resources with the Sequence lifecycle. (#3296 ) This fixes a potential issue where groupBy resources could be allocated to create a Sequence, but then the Sequence is never used, and thus the resources are never freed. Also simplifies how groupBy handles config overrides (this made the new unit test easier to write).	2016-07-27 18:44:19 -07:00
Gian Merlino	9b5523add3	Reference counting, better error handling for resources in groupBy v2. (#3268 ) Refcounting prevents releasing the merge buffer, or closing the concurrent grouper, before the processing threads have all finished. The better error handling prevents an avalanche of per-runner exceptions when grouping resources are exhausted, by grouping those all up into a single merged exception.	2016-07-27 01:59:02 +05:30
Erik Dubbelboer	76fabcfdb2	Fix #2782 , Unit test failed for DruidProcessingConfigTest.testDeserialization (#3231 ) On systems with only once processor this test fails.	2016-07-25 15:51:09 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Jonathan Wei	a42ccb6d19	Support filtering on long columns (including __time) (#3180 ) * Support filtering on __time column * Rename DruidPredicate * Add docs for ValueMatcherFactory, add comment on getColumnCapabilities * Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate * Address PR comments (support filter on all long columns) * Use predicate factory instead of composite predicate * Address PR comments * Lazily initialize long handling in selector/in filter * Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe * Add multithreaded selector/in filter test * Fix non-final lock object in SelectorDimFilter	2016-07-20 17:08:49 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Nishant	7995818220	Increase test timeout to prevent failing on slow machines (#3224 ) constantly timing out on one of slow build machines, increasing the timeout fixed it. Running io.druid.granularity.QueryGranularityTest Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.776 sec - in io.druid.granularity.QueryGranularityTest	2016-07-17 18:44:48 -07:00
Gian Merlino	6cd1f5375b	Better harmonized dimensions for query metrics. (#3245 ) All query metrics now start with toolChest.makeMetricBuilder, and all of those now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id" moved to common code, since all query metrics added it anyway. In particular this will add query-type specific dimensions like "threshold" and "numDimensions" to servlet-originated metrics like query/time.	2016-07-14 11:55:51 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Gian Merlino	fdc7e88a7d	Allow queries with no aggregators. (#3216 ) This is actually reasonable for a groupBy or lexicographic topNs that is being used to do a "COUNT DISTINCT" kind of query. No aggregators are needed for that query, and including a dummy aggregator wastes 8 bytes per row. It's kind of silly for timeseries, but why not.	2016-07-06 20:38:54 +05:30
Jonathan Wei	f3a3662133	Fix compile error in SearchBinaryFnTest (#3201 )	2016-06-29 09:44:45 -05:00
jaehong choi	efbcbf5315	Support alphanumeric sort in search query (#2593 ) * support alphanumeric sort in search query * address a comment about handling equals() and hashCode() * address comments * add Ut for string comparators * address a comment about space indentations.	2016-06-28 15:06:18 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	da660bb592	DumpSegment tool. (#3182 ) Fixes #2723.	2016-06-23 14:37:50 -07:00
Gian Merlino	a437fb150b	Fix SegmentMetadataQuery when queryGranularity is requested but not present. (#3181 )	2016-06-23 14:30:50 -07:00
Jonathan Wei	24860a1391	Two-stage filtering (#3018 ) * Two-stage filtering * PR comment	2016-06-22 16:08:21 -07:00
Nishant	f46ad9a4cb	support Union Segment metadata queries (#3132 ) * support Union Segment metadata queries fix 3128 * remove extraneous sys out	2016-06-21 10:30:50 -07:00
Dave Li	12be1c0a4b	Add bucket extraction function (#3033 ) * add bucket extraction function * add doc and header * updated doc and test	2016-06-17 09:24:27 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	0d427923c0	fix caching for search results (#3119 ) * fix caching for search results properly read count when reading from cache. * fix NPE during merging search count and add test * Update cache key to invalidate prev results	2016-06-09 17:49:47 -07:00
Gian Merlino	5998de7d5b	Fix lenient merging of conflicting aggregators. (#3113 ) This should have marked the conflicting aggregator as null, but instead it threw an NPE for the entire query.	2016-06-08 15:56:48 -07:00
Jonathan Wei	37c8a8f186	Speed up filter tests with adapter cache (#3103 )	2016-06-08 07:41:10 -07:00
Gian Merlino	54139c6815	Fix NPE in registeredLookup extractionFn when "optimize" is not provided. (#3064 )	2016-06-03 12:58:17 -05:00
Gian Merlino	6171e078c8	Improve NPE message in LookupDimensionSpec when lookup does not exist. (#3065 ) The message used to be empty, which made things hard to debug.	2016-06-02 19:59:12 -07:00
John Wang	e662efa79f	segment interface refactor for proposal 2965 (#2990 )	2016-05-26 20:36:41 -07:00
Kurt Young	b5bd406597	fix #2991 : race condition in OnheapIncrementalIndex#addToFacts (#3002 ) * fix #2991: race condition in OnheapIncrementalIndex#addToFacts * add missing header * handle parseExceptions when first doing first agg	2016-05-25 19:05:46 -07:00
Jonathan Wei	b72c54c4f8	Add benchmark data generator, basic ingestion/persist/merge/query benchmarks (#2875 )	2016-05-25 16:39:37 -07:00
Dave Li	dcabd4b1ee	Add lookup optimization for InDimFilter (#2938 ) * Add lookup optimization for InDimFilter * tests for in filter with lookup extraction fn * refactor * refactor2 and modified filter test * make optimizeLookup private	2016-05-19 16:29:16 -07:00
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	fb01db4db7	[QTL] Allows RegisteredLookupExtractionFn to find its lookups lazily (#2971 ) * Allows RegisteredLookupExtractionFn to find its lookups lazily * Use raw variables instead of AtomicReference * Make sure to use volatile * Remove extra local variable. * Move from BAOS to ByteBuffer	2016-05-17 11:29:39 -07:00
Himanshu	d3e9c47a5f	use correct ObjectMapper in Index[IO/Merger] in AggregationTestHelper and minor fix in theta sketch SketchMergeAggregatorFactory.getMergingFactory(..) (#2943 )	2016-05-13 10:06:31 +05:30
Himanshu	d821144738	at historicals GpBy query mergeResults does not need merging as results are already merged by GroupByQueryRunnerFactory.mergeRunners(..) (#2962 )	2016-05-12 17:41:24 -07:00
Gian Merlino	01bebf432a	GroupByQuery: Multi-value dimension tests. (#2959 )	2016-05-12 11:31:50 -07:00
Charles Allen	a31348450f	Add toString for LookupConfig (#2935 ) * Helps with operations and getting where the snapshot dir is	2016-05-09 18:20:00 -07:00
Dave Li	79a54283d4	Optimize filter for timeseries, search, and select queries (#2931 ) * Optimize filter for timeseries, search, and select queries * exception at failed toolchest type check * took out query type check * java7 error fix and test improvement	2016-05-09 11:04:06 -07:00
Slim	8b570ab130	make it clear what LookupExtractorFactory start/stop methods return (#2925 )	2016-05-05 10:38:40 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
Himanshu	8e2742b7e8	adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query (#2873 )	2016-05-03 11:31:10 -07:00
Gian Merlino	40e595c7a0	Remove types from TimeAndDims, they aren't needed. (#2865 )	2016-05-03 13:10:25 -05:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Navis Ryu	2729fea84d	Fix parsing fail of segment id with datasource containing underscore (#2797 ) * Fix parsing fail of segment id with underscored datasource (Fix for #2786) * addressed comment * renamed and moved code into api. added log4 dependency for tests * addressed comments * fixed test fails	2016-05-02 22:37:28 -07:00
Gian Merlino	90ce03c66f	Fix integer overflow in SegmentMetadataQuery numRows. (#2890 )	2016-04-27 14:37:04 -07:00
Gian Merlino	6dc7688a29	TimeAndDims equals/hashCode implementation. (#2870 ) Adapted from #2692, thanks @navis for original implementation.	2016-04-22 08:45:20 +08:00
Himanshu	3cfd9c64c9	make singleThreaded groupBy query config overridable at query time (#2828 ) * make isSingleThreaded groupBy query processing overridable at query time * refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent	2016-04-21 17:12:58 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	c74391e54c	JavaScript: Ability to disable. (#2853 ) Fixes #2852.	2016-04-21 09:43:15 -05:00
Gian Merlino	7d3e55717d	Reduce cost of various toFilter calls. (#2860 ) These happen once per segment and so it's better if they don't do as much work.	2016-04-21 04:28:46 +08:00
Gian Merlino	59460b17cc	Add Filters.matchPredicate helper, use it where appropriate. (#2851 ) This approach simplifies code and is generally faster, due to skipping unnecessary dictionary lookups (see #2850).	2016-04-19 15:54:32 -07:00
Xavier Léauté	b2745befb7	remove obsolete comment (#2858 )	2016-04-19 13:06:58 -07:00
Jisoo Kim	7b65ca7889	refactor ClientQuerySegmentWalker (#2837 ) * refactor ClientQuerySegmentWalker * add header to FluentQueryRunnerBuilder * refactor QueryRunnerTestHelper	2016-04-18 14:00:47 -07:00
Gian Merlino	7c0b1dde3a	DimensionPredicateFilter: Skip unnecessary dictionary lookup. (#2850 )	2016-04-18 12:38:25 -07:00
Jonathan Wei	b534f7203c	Fix performance regression from #2753 in IndexMerger (#2841 )	2016-04-14 21:39:41 -07:00
Jonathan Wei	a26134575b	Fix NPE in TopNLexicographicResultBuilder.addEntry() (#2835 )	2016-04-13 17:27:16 -07:00
Fangjin Yang	abd951df1a	Document how to use roaring bitmaps (#2824 ) * Document how to use roaring bitmaps This fixes #2408. While not all indexSpec properties are explained, it does explain how roaring bitmaps can be turned on. * fix * fix * fix * fix	2016-04-12 19:28:02 -07:00
michaelschiff	db35dd7508	fix issue #2744 . Check for null before combining metrics (#2774 )	2016-04-12 14:46:31 -07:00
Nishant	1bf1dd03a0	Merge pull request #2812 from mrijke/fix-missing-equals-hashcode-filters Add missing equals/hashcode to JS, Regex and SearchQuery DimFilters	2016-04-12 12:00:23 +05:30
Charles Allen	21e406613c	Merge pull request #2809 from metamx/fix2694 Fix test for snapshot taker to better check for lookup perist failure	2016-04-11 14:52:47 -07:00
Maarten Rijke	de68d6b7c4	Add missing equals/hashcode to JS, Regex and SearchQuery DimFilters This commits adds missing equals() and hashcode() methods to the JavascriptDimFilter, RegexDimFilter and the SearchQueryDimFilter.	2016-04-11 12:16:24 +02:00
Nishant	bbb326decf	Merge pull request #2799 from b-slim/fix_snapshot MapLookupFactory need to be Ser/Desr ready.	2016-04-07 13:22:34 +05:30
Slim Bouguerra	bf1eafc4e1	remove all the mock lookupFactory	2016-04-06 15:37:52 -05:00
Slim Bouguerra	59eb2490a0	MapLookupFactory need to be Ser/Desr.	2016-04-06 15:02:18 -05:00
Charles Allen	f915a59138	Merge pull request #2691 from metamx/lookupExtrFn Add ExtractionFn to LookupExtractor bridge	2016-04-06 09:13:08 -07:00
jon-wei	051fd6c0eb	Remove extra println from InFilter	2016-04-05 14:55:49 -07:00
Fangjin Yang	289bb6f885	Merge pull request #2690 from jon-wei/filter_support Allow filters to use extraction functions	2016-04-05 15:40:15 -06:00
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Gian Merlino	e060a9f283	Additional ExtractionFn null-handling adjustments. Followup to comments on #2771.	2016-04-01 18:35:26 -07:00
Fangjin Yang	18b9ea62cf	Merge pull request #2771 from gianm/extractionfn-stuff Various ExtractionFn null handling fixes.	2016-04-01 16:35:46 -07:00
Gian Merlino	23d66e5ff9	Merge pull request #2765 from navis/invalid-encode-nullstring Null string is encoded as "null" in incremental index	2016-04-01 14:43:40 -07:00
Gian Merlino	b6e4d8b2c1	Various ExtractionFn null handling fixes. - JavaScriptExtractionFn shouldn't pass empty strings to its JS functions - Upper/LowerExtractionFn properly handles null Objects (DimExtractionFn's implementation works here) - MatchingDimExtractionFn properly returns nulls rather than empties - RegexDimExtractionFn properly attempts matching on nulls and empties - SearchQuerySpecDimExtractionFn properly returns nulls when passed empties	2016-04-01 14:34:47 -07:00
Fangjin Yang	eea7a47870	Merge pull request #2576 from navis/paging-from-next Add option for select query to get next page without modifying returned paging identifiers	2016-04-01 13:50:36 -07:00
Fangjin Yang	4eb5a2c4f1	Merge pull request #2715 from navis/stringformat-null-handling stringFormat extractionFn should be able to return null on null values (Fix for #2706)	2016-04-01 13:45:28 -07:00
Gian Merlino	23364a47fd	BaseFilterTest: Test optimized filters too.	2016-04-01 12:44:59 -07:00
navis.ryu	077522a46f	stringFormat extractionFn should be able to return null on null values (Fix for #2706 )	2016-04-01 13:40:56 +09:00
navis.ryu	f0e55f5d31	Null string is encoded as "null" in incremental index	2016-04-01 09:47:15 +09:00
navis.ryu	29bb00535b	Add option for select query to get next page without modifying returned paging identifiers	2016-04-01 09:03:03 +09:00
Gian Merlino	5f9240fcbc	Merge pull request #2577 from navis/native-in-filter Implement native in filter	2016-03-30 20:02:54 -07:00
Fangjin Yang	3d68da94fe	Merge pull request #2661 from navis/utf8-estimated-length Utility method for length estimation of utf8	2016-03-30 19:56:14 -07:00
navis.ryu	108535fd07	Implement native in filter (Fix for #2577 )	2016-03-31 10:10:57 +09:00
navis.ryu	e0cfd9ee19	Utility method for length estimation of utf8	2016-03-31 10:07:00 +09:00
jon-wei	5503bf1b38	Remove unnecessary type check in TimeAndDimsComp	2016-03-30 17:54:15 -07:00
Fangjin Yang	95733a362f	Merge pull request #2753 from gianm/null-filtering-multi-value-columns More consistent empty-set filtering behavior on multi-value columns.	2016-03-29 18:52:25 -07:00
Charles Allen	95d42cfd9e	Merge pull request #2758 from pjain1/fix_npe_in_filter handle null values in In Filter	2016-03-29 17:53:02 -07:00
Gian Merlino	1853f36e9f	More consistent empty-set filtering behavior on multi-value columns. The behavior is now that filters on "null" will match rows with no values. The behavior in the past was inconsistent; sometimes these filters would match and sometimes they wouldn't. Adds tests for this behavior to SelectorFilterTest and BoundFilterTest, for query-level filters and filtered aggregates. Fixes #2750.	2016-03-29 15:32:13 -07:00
Parag Jain	d892918a3d	handle null values in In Filter	2016-03-29 17:03:26 -05:00
Fangjin Yang	e023df2b92	Merge pull request #2754 from gianm/i-dont-get-it Remove error suppression code from IncrementalIndexAdapter.	2016-03-28 19:29:53 -07:00
Gian Merlino	c7ff0d698e	Remove error suppression code from IncrementalIndexAdapter.	2016-03-28 18:40:27 -07:00
fjy	c418a55638	cleanup distinct count agg	2016-03-28 17:29:41 -07:00
Fangjin Yang	9cb197adec	Merge pull request #2722 from himanshug/fix_hadoop_jar_upload config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-28 14:49:03 -07:00
Charles Allen	4a98c4fbac	Fix LookupExtractionFn equals and hashCode	2016-03-28 13:14:43 -07:00
Charles Allen	0ee861d0da	Add ExtractionFn to LookupExtractor bridge	2016-03-28 13:14:43 -07:00
Fangjin Yang	7fe277e6da	Merge pull request #2727 from gianm/optimize-bound-filter BoundFilter optimizations, and related interface changes.	2016-03-26 18:59:05 -07:00
Fangjin Yang	0dae28b6af	Merge pull request #2729 from jon-wei/fix_hyperunique_comparator Fix HyperUniquesAggregatorFactory comparator	2016-03-26 15:39:35 -07:00
Gian Merlino	2970b49adc	BoundFilter optimizations, and related interface changes. BoundFilter: - For lexicographic bounds, use bitmapIndex.getIndex to find the start and end points, then union all bitmaps between those points. - For alphanumeric bounds, iterate through dimValues, and union all bitmaps for values matching the predicate. - Change behavior for nulls: it used to be that the BoundFilter would never match nulls, now it matches nulls if "" is allowed by the lower limit and not excluded by the upper limit. Interface changes: - BitmapIndex: add `int getIndex(value)` to make it possible to get the index for a value without retrieving the bitmap. - BitmapIndex: remove `ImmutableBitmap getBitmap(value)`, change callers to `getBitmap(getIndex(value))`. - BitmapIndexSelector: allow retrieving the underlying BitmapIndex through getBitmapIndex. - Clarified contract of indexOf in Indexed, GenericIndexed. Also added tests for SelectorFilter, NotFilter, and BoundFilter.	2016-03-25 14:11:48 -07:00
jon-wei	9afaa2b94a	Fix HyperUniquesAggregatorFactory comparator	2016-03-25 12:36:42 -07:00
Gian Merlino	4ac9e03161	Fix predicate-based ValueMatcher behavior for IncrementalIndex on missing columns. Missing columns should be treated the same as columns containing 100% nulls.	2016-03-25 10:23:59 -07:00
Himanshu Gupta	e78a469fb7	UTs for ExtensionsConfig	2016-03-25 10:51:28 -05:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Nishant	0b03c9405f	Merge pull request #2614 from sirpkt/calendric_gran Support week, month, quarter, and year in query granularity	2016-03-24 16:21:01 -07:00
Himanshu	56343c6cdc	Merge pull request #2704 from navis/simple-optimize optimize single elemented and/or filter	2016-03-24 16:13:48 -05:00
Gian Merlino	713062053c	Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters. I believe that the instanceof chain in Filters exists because in the past, Filter and DimFilter were in different packages (DimFilter was in druid-client and Filter was in druid-processing). And since druid-client didn't depend on druid-processing, DimFilter couldn't have a toFilter method. But now it can.	2016-03-23 17:03:49 -07:00
Gian Merlino	dd86198902	All Filters should work with FilteredAggregators. This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a ValueMatcherFactory implementation to FilteredAggregatorFactory so it can take advantage of existing makeMatcher(ValueMatcherFactory) implementations. This patch also removes the Bound-based method from ValueMatcherFactory. Its only user was the SpatialFilter, which could use the Predicate-based method. Fixes #2604.	2016-03-23 12:24:01 -07:00
binlijin	57d78d3293	clean tmp file when index merge fail	2016-03-23 10:55:12 +08:00
navis.ryu	91f6be4884	optimize single elemented and/or filter	2016-03-23 09:29:15 +09:00
Gian Merlino	ff25325f3b	Improved docs for multi-value dimensions. - Add central doc for multi-value dimensions, with some content from other docs. - Link to multi-value dimension doc from topN and groupBy docs. - Fixes a broken link from dimensionspecs.md, which was presciently already linking to this nonexistent doc. - Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes "multi-value") in favor of "multi-value".	2016-03-22 14:40:55 -07:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Keuntae Park	7f29f2ac3b	support week, month, quarter, year in query granularity	2016-03-21 17:41:53 +09:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Slim	cf342d8d3c	Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility [QTL][Lookup] lookup module with the snapshot utility	2016-03-17 11:39:47 -05:00
Slim Bouguerra	0c86b29ef0	lookup module with the snapshot utility	2016-03-17 09:20:41 -05:00
Charles Allen	2ac8a22173	Merge pull request #2579 from metamx/closerIsCloser Make CloserRule use guava's Closer	2016-03-14 17:18:19 -07:00
Charles Allen	a64979463f	Make CloserRule use guava's Closer	2016-03-14 15:01:24 -07:00
Fangjin Yang	06813b510a	Merge pull request #2571 from himanshug/gp_by_avoid_sort avoid sort while doing groupBy merging when possible	2016-03-14 14:46:51 -07:00
Fangjin Yang	dbdbacaa18	Merge pull request #2260 from navis/cardinality-for-searchquery Support cardinality for search query	2016-03-14 13:24:40 -07:00
Slim	8cc3582e70	Merge pull request #2644 from metamx/optimize-timeboundary optimize timeboundary for min or max bound	2016-03-13 13:16:24 -05:00
navis.ryu	be341bf4e3	Support cardinality for search query (Fix for #2260 )	2016-03-12 09:51:01 +09:00
Xavier Léauté	6f0d6ef0e9	optimize timeboundary for min or max bound	2016-03-11 14:11:47 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Himanshu Gupta	dc0214bddb	while GroupBy merging use unsorted facts in IncrementalIndex wherever possible	2016-03-10 16:11:48 -06:00
Himanshu Gupta	02dfd5cd80	update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance	2016-03-10 16:11:48 -06:00
Xavier Léauté	90d7409e1a	Merge pull request #2611 from himanshug/gp_by_max_limit only allow lowering maxResults and maxIntermediateRows from groupBy query context	2016-03-10 13:44:13 -08:00
Gian Merlino	a2b1652787	Clarify parser docs. - Clarify what parseSpecs are used for. - Avro, Protobuf should use timeAndDims parseSpecs. - Hadoop jobs should use hadoopyString string parsers.	2016-03-10 08:45:04 -08:00
Fangjin Yang	68cffe1d91	Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache Fix caching of skipEmptyBuckets for TimeseriesQuery.	2016-03-09 18:45:59 -08:00
Gian Merlino	708bc674fa	Make specifying query context booleans more consistent. Before, some needed to be strings and some needed to be real booleans. Now they can all be either one.	2016-03-08 19:38:26 -08:00
Gian Merlino	40dad6dff4	Fix caching of skipEmptyBuckets for TimeseriesQuery.	2016-03-08 19:22:12 -08:00
Himanshu Gupta	ca5de3f583	only allow lowering maxResults and maxIntermediateRows from groupBy query context	2016-03-08 15:03:59 -06:00
Himanshu Gupta	099acb4966	allow groupBy max[Intermediate]Rows limit be overridable by context	2016-03-07 15:22:41 -06:00
Himanshu Gupta	c544ebf25e	reintroducing the safety check removed in commit-1d602be so that dim value ids are less than cardinality	2016-03-03 23:34:23 -06:00
Bingkun Guo	4a58462fc7	update querySegmentSpec when passing query to getQueryRunner After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment. In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.	2016-03-02 16:44:56 -06:00
Nishant	31b502773a	Merge pull request #2480 from navis/pagingfail-over-segments Select query cannot span to next segment with paging	2016-03-01 11:42:41 +05:30
Fangjin Yang	e5c25725c0	Merge pull request #2562 from himanshug/fix_2556 with nested GpBy query outer query results need to be further merged	2016-02-29 12:17:33 -08:00
Himanshu Gupta	0722ced413	with GpBy query outer query results need to be further merged	2016-02-29 10:16:25 -06:00
navis.ryu	b1ff920831	Lazily initialize predicate for bound filter	2016-02-29 15:35:52 +09:00
navis.ryu	5f1e60324a	Added more complex test case with versioned segments	2016-02-29 14:48:24 +09:00
navis.ryu	2686bfa394	Select query cannot span to next segment with paging	2016-02-29 00:01:46 +09:00
Fangjin Yang	29d29ba98d	Merge pull request #2263 from jon-wei/flex_dims3 Allow IncrementalIndex to store Long/Float dimensions	2016-02-25 17:23:02 -08:00
jon-wei	c17ce02467	Allow IncrementalIndex to store Long/Float dimensions	2016-02-24 13:51:57 -08:00
jon-wei	fd3782522c	Rename 'replaceMissingValues...' parameters in RegexExtractionFn	2016-02-24 13:12:56 -08:00
Nishant	fb7eae34ed	Merge pull request #2249 from metamx/workerExpanded Use Worker instead of ZkWorker whenever possible	2016-02-24 13:23:22 +05:30
Charles Allen	ac13a5942a	Use Worker instead of ZkWorker whenver possible * Moves last run task state information to Worker * Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker	2016-02-23 15:02:03 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Fangjin Yang	3bdd757024	Merge pull request #1773 from b-slim/log_details Adding downstream source when throwing QueryInterruptedException	2016-02-22 10:16:07 -08:00
Slim Bouguerra	77925cc061	adding downstream source of QueryInterruptedException	2016-02-20 13:05:14 -06:00
Fangjin Yang	8ee81947cd	Merge pull request #2494 from himanshug/fix_timeseries do not drop post-aggs in TimeseriesQueryToolChest.makePreComputeManipulatorFn	2016-02-20 10:37:32 -08:00
Gian Merlino	d25c46cb9f	Add comparator to HyperUniquesFinalizingPostAggregator. This makes it possible to do groupBys with clauses like "HAVING uniques > 10". Beforehand you couldn't do it with either an aggregator (because it returns an HLLV1 which the havingSpec can't understand) or a finalized postaggregator (because it didn't have a comparator). Now you can at least do it with a finalizing postaggregator. Trying it with the aggregator alone still doesn't work. Added some topN and groupBy tests verifying the comparator, and added an @Ignore test that should pass if havingSpecs are made work on the aggregator directly.	2016-02-19 08:36:08 -08:00
Himanshu Gupta	11b0117422	do not drop post-aggs in timeseries query tool chest makePreComputeManipulatorFn like other query types	2016-02-17 20:51:35 -06:00
Jaehong Choi	32b9d57b23	handle a failing UT in GroupByQueryRunnerTest after merging into the master	2016-02-16 16:56:57 +09:00
Jaehong Choi	b25bca85bc	Merge branch 'master' of https://github.com/druid-io/druid into support-alphanumeric-dimensional-sort-in-gropu-by	2016-02-16 16:42:05 +09:00
Jaehong Choi	e89afc901b	delete System.out.println() in test code	2016-02-16 15:26:37 +09:00
Navis Ryu	cd315627c9	Merge pull request #2393 from CHOIJAEHONG1/support-alphanumeric-dimensional-sort-in-gropu-by support alphanumeric sorting for dimensional columns in groupby (#2393)	2016-02-16 14:11:30 +09:00
Slim	16092eb5e2	Merge pull request #2464 from gianm/print-properties Make startup properties logging optional.	2016-02-14 15:11:35 -06:00
Gian Merlino	e0c049c0b0	Make startup properties logging optional. Off by default, but enabled in the example config files. See also #2452.	2016-02-12 14:12:16 -08:00
Himanshu Gupta	da5fcd0124	before facts get it , indexAndOffsets should already know about it	2016-02-12 13:32:06 -06:00
Jonathan Wei	d63eec65a1	Merge pull request #2208 from navis/metadataquery-minmax Support min/max values for metadata query	2016-02-11 17:28:07 -08:00
Jonathan Wei	e1b022eac9	Merge pull request #2349 from navis/dimensionspec-for-selectquery Support dimension spec for select query	2016-02-11 16:38:16 -08:00
navis.ryu	dd2375477a	Support min/max values for metadata query (#2208 )	2016-02-12 09:35:58 +09:00
Gian Merlino	2d037ef05e	Merge pull request #2453 from DreamLab/fix/topn_sorting_anomaly Fix for unstable behavior of HyperLogLog comparator	2016-02-11 16:05:34 -08:00
navis.ryu	4d63196535	Support dimension spec for select query	2016-02-12 08:54:28 +09:00
Himanshu	47d48e1e67	Merge pull request #2452 from gianm/print-properties PropertiesModule: Print properties, processors, totalMemory on startup.	2016-02-11 16:49:34 -06:00
turu	f277a54a5c	removed unsafe heuristics from hll compareTo and provided unit test for regression	2016-02-11 23:46:24 +01:00
Slim	368988d187	Merge pull request #2291 from druid-io/lookupManager Promoting LookupExtractor state and LookupExtractorFactory to be a first class druid state object.	2016-02-11 16:07:27 -06:00
Gian Merlino	29f7758e74	PropertiesModule: Print properties, processors, totalMemory on startup.	2016-02-11 13:51:08 -08:00
Slim Bouguerra	4e119b7a24	Adding lookup ref manager and lookup dimension spec impl	2016-02-11 12:11:51 -06:00
Jaehong Choi	2f2e2ff5b9	support alphanumeric sorting for dimensional columns in groupby	2016-02-11 17:31:28 +09:00
Keuntae Park	05a144e39a	fix crash with filtered aggregator at ingestion time - only for selector filter because extraction filter is not supported as cardinality is not fixed at ingestion time	2016-02-11 11:25:33 +09:00
Fangjin Yang	b1673ee90e	Merge pull request #2409 from gianm/smq-merged-thing SegmentMetadataQuery: Retain segment id when merging, if possible.	2016-02-08 15:43:39 -08:00
Fangjin Yang	c9c20bb7f3	Merge pull request #2395 from metamx/fixExtractionDimFilterNullTest Actually check cache key null checking in ExtractionDimFilterTest	2016-02-08 14:10:52 -08:00
Gian Merlino	bd9c04244f	SegmentMetadataQuery: Retain segment id when merging, if possible. This is helpful on realtime nodes, where two analyses from two different hydrants are merged together but they are actually from the same segment.	2016-02-08 13:07:02 -08:00
Himanshu Gupta	9fe1b28ee5	provide configuration to enable usage of Off heap merging for groupBy query	2016-02-05 14:18:06 -06:00
Himanshu Gupta	b40c342cd1	make Global stupid pool cache size configurable	2016-02-05 14:18:06 -06:00
Himanshu Gupta	72a1e730a2	OffheapIncrementalIndex updates to do the aggregation merging off-heap	2016-02-05 14:17:05 -06:00
Himanshu Gupta	907dd77483	OffheapIncrementalIndex a copy/paste of OnheapIncrementalIndex	2016-02-05 14:02:31 -06:00
Charles Allen	aac5f9b2c9	Actually check cache key null checking in ExtractionDimFilterTest	2016-02-04 09:44:13 -08:00
fjy	1aa363cea7	new quickstart	2016-02-04 09:37:38 -08:00
Fangjin Yang	da77591129	Merge pull request #2392 from metamx/fix2391 Allow ExtractionDimFilter value to be null	2016-02-03 17:47:14 -08:00
Charles Allen	d4f00096ff	Allow ExtractionDimFilter value to be null * Fixes #2391	2016-02-03 15:51:47 -08:00
Himanshu Gupta	6e7d90cf56	UTs for DefaultLimitSpec	2016-02-03 15:59:12 -06:00
Himanshu Gupta	29e0d7f971	lazily create comparators for row columns when needed	2016-02-03 13:38:20 -06:00
navis.ryu	1d602be0f9	Replace string[] with int[] for dimensions	2016-02-03 15:03:22 +09:00
binlijin	a5ef30ff84	optimize topn on particular situation	2016-02-02 14:20:09 +08:00
Himanshu	93c50d8538	Merge pull request #2094 from navis/simplify-index-merge Simplifying dimension merging	2016-01-29 11:23:14 -06:00
navis.ryu	55a888ea2f	time-descending result of select queries	2016-01-29 10:06:05 +09:00
navis.ryu	dd774ef4dd	one-pass merging of dictionary & index	2016-01-29 10:03:53 +09:00
Himanshu	edd7ce58aa	Merge pull request #2348 from AlexanderSaydakov/fix-aggregator-test-helper fixed createIndex	2016-01-28 16:01:36 -06:00
saydakov	e0860661b1	fixed createIndex	2016-01-28 13:20:50 -08:00
Nishant	99017f4518	Merge pull request #2326 from navis/use-reverse-iterator use reverse-iterator if possible	2016-01-28 19:48:38 +05:30
Nishant	3880f54b87	Merge pull request #2332 from himanshug/configurable_partial make populateUncoveredIntervals a configuration in query context	2016-01-28 10:34:35 +05:30
navis.ryu	7324ece8f9	use reverse-iterator if possible	2016-01-28 09:04:55 +09:00
Xavier Léauté	5a3642bb93	Merge pull request #2247 from metamx/pedanticBuild Enable strict building in travis	2016-01-27 10:27:03 -08:00
Xavier Léauté	2e5004095a	Merge pull request #2341 from gianm/smq-test SegmentMetadataQuery: Fix merging of ColumnAnalysis errors.	2016-01-27 09:37:06 -08:00
Charles Allen	508734c8b0	Long constant reformatting in tests `l` --> `L`	2016-01-27 08:59:19 -08:00
Gian Merlino	b1e6c01762	Make LookupExtractor abstract methods public, they have to work across classloaders.	2016-01-26 23:08:03 -08:00
Gian Merlino	795343f7ef	SegmentMetadataQuery: Fix merging of ColumnAnalysis errors. Also add tests for: - ColumnAnalysis folding - Mixed mmap/incremental merging	2016-01-26 17:16:26 -08:00
Himanshu Gupta	3719b6e3c8	make populateUncoveredIntervals a configuration in query context	2016-01-26 15:13:45 -06:00
Himanshu	3844658fb5	Merge pull request #2323 from druid-io/update-druidapi Update druid-api to 0.3.16	2016-01-26 13:02:10 -06:00
Himanshu Gupta	09d3678667	adding single threaded indexing and querying test for IncrementalIndex	2016-01-23 00:17:14 -06:00
Charles Allen	0000b9fc62	Remove sorting in ProtoBufInputRowParserTest Due to processing/src/test/java/io/druid/data/input/ProtoBufInputRowParserTest.java	2016-01-22 16:02:25 -08:00
Himanshu Gupta	2f7f5119cf	older segments might not have field bitmapSerdeFactory for dimension columns and we must use appropriate default	2016-01-22 13:28:25 -06:00
binlijin	1d1f4d996d	Merge pull request #2111 from binlijin/optimize-create-inverted-indexes optimize create inverted indexes	2016-01-22 11:36:27 +08:00
binlijin	55f7dd4629	optimize create inverted indexes	2016-01-22 10:40:09 +08:00
Gian Merlino	d416279c14	SegmentMetadataQuery support for returning aggregators.	2016-01-21 17:27:25 -08:00
Fangjin Yang	5a9cd89059	Merge pull request #2305 from gianm/segment-metadata-query-multivalues Add StorageAdapter#getColumnTypeName, and various SegmentMetadataQuery adjustments	2016-01-21 17:22:34 -08:00
Gian Merlino	e5913be90e	Merge pull request #2257 from tubemogul/index-merge-bug Adds support for empty merge metrics. fixes #2256	2016-01-21 16:38:00 -08:00
Gian Merlino	87c8046c6c	Add StorageAdapter#getColumnTypeName, and various SegmentMetadataQuery adjustments. SegmentMetadataQuery stuff: - Simplify implementation of SegmentAnalyzer. - Fix type names for realtime complex columns; this used to try to merge a nice type name (like "hyperUnique") from mmapped segments with the word "COMPLEX" from incremental index segments, leading to a merge failure. Now it always uses the nice name. - Add hasMultipleValues to ColumnAnalysis. - Add tests for both mmapped and incremental index segments. - Update docs to include errorMessage.	2016-01-21 15:50:33 -08:00
Fangjin Yang	3f998117a6	Merge pull request #2306 from jon-wei/inherit2 More specific null/empty str handling in IndexMerger	2016-01-21 14:36:09 -08:00
Michael Schiff	1e44445f06	Adds support for empty merge metrics. fixes #2256	2016-01-21 13:21:37 -08:00
jon-wei	459a236067	More specific null/empty str handling in IndexMerger	2016-01-21 12:24:38 -08:00
Slim	201539260c	Merge pull request #2076 from b-slim/issue_2010_upper_lower_extractionFN adding lower and upper extraction fn	2016-01-21 09:58:07 -06:00
Slim Bouguerra	78feb3a13e	adding lower and upper extraction fn	2016-01-21 08:59:05 -06:00
Gian Merlino	5a932d28c1	Merge pull request #2288 from tubemogul/index-merge-bug2 Null check in IncrementalIndexAdapter.getDimValueLookup()	2016-01-20 17:07:15 -08:00
Nishant	59ea186af7	fix reference counting for segments	2016-01-20 17:24:21 +05:30
Michael Schiff	50ceec78a2	null check in IncrementalIndexAdapter.getDimValueLookup()	2016-01-19 23:19:28 -08:00
jon-wei	bc1e9b27c8	Consolidate IndexMergerTest and IndexMergerV9Test	2016-01-19 16:28:35 -08:00
jon-wei	747343e621	Preserve dimension order across indexes during ingestion	2016-01-19 13:34:11 -08:00
Fangjin Yang	0c31f007fc	Merge pull request #1728 from himanshug/aggregators_in_segment_metadata Store AggregatorFactory[] in segment metadata	2016-01-19 12:55:49 -08:00
Himanshu Gupta	a99aef29a1	adding aggregators to segment metadata	2016-01-19 14:23:39 -06:00
Himanshu Gupta	52eb0f04a7	adding a new method getMergingFactory(..) to AggregatorFactory	2016-01-18 22:03:46 -06:00
Himanshu Gupta	77fc86c015	making AggregatorFactory abstract class	2016-01-18 22:03:46 -06:00
Himanshu Gupta	164b0aad7a	removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class instead of a Map to store segment metadata	2016-01-18 22:03:46 -06:00
zhxiaog	3459a202ce	fixed #1873 , add ability to express CONCAT as an extractionFn	2016-01-18 15:03:17 -08:00
Keuntae Park	238dd3be3c	support cascade execution of extraction filters in extraction dimension spec	2016-01-18 11:10:19 +09:00
Fangjin Yang	f6a1a4ae20	Merge pull request #2138 from KurtYoung/feature-build-v9 build v9 directly	2016-01-16 13:35:46 -06:00
Kurt Young	82ff98c2bf	add config for build v9 directly and update docs	2016-01-16 11:26:34 +08:00
Kurt Young	1f2168fae5	add IndexMergerV9 add unit tests for IndexMergerV9 and fix some bugs add more unit tests and fix bugs handle null values and add more tests minor changes & use LoggingProgressIndicator in IndexGeneratorReducer make some static class public from IndexMerger minor changes and add some comments changes for comments	2016-01-16 11:25:28 +08:00
Kurt Young	bb50d2a2b2	add some streaming writers	2016-01-16 11:25:26 +08:00
Fangjin Yang	e0932ba1c2	Merge pull request #2267 from himanshug/fix_topn_multi_val_filter Remap id's returned in XXXFilteredDimensionSpec.getRow() as per reduced cardinality	2016-01-14 17:06:54 -08:00
Fangjin Yang	7704699b40	Merge pull request #2265 from navis/strlen-dimension-ignored Strlen sort spec ignores dimension	2016-01-14 17:06:33 -08:00
Himanshu Gupta	ae6a111444	fix XXXFilteredDimensionSpec to remap the dictionary encodings as per new cardinality	2016-01-13 22:25:02 -06:00
binlijin	a3140b2548	fix topN filtering on multi-valued dimension bug	2016-01-13 22:25:02 -06:00
navis.ryu	ea9fabdf2f	Strlen sort spec ignores dimension	2016-01-14 11:05:44 +09:00
Fangjin Yang	4c014c1574	Merge pull request #2228 from metamx/incremental-index-mem2 Improve heap usage for IncrementalIndex	2016-01-13 14:48:03 -08:00
navis.ryu	18479bb757	time-descending result of timeseries queries	2016-01-13 12:23:01 +09:00
Fangjin Yang	d7ad93debc	Merge pull request #2221 from binlijin/topN_minTopNThreshold Allow change minTopNThreshold per topN query	2016-01-12 16:22:20 -08:00
Nishant	4863e2ca4f	cache metric selectors instead of creating new ones for every metric in each row clear selectors on close. Add comments about thread safety.	2016-01-13 00:45:23 +05:30
Nishant	dfe6abb721	Merge pull request #2250 from himanshug/agg_test_helper_fix remove redundant registering of json modules in AggregationTestHelper	2016-01-12 11:42:00 +05:30
navis.ryu	976ebc45c0	Simplify information in IncrementalIndex	2016-01-12 10:18:11 +09:00
Himanshu Gupta	b973604bf8	remove redundant registering of json modules in AggregationTestHelper	2016-01-11 19:03:22 -06:00
Xavier Léauté	46a7f2660d	fix casing to be consistent with other classes	2016-01-08 10:19:06 -08:00
Fangjin Yang	d0b10c29d7	Merge pull request #2197 from metamx/clearIncIndexClose Make OnHeapIncrementalIndex clean maps on close()	2016-01-07 15:43:47 -08:00
Gian Merlino	4ecd901a1a	Merge pull request #2219 from himanshug/identity_extraction_fn_singleton make IdentityExtractionFn singleton	2016-01-07 10:08:28 -08:00
Fangjin Yang	aaea95ed1b	Merge pull request #2207 from himanshug/theta_sketch_select_query fix bug for thetaSketch metric not working with select queries	2016-01-07 09:46:09 -08:00
binlijin	010c6e959c	add test	2016-01-07 18:01:46 +08:00
binlijin	a6bfcc5bfd	Allow change minTopNThreshold per topN query	2016-01-07 14:51:00 +08:00
Fangjin Yang	4cc81d3eff	Merge pull request #2096 from b-slim/add_use_case_unapply Add use case unapply	2016-01-06 21:58:12 -08:00
Himanshu Gupta	217079d0c7	make IdentityExtractionFn singleton	2016-01-06 22:29:07 -06:00
Himanshu	902f51433d	Merge pull request #2125 from mangeshpardeshiyahoo/master Add extraction function support for Dimension Selector	2016-01-06 14:22:26 -06:00
Mangesh Pardeshi	75ee952197	Add extraction function support for dimension Selector	2016-01-06 13:47:07 -06:00
Slim Bouguerra	032d3bf6e6	Optimization of extraction filter by reversing the lookup	2016-01-06 11:16:11 -06:00
Himanshu Gupta	3f048f0b15	adding support to execute Select queries in AggregationTestHelper so that Select query based UTs can be written for complex aggregator implementations	2016-01-05 21:54:55 -06:00
Charles Allen	91fc32749b	Make OnHeapIncrementalIndex clean maps on close()	2016-01-04 11:18:16 -08:00
Himanshu Gupta	b47d807738	Add support for filtering at DimensionSpec level so that multivalued dimensions can be filtered correctly also adding UTs for multi-valued dimensions	2015-12-30 17:59:47 -06:00
Himanshu Gupta	fa5c3bb014	adding decorate(DimensionSelector) to DimensionSpec to enable support for arbitrary filtering/transformations to returned dimension values	2015-12-30 15:06:24 -06:00
Nishant	b68265399c	Merge pull request #2168 from druid-io/remove-indexmaker Remove IndexMaker	2015-12-30 12:24:29 +05:30
Fangjin Yang	e14ad74088	Merge pull request #1936 from b-slim/between_range_with_predicat adding Upper/Lower Bound Filter	2015-12-29 10:11:22 -08:00
fjy	faf421726b	remove IndexMaker	2015-12-28 14:19:02 -08:00
Gian Merlino	83f4130b5f	SegmentMetadataQuery merging fixes. - Fix merging when the INTERVALS analysisType is disabled, and add a test. - Remove transformFn from CombiningSequence, use MappingSequence instead. transformFn did not work for "accumulate" anyway, which made the tests wrong (the intervals should have been condensed, but were not). - Add analysisTypes to the Druids segmentMetadataQuery builder to make testing simpler.	2015-12-22 07:57:10 -08:00
Robin	dded4441d3	for completeness, add unit test for groupby/having with unrecognized type	2015-12-21 12:06:56 -06:00
Himanshu Gupta	e1631967e3	adding comments to explain merge failure in segmentMetadata query	2015-12-19 11:39:24 -06:00
Himanshu Gupta	7ecad1be24	Fix and UT for testing segment analysis merge	2015-12-19 00:24:02 -06:00
Fangjin Yang	7019d3c421	Merge pull request #2107 from jon-wei/fix_smq More efficient SegmentMetadataQuery	2015-12-18 16:40:47 -08:00
Fangjin Yang	14229ba0f2	Merge pull request #1922 from metamx/jsonIgnoresFinalFields Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to	2015-12-18 15:38:32 -08:00
Fangjin Yang	71f554bf80	Merge pull request #2101 from himanshug/fix_extraction_dim_filter_cache_key add extractionFn bytes to cache key in ExtractionDimFilter	2015-12-18 12:05:43 -08:00
Fangjin Yang	9e6874cc7e	Merge pull request #2084 from binlijin/master minor optimize IndexMerger's MMappedIndexRowIterable	2015-12-18 11:42:55 -08:00
Bingkun	cc21a5fac7	Merge pull request #1999 from himanshug/remove_min_max_aggs remove min/max aggregator factory	2015-12-18 13:38:52 -06:00
jon-wei	356b07c6c3	More efficient SegmentMetadataQuery	2015-12-17 12:46:23 -08:00
Jonathan Wei	f8cf84f466	Merge pull request #1995 from himanshug/num_rows_seg_metadata_query add numRows to segment metadata query response	2015-12-17 12:23:46 -08:00
Himanshu Gupta	82ea348003	add extractionFn bytes to cache key in ExtractionDimFilter	2015-12-16 14:00:38 -06:00
Himanshu	628643d80e	Merge pull request #2091 from rasahner/noDefaultForGroupbyHaving take away default for groupBy/having	2015-12-16 01:07:40 -06:00
sahner	3441cf3110	take away default for groupBy/having	2015-12-15 10:32:45 -06:00
Fangjin Yang	e7f06cf61c	Merge pull request #2075 from jon-wei/regex_extract Configurable value replacement on match failure for RegexExtractionFn	2015-12-14 19:10:50 -08:00
jon-wei	c88f75df7c	Configurable value replacement on match failure for RegexExtractionFn	2015-12-14 17:57:41 -08:00
binlijin	362bea1090	minor optimize IndexMerger's MMappedIndexRowIterable	2015-12-11 15:04:46 +08:00
Xavier Léauté	d531e69d1a	Merge pull request #2079 from binlijin/master reduce bytearray copy to minimal optimize VSizeIndexedWriter	2015-12-10 21:30:09 -08:00
Slim Bouguerra	77afdf25e3	adding Bound Filter	2015-12-10 08:47:21 -06:00
Slim Bouguerra	ee1a39801a	adding bulk lookup and reverse lookup	2015-12-10 08:29:41 -06:00
binlijin	0eafbd55b2	reduce bytearray copy to minimal optimize VSizeIndexedWriter	2015-12-10 16:34:39 +08:00
Fangjin Yang	f4ba13a1ac	Merge pull request #2029 from b-slim/add_reverse_fn Adding reverse lookup function to LookupExtractor.	2015-12-09 12:50:13 -08:00
Xavier Léauté	9015a68c03	Merge pull request #2002 from navis/DRUID-2001 fixed #2001 GenericIndexed.fromIterable compares all values even when it's not sorted	2015-12-09 08:56:49 -08:00
Slim Bouguerra	85f339b687	introduction and implem of reverse lookup function unApply.	2015-12-09 10:02:57 -06:00
Nishant	6c23d8edb4	Merge pull request #2043 from mangeshpardeshiyahoo/master Add dimension selector support for groupby/having filters	2015-12-08 12:08:53 +05:30
Mangesh Pardeshi	d7ce120929	Add dimension selector support for groupby/having quries	2015-12-08 01:51:11 +00:00
Himanshu Gupta	431469e9c1	remove min/max aggregator factory which are replaced by double[min/max] aggregator factories	2015-12-05 22:36:49 -06:00
Himanshu Gupta	62ba9ade37	unifying license header in all java files	2015-12-05 22:16:23 -06:00
Gian Merlino	d21a640695	Merge pull request #2034 from b-slim/fix_cache_key Fix getCacheKey for DimFilters	2015-12-04 09:13:06 -08:00
Slim Bouguerra	fb4ff3cf54	fix getCacheKey	2015-12-04 08:07:08 -06:00
Charles Allen	9d02f47201	Update IncrementalIndexTest copyright notice	2015-12-03 18:03:08 -08:00
Charles Allen	be8c6fafb0	Merge pull request #2017 from tubemogul/issue/63 fixes issue #63	2015-12-03 18:01:11 -08:00
Gian Merlino	045df54404	Merge pull request #1961 from metamx/druidMetricsVersion Add the druid artifact version to metrics when emitted	2015-12-03 17:34:57 -08:00

... 5 6 7 8 9 ...

1892 Commits