druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	337f3870d8	Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. (#4007 ) * Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. * Remove unused import. * Use defaults in cache key.	2017-03-04 17:41:59 -08:00
praveev	67d0ae3271	Let toDateTime call fall through for Duration Granularity (#4001 ) * Let toDateTime call fall through for Duration Granularity Added test for the same. * Add duration granularity test to GroupByQueryRunnerTest	2017-03-03 13:27:22 -06:00
Himanshu	e7e3c2dc5a	support singleThreaded flag for groupBy-v2 as well (#3992 )	2017-03-03 23:43:06 +05:30
Roman Leventov	81a5f9851f	TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir (#3990 ) * In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir * Use FileUtils.forceMkdir() across the codebase and remove some unused code * Fix test * Fix PullDependencies.run() * Unused import	2017-03-02 14:05:12 -08:00
Jonathan Wei	5fb1638534	Add default configuration for select query 'fromNext' parameter (#3986 ) * Add default configuration for select query 'fromNext' parameter * PR comments * Fix PagingSpec config injection * Injection fix for test	2017-03-01 17:05:35 -08:00
Himanshu	8316b4f48f	fix TimeDimExtractionFn.apply() under concurrency (#3984 )	2017-03-01 13:07:12 -08:00
kaijianding	772de66e79	add filenameBase to log when exceed file size limit to indicate which column it is (#3982 )	2017-03-01 13:05:07 -08:00
Gian Merlino	cc20133e70	Checkstyle rule to outlaw tabs. (#3988 ) Tabs are the worst.	2017-02-28 23:52:53 -08:00
Akash Dwivedi	91344cbe57	Enable GenericIndexed V2 for built-in(druid-io managed) complex columns. (#3987 ) * Enable GenericIndexed V2 for complex columns. * SerializerBuilder to use GenericColumnSerializer.	2017-02-28 22:06:54 -08:00
Jonathan Wei	a08660a9ca	Support ingestion of long/float dimensions (#3966 ) * Support ingestion for long/float dimensions * Allow non-arrays for key components in indexing type strategy interfaces * Add numeric index merge test, fixes * Docs for numeric dims at ingestion * Remove unused import * Adjust docs, add aggregate on numeric dims tests * remove unused imports * Throw exception for bitmap method on numerics * Move typed selector creation to DimensionIndexer interface * unused imports * Fix * Remove unused DimensionSpec from indexer methods, check for dims first in inc index storage adapter * Remove spaces	2017-02-28 19:04:41 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Jonathan Wei	58b704c3b4	Don't allow '__time' as a GroupBy output field name (#3967 ) * Don't allow '__time' as a GroupBy column field name * Tweak exception message	2017-02-23 14:39:17 -08:00
kaijianding	7ce05d58bc	fix NPE in search query when dimension contains null value (#3968 ) * fix NPE when dimension contains null value in search query * add ut * search with not existed dimension should always return empty result	2017-02-23 08:07:59 -08:00
Gian Merlino	372b84991c	Add virtual columns to timeseries, topN, and groupBy. (#3941 ) * Add virtual columns to timeseries, topN, and groupBy. * Fix GroupByTimeseriesQueryRunnerTest. * Updates from review comments.	2017-02-22 13:16:48 -08:00
Jihoon Son	7200dce112	Atomic merge buffer acquisition for groupBys (#3939 ) * Atomic merge buffer acquisition for groupBys * documentation * documentation * address comments * address comments * fix test failure * Addressed comments - Add InsufficientResourcesException - Renamed GroupByQueryBrokerResource to GroupByQueryResource * addressed comments * Add takeBatch() to BlockingPool	2017-02-22 14:49:37 -06:00
Gian Merlino	985203b634	Finalize fields in postaggs (#3957 ) * initial commits for finalizeFieldAccess #2433 * fix some bugs to run a query * change name of method Queries.verifyAggregations to Queries.prepareAggregations * add Uts * fix Ut failures * rebased to master * address comments and add a Ut for arithmetic post aggregators * rebased to the master * address the comment of injection within arithmetic post aggregator * address comments and introduce decorate() in the PostAggregator interface. * Address comments. 1. Implements getComparator in FinalizingFieldAccessPostAggregator and add Uts for it 2. Some minor changes like renaming a method name. * Fix a code style mismatch. * Rebased to the master	2017-02-21 16:32:14 -08:00
Gian Merlino	a47206eaf8	Ability to filter on virtual columns. (#3942 ) This didn't need much other than having BitmapIndexSelector return null from various methods to trigger cursor based filtering.	2017-02-21 16:03:31 -08:00
Jihoon Son	128274c6f0	Disable caching on brokers for groupBy v2 (#3950 ) * Disable caching on brokers for groupBy v2 * Rename parameter * address comments	2017-02-21 09:49:49 -08:00
Jonathan Wei	bc33b68b51	Use GroupBy V2 as default (#3953 ) * Use GroupBy V2 as default * Remove unused line * Change assert to exception propagation	2017-02-18 07:40:40 -08:00
kaijianding	361d9d9802	fix dynamic schema data can't rollup correctly (#3949 ) * fix dynamic schema data can't rollup correctly * add ut	2017-02-17 15:07:29 -06:00
Akash Dwivedi	797488a677	Removing Integer.MAX column size limit. (#3743 ) * Removing Integer.MAX column size limit. * On demand creation of headerLong, use v2 instead of v3 * Avoid reusing the same object from a previous test. * Avoid reusing the same object from a previous test part#2 * code formatting. * GenericIndexed/Writer code review changes. * GenericIndexed/writer code review requested changes. * checkIndex() to static * native endianess for genericIndexedV2, code review requested changes. * Formatting * Hll fix. * use native endianess during bag size calculation. * Code review requested changes. * IOPeon close() changes. * use different tmp directory path for testing. * Code review requested changes.	2017-02-16 20:09:43 -06:00
Jihoon Son	a459db68b6	Fine grained buffer management for groupby (#3863 ) * Fine-grained buffer management for group by queries * Remove maxQueryCount from GroupByRules * Fix code style * Merge master * Fix compilation failure * Address comments * Address comments - Revert Sequence - Add isInitialized() to Grouper - Initialize the grouper in RowBasedGrouperHelper.Accumulator - Simple refactoring RowBasedGrouperHelper.Accumulator - Add tests for checking the number of used merge buffers - Improve docs * Revert unnecessary changes * change to visible to testing * fix misspelling	2017-02-14 12:55:54 -08:00
Gian Merlino	af67e8904e	PreComputedHyperUniquesSerde: Fix formatting. (#3932 )	2017-02-14 09:32:29 -08:00
DaimonPl	a2875a4d91	pre-computed HLL support for hyperUnique aggregator (#3909 )	2017-02-13 15:26:20 -08:00
Akash Dwivedi	8854ce018e	File.deleteOnExit() (#3923 ) * Less use of File.deleteOnExit() * removed deleteOnExit from most of the tests/benchmarks/iopeon * Made IOpeon closable * Formatting. * Revert DeterminePartitionsJobTest, remove cleanup method from IOPeon	2017-02-13 15:12:14 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Pierre	9ab9feced6	Close all aggregators when closing onHeapIncrementalIndex (#3926 ) * Close all aggregators when closing onHeapIncrementalIndex * Aggregators are now handled as Closeables, remove unnecessary mock in test * Fix variable shadowing	2017-02-13 15:01:27 -08:00
Jihoon Son	991e2852da	Add PostAggregators to generator cache keys for top-n queries (#3899 ) * Add PostAggregators to generator cache keys for top-n queries * Add tests for strings * Remove debug comments * Add type keys and list sizes to cache key * Make post aggregators used for sort are considered for cache key generation * Use assertArrayEquals() * Improve findPostAggregatorsForSort() * Address comments * fix test failure * address comments	2017-02-13 12:23:44 -08:00
Parag Jain	33c635aff2	use as() method of base segment in reference counting segment (#3921 )	2017-02-09 20:24:47 -06:00
Jonathan Wei	ca2b04f0fd	Add long/float ColumnSelectorStrategy implementations (#3838 ) * Add long/float ColumnSelectorStrategy implementations * Address PR comments * Add String strategy with internal dictionary to V2 groupby, remove dict from numeric wrapping selectors, more tests * PR comments * Use BaseSingleValueDimensionSelector for long/float wrapping * remove unused import * Address PR comments * PR comments * PR comments * More PR comments * Fix failing calcite histogram subquery tests * ScanQuery test and comment about isInputRaw * Add outputType to extractionDimensionSpec, tweak SQL tests * Fix limit spec optimization for numerics * Add cardinality sanity checks to TopN * Fix import from merge * Add tests for filtered dimension spec outputType * Address PR comments * Allow filtered dimspecs on numerics * More comments	2017-02-08 20:39:29 -08:00
Gian Merlino	97765fdfef	Simplify LikeFilter implementation of getBitmapIndex, estimateSelectivity. (#3910 ) * Simplify LikeFilter implementation of getBitmapIndex, estimateSelectivity. LikeFilter: - Reduce code duplication, and simplify methods, at the cost of incurring an extra box of ImmutableBitmap into a SingletonImmutableList. I think this is fine, since this should be cheap and the code path is not hot (just once per filter). Filters: - Make estimateSelectivity public since it seems intended that they be used by Filter implementations, and Filters from extensions may want to use them too. Removed @VisibleForTesting for the same reason. - Rename one of the estimatePredicateSelectivity overloads to estimateSelectivity, since predicates aren't involved. * Address PR comments. * Remove unused import * Change List to Collection	2017-02-08 13:46:01 -06:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
Jihoon Son	ddd8c9ef97	Add filter selectivity estimation for auto search strategy (#3848 ) * Add filter selectivity estimation for auto search strategy * Addressed comments * Lazy bitmap materialization for bitmap sampling and java docs * Addressed comments. - Fix wrong non-overlap ratio computation and added unit tests. - Change Iterable<Integer> to IntIterable - Remove unnecessary Iterable<Integer> * Addressed comments - Split a long ternary operation into if-else blocks - Add IntListUtils.fromTo() * Fix test failure and add a test for RangeIntList * fix code style * Diabled selectivity estimation for multi-valued dimensions * Address comment	2017-02-06 11:15:03 -08:00
Parag Jain	8a13a85765	Introduce SegmentizerFactory (#3901 ) * Introduce SegmentizerFactory - that knows how to deserialize specific type of segment - Default implementation is MMappedQueryableSegmentizerFactory which creates QueryableIndexSegment - Unit test for the default behavior * review comments	2017-02-06 10:05:12 -08:00
DaimonPl	93b71e265e	Extract HLL related code to separate module (#3900 )	2017-02-03 09:45:11 -08:00
Jonathan Wei	182261f713	Allow configurable temp directory for query processing (#3893 )	2017-02-02 10:22:28 -08:00
Jonathan Wei	e6b95e80aa	Remove deprecated Aggregator/AggregatorFactory methods (#3894 )	2017-02-01 14:43:18 -08:00
Gian Merlino	d3a3b7ba0c	Add virtual column types, holder serde, and safety features. (#3823 ) * Add virtual column types, holder serde, and safety features. Virtual columns: - add long, float, dimension selectors - put cache IDs in VirtualColumnCacheHelper - adjust serde so VirtualColumns can be the holder object for Jackson - add fail-fast validation for cycle detection and duplicates - add expression virtual column in core Storage adapters: - move virtual column hooks before checking base columns, to prevent surprises when a new base column is added that happens to have the same name as a virtual column. * Fix ExtractionDimensionSpecs with virtual dimensions. * Fix unused imports. * CR comments * Merge one more time, with feeling.	2017-01-26 18:15:51 -08:00
Roman Leventov	75d9e5e7a7	DimensionSelector-related bug fixes and optimizations (fixes #3799 , part of #3798 ) (#3858 ) * * Add DimensionSelector.idLookup() and nameLookupPossibleInAdvance() to allow better inspection of features DimensionSelectors supports, and safer code working with DimensionSelectors in BaseTopNAlgorithm, BaseFilteredDimensionSpec, DimensionSelectorUtils; * Add PredicateFilteringDimensionSelector, to make BaseFilteredDimensionSpec to be able to decorate DimensionSelectors with unknown cardinality; * Add DimensionSelector.makeValueMatcher() (two kinds) for DimensionSelector-side specifics-aware optimization of ValueMatchers; * Optimize getRow() in BaseFilteredDimensionSpec's DimensionSelector, StringDimensionIndexer's DimensionSelector and SingleScanTimeDimSelector; * Use two static singletons, TrueValueMatcher and FalseValueMatcher, instead of BooleanValueMatcher; * Add NullStringObjectColumnSelector singleton and use it in MapVirtualColumn * Rename DimensionSelectorUtils.makeNonDictionaryEncodedIndexedIntsBasedValueMatcher to makeNonDictionaryEncodedRowBasedValueMatcher * Make ArrayBasedIndexedInts constructor private, replace it's usages with of() static factory method * Cache baseIdLookup in ForwardingFilteredDimensionSelector * Fix a bug in DimensionSelectorUtils.makeRowBasedValueMatcher(selector, predicate, matchNull) * Employ precomputed BitSet optimization in DimensionSelector.makeValueMatcher(value, matchNull) when lookupId() is not available, but cardinality is known and lookupName() is available * Doc fixes * Addressed comments * Fix * Fix * Adjust javadoc of DimensionSelector.nameLookupPossibleInAdvance() for SingleScanTimeDimSelector * throw UnsupportedOperationException instead of IAE in BaseTopNAlgorithm	2017-01-25 15:28:27 -08:00
Gian Merlino	3136dfa421	LikeFilter: Read value lazily when doing a prefix-based match. (#3880 ) This speeds up cases where we don't actually need to read the value, such as "LIKE 'foo%'".	2017-01-25 13:22:07 -08:00
Roman Leventov	af93a8d189	Sequences refactorings and removed unused code (part of #3798 ) (#3693 ) * Removing unused code from io.druid.java.util.common.guava package; fix #3563 (more consistent and paranoiac resource handing in Sequences subsystem); Add Sequences.wrap() for DRY in MetricsEmittingQueryRunner, CPUTimeMetricQueryRunner and SpecificSegmentQueryRunner; Catch MissingSegmentsException in SpecificSegmentQueryRunner's yielder.next() method (follow up on #3617) * Make Sequences.withEffect() execute the effect if the wrapped sequence throws exception from close() * Fix strange code in MetricsEmittingQueryRunner * Add comment on why YieldingSequenceBase is used in Sequences.withEffect() * Use Closer in OrderedMergeSequence and MergeSequence to close multiple yielders	2017-01-19 20:07:43 -08:00
kaijianding	33ae9dd485	streaming version of select query (#3307 ) * streaming version of select query * use columns instead of dimensions and metrics;prepare for valueVector;remove granularity * respect query limit within historical * use constant * fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner * add some test for scan query * add scan query document * fix merge conflicts * add compactedList resultFormat, this format is better for json ser/der * respect query timeout * respect query limit on broker * use static consts and remove unused code	2017-01-19 16:09:53 -06:00
Slim	558dc365a4	renaming classes to be run by mvn and comment non operational tests (#3847 )	2017-01-17 11:59:12 -08:00
Akash Dwivedi	dd0c4e2ead	Migrating extendedset from Metamarkets. (#3694 ) * Migrating extendedset from Metamarkets. * Notice change * More details in NOTICE * NOTICE formatting. * suppress header checkstlye for extendedset.	2017-01-17 10:10:27 -08:00
Gian Merlino	e86859b228	SQL support for nested groupBys. (#3806 ) * SQL support for nested groupBys. Allows, for example, doing exact count distinct by writing: SELECT COUNT() FROM (SELECT DISTINCT col FROM druid.foo) Contrast with approximate count distinct, which is: SELECT COUNT(DISTINCT col) FROM druid.foo Add deeply-nested groupBy docs, tests, and maxQueryCount config. * Extract magic constants into statics. * Rework rules to put preconditions in the "matches" method.	2017-01-11 18:32:53 -08:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Jihoon Son	c099977a5b	Add an option to SearchQuery to choose a search query execution strategy (#3792 ) * Add an option to SearchQuery to choose a search query execution strategy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query * Add SearchStrategy and SearchQueryExecutor * Address comments * Rename strategies and set UseIndexesStrategy as the default strategy * Add a cost-based planner for auto strategy * Add document * Fix code style * apply code style * apply comments	2017-01-10 18:04:20 -08:00
Gian Merlino	3c63cff57a	Remove makeMathExpressionSelector from ColumnSelectorFactory. (#3815 ) * Remove makeMathExpressionSelector from ColumnSelectorFactory. * Add @Nullable annotations in places, fix Number.class check. * Break up createBindings, add tests. * Add null check.	2017-01-05 18:06:38 -08:00
Gian Merlino	220ca7ebb6	Ignore DimFilterHavingSpec testConcurrentUsage. (#3814 )	2017-01-03 17:43:58 -07:00

1 2 3 4 5 ...

1681 Commits