druid

Commit Graph

Author	SHA1	Message	Date
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Justin Borromeo	6723243ed2	Create Scan Benchmark (#6986 ) * Moved Scan Builder to Druids class and started on Scan Benchmark setup * Need to form queries * It runs. * Remove todos * Change number of benchmark iterations * Changed benchmark params * More param changes * Made Jon's changes and removed TODOs * Broke some long lines into two lines * Decrease segment size for less memory usage * Committing a param change to kick teamcity	2019-02-06 14:45:01 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
Roman Leventov	0e926e8652	Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898 ) * Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool * Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java * Remove unnecessary comment * Checkstyle * Fix getFromExtensions() * Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization * Fix UriCacheGeneratorTest * Workaround issue with MaterializedViewQueryQueryToolChest * Strengthen Appenderator's contract regarding concurrency	2019-02-04 09:18:12 -08:00
Roman Leventov	f7df5fedcc	Add several missing inspectRuntimeShape() calls (#6893 ) * Add several missing inspectRuntimeShape() calls * Add lgK to runtime shapes	2019-01-31 20:04:26 -08:00
Furkan KAMACI	30ec608038	Fix mixed up segment ids at SelectBinaryFnTest.java (#6946 )	2019-01-30 20:04:16 -08:00
Clint Wylie	de810286cd	fix bug with expression virtual column selectors backed by a single long column (#6957 ) * fix issue with SingleLongInputCachingExpressionColumnValueSelector when sql compatible null handling enabled * add test with doubles to show same behavior for floats/doubles that lack the optimization of longs * simplify * fix import	2019-01-30 10:13:07 -05:00
Clint Wylie	a6d81c0d16	Adds bloom filter aggregator to 'druid-bloom-filters' extension (#6397 ) * blooming aggs * partially address review * fix docs * minor test refactor after rebase * use copied bloomkfilter * add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests * add methods to BloomKFilter to get number of set bits, use in comparator, fixes * more docs * fix * fix style * simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets * oof, more fixes * more sane docs example * fix it * do the right thing in the right place * formatting * fix * avoid conflict * typo fixes, faster comparator, docs for comparator behavior * unused imports * use buffer comparator instead of deserializing * striped readwrite lock for buffer agg, null handling comparator, other review changes * style fixes * style * remove sync for now * oops * consistency * inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception * CardinalityBufferAggregator inspect selectors instead of selectorPluses * fix style * refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments * adjustment * fix teamcity error? * rename nil aggs to empty, change empty agg constructor signature, add comments * use stringutils base64 stuff to be chill with master * add aggregate combiner, comment	2019-01-29 20:05:17 +07:00
Benedict Jin	72a571fbf7	For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913 ) * * Add few methods about base64 into StringUtils * Use `java.util.Base64` instead of others * Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis * Rename encodeBase64String & decodeBase64String * Update druid-forbidden-apis	2019-01-25 17:32:29 -08:00
Himanshu Pandey	e1033bb412	Issue#6892- Replaced Math.random() with ThreadLocalRandom.current().nextDouble() (#6914 ) * Replacing Math.random() with ThreadLocalRandom.current().nextDouble() * Added java.lang.Math#random() in forbidden-apis.txt * Minor change in the message - druid-forbidden-apis.txt	2019-01-25 19:49:20 +08:00
Clint Wylie	66f64cd8bd	fix long/float/double dimension filtering for columns with nulls (#6906 ) * fix long,float, double dimension filtering when sql compatible null handling is enabled and the column has null values * revert unintended change * fix tests	2019-01-23 22:36:52 -08:00
Roman Leventov	8eae26fd4e	Introduce SegmentId class (#6370 ) * Introduce SegmentId class * tmp * Fix SelectQueryRunnerTest * Fix indentation * Fixes * Remove Comparators.inverse() tests * Refinements * Fix tests * Fix more tests * Remove duplicate DataSegmentTest, fixes #6064 * SegmentDescriptor doc * Fix SQLMetadataStorageUpdaterJobHandler * Fix DataSegment deserialization for ignoring id * Add comments * More comments * Address more comments * Fix compilation * Restore segment2 in SystemSchemaTest according to a comment * Fix style * fix testServerSegmentsTable * Fix compilation * Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes * Fix SystemSchemaTest * Fix style * Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment * Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164 * Fix compilation	2019-01-21 11:11:10 -08:00
Jihoon Son	cc06e7e2df	Fix fallback to cursor-based plan in UseIndexesStrategy (#6875 ) * Fix fallback to cursor-based plan in UseIndexesStrategy * fix build * add a comment	2019-01-18 10:41:01 +08:00
Jonathan Wei	68f744ec0a	Fixed buckets histogram aggregator (#6638 ) * Fixed buckets histogram aggregator * PR comments * More PR comments * Checkstyle * TeamCity * More TeamCity * PR comment * PR comment * Fix doc formatting	2019-01-17 14:51:16 -08:00
Dayue Gao	5b8a221713	Add SQL id, request logs, and metrics (#6302 ) * use SqlLifecyle to manage sql execution, add sqlId * add sql request logger * fix UT * rename sqlId to sqlQueryId, sql/time to sqlQuery/time, etc * add docs and more sql request logger impls * add UT for http and jdbc * fix forbidden use of com.google.common.base.Charsets * fix UT in QuantileSqlAggregatorTest, supressed unused warning of getSqlQueryId * do not use default method in QueryMetrics interface * capitalize 'sql' everywhere in the non-property parts of the docs * use RequestLogger interface to log sql query * minor bugfixes and add switching request logger * add filePattern configs for FileRequestLogger * address review comments, adjust sql request log format * fix inspection error * try SuppressWarnings("RedundantThrows") to fix inspection error on ComposingRequestLoggerProvider	2019-01-15 23:12:59 -08:00
Charles Allen	5d2947cd52	Use Guava Compatible immediate executor service (#6815 ) * Use multi-guava version friendly direct executor implementation * Don't use a singleton * Fix strict compliation complaints * Copy Guava's DirectExecutor * Fix javadoc * Imports are the devil	2019-01-11 10:42:19 -08:00
Richard Startin	99097617a1	use RoaringBitmapWriter for RoaringBitmap construction (#6764 )	2019-01-08 17:18:41 -08:00
dongyifeng	def823124c	add version comparator for StringComparator (#6745 ) * add version comparator for StringComparator * add more test case and docs	2019-01-08 17:17:03 -08:00
Clint Wylie	9505074530	fix log typo (#6755 ) * fix log typo, add DataSegmentUtils.getIdentifiersString util method * fix indecisive oops	2018-12-18 15:10:25 -08:00
Clint Wylie	486c6f3cf9	emit logs that are only useful for debugging at debug level (#6741 ) * make logs that are only useful for debugging be at debug level so log volume is much more chill * info level messages for total merge buffer allocated/free * more chill compaction logs	2018-12-17 14:20:28 +08:00
Atul Mohan	86e3ae5b48	Add fail message (#6720 )	2018-12-11 08:05:50 -08:00
Gian Merlino	b7709e1245	FileUtils: Sync directory entry too on writeAtomically. (#6677 ) * FileUtils: Sync directory entry too on writeAtomically. See the fsync(2) man page for why this is important: https://linux.die.net/man/2/fsync This also plumbs CompressionUtils's "zip" function through writeAtomically, so the code for handling atomic local filesystem writes is all done in the same place. * Remove unused import. * Avoid FileOutputStream. * Allow non-atomic writes to overwrite. * Add some comments. And no need to flush an unbuffered stream.	2018-12-08 17:12:59 +01:00
Furkan KAMACI	bbb283fa34	Double-checked locking bugs (#6662 ) * Double-checked locking bug is fixed. * @Nullable is removed since there is no need to use along with @MonotonicNonNull. * Static import is removed. * Lazy initialization is implemented. * Local variables used instead of volatile ones. * Local variables used instead of volatile ones.	2018-12-07 17:10:29 +01:00
Jihoon Son	d525e5b18e	Fix travis timeout in BufferHashGrouperTest (#6713 ) * Fix travis timeout in BufferHashGrouperTest * adjust buffer size * adjust bufferSize and loadFactor * increase memory * add debug code * cat error * after script * print logs * print per 2 min * use direct mem * clean up	2018-12-07 12:05:27 +08:00
Atul Mohan	ec36f0b82f	Add default comparison to HavingSpecMetricComparator for custom Aggregator types (#6505 ) * Add default comparison * Switch to BigDecimal comparison * Add comparator from AggFactory * Fix indent * Add tests	2018-12-04 13:35:13 -08:00
Clint Wylie	a1c9d0add2	autosize processing buffers based on direct memory sizing by default (#6588 ) * autosize processing buffers based on direct memory sizing * remove oops, more test * max 1gb autosize buffers, test, start of docs * fix oops * revert accidental change * print buffer size in exception * change the things	2018-12-03 18:40:02 -07:00
Roman Leventov	ec38df7575	Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606 ) * Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns * Fix style * Fix HttpEmitterTest.timeoutEmptyQueue() * Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests * Clarify code	2018-12-01 01:12:56 +01:00
Mingming Qiu	849ba867b2	fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory (#6656 )	2018-11-27 15:59:58 -08:00
Roman Leventov	887c645675	Find duplicate lines with checkstyle; enable some duplicate inspections in IntelliJ (#6558 ) Not putting this to 0.13 milestone because the found bugs are not critical (one is a harmless DI config duplicate, and another is in a benchmark. Change in `DumpSegment` is just an indentation change.	2018-11-26 16:55:42 +01:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Gian Merlino	fe69da0d95	Expressions: Fix improper supplier reuse with missing columns. (#6600 ) * Expressions: Fix improper supplier reuse with missing columns. ExpressionSelectors has an optimization that skips building a Map when there is only one input supplier. However, this optimization should not be used in the case where the is one input supplier but more than one input identifier (which can happen when only one input identifier corresponds to an actual column). Fixes #6556. * Add underscores to statics.	2018-11-15 22:13:32 -08:00
David Lim	7b41e23cbb	remove backpressure time from DefaultQueryMetrics pending on-going discussion (#6631 )	2018-11-15 19:29:50 -07:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
Jihoon Son	cdae2fe7b5	Deprecate IntervalChunkingQueryRunner (#6591 ) * Deprecate IntervalChunkingQueryRunner * add doc * deprecate metric * fix doc	2018-11-14 06:33:27 +08:00
Gian Merlino	52f6bdc1eb	Optimization for expressions that hit a single long column. (#6599 ) * Optimization for expressions that hit a single long column. There was previously a single-long-input optimization that applied only to the time column. These have been combined together. Also adds type-specific value caching to ExprEval, which allowed simplifying the SingleLongInputCachingExpressionColumnValueSelector code. * Add more benchmarks. * Don't use LRU cache for __time. * Simplify a bit. * Let the cache grow.	2018-11-13 09:36:32 -08:00
Roman Leventov	54351a5c75	Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490 ) * Fix various bugs; Enable more IntelliJ inspections and update error-prone * Fix NPE * Fix inspections * Remove unused imports	2018-11-06 14:38:08 -08:00
Roman Leventov	a2a1a1c2c9	Hide NullDimensionSelector from public (#6480 )	2018-11-02 04:38:21 -07:00
QiuMM	676f5e6d7f	Prohibit some guava collection APIs and use JDK collection APIs directly (#6511 ) * Prohibit some guava collection APIs and use JDK APIs directly * reset files that changed by accident * sort codestyle/druid-forbidden-apis.txt alphabetically	2018-10-29 13:02:43 +01:00
Samarth Jain	0a90b3d51a	Remove unused code (#6504 ) * Remove unused code * Remove usage of list in setDimensions and setAggregatorSpecs * Fix formatting to adhere to 120 character guideline	2018-10-26 11:31:10 -07:00
Roman Leventov	84ac18dc1b	Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461 ) * Catch some incorrect method parameter or call argument formatting patterns with checkstyle * Fix DiscoveryModule * Inline parameters_and_arguments.txt * Fix a bug in PolyBind * Fix formatting	2018-10-23 07:17:38 -03:00
Samarth Jain	359576a80b	Implement force push down for nested group by query (#5471 ) * Force nested query push down * Code review changes	2018-10-22 13:43:47 -07:00
Roman Leventov	789c9a1dc7	Prohibit using Object\|Long\|Float\|DoubleColumnSelector in instanceof statements (#6470 ) * Prohibit using Object\|Long\|Float\|DoubleColumnSelector in instanceof statements * Doc fixes	2018-10-15 15:41:43 -07:00
robertervin	95ab1ea737	Fix Empty InDimFilter Failure (#6330 ) * fix empty InDimFilter failure (#6101) * Add test case for empty values input * Add documentation for empty values in InDimFilter	2018-10-14 20:43:16 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
dongyifeng	b06ac54a5e	add PrefixFilteredDimensionSpec for multi-value dimensions (#6307 ) * add PrefixFilteredDimensionSpec for multi-value dimensions * add docs for PrefixFilteredDimensionSpec * remove unnecessary null handling * add null check to the result of NullHandling	2018-10-12 17:51:09 -07:00
David Lim	20ab213ba6	change project versions to 0.13.0-incubating-SNAPSHOT (#6453 )	2018-10-11 19:28:01 -07:00
Charles Allen	c55b37d7ec	Add optional `name` to top level of FilteredAggregatorFactory (#6219 ) * Add optional `name` to top level of FilteredAggregatorFactory * Add compat constructor for tests * Address comments * Add equals and hash code updates * Rename test * Fix imports and code style	2018-10-11 11:56:53 -07:00
Clint Wylie	f7775d1db3	fixes for LookupReferencesManagerTest (#6444 ) * some fixes for LookupReferencesManagerTest * docs * formatting * more formatting fixes	2018-10-10 18:02:11 -07:00
Roman Leventov	09126c021a	Remove Aggregator.clone() methods (#6437 ) * Remove Aggregator.clone() methods * Remove CardinalityAggregator.name	2018-10-10 10:07:56 -03:00
QiuMM	0b8085aff7	Prohibit jackson ObjectMapper#reader methods which are deprecated (#6386 ) * Prohibit jackson ObjectMapper#reader methods which are deprecated * address comments	2018-10-03 17:55:20 -03:00
Roman Leventov	3ae563263a	Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957 ) This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR. Some of the changes: - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names. - Generified `ComplexMetricExtractor`	2018-10-02 14:50:22 -03:00
Jihoon Son	cb14a43038	Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask (#6393 ) * Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask * update doc and remove auto conversion * remove remaining doc * fix teamcity	2018-10-01 12:03:35 -07:00
Shiv Toolsidass	5a894f830b	Added backpressure metric (#6335 ) * Added backpressure metric * Updated channelReadable to AtomicBoolean and fixed broken test * Moved backpressure metric logic to NettyHttpClient * Fix placement of calculating backPressureDuration	2018-09-29 14:24:04 -07:00
Jihoon Son	f09e718c68	Implement MapVirtualColumn.makeDimensionSelector properly (#6396 ) * Implement MapVirtualColumn.makeDimensionSelector properly * address comments	2018-09-29 14:13:05 -07:00
Jihoon Son	faf3f1e426	Fix cache keys of DefaultDimensionSpec and ExtractionDimensionSpec (#6390 )	2018-09-26 20:08:53 -07:00
Nishant Bangarwa	c9d281a2e9	Add ability to pass in Bloom filter from Hive Queries (#6222 ) * Bloom filter initial implementation fix checkstyle review comments Fix wierd failure review comments Revert "Fix wierd failure" This reverts commit a13a83ad7887e679f6d539191b52aeaaea85b613. * fix test * review comment	2018-09-26 16:04:26 -07:00
Jonathan Wei	00b0a156e9	Tweak isInvalidRows behavior in HadoopTuningConfig (#6339 ) * Tweak isInvalidRows behavior in HadoopTuningConfig * Fix tests	2018-09-24 16:13:13 -07:00
Alexander Saydakov	93345064b5	HllSketch module (#5712 ) * HllSketch module * updated license and imports * updated package name * implemented makeAggregateCombiner() * removed json marks * style fix * added module * removed unnecessary import, side effect of package renaming * use TreadLocalRandom * addressing code review points, mostly formatting and comments * javadoc * natural order with nulls * typo * factored out raw input value extraction * singleton * style fix * style fix * use Collections.singletonList instead of Arrays.asList * suppress warning	2018-09-24 08:41:56 -07:00
Jonathan Wei	609da01882	Fix dictionary ID race condition in IncrementalIndexStorageAdapter (#6340 ) Possibly related to https://github.com/apache/incubator-druid/issues/4937 -------- There is currently a race condition in IncrementalIndexStorageAdapter that can lead to exceptions like the following, when running queries with filters on String dimensions that hit realtime tasks: ``` org.apache.druid.java.util.common.ISE: id[5] >= maxId[5] at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591) at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562) at org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284) ``` When the `filterMatcher` is created in the constructor of `IncrementalIndexStorageAdapter.IncrementalIndexCursor`, `StringDimensionIndexer.makeDimensionSelector` gets called eventually, which calls: ``` final int maxId = getCardinality(); ... @Override public int getCardinality() { return dimLookup.size(); } ``` So `maxId` is set to the size of the dictionary at the time that the `filterMatcher` is created. However, the `maxRowIndex` which is meant to prevent the Cursor from returning rows that were added after the Cursor was created (see https://github.com/apache/incubator-druid/pull/4049) is set after the `filterMatcher` is created. If rows with new dictionary values are added after the `filterMatcher` is created but before `maxRowIndex` is set, then it is possible for the Cursor to return rows that contain the new values, which will have `id >= maxId`. This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows with unknown dictionary IDs from being passed to the `filterMatcher`. ----------- The included test triggers the error with a custom Filter + DruidPredicateFactory. The DimensionSelector for predicate-based filter matching is created here in `Filters.makeValueMatcher`: ``` public static ValueMatcher makeValueMatcher( final ColumnSelectorFactory columnSelectorFactory, final String columnName, final DruidPredicateFactory predicateFactory ) { final ColumnCapabilities capabilities = columnSelectorFactory.getColumnCapabilities(columnName); // This should be folded into the ValueMatcherColumnSelectorStrategy once that can handle LONG typed columns. if (capabilities != null && capabilities.getType() == ValueType.LONG) { return getLongPredicateMatcher( columnSelectorFactory.makeColumnValueSelector(columnName), predicateFactory.makeLongPredicate() ); } final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector = DimensionHandlerUtils.createColumnSelectorPlus( ValueMatcherColumnSelectorStrategyFactory.instance(), DefaultDimensionSpec.of(columnName), columnSelectorFactory ); return selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(), predicateFactory); } ``` The test Filter adds a row to the IncrementalIndex in the test when the predicateFactory creates a new String predicate, after `DimensionHandlerUtils.createColumnSelectorPlus` is called.	2018-09-18 10:43:29 +04:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
Roman Leventov	d50b69e6d4	Prohibit LinkedList (#6112 ) * Prohibit LinkedList * Fix tests * Fix * Remove unused import	2018-09-13 18:07:06 -07:00
Gian Merlino	d6cbdf86c2	Broker backpressure. (#6313 ) * Broker backpressure. Adds a new property "druid.broker.http.maxQueuedBytes" and a new context parameter "maxQueuedBytes". Both represent a maximum number of bytes queued per query before exerting backpressure on the channel to the data server. Fixes #4933. * Fix query context doc.	2018-09-10 09:33:29 -07:00
Himanshu	d61f708ef5	make COMPLEX column optionally filterable in Druid code (#6223 ) * make COMPLEX column filterable in Druid code * Revert "make COMPLEX column filterable in Druid code" This reverts commit `9fc6ec768c`. * complex columns can be optionally made filterable * some types are always filterable * add ColumnCapabilitiesImpl serde tests * add SuppresedWarnings annotation	2018-09-05 12:28:49 -07:00
Gian Merlino	be6c901114	Like filter: Fix escapes escaping themselves. (#6295 ) Escapes should escape themselves.	2018-09-05 09:29:07 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Himanshu	1fae6513e1	add "subtotalsSpec" attribute to groupBy query (#5280 ) * add subtotalsSpec attribute to groupBy query * dont sent subtotalsSpec to downstream nodes from broker and other updates * address review comment * fix checkstyle issues after merge to master * add docs for subtotalsSpec feature * address doc review comments	2018-08-28 17:46:38 -07:00
Dayue Gao	fcf8c8d53c	RowBasedKeySerde should use empty dictionary in constructor (#6256 )	2018-08-28 17:22:18 -07:00
Gian Merlino	4a8b09b6a9	Fix NPE on constant null numeric expressions. (#6232 ) The bug was caused by makeExprEvalSelector returning a null object, which it isn't supposed to do. Fixed this by renaming ConstantColumnValueSelector to ConstantExprEvalSelector (it was only used for ExprEval anyway) and putting logic in that class to make sure the selectors behave as expected.	2018-08-27 15:30:56 -07:00
Gian Merlino	71c1a70ff6	FilteredBufferAggregator: Fix missing relocate, isNull methods. (#6233 )	2018-08-27 15:30:45 -07:00
Gian Merlino	157e75a1fe	Minor followup to #6220 . (#6231 ) Adjustments to comments and usage of generics.	2018-08-27 12:01:44 -05:00
Gian Merlino	cb40b6d369	Fix all inspection errors currently reported. (#6236 ) * Fix all inspection errors currently reported. TeamCity builds on master are reporting inspection errors, possibly because there was a while where it was not running due to the Apache migration, and there was some drift. * Fix one more location. * Fix tests. * Another fix.	2018-08-26 18:36:01 -06:00
Gian Merlino	23ba6f7ad7	Fix four bugs with numeric dimension output types. (#6220 ) * Fix four bugs with numeric dimension output types. This patch includes the following bug fixes: - TopNColumnSelectorStrategyFactory: Cast dimension values to the output type during dimExtractionScanAndAggregate instead of updateDimExtractionResults. This fixes a bug where, for example, grouping on doubles-cast-to-longs would fail to merge two doubles that should have been combined into the same long value. - TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long would fail to merge two strings that should have been combined. - GroupByQuery: Cast numeric types to the expected output type before comparing them in compareDimsForLimitPushDown. This fixes #6123. - GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into the proper output type. This fixes an inconsistency between results that came from cache vs. not-cache: for example, Jackson sometimes deserializes integers as Integers and sometimes as Longs. And the following code-cleanup changes, related to the fixes above: - DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType, and converterFromTypeToType to make it easier to handle casting operations. - TopN in general: Rename various "dimName" variables to "dimValue" where they actually represent dimension values. The old names were confusing. * Remove unused imports.	2018-08-25 14:31:46 -07:00
Himanshu	a76bf9ab2a	add ability to do optional rollup in AggregationTestHelper (#6213 )	2018-08-22 16:38:36 -07:00
Benedict Jin	3647d4c94a	Make time-related variables more readable (#6158 ) * Make time-related variables more readable * Patch some improvements from the code reviewer * Remove unnecessary boxing of Long type variables	2018-08-21 15:29:40 -07:00
Kirill Kozlov	62e580050c	Use JUnit TemporaryFolder rule instead of system temp folder (#6070 ) * Use JUnit TemporaryFolder rule instead of system tmp folder * Allow to forbid apis which present not in all mvn modules	2018-08-16 11:05:45 -07:00
Jihoon Son	ecee3e0a24	Further optimize memory for Travis jobs (#6150 ) * Further optimize memory for Travis jobs * fix build * sudo false	2018-08-10 22:03:36 -07:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Clint Wylie	62677212cc	Order rows during incremental index persist when rollup is disabled. (#6107 ) * order using IncrementalIndexRowComparator at persist time when rollup is disabled, allowing increased effectiveness of dimension compression, resolves #6066 * fix stuff from review	2018-08-06 14:17:48 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Jonathan Wei	b9c445c780	Optimize filtered aggs with interval filters in per-segment queries (#5857 ) * Optimize per-segment queries * Always optimize, add unit test * PR comments * Only run IntervalDimFilter optimization on __time column * PR comments * Checkstyle fix * Add test for non __time column	2018-08-01 14:39:38 -07:00
Andrés Gómez	e270362767	Add stringLast and stringFirst aggregators extension (#5789 ) * Add lastString and firstString aggregators extension * Remove duplicated class * Move first-last-string doc page to extensions-contrib * Fix ObjectStrategy compare method * Fix doc bad aggregatos type name * Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery * Add getMaxStringBytes() method to support JSON serialization * Fix null pointer exception at segment creation phase when the string value is null * Control the valueSelector object class on BufferAggregators * Perform all improvements * Add java doc on SerializablePairLongStringSerde * Refactor ObjectStraty compare method * Remove unused ; * Add aggregateCombiner unit tests. Rename BufferAggregators unit tests * Remove unused imports * Add license header * Add class name to java doc class serde * Throw exception if value is unsupported class type * Move first-last-string extension into druid core * Update druid core docs * Fix null pointer exception when pair->string is null * Add null control unit tests * Remove unused imports * Add first/last string folding aggregator on AggregatorsModule to support segment metadata query * Change SerializablePairLongString to extend SerializablePair * Change vars from public to private * Convert vars to primitive type * Clarify compare comment * Change IllegalStateException to ISE * Remove TODO comments * Control possible null pointer exception * Add @Nullable annotation * Remove empty line * Remove unused parameter type * Improve AggregatorCombiner javadocs * Add filterNullValues option at StringLast and StringFirst aggregators * Add filterNullValues option at agg documentation * Fix checkstyle * Update header license * Fix StringFirstAggregatorFactory.VALUE_COMPARATOR * Fix StringFirstAggregatorCombiner * Fix if condition at StringFirstAggregateCombiner * Remove filterNullValues from string first/last aggregators * Add isReset flag in FirstAggregatorCombiner * Change Arrays.asList to Collections.singletonList	2018-08-01 10:52:54 -07:00
Roman Leventov	0754d78a2e	Prohibit Lists.newArrayList() with a single argument (#6068 ) * Prohibit Lists.newArrayList() with a single argument * Test fixes * Add Javadoc to Node constructor	2018-07-31 20:09:10 -07:00
Clint Wylie	20ae8aa626	Fix 'auto' encoded longs + compression serializer (#6045 ) * Fix 'auto' encoded longs + compression serializer Fixes #6044 changes: * Fixes `VSizeLongSerde` serializers to treat 'close' as 'flush' when used with `BlockLayoutColumnarLongsSerializer`, allowing unwritten values to be flushed to the buffer when the block is compressed * Add exhaustive unit test that flexes a variety of value sizes, row counts, and compression strategies to catch issues such as these : * refactor LongSerializer close to be named flush instead * revert and just make new serializers per block	2018-07-30 18:35:20 -07:00
Roman Leventov	f3595c93d9	Fix a bug in GroupByQueryEngine (#6062 )	2018-07-30 14:39:38 -07:00
Gian Merlino	c57f4a5db0	FinalizingFieldAccessPostAggregator: Fix serde. (#6067 ) Fixes #6063.	2018-07-28 08:44:22 -07:00
Benedict Jin	331a0afb98	Remove redundant type parameters and enforce some other style and inspection rules (#5980 ) * Various changes about druid-services module * Patch improvements from reviewer * Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile * Fix ArraysAsListWithZeroOrOneArgument * Fix conflict * Fix ToArrayCallWithZeroLengthArrayArgument * Fix AliEqualsAvoidNull * Remove blank line * Remove unused import clauses * Fix code style in TopNQueryRunnerTest * Fix conflict * Don't use Collections.singletonList when converting the type of array type * Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase * Roll back the latest commit * Add java.io.File#toURL() into druid-forbidden-apis * Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord * Add a new regexp element into stylecode xml file * Fix style error for new regexp * Set the level of ArraysAsListWithZeroOrOneArgument as WARNING * Fix style error for new regexp * Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile * Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR * Add toArray(new Object[0]) regexp into checkstyle config file & fix them * Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it * Add a comment for string equals regexp in checkstyle config * Fix code format * Add RedundantTypeArguments as ERROR level inspection * Fix cannot resolve symbol datasource	2018-07-27 16:56:49 -05:00
kaijianding	7919e4d5df	move rangeSet compare into shardspec (#5688 )	2018-07-26 14:17:57 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
Benedict Jin	b3021ec802	Fix bug in SegmentAnalyzer.analyzeComplexColumn() #5939 (#5954 )	2018-07-09 15:36:16 -07:00
Surekha	441c9819d9	Support limit for timeseries query (#5894 ) (#5931 ) * Support limit for timeseries query (#5894) * Fix tests * Address PR comments * Try to fix teamcity inspection checks * Remove unused method from VirtualColumns * Remove unused import statement	2018-07-09 08:58:42 -07:00
Jihoon Son	10a01d6846	[SQL] Fix missing postAggregations for Timeseries and TopN (#5912 ) * [SQL] Fix missing postAggregations for Timeseries and TopN * fix build * fix test	2018-06-29 10:36:55 -07:00
Jonathan Wei	f3e1520360	Fix merge for TrueDimFilter (#5916 ) * Fix merge for TrueDimFilter * remove unused cache ID	2018-06-28 14:46:47 -07:00
scrawfor	bf2a31a5bc	Add new 'true' filter which always returns true. (#5711 ) * Add new 'true' filter which always returns true. * Add support for bitmap index. * Adds documentation. * Removes No-op Filter	2018-06-28 11:52:45 -07:00
zhangxinyu	d857345b7d	add method getRequiredColumns for DimFilter (#5872 ) * add method getRequiredColumns for DimFilter * deal with the NullPointerException when DimFilter is null	2018-06-27 15:45:46 -07:00
陈春斌	7649742943	Use ReentrantReadWriteLock in DimensionDictionary (#5883 )	2018-06-25 12:35:26 -07:00
Nishant Bangarwa	1c031784cb	Align long Aggregator implementation with Double and Float (#5861 ) Add LongMin/Max aggregator combiners Extract common code from LongSum/Min/MaxAggregatorFactories in SimpleLongAggregatorFactory	2018-06-14 01:56:41 +04:00
Gian Merlino	3af95913a9	Lazy-ify IncrementalIndex filtering too. (#5852 ) * Lazy-ify IncrementalIndex filtering too. Follow-up to #5403, which only lazy-ified cursor-based filtering on QueryableIndex. * Fix logic error.	2018-06-06 18:03:34 -07:00
Gian Merlino	78fd27cdb2	Lazy-ify ValueMatcher BitSet optimization for string dimensions. (#5403 ) * Lazy-ify ValueMatcher BitSet optimization for string dimensions. The idea is that if the prior evaluated filters are decently selective, such that they mean we won't see all possible values of the later filters, then the eager version of the optimization is too wasteful. This involves checking an extra bitset, but the overhead is small even if the lazy-ification is useless. * Remove import. * Minor transformation	2018-06-05 09:06:51 -07:00
Clint Wylie	2b45a6a42d	Fix topN lexicographic sort (#5815 ) * fixes #5814 changes: * pass `StorageAdapter` to topn algorithms to get things like if column is 'sorted' or if query interval is smaller than segment granularity, instead of using `io.druid.segment.Capabilities` * remove `io.druid.segment.Capabilities` since it had one purpose, supplying `dimensionValuesSorted` which is now provided directly by `StorageAdapter`. * added test for topn optimization path checking * add Capabilities back since StorageAdapter is marked PublicApi * oops * add javadoc, fix build i think * correctly revert api changes * fix intellij fail * fix typo :(	2018-05-31 09:53:29 -07:00
Jihoon Son	9dca5ec76b	Simple cleanup for ThreadPoolTaskRunner and SetAndVerifyContextQueryRunner / Add ThreadPoolTaskRunnerTest (#5557 ) * Simple fix for ThreadPoolTaskRunner * fix build * address comments * update javadoc * fix build * fix test * add dependency	2018-05-15 22:53:11 +05:30
Alexander Saydakov	15864434be	ArrayOfDoublesSketch module (#5148 ) * ArrayOfDoublesSketch module * UTF-8 fix * javadoc, style fixes * more style fixes * null key selector fix * more style fixes * removed @Override, strict compiler doesn't like it * removed @Override, strict compiler doesn't like it * IndexedInts is not autoclosable? removed one more @0verride * synchronized with upstream master * removed unused imports * addressed review points * null fix * addressed review points * IAE from druid package * synchronized aggregate() and get() * use locks per buffer position * corrected javadoc * style fixes * added lock and narrowed the scope * addressed review comments * conflict resolution went wrong * addressed review comments * javadoc * javadoc links * fully qualified name since there is no import for this class * addressed review points * style fix * StandardCharsets.UTF_8 * addressed review points * added @Override * added equals and hashCode tests for post aggs * formatting * suppress warnings * optimal IndexedInts iteration * suppress SelfEquals * added comments about getClass() in equals()	2018-05-13 15:48:00 +03:00
Dylan Wylie	e8caf02147	Revert "Use a bimap for reverse lookups on injective maps" (#5764 ) * Revert "Consider waiting and pending compaction tasks as well as running tasks in DruidCoordinatorSegmentCompactor (#5704)" This reverts commit `c7a59394e0`. * Revert "Fix metrics for inserting segments (#5749)" This reverts commit `c9d645103b`. * Revert "Typo fix in historical doc (#5753)" This reverts commit `aa23fe6386`. * Revert "Use a bimap for reverse lookups on injective maps (#5681)" This reverts commit `e1277d306c`.	2018-05-09 19:12:36 -07:00
Surekha	2f8904e25f	Check against the real default of maxBytes(1/6 max mem) in AppenderatorImpl's add (#5758 ) * The check for maxBytesInMemory should be >= 0 instead of > 0 * if the default value is 0, the actual check could be skipped * fix the message for persistReasons * Address PR comments * if maxBytes set -1, make is Long.MAX_VAL, so we do not need to check if it's 0 or -1 * set the maxBytesTuningconfig in AppenderatorImpl constructor to avoid duplicate code * fix the failing test cases * Address PR comments	2018-05-09 13:41:51 -07:00
Dylan Wylie	e1277d306c	Use a bimap for reverse lookups on injective maps (#5681 ) * Use a bimap for reverse lookups on injective maps - A BiMap provides constant-time lookups for mapping values to keys * Address comments * Fix Tests	2018-05-07 18:46:21 -07:00
Fokko Driesprong	a95ec92296	Move to the org.lz4 dependency (#5746 ) The net.jpountz.lz4 moved to org.lz4	2018-05-07 08:16:45 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
Jihoon Son	86746f82d8	Use mergeBuffer instead of processingBuffer in parallelCombiner (#5634 ) * Use mergeBuffer instead of processingBuffer in parallelCombiner * Fix test * address comments * fix test * Fix test * Update comment * address comments * fix build * Fix test failure	2018-04-27 18:14:37 -07:00
Roman Leventov	9be000758d	Refactor index merging, replace Rowboats with RowIterators and RowPointers (#5335 ) * Refactor index merging, replace Rowboats with RowIterators and RowPointers * Add javadocs * Fix a bug in QueryableIndexIndexableAdapter * Fixes * Remove unused declarations * Remove unused GenericColumn.isNull() method * Fix test * Address comments * Rearrange some code in MergingRowIterator for more clarity * Self-review * Fix style * Improve docs * Fix docs * Rename IndexMergerV9.writeDimValueAndSetupDimConversion to setUpDimConversion() * Update Javadocs * Minor fixes * Doc fixes, more code comments, cleanup of RowCombiningTimeAndDimsIterator * Fix doc link	2018-04-27 17:34:32 -07:00
Slim Bouguerra	73da7426da	Timeseries results are incoherent for case interval is out of range and case false filter. (#5649 ) * adding some tests Change-Id: I92180498e2e6695212b286d980e349c136c78c86 * added empty sequence runner Change-Id: I20c83095072bbf3b4a3a57dfc1934d528e2c7a1a * treat only granularity ALL Change-Id: I1d88fab500c615bc46db4f4497ce93089976441f * moving toList within If and add expected queries Change-Id: I56cdd980e44f0685806efb45e29031fa2e328ec4 * typo Change-Id: I42fdd28da5471f6ae57d3962f671741b106300cd * adding tests and fix logic of intervals Change-Id: I0bd414d2278e3eddc2810e4f5080e6cf6a117f12 * fix style Change-Id: I99a2380934c9ab350ca934c56041dc343c08b99f * comments review Change-Id: I726a3b905a9520d8b1db70e4ba17853c65c414a4	2018-04-23 15:55:18 -07:00
Roman Leventov	a3a9ada843	Add GenericWhitespace checkstyle check (#5668 )	2018-04-24 01:09:14 +05:30
scrawfor	15f4ab2b31	Expose noop filter to users (#5597 )	2018-04-18 07:57:07 -07:00
Gian Merlino	5d09f76df6	topN: Fix caching of Float dimension values. (#5653 ) Jackson would deserialize them as Doubles, leading to ClassCastExceptions in the topN processing pipeline when it attempted to treat them as Floats.	2018-04-17 15:35:18 -07:00
Gian Merlino	fbf3fc178e	Timeseries: Add "grandTotal" option. (#5640 ) * Timeseries: Add "grandTotal" option. * Modify whitespace. * Checkstyle workaround.	2018-04-16 18:22:19 -07:00
Jihoon Son	f349e03091	Fix NPE in compactionTask (#5613 ) * Fix NPE in compactionTask * more annotations for metadata * better error message for empty input * fix build * revert some null checks * address comments	2018-04-13 00:11:03 -04:00
Gian Merlino	72d6dcda4f	ParallelCombiner: Fix buffer leak on exception in "combine". (#5630 ) Once a buffer is acquired, we need to make sure to release it if an exception is thrown before the closeable iterator is created.	2018-04-11 20:39:39 -04:00
Senthil Kumar L S	371c672828	Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551 ) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant	2018-04-05 22:56:59 -07:00
Niketh Sabbineni	270fd1ea15	Allow getDomain to return disjointed intervals (#5570 ) * Allow getDomain to return disjointed intervals * Indentation issues	2018-04-05 22:12:30 -07:00
Jonathan Wei	969342cd28	More error reporting and stats for ingestion tasks (#5418 ) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments	2018-04-05 21:38:57 -07:00
Kirill Kozlov	8878a7ff94	Replace guava Charsets with native java StandardCharsets (#5545 )	2018-03-28 21:00:08 -07:00
Niketh Sabbineni	912adcc284	ArrayAggregation: Use long to avoid overflow (#5544 ) * ArrayAggregation: Use long to avoid overflow * Add Tests	2018-03-28 16:37:53 -07:00
Jihoon Son	024e0a9cca	Respect forceHashAggregation in queryContext (#5533 ) * Respect forceHashAggregation in queryContext * address comment	2018-03-28 14:15:38 -07:00
Atul Mohan	ec17a44e09	Add result level caching to Brokers (#5028 ) * Add result level caching to Brokers * Minor doc changes * Simplify sequences * Move etag execution * Modify cacheLimit criteria * Fix incorrect etag computation * Fix docs * Add separate query runner for result level caching * Update docs * Add post aggregated results to result level cache * Fix indents * Check byte size for exceeding cache limit * Fix indents * Fix indents * Add flag for result caching * Remove logs * Make cache object generation synchronous * Avoid saving intermediate cache results to list * Fix changes that handle etag based response * Release bytestream after use * Address PR comments * Discard resultcache stream after use * Fix docs * Address comments * Add comment about fluent workflow issue	2018-03-23 19:11:52 -07:00
Jihoon Son	1ad898bde2	Use the official aws-sdk instead of jet3t (#5382 ) * Use the official aws-sdk instead of jet3t * fix compile and serde tests * address comments and fix test * add http version string * remove redundant dependencies, fix potential NPE, and fix test * resolve TODOs * fix build * downgrade jackson version to 2.6.7 * fix test * resolve the last TODO * support proxy and endpoint configurations * fix build * remove debugging log * downgrade hadoop version to 2.8.3 * fix tests * remove unused log * fix it test * revert KerberosAuthenticator change * change hadoop-aws scope to provided in hdfs-storage * address comments * address comments	2018-03-21 15:36:54 -07:00
Clint Wylie	885b975c95	fix LongsColumnWithNulls and FloatsColumnWithNulls to override isNull in order to actually use nullValueBitmap (#5510 )	2018-03-20 16:04:08 -07:00
Charles Allen	58f110f7f8	Future-proof some Guava usage (#5414 ) * Future-proof some Guava usage * Use a java-util EmptyIterator instead of Guava's * Change some of the guava future handling to do manual async transforms. Guava changes transform into transformAsync by deprecating transform in ONLY Guava 19. Then its gone in 20 * Use `Collections.emptyIterator()` * Pretty formatting * Make listenable future transforms a thing in default druid * Format fix * Add forbidden guava apis * Make the ListenableFutrues.transformAsync have comments * Undo intellij bad pattern matching in comments * Futrues --> Futures * Add empty iterators forbidding * Fix extra `A` * Correct method signature * Address review comments * Finish Gian review comments * Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax	2018-03-20 08:59:33 -07:00
Slim	17c71a2a60	Make Doubles aggregators use 64bits by default (#5478 ) * use 64-bit float representation for double based aggregator Change-Id: Ia4f442037052add178f6ac68138c9d52f96c6e09 * review comments Change-Id: I5a588f7364f236bf22f2b138e9d743bfb27c67fe	2018-03-19 19:13:04 -07:00
Roman Leventov	693e3575f9	Remove unused code and exception declarations (#5461 ) * Remove unused code and exception declarations * Address comments * Remove redundant Exception declarations * Make FirehoseFactoryV2.connect() to throw IOException again	2018-03-16 22:11:12 +01:00
Samarth Jain	afa25202a3	Segment filtering should be done by looking at the inner most query o… (#5496 ) * Segment filtering should be done by looking at the inner most query of a nested query * Fixing checkstyle errors * Addressing code review comments	2018-03-16 14:05:14 -07:00
Gian Merlino	a08efe4683	Fix round robining in router. (#5500 ) * Fix round robining in router. Say that ten times fast. For query endpoints, AsyncQueryForwardingServlet called hostFinder.getDefaultServer() to set a default server, followed by hostFinder.getServer(inputQuery) to override it with query-specific routing. Since hostFinder is round-robin, this skips a server. When there are only two servers, one server is _always_ skipped and the router sends all queries to the same broker. * Adjust spacing.	2018-03-15 18:45:59 -07:00
Gian Merlino	16b81fcd53	SegmentMetadataQuery: Fix default interval handling. (#5489 ) * SegmentMetadataQuery: Fix default interval handling. PR #4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder. * Intervals	2018-03-15 10:05:46 -07:00
Niketh Sabbineni	40cc2c8740	Query should not fail because emitter fails or throws Exception (#5484 )	2018-03-13 19:57:05 -07:00
Roman Leventov	6b158abe3f	Enforce optimal IndexedInts iteration (#5456 ) * Enforce optimal IndexedInts iteration * Fix remaining suboptimal usages	2018-03-09 09:42:40 -08:00
Niraja Mishra	ba3dbf2a42	Fixed NPE when dimension is null or empty. https://github.com/druid-io/druid/issues/3007 (#5299 )	2018-03-05 16:27:35 -08:00
Gian Merlino	7416d1d02d	Add "joda" option to timeFormat extractionFn. (#5448 )	2018-03-02 19:59:26 -08:00
Jonathan Wei	cf5f74b013	Fix GroupBy limit push down descending sorting on numeric columns (#5453 )	2018-03-01 18:43:45 -08:00
Gian Merlino	e4eaee3806	Support for disabling bitmap indexes. (#5402 ) * Support for disabling bitmap indexes. Can save space for columns where bitmap indexes are pointless (like free-form text). * Remove import. * Fix CompactionTaskTest. * Update for review comments. * Review comments, tests. * Fix test.	2018-02-28 19:19:56 -08:00
Niraja Mishra	0f009a41e1	Fixed PeriodGranularity for Asia pacific timezones (#5410 )	2018-02-27 10:39:50 -08:00
Nishant Bangarwa	219e77aeac	SQL compatible Null Handling Part - Expressions and Storage Changes (#5278 ) * SQL compatible Null Handling Part - Expressions, Storage and Dimension Selector Changes fix travis strict compilation * fix teamcity error - remove unused method * review comments * review comments * more comments * review comments * review comments * Optimize isNull method * Optimize isNull in ColumnarFloats/Longs/Doubles * review comment - separate classes for null and non-null columns fix intellij inspection * remove unused import * More Review comments * improve comment * More review comments * fix checkstyle * more review comments * review comments. fix javadoc links remove Nullable from ConstantColumnValueSelector * review comments. * satisfy teamcity inspections	2018-02-21 13:27:26 +01:00
Jihoon Son	deeda0dff2	Fix DefaultLimitSpec to respect sortByDimsFirst (#5385 ) * Fix DefaultLimitSpec to respect sortByDimsFirst * fix bug * address comment	2018-02-16 15:26:32 -08:00
Roman Leventov	e64ffb10c2	Standartize on using Integer.BYTES instead of Ints.BYTES from Guava, same for other primitives (#5366 )	2018-02-07 13:24:30 -08:00
Gian Merlino	971d45ab3f	Use a separate snapshot file per lookup tier. (#5358 ) Prevents conflicts if two processes on the same machine use the same lookup snapshot directory but are in different tiers.	2018-02-07 11:28:53 -08:00
Gian Merlino	e255d66b85	Fix two improper casts in HavingSpecMetricComparator. (#5352 ) * Fix two improper casts in HavingSpecMetricComparator. Fixes two things: 1. An improper double-to-long cast when comparing double metrics to any kind of value, which was a regression from #4883. 2. An improper double-to-long cast when comparing a long/int metric to a double/float value: the value was cast to long/int, drawing strange conclusions like int 100 matching a havingSpec of equalTo(100.5). * Add comments. * Remove extraneous comment. * Simplify code a bit.	2018-02-06 13:18:55 -08:00
Gian Merlino	c21ff6e81c	Properly set "identity" in query metrics. (#5330 ) * Properly set "identity" in query metrics. This patch adds an "identity" field to QueryPlus and sets it in QueryLifecycle when the query starts executing. This is important because it allows it to be used for future QueryMetrics created by that QueryPlus object. We also add "identity" to the request-level QueryMetrics object created in emitLogsAndMetrics. * Remove unused method.	2018-02-06 10:53:00 -08:00
Gian Merlino	8c738c7076	Fix races in LookupSnapshotTaker, CoordinatorPollingBasicAuthenticatorCacheManager (#5344 ) * Fix races in LookupSnapshotTaker, CoordinatorPollingBasicAuthenticatorCacheManager. Both were susceptible to the following conditions: 1. Two JVMs on the same machine (perhaps two peons) could conflict by one reading while the other was writing, or by writing to the file at the same time. 2. One JVM could partially write a file, then crash, leaving a truncated file. * Use StringUtils.format	2018-02-06 09:44:06 -08:00
Slim	37c09ce3f8	Use both Joad Ids and Java IDs as Timezone to string readers (#5349 ) * Use both Joad Ids and Java IDs as Timezone to string readers Change-Id: Ieb5c18559879f3f3a0104912ce2f0a354ad0aac3 * move the function to DateTimes and add org.joda.time.DateTimeZone#forID as part of forbidden api Change-Id: Iff97fa044758019ed0c231587d10e31a9cc18da0 * exclude class and remove other usage Change-Id: Ib458c2caaa1865535767e1009fbf017a92c8f615 * remove it from test classes Change-Id: I9b576324f6c7e17a74bd8b13879232c9a8cd40b4 * remove unused Change-Id: If1c5b70c26c2b7c83c20434cb72b2060653f5052	2018-02-06 16:34:11 +05:30
Gian Merlino	9a62b02cb7	Extensions: Option to load classes from extension jars first. (#5321 ) The behavior is configurable through druid.extensions.useExtensionClassloaderFirst. It is useful when extensions want to load a dependency different from one provided by Druid, for example a different version of geoip or protobuf.	2018-02-06 16:14:03 +05:30
Jonathan Wei	285dedd126	More ParseException handling for numeric dimensions (#5312 ) * Discard rows with unparseable numeric dimensions * PR comments * Don't throw away entire row on parse exception * PR comments * Fix import	2018-02-05 21:43:35 -08:00
Gian Merlino	7e02408510	Update versions to 0.13.0-SNAPSHOT. (#5323 )	2018-02-02 12:06:38 -06:00
Himanshu	4cd47de62f	add LookupExtractorFactory.destroy() method (#5287 ) * add LookupExtractorFactory.destroy() method * fix LookupReferencesManagerTest	2018-02-01 22:56:09 -08:00
Gian Merlino	ed47a1e1a9	Lookups: Inherit "injective" from registered lookups, improve docs. (#5316 ) Code changes: - In the lookup-based extractionFns, inherit injective property from the lookup itself if not specified. Doc changes: - Add a "Query execution" section to the lookups doc explaining how injective lookups and their optimizations work. - Remove scary warnings against using registeredLookup extractionFns. They are necessary and important since they work with filters and function cascades -- two things that the dimension specs do not do. They deserve to be first class citizens. - Move the "registeredLookup" fn above the "lookup" fn. It's probably more commonly used, so the docs read better this way.	2018-02-01 18:30:19 -08:00
Jonathan Wei	80419752b5	Add metamx emitter, http clients, and metrics packages to druid java-util (#5289 ) * Add metamx java-util emitter, http clients, and metrics packages to druid java-util * Remove metamx java-util from pom.xml files * Checkstyle fixes * Import fix * TeamCity inspection fixes * Use slf4j, move some version defs to master pom.xml * Use parent jvm-attach-api and maven-surefire-plugin versions * Add ] to log msg, suppress inspection	2018-01-24 22:10:36 +01:00
Roman Leventov	61e6878afd	Check Javadoc reference integrity (#5279 )	2018-01-22 13:51:28 -08:00
Roman Leventov	a346bbc6f3	Enforce spacing around foreach colon with Checkstyle (#5271 )	2018-01-22 11:48:51 -08:00
Roman Leventov	f99c27e9e0	Fix bugs in ImmutableRTree; Merge bytebuffer-collections module into druid-processing (#5275 ) * Fix bugs in ImmutableRTree; optimize ImmmutableRTreeObjectStrategy.writeTo(); Merge bytebuffer-collections module into druid-processing * Remove unused declaration * Fix another bug	2018-01-23 00:49:59 +05:30
Roman Leventov	87c744ac1d	Add MethodParamPad, OneStatementPerLine and EmptyStatement Checkstyle checks (#5272 )	2018-01-18 11:29:23 -08:00
Roman Leventov	ad6cdf5d09	Reuse IndexedInts returned from DimensionSelector.getRow() implementations (#5172 ) * Reuse IndexedInts in DimensionSelector implementations * Remove BaseObjectColumnValueSelector.getObject() doc * typo	2018-01-17 16:01:26 +01:00
Clint Wylie	491f8cca81	fix timewarp query results when using timezones and crossing DST transitions (#5157 ) * timewarp and timezones changes: * `TimewarpOperator` will now compensate for daylight savings time shifts between date translation ranges for queries using a `PeriodGranularity` with a timezone defined * introduces a new abstract query type `TimeBucketedQuery` for all queries which have a `Granularity` (100% not attached to this name). `GroupByQuery`, `SearchQuery`, `SelectQuery`, `TimeseriesQuery`, and `TopNQuery` all extend `TimeBucke tedQuery`, cutting down on some duplicate code and providing a mechanism for `TimewarpOperator` (and anything else) that needs to be aware of granularity * move precondition check to TimeBucketedQuery, add Granularities.nullToAll, add getTimezone to TimeBucketQuery * formatting * more formatting * unused import * changes: * add 'getGranularity' and 'getTimezone' to 'Query' interface * merge 'TimeBucketedQuery' into 'BaseQuery' * fixup tests from resulting serialization changes * dedupe * fix after merge * suppress warning	2018-01-11 12:39:33 -08:00
Roman Leventov	8877ce38d6	Enforce modifier order with Checkstyle (#5246 )	2018-01-11 09:50:42 +01:00
Roman Leventov	535ec437e9	Apply 'power of 2' optimization to BlockLayoutIndexedDoubleSupplier (#5176 ) * Apply 'power of 2' optimization to BlockLayoutIndexedDoubleSupplier; slight optimization of buffer.get() in block layout indexed suppliers * Fix byte order	2018-01-05 16:08:07 +09:00
Jonathan Wei	935ac646f4	Upgrade to Calcite 1.15.0 (#5210 ) * Upgrade to Calcite 1.15.0 * Use Filtration.eternity()	2018-01-04 12:11:24 -08:00
Roman Leventov	579f9fbedf	Add IndexedInts.debugToString() and AbstractIndex.toString(); Add Sequence.toList() and limit() (#5175 ) * Add IndexedInts.debugToString() and AbstractIndex.toString() * Fix AppenderatorTest	2018-01-04 09:56:47 +09:00
Roman Leventov	dc87e4fda1	Renamed IndexedFloats/Doubles/Longs to ColumnarFloats/..., IndexedMultivalue to ColumnarMultiInts, separate IndexedInts from ColumnarInts, many other renames for consistency in io.druid.segment.data package (#5171 )	2017-12-20 18:50:07 -08:00
Clint Wylie	1181411901	small optimization in timeseries if 'skipEmptyBuckets' is true and cursor completed (#5178 )	2017-12-19 16:47:00 -06:00
Roman Leventov	f18eba50ee	Remove Aggregator.reset() (#5177 )	2017-12-19 14:09:17 -08:00
Roman Leventov	5787d04fad	Bump Druid version to 0.12.0 (#5138 )	2017-12-15 07:37:01 -08:00
Roman Leventov	64848c7ebf	DataSegment memory optimizations (#5094 ) * Deduplicate DataSegments contents (loadSpec's keys, dimensions and metrics lists as a whole) more aggressively; use ArrayMap instead of default LinkedHashMap for DataSegment.loadSpec, because they have only 3 entries on average; prune DataSegment.loadSpec on brokers * Fix DataSegmentTest * Refinements * Try to fix * Fix the second DataSegmentTest * Nullability * Fix tests * Fix tests, unify to use TestHelper.getJsonMapper() * Revert TestUtil as ServerTestHelper, fix tests * Add newline * Fix indexing tests * Fix s3 tests * Try to fix tests, remove lazy caching of ObjectMapper in TestHelper, rename TestHelper.getJsonMapper() to makeJsonMapper() * Fix HDFS tests * Fix HdfsDataSegmentPusherTest * Capitalize constant names	2017-12-12 11:41:40 -08:00
Charles Allen	4365390310	Remove duplicate fastutil dependency in processing pom.xml (#5142 )	2017-12-07 08:54:48 +09:00
Alexander Saydakov	45f91a241e	numeric quantiles sketch aggregator (#5002 ) * numeric quantiles sketch aggregator * it seems that we need to synchronize all methods, which modify the state * Seems like a false positive with -Pstrict * code style fix * code style fix * use sketches-core-0.10.3 * moved cache ids to the central place * better class names * support large columns * explained autodetection, added exception * added comments regarding sketches moving on heap * support reindexing * implemented suggestions from jihoonson * style fix * use max(k, other.k) for better accuracy * check for NilColumnValueSelector instead of null * throw exceptions instead of providing no-op comparators	2017-12-06 08:18:08 +09:00
Roman Leventov	a7a6a0487e	Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762 ) * Replace IOPeon with OutputMedium; Improve compression * Fix test * Cleanup CompressionStrategy * Javadocs * Add OutputBytesTest * Address comments * Random access in OutputBytes and GenericIndexedWriter * Fix bugs * Fixes * Test OutputBytes.readFully() * Address comments * Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes * Add comments to ByteBufferInputStream * Remove unused declarations	2017-12-04 18:04:27 -08:00
Parag Jain	7c01f77b04	Parse Batch support (#5081 ) * add parseBatch and deprecate parse method in InputRowParser add addAll method, skip max rows in memory check for it remove parse method from implemetations transform transformers add string multiplier input row parser fix withParseSpec fix kafka batch indexing fix isPersistRequired comments * add unit test * make persist async * review comments	2017-12-04 16:06:16 -06:00
Roman Leventov	aacc57131b	Don't use HistoricalXxxPrototypes in PooledTopNAlgorithm when Cursor's offset is FilteredOffset (#5133 ) * Don't use HistoricalXxxPrototypes in PooledTopNAlgorithm when Cursor's offset is FilteredOffset, because it doens't support cloning * Add test	2017-12-01 17:04:22 -08:00
Gian Merlino	5f6bdd940b	SQL: Improve translation of time floor expressions. (#5107 ) * SQL: Improve translation of time floor expressions. The main change is to TimeFloorOperatorConversion.applyTimestampFloor. - Prefer timestamp_floor expressions to timeFormat extractionFns, to avoid turning things into strings when it isn't necessary. - Collapse CAST(FLOOR(X TO Y) AS DATE) to FLOOR(X TO Y) if appropriate. * Fix tests.	2017-11-29 12:06:03 -08:00
Chuanlei Ni	368d03146b	assign granularity.all to SelectQuery by default (#5091 )	2017-11-21 17:10:19 -08:00
zhangxinyu1	590633c595	fix bug in method reset of ByteBufferHashTable.java (#5100 )	2017-11-17 13:19:16 -03:00
Jonathan Wei	ec6774039e	Fix numLookupLoadingThreads default value (#5097 )	2017-11-16 15:13:52 -08:00
Gian Merlino	77df5e0673	ExpressionSelectors: Add optimized selectors. (#5048 ) * ExpressionSelectors: Add caching selectors. - SingleLongInputCaching selector for expressions on the __time column, using a similar optimization to SingleScanTimeDimSelector - SingleStringInputDimensionSelector for expressions on string columns that return strings, using a similar optimization to ExtractionFn based DimensionSelectors. - SingleStringInputCaching selector for expressions on string columns that return primitives. Also, in the SQL planner, prefer expressions for time operations rather than extractionFns. * Code review comments.	2017-11-13 20:24:24 -08:00
Akash Dwivedi	c1538f29fc	maxQueryTimeout property in runtime properties. (#4852 ) * maxQueryTimeout property in runtime properties. * extra line * move withTimeoutAndMaxScatterGatherBytes method to QueryLifeCycle. * Fix initialize method. * remove unused import. * doc update. * some more details in doc about query failure.. * minor fix. * decorating QueryRunner to set and verify context. Added by servers. * remove whitespace.	2017-11-13 19:23:11 -06:00
Gian Merlino	9444da5038	SQL: Improved behavior when implicitly casting strings to date/time literals. (#5023 ) * SQL: Improved behavior when implicitly casting strings to date/time literals. - Handle all flavors of ISO8601 and SQL literals. - Throw errors on other literals instead of silently transforming them to 0. * Respect timeZone when format is null.	2017-11-10 17:43:22 +09:00
Roman Leventov	3541b7544b	Prohibit and remove unused declarations in the processing module (#4930 ) * Prohibit and remove unused declarations in the processing module * Fix tests * Fix integration tests * Suppress unused * Try to remove SuppressWarnings unused in VirtualColumn * Remove reset 'false positives' * Annotate CliCommandCreator as ExtensionPoint * Unused import warning instead of error in IntelliJ * Fixes * Add comment * Fix AzureBlob * Fix CloudFilesBlob * Address comments * Add Project SDK section to INTELLIJ_SETUP.md * Fix image	2017-11-09 09:27:27 -08:00
Roman Leventov	a8dc056c09	Add retries for coordinator fetch and lookup start in LookupReferencesManager (#5029 ) * Add retries for coordinator fetch and lookup start in LookupReferencesManager * Fix LookupConfigTest * Address comments * Address more comments * And address more comments * Address comms * Recognize 'not found' lookups in LookupReferencesManager.tryGetLookupListFromCoordinator(), by @egor-ryashin	2017-11-09 02:30:36 -03:00
Jihoon Son	5f3c863d5e	Add compaction task (#4985 ) * Add compaction task * added doc * use combining aggregators * address comments * add support for dimensionsSpec * fix getUniqueDims and getUniqueMetics * find unique dimensionsSpec * fix compilation * add unit test * fix test * fix test * test for different dimension orderings and types, and doc for type and ordering * add control for custom ordering and type * update doc * fix compile * fix compile * add segments param * fix serde error * fix build	2017-11-03 21:55:27 -06:00
Roman Leventov	5eb08c27cb	Add Emitter monitoring (#4973 ) * Add Emitter monitoring * Fix typo * Fixes * testing new emitter * Fix failed test (#71) * testing new emitter * fix on failed test * Remove emitter's readTimeout from docs * Update docs * Add HttpEmittingMonitor * Update java-util to 1.3.2	2017-11-03 21:27:57 -06:00
Gian Merlino	6c725a7e06	Fix havingSpec on complex aggregators. (#5024 ) * Fix havingSpec on complex aggregators. - Uses the technique from #4883 on DimFilterHavingSpec too. - Also uses Transformers from #4890, necessitating a move of that and other related classes from druid-server to druid-processing. They probably make more sense there anyway. - Adds a SQL query test. Fixes #4957. * Remove unused import.	2017-11-01 12:58:08 -04:00
Gian Merlino	1df458b35e	Fix improper handling of empty arrays in StringDimensionIndexer. (#5012 ) * Fix improper handling of empty arrays in StringDimensionIndexer. This bug was able to introduce data errors: if the input rows to an IncrementalIndex contained entirely empty arrays and single values, then upon persisting to disk, the empty arrays would be replaced with the lexicographically smallest single value, rather than nulls like they should have been. * Style fix. * Add tests for bitmap indexes too.	2017-10-27 14:21:48 -07:00
Jihoon Son	d7024f22e1	Upgrade fastutil to 8.1.0 (#4988 ) * Upgrade failutil to 8.1.0 * unused import	2017-10-19 23:37:43 -05:00
Jihoon Son	52d7f74226	Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704 ) * Add steaming grouper * Fix doc * Use a single dictionary while combining * Revert GroupByBenchmark * Removed unused code * More cleanup * Remove unused config * Fix some typos and bugs * Refactor Groupers.mergeIterators() * Add comments for combining tree * Refactor buildCombineTree * Refactor iterator * Add ParallelCombiner * Add ParallelCombinerTest * Handle InterruptedException * use AbstractPrioritizedCallable * Address comments * [maven-release-plugin] prepare release druid-0.11.0-sg * [maven-release-plugin] prepare for next development iteration * Address comments * Revert "[maven-release-plugin] prepare for next development iteration" This reverts commit `5c6b31e488`. * Revert "[maven-release-plugin] prepare release druid-0.11.0-sg" This reverts commit `0f5c3a8b82`. * Fix build failure * Change list to array * rename sortableIds * Address comments * change to foreach loop * Fix comment * Revert keyEquals() * Remove loop * Address comments * Fix build fail * Address comments * Remove unused imports * Fix method name * Split intermediate and leaf combine degrees * Add comments to StreamingMergeSortedGrouper * Add more comments and fix overflow * Address comments * ConcurrentGrouperTest cleanup * add thread number configuration for parallel combining * improve doc * address comments * fix build	2017-10-17 23:24:08 -07:00
Slim	af2bc5f814	Make float default representation for DoubleSum/Min/Max aggregators (#4944 ) * Introduce System wide property to select how to store double. Set the default to store as float Change-Id: Id85cca04ed0e7ecbce78624168c586dcc2adafaa * fix tests Change-Id: Ib42db724b8a8f032d204b58c366caaeabdd0d939 * Change the property name Change-Id: I3ed69f79fc56e3735bc8f3a097f52a9f932b4734 * add tests and make default distribution store doubles as 64bits Change-Id: I237b07829117ac61e247a6124423b03992f550f2 * adding mvn argument to parallel-test profile Change-Id: Iae5d1328f901c4876b133894fa37e0d9a4162b05 * move property name and helper function to io.druid.segment.column.Column Change-Id: I62ea903d332515de2b7ca45c02587a1b015cb065 * fix docs and clean style Change-Id: I726abb8f52d25dc9dc62ad98814c5feda5e4d065 * fix docs Change-Id: If10f4cf1e51a58285a301af4107ea17fe5e09b6d	2017-10-16 17:17:22 -07:00
Himanshu	a7e802c9d4	greater-than/less-than/equal-to havingSpec to call AggregatorFactory.finalizeComputation(..) (#4883 ) * greater-than/less-than/equal-to havingSpec to call AggregatorFactory.finalizeComputation(..) * fix the unit test and expect having to work on hyperUnique agg * test fix * fix style errors	2017-10-16 12:02:30 -07:00
Roman Leventov	dc7cb117a1	Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism (#4886 ) * Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism * Fix MapVirtualColumn.makeColumnValueSelector() * Minor fixes * Fix IndexGeneratorCombinerTest * DimensionSelector to return zeros when treated as numeric ColumnValueSelector * Fix IncrementalIndexTest * Fix IncrementalIndex.makeColumnSelectorFactory() * Optimize MapBasedRow.getMetric() * Fix VarianceAggregatorTest * Simplify IncrementalIndex.makeColumnSelectorFactory() * Address comments * More comments * Test	2017-10-13 21:44:17 -05:00
Jihoon Son	8d9902831e	Refactoring PrefetchableTextFilesFirehoseFactory (#4836 ) * Refactoring prefetchable firehose * Fix to read cache when prefetch is disabled * More tests * Cleanup codes * Add Fetcher * Fix test failure * Count file size * Fix test * rename generic parameter * address comments * address comments * reuse buffer * move Execs to java-util * use execs * Fix build	2017-10-13 21:39:28 -05:00
Jihoon Son	675c6c00dd	Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958 ) * add checkstyle and intellij rule * fix tc fail	2017-10-13 07:56:19 -07:00
Atul Mohan	c07678b143	Synchronization of lookups during startup of druid processes (#4758 ) * Changes for lookup synchronization * Refactor of Lookup classes * Minor refactors and doc update * Change coordinator instance to be retrieved by DruidLeaderClient * Wait before thread shutdown * Make disablelookups flag true by default * Update docs * Rename flag * Move executorservice shutdown to finally block * Update LookupConfig * Refactoring and doc changes * Remove lookup config constructor * Revert Lookupconfig constructor changes * Add tests to LookupConfig * Make executorservice local * Update LRM * Move ListeningScheduledExecutorService to ExecutorCompletionService * Move exception to outer block * Remove check to see future is done * Remove unnecessary assignment * Add logging	2017-10-12 21:22:24 -05:00
Gian Merlino	32f36beaae	QueryableIndexStorageAdapter: Lift column cache to Cursor sequence. (#4950 ) * QueryableIndexStorageAdapter: Lift column cache to Cursor sequence. This is where it was before #4710, when its was moved to the individual Cursors, leading to higher than expected memory usage. It could be extreme for finer query granularities like "second". * Comment.	2017-10-12 16:44:33 -05:00
Jihoon Son	56fb11ce0b	Lazy initialization for JavaScript functions (#4871 ) * Lazy initialization of JavaScript functions * Fix test failure * Fix thread-safety and postpone js conf check * Fix test fail * Fix test * Fix KafkaIndexTaskTest * Move config check	2017-10-10 21:52:42 -07:00
Jonathan Wei	18635a19b3	Remove unused limitFn in GroupByQuery (#4935 ) * Remove unused limitFn in GroupByQuery * Remove unused limitFn creation logic	2017-10-10 15:56:30 -07:00
Gian Merlino	b20e3038b6	SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889 ) * SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. This brings benefits: - Ability to do GROUP BY and ORDER BY with ordinals. - Ability to support IN filters beyond 19 elements (fixes #4203). Some refactoring of druid-sql internals: - Builtin aggregators and operators are implemented as SqlAggregators and SqlOperatorConversions rather being special cases. This simplifies the Expressions and GroupByRules code, which were becoming complex. - SqlAggregator implementations are no longer responsible for filtering. Added new functions: - Expressions: strpos. - SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR, and DATE_TRUNC. * Add missing @Override annotation. * Adjustments for forbidden APIs. * Adjustments for forbidden APIs. * Disable GROUP BY alias. * Doc reword.	2017-10-10 12:44:05 -07:00
chunghochen	0614b92df1	adding new post aggregators for test statistics to druid-stats extension (#4532 ) * adding new post aggregators of test stats to druid-stats extension * changes to address code review comments * fix checkstyle violations using druid_intellij_formatting.xml after merge upstream/master * add @Override annotation per CI log * make changes per review comments/discussions * remove some blocks per review comments	2017-10-09 23:43:27 -07:00
Akash Dwivedi	716a5ec1a8	Add identity to DefaultSearchQueryMetrics and DefaultSelectQueryMetrics. (#4906 )	2017-10-04 20:28:23 -05:00
Akash Dwivedi	2ee32399ff	granularity method in QueryMetrics. (#4570 ) * granularity method in QueryMetrics. PR to emit granularity dimension for timeseries, search, groupBy, select and topN queries. * QueryMetricsFactory classes for search and select queries. * Empty implementation for Granularity() method. * Review comment changes. * Remove unused import. * empty query() method. * checkstyle fix. * Import fix.	2017-10-04 09:42:52 -07:00
Jonathan Wei	5fbec5b435	Fix limit push down comparator bug (#4868 )	2017-10-02 11:44:23 -07:00
Gian Merlino	1f2074c247	Bump versions in master to 0.11.1-SNAPSHOT. (#4878 ) * Bump versions in master to 0.11.1-SNAPSHOT. * Missed a few.	2017-09-28 17:09:51 -05:00
Gian Merlino	a19f22b5bb	Add identity to query metrics, logs. (#4862 ) * Add identity to query metrics, logs. Also fix a bug where unauthorized requests would not emit any logs or metrics, and instead would log a "Tried to emit logs and metrics twice" warning. Also rename QueryResource's "getServer" to "cancelQuery", because that's what it does. * Do not emit identity by default.	2017-09-28 11:45:23 -07:00
Goh Wei Xiang	2c30d5ba55	Add org.joda.time.DateTime.parse() to forbidden APIs (#4857 ) * Added org.joda.time.DateTime#(java.lang.String) to forbidden API. * Added org.joda.time.DateTime#(java.lang.String, org.joda.time.format.DateTimeFormatter) to forbidden API. * Add additional APIs that may create DateTime with default time zone * Add helper function that accepts formatter to parse String. * Add additional forbidden APIs * Replace existing usage of forbidden APIs * Use wrapper class to enforce Chronology on DateTimeFormatter. * Creates constant UtcFormatter for constant ISODateTimeFormat.	2017-09-27 17:46:44 -05:00
Roman Leventov	9c126e2aa9	Forbid MapMaker (#4845 ) * Forbid MapMaker * Shorter syntax * Forbid Maps.newConcurrentMap()	2017-09-27 06:49:47 -07:00
Roman Leventov	e267f3901b	Enforce Indentation with Checkstyle (#4799 )	2017-09-21 13:06:48 -07:00
Roman Leventov	a9d8539802	Remove IndexedInts.iterator() (#4811 ) * Remove IndexedInts.iterator() * Retain IndexedInts.iterator(), but don't extend Iterable * Add BitmapValues * Fix tests	2017-09-20 21:25:52 -07:00
Roman Leventov	88e9a80636	Rename ObjectValueSelector.get() to getObject(); Add getObject() and classOfObject() to ColumnValueSelector (#4801 )	2017-09-19 14:47:20 -05:00
Roman Leventov	24646ac76a	LZ4 decompression forward compatibility (#4824 )	2017-09-19 10:18:37 -07:00
Charles Allen	00d39ce7a5	Move checks for bitmap size == 0 to isEmpty (#4820 )	2017-09-19 21:45:16 +05:30
Jonathan Wei	c2a0e753b6	Extension points for authentication/authorization (#4271 ) * Extension points for authentication/authorization * Address some PR comments * Authorization result caching * Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter * Use Set for auth caching, close outputstreams in filters * Don't close output stream on success in sanity check filter * Add ConfigResourceFilter to coordinator lookups * Fix filtering authorization check for empty resource list * HttpClient users must explicitly escalate the client * Remove response modification from PreResponseAuthorizationCheckFilter * Remove extraneous pom.xml * Fix unit test * Better lifecycle management * Rename AuthorizationManager to Authorizer * Fix authorization denials for empty supervisor list * Address some PR comments * Address more PR comments * Small cleanup * Add Jetty HttpClient wrapper to Authenticator * Remove Authorizer start/stop * Restore immutable context map in DruidConnection, UT fix * Fix/update docs * Add authorization checks to EventReceiverFirehose * Fix router authorization check failure, restore PreResponseAuthorizationFilter changes * Compile fixes * Test fixes * Update Authenticator/Authorizer doc comments * Merge fixes * PR comments * Fix test * Fix IT * More PR comments * PR comments * SSL fix	2017-09-15 23:45:48 -07:00
Roman Leventov	3f92184dd8	Inspection fixes (#4809 )	2017-09-15 17:48:29 -07:00
Roman Leventov	b61248fdb1	Replace HistoricalFloatColumnSelector with more generic HistoricalColumnSelector (#4796 )	2017-09-14 13:52:06 -07:00
Akash Dwivedi	a17e48fe69	search package name correction. (#4785 ) * search package name correction. * Refactor search.search pkg to search. * remove unused import.	2017-09-14 13:50:23 -07:00
Gian Merlino	2ce8123bdb	Move scan-query from a contrib extension into core. (#4751 ) * Move scan-query from a contrib extension into core. Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion This patch also adds support for virtual columns to the Scan query, and updates Druid SQL to use Scan instead of Select. This patch also makes some behavioral changes to handling of the __time column. In particular, it is now is returned as "__time" rather than "timestamp"; it is no longer included if you do not specifically ask for it in your "columns"; and it is returned as a long rather than a string. Users can revert time handling to the legacy extension behavior by setting "legacy" : true in their queries, or setting the property druid.query.scan.legacy = true. This is meant to provide a migration path for users that were formerly using the contrib extension. * Adjustments from review. * Add back Select query. * Adjust SQL docs. * Restore SelectQuery link.	2017-09-13 09:51:24 -07:00
Gian Merlino	c3a1ce6933	SQL: Fix toTimeseriesQuery and toTopNQuery. (#4780 ) The former would sometimes eat limits, and the latter would sometimes use the wrong dimension comparator.	2017-09-12 14:37:27 -07:00
Jonathan Wei	3a29521273	Fix GroupBy limit push down error when buffer is too small (#4745 ) * Fix GroupBy limit push down error when buffer is too small * Address PR comments	2017-09-12 12:34:50 -07:00
Roman Leventov	832cc293ef	Refactoring of ReferenceCountingSegment and FireHydrant (#4154 ) * Refactoring of ReferenceCountingSegment and FireHydrant * Address comment * Fix FireHydrant.closeSegment() * Address comment * Added comments to ReferenceCountingSegment	2017-09-12 14:28:35 -05:00
Gian Merlino	4909c48b0c	SQL: Full TRIM support. (#4750 ) * SQL: Full TRIM support. - Support trimming arbitrary characters - Support BOTH, LEADING, and TRAILING * Remove unused import. * Fix tests, add RTRIM / LTRIM. * Remove unused imports. * BTRIM and docs. * Replace for with foreach.	2017-09-12 11:49:08 -07:00
Gian Merlino	23c0357816	BufferHashGrouperTest: Better behavior with regard to large buffers. (#4779 ) * BufferHashGrouperTest: Better behavior with regard to large buffers. 1) Free buffers after each test 2) Avoid mmaping past the end of a file * Use CloserRule.	2017-09-11 12:10:31 -07:00
Andy Sloane	706747cc8c	Fix for sort order in select/topN query cache (#4766 ) When historical caching is enabled, and a select or topN query is issued, and then a following query with "descending": true is set, the cached query returns the ascending result (or vice versa), often resulting in invalid paging identifiers. The CacheKey for these queries doesn't include the "descending" flag; this change adds it, and fixes the problem.	2017-09-10 11:33:00 +09:00
Charles Allen	bdfc6fe25e	Move common TypeReference into JacksonUtils (#4738 )	2017-08-31 13:40:16 -07:00
Niketh Sabbineni	beecb9e210	Fix failing build, remove unused import (#4726 ) LGTM	2017-08-29 14:46:38 +09:00
Roman Leventov	4d109a358a	Refactoring of Storage Adapters (#4710 ) * Factor QueryableIndexColumnSelectorFactory and IncrementalIndexColumnSelectorFactory out of QueryableIndexStorageAdapter and IncrementalIndexStorageAdapter; Add Offset.getBaseReadableOffset(); Remove OffsetHolder interface; Replace Cursor extends ColumnSelectorFactory with composition; Reduce indirection in ColumnValueSelectors created by QueryableIndexColumnSelectorFactory * Don't override clone() in FilteredOffset (the prev. implementation was broken); Some warnings fixed * Simplify Cursors in QueryableIndexStorageAdapter * Address comments * Remove unused and unimplemented methods from GenericColumn interface * Comments	2017-08-28 18:07:31 -07:00
Gian Merlino	daf3c5f927	Add "round" option to cardinality and hyperUnique aggregators. (#4720 ) * Add "round" option to cardinality and hyperUnique aggregators. Also turn it on by default in SQL, to make math on distinct counts work more as expected. * Fix some compile errors. * Fix test. * Formatting.	2017-08-28 14:52:11 -07:00
Gian Merlino	9fbfc1be32	Add @ExtensionPoint and @PublicApi annotations. (#4433 ) * Add @ExtensionPoint and @PublicApi annotations. * Clean up wording. * Remove unused import. * Remove unused imports. * Only types can be extension points. * Adjust annotations some more. * Remove unused import. * Make ServletFilterHolder an extension point. * Add a couple extension points, and update docs.	2017-08-28 14:50:58 -07:00
Gian Merlino	43488df975	Fix dimension selectors with extractionFns on missing columns. (#4717 ) * Fix dimension selectors with extractionFns on missing columns. This patch properly applies the requested extractionFn to missing columns. It's important when the extractionFn maps null to something other than null. * Extract helper method. * Change contracts of VirtualColumns and VirtualColumn methods based on review comments. * Remove unused import. * Remove unused method. * Adjust helper function. * Adjustments	2017-08-25 18:34:42 -05:00
Roman Leventov	598cc46bae	Replace HashMap with Obj2IntMap in StringDimensionIndexer; Small optimization in StringDimensionMergerV9 (#4721 )	2017-08-25 12:30:39 -07:00
Roman Leventov	326a85a9a4	Add Offset.reset() and remove unused Offset implementations (#4706 ) * Add Offset.reset() and remove unused Offset implementations * Fix BitmapOffset * Address comments	2017-08-22 17:43:29 -07:00
Roman Leventov	cacf63b007	Add AggregateCombiners (#4676 ) * Add MetricCombiners * Rename MetricCombiner to AggregateCombiner * Spelling * Fix TimestampAggregatorFactory.combine() and add makeAggregateCombiner() implementation * Rename AggregateCombiner.combine() to fold()	2017-08-21 16:45:29 -07:00
Roman Leventov	cbd1902db8	Add forbidden-apis plugin; prohibit using system time zone (#4611 ) * Forbidden APIs WIP * Remove some tests * Restore io.druid.math.expr.Function * Integration tests fix * Add comments * Fix in SimpleWorkerProvisioningStrategy * Formatting * Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest * Address comments * Fix GroupByMultiSegmentTest	2017-08-21 13:02:42 -07:00
Roman Leventov	fa87eaa6e8	Remove IndexedInts.fill() (#4705 )	2017-08-21 13:01:34 -07:00
Asif Mansoor Amanullah	37f85b08d2	move row up/down for null metric ordering (#4681 ) * move row up/down for null metric ordering * addressed comments * addressed changes	2017-08-17 11:36:19 -05:00
Jonathan Wei	ab28dc3b97	free() dictionary merging buffers in IndexMerger (#4684 ) * free() dictionary merging buffers in IndexMerger * Use close() for dictionary merge iterators * Add comments on buffer free	2017-08-15 10:11:29 -07:00
Jonathan Wei	e91d4d1b80	Remove makeObjectColumnSelector() from DimensionIndexer (#4679 )	2017-08-11 14:39:00 -07:00
Jonathan Wei	1bddfc089c	Additional docs/log for direct memory usage (#4631 ) * Additional docs/log for direct memory usage * Tweak docs * Doc rewording	2017-08-10 23:33:20 -07:00
Roman Leventov	bf28d0775b	Remove QueryRunner.run(Query, responseContext) and related legacy methods (#4482 ) * Remove QueryRunner.run(Query, responseContext) and related legacy methods * Remove local var	2017-08-11 09:12:38 +09:00
Jihoon Son	65c1d6c797	Add IntGrouper to avoid unnecessary boxing/unboxing in array-based aggregation (#4668 ) * Add IntGrouper * Fix build * Address comments * Add a benchmark query	2017-08-10 07:41:39 -07:00
solimant	de9ba97d54	Move equals() from Float[Sum\|Min\|Max]AggregatorFactory to SimpleFloat... (#4675 ) Addresses #4671	2017-08-10 07:22:22 -07:00
Gian Merlino	7c89e12ca9	Replace Guava Enum.getIfPresent with builtin version. (#4659 ) * Replace Guava Enum.getIfPresent with builtin version. This is useful for running in Hadoop environments that use Guava 11. Some code is also simplified. * Code review	2017-08-09 17:20:00 -07:00
Jihoon Son	fe3421032b	Parallel sort for ConcurrentGrouper (#4660 ) * Multi-thread sort * Address comments	2017-08-09 16:24:36 -07:00
Goh Wei Xiang	42569e65e2	Minor fix in ExpressionSelectors to avoid potential NPE. (#4669 )	2017-08-09 10:13:31 -07:00
Roman Leventov	7454fd86a0	Polymorphic numeric getters for ColumnValueSelector (#4623 ) * Add methods getFloat(), getDouble() and getLong() to ColumnValueSelector * Fix copy-paste mistake in docs * Spelling	2017-08-08 18:38:06 -07:00
Roman Leventov	f5d4171459	Prohibit for loops which could be foreach with IntelliJ (#4653 ) * Replace for with foreach * Replace for with for-each in GroupByQueryEngineV2 * Remove io.druid.collections.IntList	2017-08-08 18:05:33 -07:00
Roman Leventov	aa7e4ae5e4	Enforce correct spacing with Checkstyle (#4651 )	2017-08-05 10:18:25 -07:00
Jonathan Wei	aa8d75004c	More informative QueryInterruptedException toString() (#4642 ) * More informative QueryInterruptedException toString() * Use StringUtils.format	2017-08-04 16:00:20 -07:00
Jonathan Wei	9650d80f3e	Don't use limit push down with having spec (#4630 ) * Don't use limit push down with having spec * Throw exception when forcing limit push down with having * Tests for having and limit push down * Fix pool sizes in unit test	2017-08-04 15:13:29 -07:00
Roman Leventov	7a005088d9	Add HistoricalCursor.getReadableOffset() to access unwrapped offset in selectors (#4633 ) * Add HistoricalCursor.getReadableOffset() to access unwrapped offset in selectors, when the 'main' offset if FilteredOffset (fixes #4628) * Stack overflow test	2017-08-04 12:51:48 -07:00
Jihoon Son	f3f2cd35e1	Array-based aggregation for groupBy query (#4576 ) * Array-based aggregation * Fix handling missing grouping key * Handle invalid offset * Fix compilation * Add cardinality check * Fix cardinality check * Address comments * Address comments * Address comments * Address comments * Cleanup GroupByQueryEngineV2.process * Change to Byte.SIZE * Add flatMap	2017-08-03 20:04:54 +03:00

... 3 4 5 6 7 ...

2228 Commits