druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	3d6f409fc8	Fix groupBy on double dimensions. (#4596 ) * Fix groupBy on double dimensions. * Fix tests. * Fix tests. * Fix Scan tests.	2017-07-24 23:18:06 -07:00
Gian Merlino	8a4185897e	Add filter tests for both floats and doubles. (#4597 )	2017-07-24 17:02:02 -07:00
Atul Mohan	4bd0f174ba	Changes for deduplication (#4581 )	2017-07-24 11:12:23 -05:00
Roman Leventov	7408a7c4ed	Refactor CachingClusteredClient.run() (#4489 ) * Refactor CachingClusteredClient * Comments * Refactoring * Readability fixes	2017-07-23 23:10:36 +09:00
Roman Leventov	c0beb78ffd	Enforce brace formatting with Checkstyle (#4564 )	2017-07-21 10:26:59 -05:00
Slim	71e7a4c054	Adding double colums supports (#4491 ) * add double columns support * Fix numbers and expected results in UTs * adding float aggregators * fix IT expected test results * fix comments * more fixes * fix comp * fix test * refactor double and float aggregator factories * fix * fix UTs * fix comments * clean unused code * fix more comments * undo unnecessary changes * fix null issue * refactor TopNColumnSelectorStrategyFactory * fix docs * refactor NumericTopNColumnSelectorStrategy * fix return * fix comments * handle the null case in DimesionIndexer * more null fixing * cosmetic changes	2017-07-20 10:14:14 +03:00
Roman Leventov	ae86323dbd	Remove unnecessary qualifier (#4565 )	2017-07-18 17:40:54 +09:00
Roman Leventov	60cdf94677	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 ) * Add PMD and prohibit unnecessary fully qualified class names in code * Extra fixes * Remove extra unnecessary fully-qualified names * Remove qualifiers * Remove qualifier	2017-07-17 22:22:29 +09:00
Chris Gavin	960cb07ea6	Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474 ) * Remove some unnecessary use of boxed types. * Fix some incorrect format strings. * Enable IDEA's MalformedFormatString inspection. * Add a Checkstyle check for finding uses of incorrect logging packages. * Fix some incorrect usages of the metamx logger. * Bypass incorrect logger Checkstyle check where using the correct logger is not simple. * Fix some more places where the wrong number of arguments are provided to format strings. * Suppress `MalformedFormatString` inspection on legacy logging test. * Use @SuppressWarnings rather than a noinspection suppression comment. * Fix some more incorrect format strings. * Suppress some more incorrect format string warnings where the incorrect string is intentional. * Log the aggregator when closing it fails. * Remove some unneeded log lines.	2017-07-13 12:15:32 -07:00
Roman Leventov	b2865b7c7b	Make possible to start Peon without DI loading of any querying-related stuff (#4516 ) * Make QueryRunnerFactoryConglomerate injection lazy in TaskToolbox/TaskToolboxFactory * Extract QueryablePeonModule and add druid.modules.excludeList config * Typo	2017-07-12 13:18:25 -05:00
Goh Wei Xiang	53e6b5cb9b	Removal of TopNResultMerger because it is vestigial. (#4520 )	2017-07-12 13:24:07 +03:00
Roman Leventov	ad76f7a1ab	Make Filter.getBitmapResult() abstract (#4481 )	2017-07-11 12:39:32 -07:00
Akash Dwivedi	a108d05f76	Use GenericIndexed v2 supported read() during deserializeColumn (#4463 )	2017-07-11 10:18:25 -05:00
Roman Leventov	d168a4271e	Use Double.NEGATIVE_INFINITY and Double.POSITIVE_INFINITY (#4496 ) * Use Double.NEGATIVE_INFINITY and Double.POSITIVE_INFINITY instead of Double.MIN_VALUE and Double.MAX_VALUE, same for Float * Replace usages in comments * Fix RTree * Remove commented code * Add tests	2017-07-07 09:10:13 -06:00
Roman Leventov	fc4fe24dd5	Incorrect use of Long.TYPE and Float.TYPE as return type of ObjectColumnSelector.classOfObject() (#4501 )	2017-07-05 08:54:06 -07:00
Jonathan Wei	97a79f4478	Fix GroupBy type cast when ChainedExecutionQueryRunner merges results (#4488 ) * Fix GroupBy type cast error when ChainedExecutionQueryRunner merges multiple runners * Move conversion step to separate method * Remove unnecessary comment * Use compute to update map	2017-06-30 17:33:03 -07:00
Roman Leventov	9ae457f7ad	Avoid using the default system Locale and printing to System.out in production code (#4409 ) * Avoid usages of Default system Locale and printing to System.out or System.err in production code * Fix Charset in DruidKerberosUtil * Remove redundant string format in GenericIndexed * Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format() * Fix testSafeFormat() * More fixes of redundant StringUtils.format() inside ISE * Rename unimportantSafeFormat() to nonStrictFormat()	2017-06-29 14:06:19 -07:00
Roman Leventov	ae900a4934	Update versions to 0.11.0-SNAPSHOT (#4483 )	2017-06-28 17:05:58 -07:00
Roman Leventov	6173570425	Add ExtensionsConfig.excludeModules (#4438 ) * Add ExtensionsConfig.excludeModules * Add branch * Refactor Initialization.getFromExtensions() * excludeModules -> moduleExcludeList * Initialization.getFromExtensions() and getLoadedModules() should return Collection, not Set * Fix doc	2017-06-28 14:01:31 -07:00
Gian Merlino	4c33d0a00f	Add some new expression functions and macros. (#4442 ) * Add some new expression functions and macros. See misc/math-expr.md for the list of added functions, except for "like", which previously existed but was not documented. * Add easymock to datasketches tests. * Add easymock to distinctcount tests. * Add easymock to virtual-columns tests. * Code review comments. * Clean up code a bit. * Add easymock to scan-query tests. * Rework ExprMacros that have multiple impls. * Improve test coverage.	2017-06-28 10:15:58 -07:00
Roman Leventov	2fa4b10145	More fine-grained DI for management node types. Don't allocate processing resources on Router (#4429 ) * Remove DruidProcessingModule, QueryableModule and QueryRunnerFactoryModule from DI for coordinator, overlord, middle-manager. Add RouterDruidProcessing not to allocate processing resources on router * Fix examples * Fixes * Revert Peon configs and add comments * Remove qualifier	2017-06-27 22:58:01 -07:00
Roman Leventov	05d58689ad	Remove the ability to create segments in v8 format (#4420 ) * Remove ability to create segments in v8 format * Fix IndexGeneratorJobTest * Fix parameterized test name in IndexMergerTest * Remove extra legacy merging stuff * Remove legacy serializer builders * Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest	2017-06-26 13:21:39 -07:00
Jihoon Son	3e60c9125d	Increase timeout of GroupByQueryMergeBufferTest and AppenderatorDriverTest (#4441 )	2017-06-22 06:09:52 -07:00
Jihoon Son	3a5c480405	Split IndexMergerTest and ImmutableConciseSetTest (#4427 )	2017-06-21 20:55:51 -07:00
Gian Merlino	34d2f9ebfe	Queries: Restore old prepareAggregations method. (#4432 ) For backwards compatibility, post-#4394.	2017-06-21 05:36:32 -07:00
Gian Merlino	679cf277c0	Add ExpressionFilter. (#4405 ) * Add ExpressionFilter. The expression filter expects a single argument, "expression", and matches rows where that expression is true. * Code review comments. * CR comment. * Fix logic. * Fix test. * Remove unused import.	2017-06-20 12:42:26 -07:00
Gian Merlino	22aad08a59	ExpressionPostAggregator: Automatically finalize inputs. (#4406 ) * ExpressionPostAggregator: Automatically finalize inputs. Raw HyperLogLogCollectors and such aren't very useful. When writing expressions like `x / y` users will expect `x` and `y` to be finalized. * Fix un-merge. * Code review comments. * Remove unnecessary ImmutableMap.copyOf.	2017-06-17 13:22:47 -07:00
Goh Wei Xiang	f68a0693f3	Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397 ) * Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe. * - Use of computeIfAbsent over putIfAbsent - Replace Maps.newXXXMap() with normal instantiation - Documentations on when is thread-safe required. - Use Builders for On/OffheapIncrementalIndex * - Optimization on computeIfAbsent - Constant EMPTY DimensionsSpec - Improvement on IncrementalIndexSchema.Builder - Remove setting of default values - Use var args for metrics - Correction on On/OffheapIncrementalIndex Builders - Combine On/OffheapIncrementalIndex Builders * - Removing unused imports. * - Helper method for testing with IncrementalIndex.Builder * - Correction on javadoc. * Style fix	2017-06-16 16:04:19 -05:00
Gian Merlino	054cf8a183	Limit random access in compressed column tests. (#4414 ) * Limit random access in compressed column tests. Random access leads to lots of block decompressions for reading single elements, which is time prohibitive for the large column tests. For those tests, limit the number of randomly accessed elements to 1000. * Random -> ThreadLocalRandom	2017-06-15 14:48:06 -07:00
Jonathan Wei	7fe295009e	Faster ByteBufferMinMaxOffsetHeapTest (#4404 )	2017-06-15 13:14:29 -05:00
Gian Merlino	6edee7f434	Expressions work better with strings. (#4394 ) * Expressions work better with strings. - ExpressionObjectSelector able to read from string columns, and able to return strings. - ExpressionVirtualColumn able to offer string (and long for that matter) as its native type. - ExpressionPostAggregator able to return strings. - groupBy, topN: Allow post-aggregators to accept dimensions as inputs, making ExpressionPostAggregator more useful. - topN: Use DimExtractionTopNAlgorithm for STRING columns that do not have dictionaries, allowing it to work with STRING-type expression virtual columns. - Adjusts null handling to better match the rest of Druid: null and empty string treated the same; nulls implicitly treated as zeroes in numeric context. * Code review comments. * More code review. * Fix test. * Adjust annotations.	2017-06-14 14:50:18 -07:00
Roman Leventov	113b8007b7	Increase timeout for QueryGranularityTest.testDeadLock() (#4395 )	2017-06-12 13:28:21 -07:00
Gian Merlino	1f2afccdf8	Expressions: Add ExprMacros. (#4365 ) * Expressions: Add ExprMacros, which have the same syntax as functions, but can convert themselves to any kind of Expr at parse-time. ExprMacroTable is an extension point for adding new ExprMacros. Anything that might need to parse expressions needs an ExprMacroTable, which can be injected through Guice. * Address code review comments.	2017-06-08 09:32:10 -04:00
Jonathan Wei	9ae04b7375	Remove queryMetricsFactory from GroupByQueryConfig (#4383 )	2017-06-07 21:35:26 -05:00
Roman Leventov	63a897c278	Enable most IntelliJ 'Probable bugs' inspections (#4353 ) * Enable most IntelliJ 'Probable bugs' inspections * Fix in RemoteTestNG * Fix IndexSpec's equals() and hashCode() to include longEncoding * Fix inspection errors * Extract global isntance of natural().nullsFirst(); address comments * Fix * Use noinspection comments instead of SuppressWarnings on method for IntelliJ-specific inspections * Prohibit Ordering.natural().nullsFirst() using Checkstyle	2017-06-07 09:54:25 -07:00
Roman Leventov	b487fa355b	More methods in QueryMetrics and TopNQueryMetrics (the last part of #3798 ) (#4284 ) * Add more methods to QueryMetrics and TopNQueryMetrics, add BitmapResultFactory * Add implementor expectations section to BitmapResultFactory javadoc	2017-06-07 09:49:08 -07:00
kaijianding	551a89bd67	serialize DateTime As Long to improve json serde performance (#4038 )	2017-06-06 10:08:51 -07:00
Gian Merlino	d22db30db4	VirtualColumns: Block virtual columns with empty names. (#4367 ) * VirtualColumns: Block virtual columns with empty names. * Spelling.	2017-06-06 08:05:47 -07:00
Roman Leventov	31d33b333e	Make using implicit system Charset an error (#4326 ) * Make using implicit system charset an error * Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String() * Use English locale in StringUtils.safeFormat() * Restore comment	2017-06-05 23:57:25 -07:00
Jonathan Wei	b90c28e861	Support limit push down for GroupBy (#3873 ) * Support limit push down for GroupBy V2 * Use orderBy spec ordering when applying limit push down * PR Comments * Remove unused var * Checkstyle fixes * Fix test * Add comment on non-final variables, fix checkstyle * Address PR comments * PR comments * Remove unnecessary buffer reset * Fix missing @JsonProperty annotation	2017-06-02 15:39:04 -07:00
praveev	290ed3ab9d	Make DateTime timezone aware (#4343 ) * Make DateTime timezone aware * Change unit tests to make DateTime timezone aware for PeriodGranularity	2017-06-02 12:45:52 -07:00
kaijianding	0efd18247b	explicitly unmap hydrant files when abandonSegment to recycle mmap memory (#4341 ) * fix TestKafkaExtractionCluster fail due to port already used * explicitly unmap hydrant files when abandonSegment to recyle mmap memory * address the comments * apply to AppenderatorImpl	2017-06-01 18:15:30 -05:00
Roman Leventov	50e72c6aea	Fix bugs (core) (#4339 ) * Fix bugs * Add test for GoogleDataSegmentPusher.buildPath() * Exclude extension changes * Address comments * Brace	2017-06-02 06:47:59 +09:00
Roman Leventov	78179ef74d	Inject QueryMetrics factories via PolyBind (#4336 )	2017-05-31 09:07:03 -07:00
Kenji Noguchi	3400f601db	Protobuf extension (#4039 ) * move ProtoBufInputRowParser from processing module to protobuf extensions * Ported PR #3509 * add DynamicMessage * fix local test stuff that slipped in * add license header * removed redundant type name * removed commented code * fix code style * rename ProtoBuf -> Protobuf * pom.xml: shade protobuf classes, handle .desc resource file as binary file * clean up error messages * pick first message type from descriptor if not specified * fix protoMessageType null check. add test case * move protobuf-extension from contrib to core * document: add new configuration keys, and descriptions * update document. add examples * move protobuf-extension from contrib to core (2nd try) * touch * include protobuf extensions in the distribution * fix whitespace * include protobuf example in the distribution * example: create new pb obj everytime * document: use properly quoted json * fix whitespace * bump parent version to 0.10.1-SNAPSHOT * ignore Override check * touch	2017-05-30 13:11:58 -07:00
Kamal Gurala	dcb07d6958	Option to configure default analysis types in SegmentMetadataQuery (#4259 ) * Option to configure default analysis types * Updated Docs and renamed * Added serde tests and Null handling * Fixed Documentation * Updated implementation * Updated implementation * Updated implementation * Added usingDefaultIntervals in Builder * Updated implementation * Updated implementation and added failing test * filterSegments implementation updated * Updated imlementation * Padding * Add missing Override * Updated implementation * Fixed a naming bug * Fixed bug * Removed comment	2017-05-26 12:12:39 -07:00
Gian Merlino	1eaa7887bd	Fix integer overflow in BufferGrouper. (#4333 ) Would have led to out of bounds buffer access with large buffers. Also added tests using large buffers.	2017-05-25 23:30:20 -07:00
Gian Merlino	2bd4c0930f	Fix "quarter" granularity serialization. (#4316 )	2017-05-23 10:06:17 -07:00
Gian Merlino	9283807ad7	GroupByQuery: Fix type-spanning comparisons. (#4317 ) Jackson deserializes integers sometimes as int and sometimes as long, depending on how big they are. This leads to ClassCastException when comparing deserialized values as part of groupBy merging on the broker.	2017-05-23 10:06:04 -07:00
Gian Merlino	22e5f52d00	Workaround for non-thread-safe use of CardinalityAggregator. (#4304 )	2017-05-23 10:33:03 +09:00
Roman Leventov	8ec3a29af0	Don't pass QueryMetrics down in concurrent and async QueryRunners (fixes #4279 ) (#4288 ) * Don't pass QueryMetrics down in concurrent and async QueryRunners * Rename QueryPlus.threadSafe() to withoutThreadUnsafeState(); Update QueryPlus.withQueryMetrics() Javadocs; Fix generics in MetricsEmittingQueryRunner and CpuTimeMetricQueryRunner; Make DefaultQueryMetrics to fail fast on modifications from concurrent threads	2017-05-22 13:42:09 -05:00
Maksim Logvinenko	d45dad2b44	Remove boxing/unboxing in indexer (#4269 ) * Remove boxing/unboxing in indexer * Fix rowIndex visibility * Cleanup	2017-05-17 19:13:53 -05:00
Roman Leventov	d9f423f55d	Make QueryMetrics factories configurable (#4268 ) * Ensure QueryMetrics factories accept Json ObjectMapper; Make QueryMetrics factories configurable * Update QueryMetrics Javadocs * Add javadocs to QueryMetrics factories * Move queryMetricsFactory defaults to getter methods of config classes	2017-05-17 08:41:59 -07:00
Gian Merlino	ddc2e68998	Remove cache keys from HavingSpecs. (#4280 ) * Remove cache keys from HavingSpecs. They weren't used, since they aren't part of the groupBy cache key. Also, it's good that they weren't used, since many of them had value truncation bugs. * Fix imports. * Fix test.	2017-05-16 22:13:02 -07:00
Roman Leventov	d400f23791	Monomorphic processing of TopN queries with simple double aggregators over historical segments (part of #3798 ) (#4079 ) * Monomorphic processing of topN queries with simple double aggregators and historical segments * Add CalledFromHotLoop annocations to specialized methods in SimpleDoubleBufferAggregator * Fix a bug in Historical1SimpleDoubleAggPooledTopNScannerPrototype * Fix a bug in SpecializationService * In SpecializationService, emit maxSpecializations warning only once * Make GenericIndexed.theBuffer final * Address comments * Newline * Reapply `439c906` (Make GenericIndexed.theBuffer final) * Remove extra PooledTopNAlgorithm.capabilities field * Improve CachingIndexed.inspectRuntimeShape() * Fix CompressedVSizeIntsIndexedSupplier.inspectRuntimeShape() * Don't override inspectRuntimeShape() in subclasses of CompressedVSizeIndexedInts * Annotate methods in specializations of DimensionSelector and FloatColumnSelector with @CalledFromHotLoop * Make ValueMatcher to implement HotLoopCallee * Doc fix * Fix inspectRuntimeShape() impl in ExpressionSelectors * INFO logging of specialization events * Remove modificator * Fix OrFilter * Fix AndFilter * Refactor PooledTopNAlgorithm.scanAndAggregate() * Small refactoring * Add 'nothing to inspect' messages in empty HotLoopCallee.inspectRuntimeShape() implementations * Don't care about runtime shape in tests * Fix accessor bugs in Historical1SimpleDoubleAggPooledTopNScannerPrototype and HistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopNScannerPrototype, cover them with tests * Doc wording * Address comments * Remove MagicAccessorBridge and ensure Offset subclasses are public * Attach error message to element	2017-05-16 16:19:55 -07:00
Roman Leventov	b7a52286e8	Make @Override annotation obligatory (#4274 ) * Make MissingOverride an error * Make travis stript to fail fast * Add missing Override annotations * Comment	2017-05-16 13:30:30 -05:00
Himanshu	136b2fae72	improve query timeout handling and limit max scatter-gather bytes (#4229 ) * improve query timeout handling and limit max scatter-gather bytes * address review comments	2017-05-16 12:47:32 -05:00
Benedict Jin	e823085866	Improve `collection` related things that reusing a immutable object instead of creating a new object (#4135 )	2017-05-17 01:38:51 +09:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Roman Leventov	1ebfa22955	Update Error prone configuration; Fix bugs (#4252 ) * Make Errorprone the default compiler * Address comments * Make Error Prone's ClassCanBeStatic rule a error * Preconditions allow only %s pattern * Fix DruidCoordinatorBalancerTester * Try to give the compiler more memory * Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now * Don't show compiler warnings * Try different travis script * Fix travis.yml * Make Error Prone optional again * For error-prone compiler * Increase compiler's maxmem * Don't run Error Prone for benchmarks because of OOM * Skip install step in Travis * Remove MetricHolder.writeToChannel() * In travis.yml, check compilation before tests, because it may fail faster	2017-05-12 15:55:17 +09:00
Roman Leventov	e09e892477	Refactor QueryRunner to accept QueryPlus: Query + QueryMetrics (part of #3798 ) (#4184 ) * Add QueryPlus. Add QueryRunner.run(QueryPlus, Map) method with default implementation, to replace QueryRunner.run(Query, Map). * Fix GroupByMergingQueryRunnerV2 * Fix QueryResourceTest * Expand the comment to Query.run(walker, context) * Remove legacy version of BySegmentSkippingQueryRunner.doRun() * Add LegacyApiQueryRunnerTest and be more specific about legacy API removal plans in Druid 0.11 in Javadocs	2017-05-10 12:25:00 -07:00
Himanshu	462f6482df	optionally add extensions to explicitly specified hadoopContainerClassPath (#4230 ) * optionally add extensions to explicitly specified hadoopContainerClassPath * note extensions always pushed in hadoop container when druid.extensions.hadoopContainerDruidClasspath is not provided explicitly	2017-05-08 14:24:14 -05:00
Roman Leventov	8277284d67	Add Checkstyle rule to force comments to classes and methods to be Javadoc comments (#4239 )	2017-05-04 11:14:41 -07:00
Roman Leventov	5e85fcc0f5	Restore BaseQuery.computeOverridenContext() for compatibility (#4241 )	2017-05-02 10:22:02 -07:00
Himanshu	5a5a2749cd	improvements to coordinator lookups management (#3855 ) * coordinator lookups mgmt improvements * revert replaces removal, deprecate it instead * convert and use older specs stored in db * more tests and updates * review comments * add behavior for 0.10.0 to 0.9.2 downgrade * incorporating more review comments * remove explicit lock and use LifecycleLock in LookupReferencesManager. use LifecycleLock in LookupCoordinatorManager as well * wip on LookupCoordinatorManager * lifecycle lock * refactor thread creation into utility method * more review comments addressed * support smooth roll back of lookup snapshots from 0.10.0 to 0.9.2 * correctly use LifecycleLock in LookupCoordinatorManager and remove synchronization from start/stop * run lookup mgmt on leader coordinator only * wip: changes to do multiple start() and stop() on LookupCoordinatorManager * lifecycleLock fix usage in LookupReferencesManagerTest * add LifecycleLock back * fix license hdr * some fixes * make LookupReferencesManager.getAllLookupsState() consistent while still being lockless * address review comments * addressing leventov's comments * address charle's comments * add IOE.java * for safety in LookupReferencesManager mainThread check for lifecycle started state on each loop in addition to interrupt * move thread creation utility method to Execs * fix names * add tests for LookupCoordinatorManager.lookupManagementLoop() * add further tests for figuring out toBeLoaded and toBeDropped on LookupCoordinatorManager * address leventov comments * remove LookupsStateWithMap and parameterize LookupsState * address review comments * address more review comments * misc fixes	2017-04-28 08:41:38 -05:00
Roman Leventov	b9fd30e90a	Add Checkstyle check to prohibit IntelliJ-style commented code lines (#4220 ) * Add Checkstyle check to prohibit IntelliJ-style commented code lines * Address comment * Restore issue link	2017-04-27 18:11:25 -07:00
kaijianding	c47cfed0ec	Significantly improve LongEncodingStrategy.AUTO build performance (#4215 ) * Significantly improve LongEncodingStrategy.AUTO build performance * use numInserted instead of tempIn.available * fix bug	2017-04-27 15:11:07 +03:00
Roman Leventov	ee9b5a619a	Fix bugs in query builders and in TimeBoundaryQuery.getFilter() (#4131 ) * Add queryMetrics property to Query interface; Fix bugs and removed unused code in Druids * Fix a bug in TimeBoundaryQuery.getFilter() and remove TimeBoundaryQuery.getDimensionsFilter() * Don't reassign query's queryMetrics if already present in CPUTimeMetricQueryRunner and MetricsEmittingQueryRunner * Add compatibility constructor to BaseQuery * Remove Query.queryMetrics property * Move nullToNoopLimitSpec() method to LimitSpec interface * Rename GroupByQuery.applyLimit() to postProcess(); Fix inconsistencies in GroupByQuery.Builder	2017-04-25 16:32:02 -05:00
kaijianding	336089563d	skip rows which are added after cursor created (#4049 ) * fix can't get dim value via IncrementalIndexStorageAdapter cursor * address the comment * add ut * address ut comments * fix bug and fix ut	2017-04-26 03:26:46 +09:00
Jonathan Wei	723a855ab9	Fix nested groupBys with outer exfns on inner numeric columns (#4182 )	2017-04-21 19:47:46 -07:00
Gian Merlino	2ca7b00346	Update versions to 0.10.1-SNAPSHOT. (#4191 )	2017-04-20 18:12:28 -07:00
Gian Merlino	60caa641f3	Restore backwards compatibility of Query. (#4185 )	2017-04-19 19:47:50 +03:00
Jihoon Son	5b69f2eff2	Make timeout behavior consistent to document (#4134 ) * Make timeout behavior consistent to document * Refactoring BlockingPool and add more methods to QueryContexts * remove unused imports * Addressed comments * Address comments * remove unused method * Make default query timeout configurable * Fix test failure * Change timeout from period to millis	2017-04-19 09:47:53 +09:00
Gian Merlino	b2954d5fea	Better groupBy error messages and docs around resource limits. (#4162 ) * Better groupBy error messages and docs around resource limits. * Fix BufferGrouper test from datasketches. * Further clarify.	2017-04-13 10:38:53 -07:00
Ram iyer	2e9589215e	removing unused var (#4163 )	2017-04-13 04:03:41 +09:00
kaijianding	676af79044	don't do postAgg in TimeseriesQueryQueryToolChest when not necessary (#4155 ) * don't do postAgg in TimeseriesQueryQueryToolChest when not necessary * set postAggs to empty list in TimeseriesQueryQueryToolChest instead of checking finalizing fn * fix ut * fix ut again	2017-04-12 15:46:46 +05:30
Roman Leventov	15f3a94474	Copy closer into Druid codebase (fixes #3652 ) (#4153 )	2017-04-10 09:38:45 +09:00
Roman Leventov	73d9b31664	GenericIndexed minor bug fixes, optimizations and refactoring (#3951 ) * Minor bug fixes in GenericIndexed; Refactor and optimize GenericIndexed; Remove some unnecessary ByteBuffer duplications in some deserialization paths; Add ZeroCopyByteArrayOutputStream * Fixes * Move GenericIndexedWriter.writeLongValueToOutputStream() and writeIntValueToOutputStream() to SerializerUtils * Move constructors * Add GenericIndexedBenchmark * Comments * Typo * Note in Javadoc that IntermediateLongSupplierSerializer, LongColumnSerializer and LongMetricColumnSerializer are thread-unsafe * Use primitive collections in IntermediateLongSupplierSerializer instead of BiMap * Optimize TableLongEncodingWriter * Add checks to SerializerUtils methods * Don't restrict byte order in SerializerUtils.writeLongToOutputStream() and writeIntToOutputStream() * Update GenericIndexedBenchmark * SerializerUtils.writeIntToOutputStream() and writeLongToOutputStream() separate for big-endian and native-endian * Add GenericIndexedBenchmark.indexOf() * More checks in methods in SerializerUtils * Use helperBuffer.arrayOffset() * Optimizations in SerializerUtils	2017-03-27 14:17:31 -05:00
Gian Merlino	dd6c0ab509	Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. (#4055 ) * Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. * Fix tests.	2017-03-24 17:38:36 -07:00
Erik Dubbelboer	2cbc4764f8	Comparing dimensions to each other in a filter (#3928 ) Comparing dimensions to each other using a select filter	2017-03-23 18:23:46 -07:00
Roman Leventov	4b5ae31207	QueryMetrics: abstraction layer of query metrics emitting (part of #3798 ) (#3954 ) * QueryMetrics: abstraction layer of query metrics emitting * Minor fixes * QueryMetrics.emit() for bulk emit and improve Javadoc * Fixes * Fix * Javadoc fixes * Typo * Use DefaultObjectMapper * Add tests * Address PR comments * Remove QueryMetrics.userDimensions(); Rename QueryMetric.register() to report() * Dedicated TopNQueryMetricsFactory, GroupByQueryMetricsFactory and TimeseriesQueryMetricsFactory * Typo * More elaborate Javadoc of QueryMetrics * Formatting * Replace QueryMetric enum with lambdas * Add comments and VisibleForTesting annotations	2017-03-23 17:23:59 -07:00
Jonathan Wei	79f1a1d7f0	Allow float parameters for Bound/Selector/In filters on long columns (#4074 ) * Allow float parameters for long filters * Use BigDecimal intermediate form for string->long conversions * PR comments * PR comments	2017-03-23 14:18:05 -07:00
Akash Dwivedi	ff7f90b02d	relocate method in BufferAggregator. (#4071 ) * relocate method in BufferAggregator. * Unused import. * Detailed javadoc. * using Int2ObjectMap. * batch relocate. * Revert batch relocate. * Unused import. * code comments. * code comment.	2017-03-23 13:07:59 -07:00
David Lim	f68ba4128f	Exclude pagingIdentifiers that don't apply to a datasource (#4078 ) * exclude pagingIdentifiers that don't apply to a datasource to support union datasources * code review changes * code review changes	2017-03-22 12:32:27 -07:00
Gian Merlino	1f48198607	Fix some query cache key collisions. (#4094 ) The query caches generally store dimensions and aggregators positionally, so appendCacheablesIgnoringOrder could lead to incorrect results being pulled from the cache.	2017-03-22 11:08:48 -07:00
Gian Merlino	77b6213222	Remove unused Filters.getLongValueMatcher method. (#4086 )	2017-03-21 13:46:07 -06:00
Gian Merlino	ad477cb454	Fix topNs with extractionFns but no aggregators. (#4070 ) The result sets were empty because of an aggs.length > 0 check. I'm not sure if it was there for any good reason, but there didn't seem to be one.	2017-03-20 11:31:30 -07:00
Roman Leventov	84fe91ba0b	Monomorphic processing of TopN queries with 1 and 2 aggregators (key part of #3798 ) (#3889 ) * Monomorphic processing: add HotLoopCallee, CalledFromHotLoop, RuntimeShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing. * Use Execs.singleThreaded() * RuntimeShapeInspector to support nullable fields * Make CalledFromHotLoop annotation Inherited * Remove unnecessary conversion of array of ColumnSelectorPluses to list and back to array in CardinalityAggregatorFactory * Close InputStream in SpecializationService * Formatting * Test specialized PooledTopNScanners * Set flags in PooledTopNAlgorithm directly * Fix tests, dependent on CountAggragatorFactory toString() form * Fix * Revert CountAggregatorFactory changes * Implement inspectRuntimeShape() for LongWrappingDimensionSelector and FloatWrappingDimensionSelector * Remove duplicate RoaringBitmap dependency in the extendedset pom.xml * Fix * Treat ByteBuffers specially in StringRuntimeShape * Doc fix * Annotate BufferAggregator.init() with CalledFromHotLoop * Make triggerSpecializationIterationsThreshold an int * Remove SpecializationService.PerPrototypeClassState.of() * Add comments * Limit the amount of specializations that SpecializationService could make * Add default implementation for BufferAggregator.inspectRuntimeShape(), for compatibility with extensions * Use more efficient ConcurrentMap's idioms in SpecializationService	2017-03-17 14:44:36 -05:00
Gian Merlino	3ec1877887	Fix BucketExtractionFn on objects that are strings. (#4072 )	2017-03-16 22:59:11 -07:00
Charles Allen	805d85afda	Allow compilation as Java8 source and target (#3328 ) * Allow compilation as Java8 source and target for everything except API * Remove conditions in tests which assume that we may run with Java 7 * Update easymock to 3.4 * Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration * Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity() * Remove java7 special for druid-api	2017-03-14 22:23:47 -06:00
Gian Merlino	e5c0dab12c	groupBy v2: Better error message when resources are exhausted. (#4046 ) * groupBy v2: Better error message when resources are exhausted. Fixes #4043. * Fix tests.	2017-03-15 00:37:49 +05:30
Jihoon Son	dfe4bda7fd	add doc (#4030 )	2017-03-10 12:49:20 -08:00
Gian Merlino	a5170666b6	groupBy v2: Always merge queries. (#4023 ) This fixes #4020 because it means the timestamp will always be included for outermost queries. Historicals receiving queries from older brokers will think they're outermost (because CTX_KEY_OUTERMOST isn't set to "false"), so they'll include a timestamp, so the older brokers will be OK.	2017-03-08 12:47:46 -06:00
Gian Merlino	4ca5270e88	Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. (#4004 ) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.	2017-03-06 12:27:02 -06:00
Gian Merlino	7b9e6c29cd	Fix float, long dimension indexer object selectors. (#4012 ) Their "convertUnsortedEncodedKeyComponentToActualArrayOrList" methods didn't respect the contract, which says they should return single values (not array/list) if there is only a single value to return. This affects the behavior of ObjectColumnSelectors on realtime segments.	2017-03-06 10:01:30 -08:00
Gian Merlino	337f3870d8	Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. (#4007 ) * Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. * Remove unused import. * Use defaults in cache key.	2017-03-04 17:41:59 -08:00
praveev	67d0ae3271	Let toDateTime call fall through for Duration Granularity (#4001 ) * Let toDateTime call fall through for Duration Granularity Added test for the same. * Add duration granularity test to GroupByQueryRunnerTest	2017-03-03 13:27:22 -06:00
Himanshu	e7e3c2dc5a	support singleThreaded flag for groupBy-v2 as well (#3992 )	2017-03-03 23:43:06 +05:30
Roman Leventov	81a5f9851f	TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir (#3990 ) * In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir * Use FileUtils.forceMkdir() across the codebase and remove some unused code * Fix test * Fix PullDependencies.run() * Unused import	2017-03-02 14:05:12 -08:00
Jonathan Wei	5fb1638534	Add default configuration for select query 'fromNext' parameter (#3986 ) * Add default configuration for select query 'fromNext' parameter * PR comments * Fix PagingSpec config injection * Injection fix for test	2017-03-01 17:05:35 -08:00
Himanshu	8316b4f48f	fix TimeDimExtractionFn.apply() under concurrency (#3984 )	2017-03-01 13:07:12 -08:00
kaijianding	772de66e79	add filenameBase to log when exceed file size limit to indicate which column it is (#3982 )	2017-03-01 13:05:07 -08:00
Gian Merlino	cc20133e70	Checkstyle rule to outlaw tabs. (#3988 ) Tabs are the worst.	2017-02-28 23:52:53 -08:00
Akash Dwivedi	91344cbe57	Enable GenericIndexed V2 for built-in(druid-io managed) complex columns. (#3987 ) * Enable GenericIndexed V2 for complex columns. * SerializerBuilder to use GenericColumnSerializer.	2017-02-28 22:06:54 -08:00
Jonathan Wei	a08660a9ca	Support ingestion of long/float dimensions (#3966 ) * Support ingestion for long/float dimensions * Allow non-arrays for key components in indexing type strategy interfaces * Add numeric index merge test, fixes * Docs for numeric dims at ingestion * Remove unused import * Adjust docs, add aggregate on numeric dims tests * remove unused imports * Throw exception for bitmap method on numerics * Move typed selector creation to DimensionIndexer interface * unused imports * Fix * Remove unused DimensionSpec from indexer methods, check for dims first in inc index storage adapter * Remove spaces	2017-02-28 19:04:41 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Jonathan Wei	58b704c3b4	Don't allow '__time' as a GroupBy output field name (#3967 ) * Don't allow '__time' as a GroupBy column field name * Tweak exception message	2017-02-23 14:39:17 -08:00
kaijianding	7ce05d58bc	fix NPE in search query when dimension contains null value (#3968 ) * fix NPE when dimension contains null value in search query * add ut * search with not existed dimension should always return empty result	2017-02-23 08:07:59 -08:00
Gian Merlino	372b84991c	Add virtual columns to timeseries, topN, and groupBy. (#3941 ) * Add virtual columns to timeseries, topN, and groupBy. * Fix GroupByTimeseriesQueryRunnerTest. * Updates from review comments.	2017-02-22 13:16:48 -08:00
Jihoon Son	7200dce112	Atomic merge buffer acquisition for groupBys (#3939 ) * Atomic merge buffer acquisition for groupBys * documentation * documentation * address comments * address comments * fix test failure * Addressed comments - Add InsufficientResourcesException - Renamed GroupByQueryBrokerResource to GroupByQueryResource * addressed comments * Add takeBatch() to BlockingPool	2017-02-22 14:49:37 -06:00
Gian Merlino	985203b634	Finalize fields in postaggs (#3957 ) * initial commits for finalizeFieldAccess #2433 * fix some bugs to run a query * change name of method Queries.verifyAggregations to Queries.prepareAggregations * add Uts * fix Ut failures * rebased to master * address comments and add a Ut for arithmetic post aggregators * rebased to the master * address the comment of injection within arithmetic post aggregator * address comments and introduce decorate() in the PostAggregator interface. * Address comments. 1. Implements getComparator in FinalizingFieldAccessPostAggregator and add Uts for it 2. Some minor changes like renaming a method name. * Fix a code style mismatch. * Rebased to the master	2017-02-21 16:32:14 -08:00
Gian Merlino	a47206eaf8	Ability to filter on virtual columns. (#3942 ) This didn't need much other than having BitmapIndexSelector return null from various methods to trigger cursor based filtering.	2017-02-21 16:03:31 -08:00
Jihoon Son	128274c6f0	Disable caching on brokers for groupBy v2 (#3950 ) * Disable caching on brokers for groupBy v2 * Rename parameter * address comments	2017-02-21 09:49:49 -08:00
Jonathan Wei	bc33b68b51	Use GroupBy V2 as default (#3953 ) * Use GroupBy V2 as default * Remove unused line * Change assert to exception propagation	2017-02-18 07:40:40 -08:00
kaijianding	361d9d9802	fix dynamic schema data can't rollup correctly (#3949 ) * fix dynamic schema data can't rollup correctly * add ut	2017-02-17 15:07:29 -06:00
Akash Dwivedi	797488a677	Removing Integer.MAX column size limit. (#3743 ) * Removing Integer.MAX column size limit. * On demand creation of headerLong, use v2 instead of v3 * Avoid reusing the same object from a previous test. * Avoid reusing the same object from a previous test part#2 * code formatting. * GenericIndexed/Writer code review changes. * GenericIndexed/writer code review requested changes. * checkIndex() to static * native endianess for genericIndexedV2, code review requested changes. * Formatting * Hll fix. * use native endianess during bag size calculation. * Code review requested changes. * IOPeon close() changes. * use different tmp directory path for testing. * Code review requested changes.	2017-02-16 20:09:43 -06:00
Jihoon Son	a459db68b6	Fine grained buffer management for groupby (#3863 ) * Fine-grained buffer management for group by queries * Remove maxQueryCount from GroupByRules * Fix code style * Merge master * Fix compilation failure * Address comments * Address comments - Revert Sequence - Add isInitialized() to Grouper - Initialize the grouper in RowBasedGrouperHelper.Accumulator - Simple refactoring RowBasedGrouperHelper.Accumulator - Add tests for checking the number of used merge buffers - Improve docs * Revert unnecessary changes * change to visible to testing * fix misspelling	2017-02-14 12:55:54 -08:00
Gian Merlino	af67e8904e	PreComputedHyperUniquesSerde: Fix formatting. (#3932 )	2017-02-14 09:32:29 -08:00
DaimonPl	a2875a4d91	pre-computed HLL support for hyperUnique aggregator (#3909 )	2017-02-13 15:26:20 -08:00
Akash Dwivedi	8854ce018e	File.deleteOnExit() (#3923 ) * Less use of File.deleteOnExit() * removed deleteOnExit from most of the tests/benchmarks/iopeon * Made IOpeon closable * Formatting. * Revert DeterminePartitionsJobTest, remove cleanup method from IOPeon	2017-02-13 15:12:14 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Pierre	9ab9feced6	Close all aggregators when closing onHeapIncrementalIndex (#3926 ) * Close all aggregators when closing onHeapIncrementalIndex * Aggregators are now handled as Closeables, remove unnecessary mock in test * Fix variable shadowing	2017-02-13 15:01:27 -08:00
Jihoon Son	991e2852da	Add PostAggregators to generator cache keys for top-n queries (#3899 ) * Add PostAggregators to generator cache keys for top-n queries * Add tests for strings * Remove debug comments * Add type keys and list sizes to cache key * Make post aggregators used for sort are considered for cache key generation * Use assertArrayEquals() * Improve findPostAggregatorsForSort() * Address comments * fix test failure * address comments	2017-02-13 12:23:44 -08:00
Parag Jain	33c635aff2	use as() method of base segment in reference counting segment (#3921 )	2017-02-09 20:24:47 -06:00
Jonathan Wei	ca2b04f0fd	Add long/float ColumnSelectorStrategy implementations (#3838 ) * Add long/float ColumnSelectorStrategy implementations * Address PR comments * Add String strategy with internal dictionary to V2 groupby, remove dict from numeric wrapping selectors, more tests * PR comments * Use BaseSingleValueDimensionSelector for long/float wrapping * remove unused import * Address PR comments * PR comments * PR comments * More PR comments * Fix failing calcite histogram subquery tests * ScanQuery test and comment about isInputRaw * Add outputType to extractionDimensionSpec, tweak SQL tests * Fix limit spec optimization for numerics * Add cardinality sanity checks to TopN * Fix import from merge * Add tests for filtered dimension spec outputType * Address PR comments * Allow filtered dimspecs on numerics * More comments	2017-02-08 20:39:29 -08:00
Gian Merlino	97765fdfef	Simplify LikeFilter implementation of getBitmapIndex, estimateSelectivity. (#3910 ) * Simplify LikeFilter implementation of getBitmapIndex, estimateSelectivity. LikeFilter: - Reduce code duplication, and simplify methods, at the cost of incurring an extra box of ImmutableBitmap into a SingletonImmutableList. I think this is fine, since this should be cheap and the code path is not hot (just once per filter). Filters: - Make estimateSelectivity public since it seems intended that they be used by Filter implementations, and Filters from extensions may want to use them too. Removed @VisibleForTesting for the same reason. - Rename one of the estimatePredicateSelectivity overloads to estimateSelectivity, since predicates aren't involved. * Address PR comments. * Remove unused import * Change List to Collection	2017-02-08 13:46:01 -06:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
Jihoon Son	ddd8c9ef97	Add filter selectivity estimation for auto search strategy (#3848 ) * Add filter selectivity estimation for auto search strategy * Addressed comments * Lazy bitmap materialization for bitmap sampling and java docs * Addressed comments. - Fix wrong non-overlap ratio computation and added unit tests. - Change Iterable<Integer> to IntIterable - Remove unnecessary Iterable<Integer> * Addressed comments - Split a long ternary operation into if-else blocks - Add IntListUtils.fromTo() * Fix test failure and add a test for RangeIntList * fix code style * Diabled selectivity estimation for multi-valued dimensions * Address comment	2017-02-06 11:15:03 -08:00
Parag Jain	8a13a85765	Introduce SegmentizerFactory (#3901 ) * Introduce SegmentizerFactory - that knows how to deserialize specific type of segment - Default implementation is MMappedQueryableSegmentizerFactory which creates QueryableIndexSegment - Unit test for the default behavior * review comments	2017-02-06 10:05:12 -08:00
DaimonPl	93b71e265e	Extract HLL related code to separate module (#3900 )	2017-02-03 09:45:11 -08:00
Jonathan Wei	182261f713	Allow configurable temp directory for query processing (#3893 )	2017-02-02 10:22:28 -08:00
Jonathan Wei	e6b95e80aa	Remove deprecated Aggregator/AggregatorFactory methods (#3894 )	2017-02-01 14:43:18 -08:00
Gian Merlino	d3a3b7ba0c	Add virtual column types, holder serde, and safety features. (#3823 ) * Add virtual column types, holder serde, and safety features. Virtual columns: - add long, float, dimension selectors - put cache IDs in VirtualColumnCacheHelper - adjust serde so VirtualColumns can be the holder object for Jackson - add fail-fast validation for cycle detection and duplicates - add expression virtual column in core Storage adapters: - move virtual column hooks before checking base columns, to prevent surprises when a new base column is added that happens to have the same name as a virtual column. * Fix ExtractionDimensionSpecs with virtual dimensions. * Fix unused imports. * CR comments * Merge one more time, with feeling.	2017-01-26 18:15:51 -08:00
Roman Leventov	75d9e5e7a7	DimensionSelector-related bug fixes and optimizations (fixes #3799 , part of #3798 ) (#3858 ) * * Add DimensionSelector.idLookup() and nameLookupPossibleInAdvance() to allow better inspection of features DimensionSelectors supports, and safer code working with DimensionSelectors in BaseTopNAlgorithm, BaseFilteredDimensionSpec, DimensionSelectorUtils; * Add PredicateFilteringDimensionSelector, to make BaseFilteredDimensionSpec to be able to decorate DimensionSelectors with unknown cardinality; * Add DimensionSelector.makeValueMatcher() (two kinds) for DimensionSelector-side specifics-aware optimization of ValueMatchers; * Optimize getRow() in BaseFilteredDimensionSpec's DimensionSelector, StringDimensionIndexer's DimensionSelector and SingleScanTimeDimSelector; * Use two static singletons, TrueValueMatcher and FalseValueMatcher, instead of BooleanValueMatcher; * Add NullStringObjectColumnSelector singleton and use it in MapVirtualColumn * Rename DimensionSelectorUtils.makeNonDictionaryEncodedIndexedIntsBasedValueMatcher to makeNonDictionaryEncodedRowBasedValueMatcher * Make ArrayBasedIndexedInts constructor private, replace it's usages with of() static factory method * Cache baseIdLookup in ForwardingFilteredDimensionSelector * Fix a bug in DimensionSelectorUtils.makeRowBasedValueMatcher(selector, predicate, matchNull) * Employ precomputed BitSet optimization in DimensionSelector.makeValueMatcher(value, matchNull) when lookupId() is not available, but cardinality is known and lookupName() is available * Doc fixes * Addressed comments * Fix * Fix * Adjust javadoc of DimensionSelector.nameLookupPossibleInAdvance() for SingleScanTimeDimSelector * throw UnsupportedOperationException instead of IAE in BaseTopNAlgorithm	2017-01-25 15:28:27 -08:00
Gian Merlino	3136dfa421	LikeFilter: Read value lazily when doing a prefix-based match. (#3880 ) This speeds up cases where we don't actually need to read the value, such as "LIKE 'foo%'".	2017-01-25 13:22:07 -08:00
Roman Leventov	af93a8d189	Sequences refactorings and removed unused code (part of #3798 ) (#3693 ) * Removing unused code from io.druid.java.util.common.guava package; fix #3563 (more consistent and paranoiac resource handing in Sequences subsystem); Add Sequences.wrap() for DRY in MetricsEmittingQueryRunner, CPUTimeMetricQueryRunner and SpecificSegmentQueryRunner; Catch MissingSegmentsException in SpecificSegmentQueryRunner's yielder.next() method (follow up on #3617) * Make Sequences.withEffect() execute the effect if the wrapped sequence throws exception from close() * Fix strange code in MetricsEmittingQueryRunner * Add comment on why YieldingSequenceBase is used in Sequences.withEffect() * Use Closer in OrderedMergeSequence and MergeSequence to close multiple yielders	2017-01-19 20:07:43 -08:00
kaijianding	33ae9dd485	streaming version of select query (#3307 ) * streaming version of select query * use columns instead of dimensions and metrics;prepare for valueVector;remove granularity * respect query limit within historical * use constant * fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner * add some test for scan query * add scan query document * fix merge conflicts * add compactedList resultFormat, this format is better for json ser/der * respect query timeout * respect query limit on broker * use static consts and remove unused code	2017-01-19 16:09:53 -06:00
Slim	558dc365a4	renaming classes to be run by mvn and comment non operational tests (#3847 )	2017-01-17 11:59:12 -08:00
Akash Dwivedi	dd0c4e2ead	Migrating extendedset from Metamarkets. (#3694 ) * Migrating extendedset from Metamarkets. * Notice change * More details in NOTICE * NOTICE formatting. * suppress header checkstlye for extendedset.	2017-01-17 10:10:27 -08:00
Gian Merlino	e86859b228	SQL support for nested groupBys. (#3806 ) * SQL support for nested groupBys. Allows, for example, doing exact count distinct by writing: SELECT COUNT() FROM (SELECT DISTINCT col FROM druid.foo) Contrast with approximate count distinct, which is: SELECT COUNT(DISTINCT col) FROM druid.foo Add deeply-nested groupBy docs, tests, and maxQueryCount config. * Extract magic constants into statics. * Rework rules to put preconditions in the "matches" method.	2017-01-11 18:32:53 -08:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Jihoon Son	c099977a5b	Add an option to SearchQuery to choose a search query execution strategy (#3792 ) * Add an option to SearchQuery to choose a search query execution strategy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query * Add SearchStrategy and SearchQueryExecutor * Address comments * Rename strategies and set UseIndexesStrategy as the default strategy * Add a cost-based planner for auto strategy * Add document * Fix code style * apply code style * apply comments	2017-01-10 18:04:20 -08:00
Gian Merlino	3c63cff57a	Remove makeMathExpressionSelector from ColumnSelectorFactory. (#3815 ) * Remove makeMathExpressionSelector from ColumnSelectorFactory. * Add @Nullable annotations in places, fix Number.class check. * Break up createBindings, add tests. * Add null check.	2017-01-05 18:06:38 -08:00
Gian Merlino	220ca7ebb6	Ignore DimFilterHavingSpec testConcurrentUsage. (#3814 )	2017-01-03 17:43:58 -07:00
Gian Merlino	d8702ebece	Filters: Use ColumnSelectorFactory directly for building row-based matchers. (#3797 ) * Filters: Use ColumnSelectorFactory directly for building row-based matchers. * Adjustments based on code review. - BoundDimFilter: fewer volatiles, rename matchesAnything to !matchesNothing. - HavingSpecs: Clarify that they are not thread-safe, and make DimFilterHavingSpec not thread safe. - Renamed rowType to rowSignature. - Added specializations for time-based vs non-time-based DimensionSelector in RBCSF. - Added convenience method DimensionHanderUtils.createColumnSelectorPlus. - Added singleton ZeroIndexedInts. - Added test cases for DimFilterHavingSpec. * Make ValueMatcherColumnSelectorStrategy actually use the associated selector. * Add RangeIndexedInts. * DimFilterHavingSpec: Fix concurrent usage guard on jdk7. * Add assertion to ZeroIndexedInts. * Rename no-longer-volatile members.	2017-01-03 14:30:22 -08:00
Roman Leventov	33800122ad	Don't return leaked Objects back to StupidPool, because this is dangerous. Reuse Cleaners in StupidPool. Make StupidPools named. Add StupidPool.leakedObjectCount(). Minor fixes (#3631 )	2016-12-26 00:35:35 -06:00
Jonathan Wei	0e5bd8b4d4	Add dimension type-based interface for query processing (#3570 ) * Add dimension type-based interface for query processing * PR comment changes * Address PR comments * Use getters for QueryDimensionInfo * Split DimensionQueryHelper into base interface and query-specific interfaces * Treat empty rows as nulls in v2 groupby * Reduce boxing in SearchQueryRunner * Add GroupBy empty row handling to MultiValuedDimensionTest * Address PR comments * PR comments and refactoring * More PR comments * PR comments	2016-12-21 20:11:37 -07:00
Jonathan Wei	2bfcc8a592	First and Last Aggregator (#3566 ) * add first and last aggregator * add test and fix * moving around * separate aggregator valueType * address PR comment * add finalize inner query and adjust v1 inner indexing * better test and fixes * java-util import fixes * PR comments * Add first/last aggs to ITWikipediaQueryTest	2016-12-16 15:26:40 -08:00
Himanshu	ed322a4beb	remove size from default analysisTypes list for segmentMetadata query (#3773 )	2016-12-13 18:01:21 -08:00
Jonathan Wei	880a021a7a	Fix missed travis failures from PR 3567 and 2798 (#3761 ) * Fix checkstyle failures from PR 3567 * Fix GranularityPathSpecTest compile failure	2016-12-07 19:07:31 -08:00
Erik Dubbelboer	bb9e35e1af	Add Greatest and Least post aggregations (#3567 )	2016-12-07 17:58:23 -08:00
Roman Leventov	dc8f814acc	Optimize Iterator<ImmutableBitmap> implementation inside Filters.matchPredicate() so that it doesn't emit empty bitmap in the end of the iteration, and make it to follow Iterator contract, that is throw NoSuchElementException from next() if there are no more bitmaps (#3754 )	2016-12-07 12:54:09 -08:00
Jonathan Wei	d1896a2d62	Disable flush after every ObjectMapper write (#3748 )	2016-12-06 16:45:23 -08:00
Gian Merlino	b1bac9f2d3	groupBy v2: Ignore timestamp completely when granularity = all, except for the final merge. (#3740 ) * GroupByBenchmark: Add serde, spilling, all-gran benchmarks. Also use more iterations. * groupBy v2: Ignore timestamp completely when granularity = all, except for the final merge. Specifically: - Remove timestamp from RowBasedKey when not needed - Set timestamp to null in MapBasedRows that are not part of the final merge.	2016-12-06 16:17:32 -08:00
Himanshu	45da7e48f1	groupBy sort results by (dimensions,timestamp) instead of (timestamp,dimension) (#3672 ) * sortByDimsFirst flag for groupBy query * Remove need for KeyType in Grouper<KeyType> to be Comparable<KeyType> * fix review comments * fix review comments regarding removing code duplication of dim/time comparison * move comparator for KeyType object to KeySerdeFactory so that creation of comparator does not need KeySerde * remove unnecessary system.out.println * make access static var NATURAL_NULLS_FIRST directly * further review comments addressing	2016-12-06 09:48:56 -08:00
Navis Ryu	c74d267f50	Support virtual column for select query (#2511 ) * Support virtual column for select query * Addressed comments	2016-12-05 15:14:35 -08:00
Gian Merlino	b64e06704e	Fix SingleScanTimeDimSelector when an extractionFn returns null for a timestamp. (#3732 )	2016-12-02 15:27:54 -08:00
Gian Merlino	f4cc8c2b2f	IndexBuilder: Close IncrementalIndex when done. (#3734 )	2016-12-02 16:56:34 -06:00
Gian Merlino	353fee79dd	Add "asMillis" option to "timeFormat" extractionFn. (#3733 ) This is useful for chaining extractionFns that all want to treat time as millis, such as having a javascript extractionFn after a timeFormat.	2016-12-02 13:45:16 -08:00
Gian Merlino	102375d9bb	Add "strlen" extractionFn. (#3731 )	2016-12-02 12:08:51 -08:00
Gian Merlino	4c5d10f8a3	Add DimFilterHavingSpec. (#3727 ) * Add DimFilterHavingSpec. * Add test for DimFilterHavingSpec with extractionFns.	2016-12-02 10:04:30 -08:00
Gian Merlino	68735829ca	Add, fix equals, hashCode, toString on various classes. (#3723 ) * TimeFormatExtractionFn: Add toString. * InDimFilter: Add toString, allow accepting any Collection of values. * DimensionTopNMetricSpec: Fix toString. * InvertedTopNMetricSpec: Add toString. * HyperUniqueFinalizingPostAggregator: Add equals, hashCode, toString.	2016-11-30 19:00:14 -08:00
Gian Merlino	477e0cab7c	Filter fixes and tests (#3724 ) * More robust Filter tests. All Filter tests now exercise the CNF and post-filtering features. * Fixes to RowBasedValueMatcherFactory and to bound filters. - Change Comparables to Strings in ValueMatcher related code. - Break out RowBasedValueMatcherFactory, fix a variety of issues around nulls, and add tests. - Fix bound filters on long columns with non-numeric bounds, and add tests.	2016-11-30 16:10:05 -08:00
Gian Merlino	6922d684bf	GroupBy: Validation of output names, and a gross hack for v1 subqueries. (#3686 ) v1 subqueries try to use aggregators to "transfer" values from the inner results to an incremental index, but aggregators can't transfer all kinds of values (strings are a common one). This is a workaround that selectively ignores what the outer aggregators ask for and instead assumes that we know best. These are in the same commit because the name validation changed the kinds of errors that were thrown by v1 subqueries.	2016-11-29 12:35:03 +05:30
Roman Leventov	c070b4a816	Fix concurrency defects, remove unnecessary volatiles (#3701 )	2016-11-22 16:42:28 -08:00
Roman Leventov	7b56cec3b9	Fix resource leaks (#3702 )	2016-11-18 21:21:36 +05:30
Gian Merlino	7e80d1045a	Exercise v2 engine in the groupBy aggregator and multi-value dimension tests. (#3698 ) This also involved some other test changes: - Added a factory.mergeRunners step to AggregationTestHelper's groupBy chain, since the v2 engine does merging there. - Changed test byteBuffer pools from on-heap to off-heap to work around https://github.com/DataSketches/sketches-core/pull/116 for datasketches tests.	2016-11-16 20:02:25 -08:00
Keuntae Park	094f5b851b	Support Min/Max for Timestamp (#3299 ) * Min/Max aggregator for Timestamp * remove unused imports and method * rebase and zip the test data * add docs	2016-11-14 23:00:21 -08:00
Gian Merlino	9ad34a3f03	groupBy v1: Force all dimensions to strings. (#3685 ) Fixes #3683.	2016-11-14 09:30:18 -08:00
Jisoo Kim	7c0f462fbc	fix bug in StringDimensionHandler and add a cli tool for validating segments (#3666 )	2016-11-11 18:46:25 -08:00
Roman Leventov	fbbb55f867	Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics (#3679 ) * Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics, not only query metrics * Remove unused imports * Use empty string instead of "testing-version" as a version placeholder	2016-11-11 17:17:27 -06:00
Akash Dwivedi	3e408497b3	Migrating bytebuffercollections from Metamarkets. (#3647 ) * Migrating bytebuffercollections from Metamarkets. * resolving code conflicts and removing <p> from bytebuffer-collections.	2016-11-11 10:51:07 -08:00
Gian Merlino	fd5451486c	Short-circuiting AndFilter. (#3676 ) If any of the bitmaps are empty, the result will be false.	2016-11-11 10:14:56 -08:00
Gian Merlino	657e4512d2	Checkstyle checks for AvoidStaticImport, UnusedImports. (#3660 ) Excludes tests from AvoidStaticImport, since those are used often there and I didn't want to make this changeset too large. Production code use was minimal and I switched those to non-static imports.	2016-11-05 11:34:36 -07:00
Gian Merlino	4cbebd0931	SubstringDimExtractionFn, BoundDimFilter: Implement typical style toString. (#3658 )	2016-11-04 13:31:47 -07:00
Gian Merlino	600bbd4a17	BucketExtractionFn: Implement hashCode, fix toString. (#3656 )	2016-11-04 11:24:02 -07:00
Gian Merlino	8b3c86f41f	Fix FilteredAggregatorFactory toString formatting. (#3657 )	2016-11-04 11:23:55 -07:00
Gian Merlino	2c504b6258	Add "like" filter. (#3642 ) * Add "like" filter. * Addressed some PR comments. * Slight simplifications to LikeFilter. * Additional simplifications. * Fix comment in LikeFilter. * Clarify comment in LikeFilter. * Simplify LikeMatcher a bit. * No use going through the optimized path if prefix is empty. * Add more tests.	2016-11-04 23:25:03 +05:30
Navis Ryu	b99e14e732	Support configuration for handling multi-valued dimension (#2541 ) * Support configuration for handling multi-valued dimension * Addressed comments * use MultiValueHandling.ofDefault() for missing policy	2016-11-03 22:38:54 -06:00
Navis Ryu	e10def32f2	Support string type in math expression (#2836 ) * Support string type in math expression addressed comments addressed comments Addressed comments * Updated math function document * Addressed comments	2016-11-02 21:10:48 -06:00
kaijianding	2961406b90	fix zero period in PeriodGranularity causing gran.iterable(start, end) infinite loop (#3644 )	2016-11-02 15:40:07 +05:30
Roman Leventov	4b0d6cf789	Fix resource leaks (ComplexColumn and GenericColumn) (#3629 ) * Remove unused ComplexColumnImpl class * Remove throws IOException from close() in GenericColumn, ComplexColumn, IndexedFloats and IndexedLongs * Use concise try-with-resources syntax in several places * Fix resource leaks (ComplexColumn and GenericColumn) in SegmentAnalyzer, SearchQueryRunner, QueryableIndexIndexableAdapter and QueryableIndexStorageAdapter * Use Closer in Iterable, returned from QueryableIndexIndexableAdapter.getRows(), in order to try to close everything even if closing some parts thew exceptions	2016-11-02 09:23:52 +05:30
Gian Merlino	45940d6e40	Math expressions support for missing columns. (#3630 ) Also add SchemaEvolutionTest to help test this kind of thing. Fixes #3627 and includes test for #3625.	2016-11-01 09:40:25 -07:00
Gian Merlino	89d9c61894	Deprecate Aggregator.getName and AggregatorFactory.getAggregatorStartValue. (#3572 )	2016-10-31 15:24:30 -07:00
Navis Ryu	3fca3be9ea	SpecificSegmentQueryRunner misses missing segments from toYielder() (#3617 )	2016-10-30 11:47:29 -07:00
Himanshu	23a8e22836	fix SketchMergeAggregatorFactory.finalizeResults, comparator and more UTs for timeseries, topN (#3613 )	2016-10-28 15:48:33 -07:00
Navis Ryu	898c1c21af	More best-effort parse long (#3603 ) * More best-effort parse long * addressed comments	2016-10-25 10:31:51 -07:00
Akash Dwivedi	4b3bd8bd63	Migrating java-util from Metamarkets. (#3585 ) * Migrating java-util from Metamarkets. * checkstyle and updated license on java-util files. * Removed unused imports from whole project. * cherry pick metamx/java-util@826021f. * Copyright changes on java-util pom, address review comments.	2016-10-21 14:57:07 -07:00
Navis Ryu	8b7ff4409a	Math expressional parameters for aggregator (#2783 ) * Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was #2820) * Addressed comments * addressed comments	2016-10-19 13:58:35 -05:00
Roman Leventov	b113a34355	In CPUTimeMetricQueryRunner, account CPU consumed in baseSequence.toYielder() (#3587 )	2016-10-18 09:06:42 -05:00
Charles Allen	2c5c8198db	Make query/cpu/time still report on error (#3535 )	2016-10-18 08:26:21 -05:00
Roman Leventov	9611358f0a	Small topn scan improvements (#3526 ) * Remove unused numProcessed param from PooledTopNAlgorithm.aggregateDimValue() * Replace AtomicInteger with simple int in PooledTopNAlgorithm.scanAndAggregate() and aggregateDimValue() * Remove unused import	2016-10-17 10:36:19 -07:00
Gian Merlino	285516bede	Workaround non-thread-safe use of HLL aggregators. (#3578 ) Despite the non-thread-safety of HyperLogLogCollector, it is actually currently used by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and "get" methods can be called simultaneously by OnheapIncrementalIndex, since its "doAggregate" and "getMetricObjectValue" methods are not synchronized. This means that the optimization of HyperLogLogCollector.fold in #3314 (saving and restoring position rather than duplicating the storage buffer of the right-hand side) could cause corruption in the face of concurrent writes. This patch works around the issue by duplicating the storage buffer in "get" before returning a collector. The returned collector still shares data with the original one, but the situation is no worse than before #3314. In the future we may want to consider making a thread safe version of HLLC that avoids these kinds of problems in realtime indexing. But for now I thought it was best to do a small change that restored the old behavior.	2016-10-17 09:39:12 -07:00
Roman Leventov	5dc95389f7	Add Checkstyle framework (#3551 ) * Add Checkstyle framework * Avoid star import * Need braces for control flow statements * Redundant imports * Add NewLineAtEndOfFile check	2016-10-13 13:37:47 -07:00
Roman Leventov	85ac8eff90	Improve performance of IndexMergerV9 (#3440 ) * Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9 * Don't mask index in MergeIntIterator.makeQueueElement() * DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator * Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable * Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }	2016-10-13 08:28:46 -07:00
Charles Allen	76e77cb610	Make segment creation gauva 14 friendly (#3520 )	2016-10-05 15:25:03 -07:00
Gian Merlino	40f2fe7893	Bump versions to 0.9.3-SNAPSHOT (#3524 )	2016-09-29 13:53:32 -07:00
Charles Allen	654e1db309	Add simple test to FunctionalExtractionTest (#3522 )	2016-09-28 23:45:15 -07:00
Gian Merlino	d5a8a35fec	groupBy: GroupByRowProcessor fixes, invert subquery context overrides. (#3502 ) - Fix GroupByRowProcessor config overrides - Fix GroupByRowProcessor resource limit checking - Invert subquery context overrides such that for the subquery, its own keys override keys from the outer query, not the other way around. The last bit is necessary for the test to work, and seems like a better way to do it anyway.	2016-09-23 14:41:09 -07:00
Gian Merlino	7195be32d8	groupBy v2: Fix dangling references. (#3500 ) Acquiring references in the processing task prevents dangling references caused by canceled processing tasks.	2016-09-24 01:59:11 +05:30
Gian Merlino	f8d71fc602	groupBy: Fix maxMergingDictionarySize config. (#3488 )	2016-09-22 10:02:33 -07:00
Gian Merlino	c87ecea975	Fix ListFilteredDimensionSpec blacklisting on non-present values. (#3487 )	2016-09-22 09:12:02 -07:00
Navis Ryu	49c0fe0e8b	Show candidate hosts for the given query (#2282 ) * Show candidate hosts for the given query * Added test cases & minor changes to address comments * Changed path-param to query-pram for intervals/numCandidates	2016-09-22 08:32:38 -07:00
Keuntae Park	54ec4dd584	Support renaming of outputName for cached select and search query results (#3395 ) * support renaming of outputName for cached select and search queries * rebase and resolve conflicts * rollback CacheStrategy interface change * updated based on review comments	2016-09-20 08:19:14 -07:00
Charles Allen	95e08b38ea	[QTL] Reduced Locking Lookups (#3071 ) * Lockless lookups * Fix compile problem * Make stack trace throw instead * Remove non-germane change * * Add better naming to cache keys. Makes logging nicer * Fix #3459 * Move start/stop lock to non-interruptable for readability purposes	2016-09-16 11:54:23 -07:00
Jonathan Wei	df766b2bbd	Add dimension handling interface for ingestion and segment creation (#3217 ) * Add dimension handling interface for ingestion and segment creation * update javadocs for DimensionHandler/DimensionIndexer * Move IndexIO row validation into DimensionHandler * Fix null column skipping in mergerV9 * Add deprecation note for 'numeric_dims' filename pattern in IndexIO v8->v9 conversion * Fix java7 test failure	2016-09-12 12:54:02 -07:00
Gian Merlino	d108461838	groupBy v2: Parallel disk spilling. (#3433 ) In ConcurrentGrouper, when it becomes clear that disk spilling is necessary, switch from hash-based partitioning to thread-based partitioning. This stops processing threads from blocking each other while spilling is occurring.	2016-09-09 16:49:58 -06:00
Gian Merlino	1e3f94237e	groupBy v2: Configurable load factor. (#3437 ) Also change defaults: - bufferGrouperMaxLoadFactor from 0.75 to 0.7. - maxMergingDictionarySize to 100MB from 25MB, should be more appropriate for most heaps.	2016-09-07 14:14:59 -05:00
Roman Leventov	4f0bcdce36	Eager file unmapping in IndexIO, IndexMerger and IndexMergerV9 (#3422 ) * Eager file unmapping in IndexIO, IndexMerger and IndexMergerV9. The exact purpose for this change is to allow running IndexMergeBenchmark in Windows, however should also be universally 'better' than non-deterministic unmapping, done when MappedByteBuffers are garbage-collected (BACKEND-312) * Use Closer with a proper pattern in IndexIO, IndexMerger and IndexMergerV9 * Unmap file in IndexMergerV9.makeInvertedIndexes() using try-with-resources * Reformat IndexIO	2016-09-07 10:43:47 -07:00
Gian Merlino	8d2ae144a8	groupBy: Short-circuit identity preCompute manipulators. (#3434 )	2016-09-06 22:28:44 -06:00
Gian Merlino	1d07964987	LimitedTemporaryStorage: Fix perf bug. (#3432 ) FilterOutputStream has an inefficient implementation of write(byte[], int, int). So let's extend OutputStream directly and use efficient implementations of all methods.	2016-09-06 15:39:36 -07:00
Gian Merlino	8ed1894488	groupBy: Omit timestamp from merge key when granularity = all. (#3416 ) Fixes #3412.	2016-09-01 09:02:54 -07:00
Gian Merlino	6d25c5e053	Avoid materializing all groupBy results with order + limit. (#3410 ) The old TopNFunction code did Sequences.toList on the input sequence before using a priority queue to find the top N items. Now, the priority queue is used in an accumulator, so there is no need to fully materialize the results. Also removed equals/hashCode from the limitFn and remove limitFn from the GroupByQuery's hashCode, since that wasn't necessary and the implementation of hashCode wasn't correct anyway.	2016-08-31 14:08:07 -07:00
Gian Merlino	1268e2902c	Add groupBy test for multiple multi-value dimensions. (#3415 )	2016-08-31 11:21:10 -07:00
Gian Merlino	e9050c2b4c	TimeFormatExtractionFn: Allow null formats (equivalent to ISO8601) and granular bucketing. (#3411 )	2016-08-31 20:58:53 +05:30
Keuntae Park	0076b5fc1a	Interval bug fix for search query (#2903 ) * support query granularity and interval for search query * skip unncessary bitmap calculation when query interval contains whole the data interval of the given segments. * use binary search to find start and end index for the given interval * fix based on comment * bug fix based on the review comments and add unit tests	2016-08-31 20:52:44 +05:30
Dave Li	c4e8440c22	Adds long compression methods (#3148 ) * add read * update deprecated guava calls * add write and vsizeserde * add benchmark * separate encoding and compression * add header and reformat * update doc * address PR comment * fix buffer order * generate benchmark files * separate encoding strategy and format * fix benchmark * modify supplier write to channel * add float NONE handling * address PR comment * address PR comment 2	2016-08-30 16:17:46 -07:00
Jonathan Wei	4e91330a17	Use DimensionSpec in CardinalityAggregatorFactory (#3406 ) * Use DimensionSpec in CardinalityAggregatorFactory * Address PR comments * Fix requiredFields()	2016-08-30 15:54:02 -07:00
Gian Merlino	b11e9544ea	GroupBy v2: Improve hash code distribution. (#3407 ) Without this transformation, distribution of hash % X is poor in general. It is catastrophically poor when X is a multiple of 31 (many slots would be empty).	2016-08-30 12:09:08 +05:30
kaijianding	f037dfcaa4	fix missing segments duplicate retried (#3398 )	2016-08-29 23:46:21 +05:30
jaehong choi	2e0f253c32	introducing lists of existing columns in the fields of select queries' output (#2491 ) * introducing lists of existing columns in the fields of select queries' output * rebase master * address the comment. add test code for select query caching * change the cache code in SelectQueryQueryToolChest to 0x16	2016-08-25 21:37:53 +05:30
rajk-tetration	362b9266f8	Adding filters for TimeBoundary on backend (#3168 ) * Adding filters for TimeBoundary on backend Signed-off-by: Balachandar Kesavan <raj.ksvn@gmail.com> * updating TimeBoundaryQuery constructor in QueryHostFinderTest * add filter helpers * update filterSegments + test * Conditional filterSegment depending on whether a filter exists * Style changes * Trigger rebuild * Adding documentation for timeboundaryquery filtering * added filter serialization to timeboundaryquery cache * code style changes	2016-08-15 10:25:24 -07:00
Gian Merlino	e1b0b7de3e	IndexBuilder: Allow replacing rows, customizable maxRows. (#3359 )	2016-08-12 15:22:45 -07:00
Jonathan Wei	454587857c	Make StringComparator deserialization case-insensitive (#3356 )	2016-08-11 18:00:11 -07:00
Himanshu	043562914d	Update IncrementalIndex.getMetricType() to return type name stored by ComplexMetricsSerde instead of AggregatorFactory.getTypeName() (#3341 )	2016-08-10 11:03:44 -07:00
Gian Merlino	1eb7a7e882	Restore optimizations in BoundFilter. (#3343 )	2016-08-10 08:53:17 -07:00
Gian Merlino	a2bcd97512	IncrementalIndex: Fix multi-value dimensions returned from iterators. (#3344 ) They had arrays as values, which MapBasedRow doesn't understand and toStrings rather than converting to lists.	2016-08-10 08:47:29 -07:00
Jonathan Wei	890e3bdd3f	More informative query unit test names (#3342 )	2016-08-09 22:24:48 -07:00
Gian Merlino	8899affe48	Introduce standardized "Resource limit exceeded" error. (#3338 ) Fixes #3336.	2016-08-09 10:50:56 -07:00
Gian Merlino	21bce96c4c	More useful query errors. (#3335 ) Follow-up to #1773, which meant to add more useful query errors but did not actually do so. Since that patch, any error other than interrupt/cancel/timeout was reported as `{"error":"Unknown exception"}`. With this patch, the error fields are: - error, one of the specific strings "Query interrupted", "Query timeout", "Query cancelled", or "Unknown exception" (same behavior as before). - errorMessage, the message of the topmost non-QueryInterruptedException in the causality chain. - errorClass, the class of the topmost non-QueryInterruptedException in the causality chain. - host, the host that failed the query.	2016-08-09 16:14:52 +08:00
Gian Merlino	1aae5bd67d	Nicer handling for cancelled groupBy v2 queries. (#3330 ) 1. Wrap temporaryStorage in a resource holder, to avoid spurious "Closed" errors from already-running processing tasks. 2. Exit early from the merging accumulator if the query is cancelled.	2016-08-05 14:48:06 -07:00
Jonathan Wei	decefb7477	Add time interval dim filter and retention analysis example (#3315 ) * Add time interval dim filter and retention analysis example * Use closed-open matching for intervals, update cache key generation * Fix time filtering tests for interval boundary change	2016-08-05 07:25:04 -07:00
Navis Ryu	5b3f0ccb1f	Support variance and standard deviation (#2525 ) * Support variance and standard deviation * addressed comments	2016-08-04 17:32:58 -07:00
Gian Merlino	9437a7a313	HLL: Avoid some allocations when possible. (#3314 ) - HLLC.fold avoids duplicating the other buffer by saving and restoring its position. - HLLC.makeCollector(buffer) no longer duplicates incoming BBs. - Updated call sites where appropriate to duplicate BBs passed to HLLC.	2016-08-03 18:08:52 -07:00
Gian Merlino	a4b95af839	Fix grouper closing in GroupByMergingQueryRunnerV2. (#3316 ) The grouperHolder should be closed on failure, not the grouper.	2016-08-02 21:02:30 -07:00
Gian Merlino	0299ac73b8	Fix FilteredAggregators at ingestion time and in groupBy v2 nested queries. (#3312 ) The common theme between the two is they both create "fake" DimensionSelectors that work on top of Rows. They both do it because there isn't really any dictionary for the underlying Rows, they're just a stream of data. The fix for both is to allow a DimensionSelector to tell callers that it has no dictionary by returning CARDINALITY_UNKNOWN from getValueCardinality. The callers, in turn, can avoid using it in ways that assume it has a dictionary. Fixes #3311.	2016-08-02 17:39:40 -07:00
Gian Merlino	ae3e0015b6	Fix ClassCastException in nested v2 groupBys with timeouts. (#3310 ) Add tests for the CCE and for a bunch of other groupBy stuff. Also avoids setting the interrupted flag when InterruptedExceptions happen, since this might interfere with resource closing, no other query does it, and is probably pointless anyway since the thread is likely to be a jetty thread that we don't actually want to set an interrupt flag on. Also fixes toString on OrderByColumnSpec.	2016-08-02 16:02:44 -06:00
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
Jonathan Wei	0bdaaa224b	Use Long.compare for NumericComparator when possible (#3309 )	2016-08-01 20:36:56 -07:00
Dave Li	bc20658239	groupBy nested query using v2 strategy (#3269 ) * changed v2 nested query strategy * add test for #3239 * update for new ValueMatcher interface and add benchmarks * enable time filtering * address PR comments * add failing test for outer filter aggregator * add helper class for sharing code * update nested groupby doc * move temporary storage instantiation * address PR comment * address PR comment 2	2016-08-01 18:30:39 -07:00
Jonathan Wei	a6105cbb86	Add numeric StringComparator (#3270 ) * Add numeric StringComparator * Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query * Address PR comments, add multithreaded BoundDimFilter test * Add comment on strlen tie handling * Add timeseries interval filter benchmark * Adjust docs * Use jackson for StringComparator, address PR comments * Add new TopNMetricSpec and SearchSortSpec with tests (WIP) * More TopNMetricSpec and SearchSortSpec tests * Fix NewSearchSortSpec serde * Update docs for new DimensionTopNMetricSpec * Delete NumericDimensionTopNMetricSpec * Delete old SearchSortSpec * Rename NewSearchSortSpec to SearchSortSpec * Add TopN numeric comparator benchmark, address PR comments * Refactor OrderByColumnSpec * Add null checks to NumericComparator and String->BigDecimal conversion function * Add more OrderByColumnSpec serde tests	2016-07-29 15:44:16 -07:00
Navis Ryu	884017d981	"all" type search query spec (#3300 ) * "all" type search query spec * addressed comments * added unit test	2016-07-28 18:16:15 -07:00
Gian Merlino	2553997200	Associate groupBy v2 resources with the Sequence lifecycle. (#3296 ) This fixes a potential issue where groupBy resources could be allocated to create a Sequence, but then the Sequence is never used, and thus the resources are never freed. Also simplifies how groupBy handles config overrides (this made the new unit test easier to write).	2016-07-27 18:44:19 -07:00
Gian Merlino	9b5523add3	Reference counting, better error handling for resources in groupBy v2. (#3268 ) Refcounting prevents releasing the merge buffer, or closing the concurrent grouper, before the processing threads have all finished. The better error handling prevents an avalanche of per-runner exceptions when grouping resources are exhausted, by grouping those all up into a single merged exception.	2016-07-27 01:59:02 +05:30
Erik Dubbelboer	76fabcfdb2	Fix #2782 , Unit test failed for DruidProcessingConfigTest.testDeserialization (#3231 ) On systems with only once processor this test fails.	2016-07-25 15:51:09 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Jonathan Wei	a42ccb6d19	Support filtering on long columns (including __time) (#3180 ) * Support filtering on __time column * Rename DruidPredicate * Add docs for ValueMatcherFactory, add comment on getColumnCapabilities * Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate * Address PR comments (support filter on all long columns) * Use predicate factory instead of composite predicate * Address PR comments * Lazily initialize long handling in selector/in filter * Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe * Add multithreaded selector/in filter test * Fix non-final lock object in SelectorDimFilter	2016-07-20 17:08:49 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Nishant	7995818220	Increase test timeout to prevent failing on slow machines (#3224 ) constantly timing out on one of slow build machines, increasing the timeout fixed it. Running io.druid.granularity.QueryGranularityTest Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.776 sec - in io.druid.granularity.QueryGranularityTest	2016-07-17 18:44:48 -07:00
Gian Merlino	6cd1f5375b	Better harmonized dimensions for query metrics. (#3245 ) All query metrics now start with toolChest.makeMetricBuilder, and all of those now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id" moved to common code, since all query metrics added it anyway. In particular this will add query-type specific dimensions like "threshold" and "numDimensions" to servlet-originated metrics like query/time.	2016-07-14 11:55:51 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Gian Merlino	fdc7e88a7d	Allow queries with no aggregators. (#3216 ) This is actually reasonable for a groupBy or lexicographic topNs that is being used to do a "COUNT DISTINCT" kind of query. No aggregators are needed for that query, and including a dummy aggregator wastes 8 bytes per row. It's kind of silly for timeseries, but why not.	2016-07-06 20:38:54 +05:30
Jonathan Wei	f3a3662133	Fix compile error in SearchBinaryFnTest (#3201 )	2016-06-29 09:44:45 -05:00
jaehong choi	efbcbf5315	Support alphanumeric sort in search query (#2593 ) * support alphanumeric sort in search query * address a comment about handling equals() and hashCode() * address comments * add Ut for string comparators * address a comment about space indentations.	2016-06-28 15:06:18 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	da660bb592	DumpSegment tool. (#3182 ) Fixes #2723.	2016-06-23 14:37:50 -07:00
Gian Merlino	a437fb150b	Fix SegmentMetadataQuery when queryGranularity is requested but not present. (#3181 )	2016-06-23 14:30:50 -07:00
Jonathan Wei	24860a1391	Two-stage filtering (#3018 ) * Two-stage filtering * PR comment	2016-06-22 16:08:21 -07:00
Nishant	f46ad9a4cb	support Union Segment metadata queries (#3132 ) * support Union Segment metadata queries fix 3128 * remove extraneous sys out	2016-06-21 10:30:50 -07:00
Dave Li	12be1c0a4b	Add bucket extraction function (#3033 ) * add bucket extraction function * add doc and header * updated doc and test	2016-06-17 09:24:27 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	0d427923c0	fix caching for search results (#3119 ) * fix caching for search results properly read count when reading from cache. * fix NPE during merging search count and add test * Update cache key to invalidate prev results	2016-06-09 17:49:47 -07:00
Gian Merlino	5998de7d5b	Fix lenient merging of conflicting aggregators. (#3113 ) This should have marked the conflicting aggregator as null, but instead it threw an NPE for the entire query.	2016-06-08 15:56:48 -07:00
Jonathan Wei	37c8a8f186	Speed up filter tests with adapter cache (#3103 )	2016-06-08 07:41:10 -07:00
Gian Merlino	54139c6815	Fix NPE in registeredLookup extractionFn when "optimize" is not provided. (#3064 )	2016-06-03 12:58:17 -05:00
Gian Merlino	6171e078c8	Improve NPE message in LookupDimensionSpec when lookup does not exist. (#3065 ) The message used to be empty, which made things hard to debug.	2016-06-02 19:59:12 -07:00
John Wang	e662efa79f	segment interface refactor for proposal 2965 (#2990 )	2016-05-26 20:36:41 -07:00
Kurt Young	b5bd406597	fix #2991 : race condition in OnheapIncrementalIndex#addToFacts (#3002 ) * fix #2991: race condition in OnheapIncrementalIndex#addToFacts * add missing header * handle parseExceptions when first doing first agg	2016-05-25 19:05:46 -07:00
Jonathan Wei	b72c54c4f8	Add benchmark data generator, basic ingestion/persist/merge/query benchmarks (#2875 )	2016-05-25 16:39:37 -07:00
Dave Li	dcabd4b1ee	Add lookup optimization for InDimFilter (#2938 ) * Add lookup optimization for InDimFilter * tests for in filter with lookup extraction fn * refactor * refactor2 and modified filter test * make optimizeLookup private	2016-05-19 16:29:16 -07:00
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	fb01db4db7	[QTL] Allows RegisteredLookupExtractionFn to find its lookups lazily (#2971 ) * Allows RegisteredLookupExtractionFn to find its lookups lazily * Use raw variables instead of AtomicReference * Make sure to use volatile * Remove extra local variable. * Move from BAOS to ByteBuffer	2016-05-17 11:29:39 -07:00
Himanshu	d3e9c47a5f	use correct ObjectMapper in Index[IO/Merger] in AggregationTestHelper and minor fix in theta sketch SketchMergeAggregatorFactory.getMergingFactory(..) (#2943 )	2016-05-13 10:06:31 +05:30
Himanshu	d821144738	at historicals GpBy query mergeResults does not need merging as results are already merged by GroupByQueryRunnerFactory.mergeRunners(..) (#2962 )	2016-05-12 17:41:24 -07:00
Gian Merlino	01bebf432a	GroupByQuery: Multi-value dimension tests. (#2959 )	2016-05-12 11:31:50 -07:00
Charles Allen	a31348450f	Add toString for LookupConfig (#2935 ) * Helps with operations and getting where the snapshot dir is	2016-05-09 18:20:00 -07:00
Dave Li	79a54283d4	Optimize filter for timeseries, search, and select queries (#2931 ) * Optimize filter for timeseries, search, and select queries * exception at failed toolchest type check * took out query type check * java7 error fix and test improvement	2016-05-09 11:04:06 -07:00
Slim	8b570ab130	make it clear what LookupExtractorFactory start/stop methods return (#2925 )	2016-05-05 10:38:40 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
Himanshu	8e2742b7e8	adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query (#2873 )	2016-05-03 11:31:10 -07:00
Gian Merlino	40e595c7a0	Remove types from TimeAndDims, they aren't needed. (#2865 )	2016-05-03 13:10:25 -05:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Navis Ryu	2729fea84d	Fix parsing fail of segment id with datasource containing underscore (#2797 ) * Fix parsing fail of segment id with underscored datasource (Fix for #2786) * addressed comment * renamed and moved code into api. added log4 dependency for tests * addressed comments * fixed test fails	2016-05-02 22:37:28 -07:00
Gian Merlino	90ce03c66f	Fix integer overflow in SegmentMetadataQuery numRows. (#2890 )	2016-04-27 14:37:04 -07:00
Gian Merlino	6dc7688a29	TimeAndDims equals/hashCode implementation. (#2870 ) Adapted from #2692, thanks @navis for original implementation.	2016-04-22 08:45:20 +08:00
Himanshu	3cfd9c64c9	make singleThreaded groupBy query config overridable at query time (#2828 ) * make isSingleThreaded groupBy query processing overridable at query time * refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent	2016-04-21 17:12:58 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	c74391e54c	JavaScript: Ability to disable. (#2853 ) Fixes #2852.	2016-04-21 09:43:15 -05:00
Gian Merlino	7d3e55717d	Reduce cost of various toFilter calls. (#2860 ) These happen once per segment and so it's better if they don't do as much work.	2016-04-21 04:28:46 +08:00
Gian Merlino	59460b17cc	Add Filters.matchPredicate helper, use it where appropriate. (#2851 ) This approach simplifies code and is generally faster, due to skipping unnecessary dictionary lookups (see #2850).	2016-04-19 15:54:32 -07:00
Xavier Léauté	b2745befb7	remove obsolete comment (#2858 )	2016-04-19 13:06:58 -07:00
Jisoo Kim	7b65ca7889	refactor ClientQuerySegmentWalker (#2837 ) * refactor ClientQuerySegmentWalker * add header to FluentQueryRunnerBuilder * refactor QueryRunnerTestHelper	2016-04-18 14:00:47 -07:00
Gian Merlino	7c0b1dde3a	DimensionPredicateFilter: Skip unnecessary dictionary lookup. (#2850 )	2016-04-18 12:38:25 -07:00
Jonathan Wei	b534f7203c	Fix performance regression from #2753 in IndexMerger (#2841 )	2016-04-14 21:39:41 -07:00
Jonathan Wei	a26134575b	Fix NPE in TopNLexicographicResultBuilder.addEntry() (#2835 )	2016-04-13 17:27:16 -07:00
Fangjin Yang	abd951df1a	Document how to use roaring bitmaps (#2824 ) * Document how to use roaring bitmaps This fixes #2408. While not all indexSpec properties are explained, it does explain how roaring bitmaps can be turned on. * fix * fix * fix * fix	2016-04-12 19:28:02 -07:00
michaelschiff	db35dd7508	fix issue #2744 . Check for null before combining metrics (#2774 )	2016-04-12 14:46:31 -07:00
Nishant	1bf1dd03a0	Merge pull request #2812 from mrijke/fix-missing-equals-hashcode-filters Add missing equals/hashcode to JS, Regex and SearchQuery DimFilters	2016-04-12 12:00:23 +05:30
Charles Allen	21e406613c	Merge pull request #2809 from metamx/fix2694 Fix test for snapshot taker to better check for lookup perist failure	2016-04-11 14:52:47 -07:00
Maarten Rijke	de68d6b7c4	Add missing equals/hashcode to JS, Regex and SearchQuery DimFilters This commits adds missing equals() and hashcode() methods to the JavascriptDimFilter, RegexDimFilter and the SearchQueryDimFilter.	2016-04-11 12:16:24 +02:00
Nishant	bbb326decf	Merge pull request #2799 from b-slim/fix_snapshot MapLookupFactory need to be Ser/Desr ready.	2016-04-07 13:22:34 +05:30
Slim Bouguerra	bf1eafc4e1	remove all the mock lookupFactory	2016-04-06 15:37:52 -05:00
Slim Bouguerra	59eb2490a0	MapLookupFactory need to be Ser/Desr.	2016-04-06 15:02:18 -05:00
Charles Allen	f915a59138	Merge pull request #2691 from metamx/lookupExtrFn Add ExtractionFn to LookupExtractor bridge	2016-04-06 09:13:08 -07:00
jon-wei	051fd6c0eb	Remove extra println from InFilter	2016-04-05 14:55:49 -07:00
Fangjin Yang	289bb6f885	Merge pull request #2690 from jon-wei/filter_support Allow filters to use extraction functions	2016-04-05 15:40:15 -06:00
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Gian Merlino	e060a9f283	Additional ExtractionFn null-handling adjustments. Followup to comments on #2771.	2016-04-01 18:35:26 -07:00
Fangjin Yang	18b9ea62cf	Merge pull request #2771 from gianm/extractionfn-stuff Various ExtractionFn null handling fixes.	2016-04-01 16:35:46 -07:00
Gian Merlino	23d66e5ff9	Merge pull request #2765 from navis/invalid-encode-nullstring Null string is encoded as "null" in incremental index	2016-04-01 14:43:40 -07:00
Gian Merlino	b6e4d8b2c1	Various ExtractionFn null handling fixes. - JavaScriptExtractionFn shouldn't pass empty strings to its JS functions - Upper/LowerExtractionFn properly handles null Objects (DimExtractionFn's implementation works here) - MatchingDimExtractionFn properly returns nulls rather than empties - RegexDimExtractionFn properly attempts matching on nulls and empties - SearchQuerySpecDimExtractionFn properly returns nulls when passed empties	2016-04-01 14:34:47 -07:00
Fangjin Yang	eea7a47870	Merge pull request #2576 from navis/paging-from-next Add option for select query to get next page without modifying returned paging identifiers	2016-04-01 13:50:36 -07:00
Fangjin Yang	4eb5a2c4f1	Merge pull request #2715 from navis/stringformat-null-handling stringFormat extractionFn should be able to return null on null values (Fix for #2706)	2016-04-01 13:45:28 -07:00
Gian Merlino	23364a47fd	BaseFilterTest: Test optimized filters too.	2016-04-01 12:44:59 -07:00
navis.ryu	077522a46f	stringFormat extractionFn should be able to return null on null values (Fix for #2706 )	2016-04-01 13:40:56 +09:00
navis.ryu	f0e55f5d31	Null string is encoded as "null" in incremental index	2016-04-01 09:47:15 +09:00
navis.ryu	29bb00535b	Add option for select query to get next page without modifying returned paging identifiers	2016-04-01 09:03:03 +09:00
Gian Merlino	5f9240fcbc	Merge pull request #2577 from navis/native-in-filter Implement native in filter	2016-03-30 20:02:54 -07:00
Fangjin Yang	3d68da94fe	Merge pull request #2661 from navis/utf8-estimated-length Utility method for length estimation of utf8	2016-03-30 19:56:14 -07:00
navis.ryu	108535fd07	Implement native in filter (Fix for #2577 )	2016-03-31 10:10:57 +09:00
navis.ryu	e0cfd9ee19	Utility method for length estimation of utf8	2016-03-31 10:07:00 +09:00
jon-wei	5503bf1b38	Remove unnecessary type check in TimeAndDimsComp	2016-03-30 17:54:15 -07:00
Fangjin Yang	95733a362f	Merge pull request #2753 from gianm/null-filtering-multi-value-columns More consistent empty-set filtering behavior on multi-value columns.	2016-03-29 18:52:25 -07:00
Charles Allen	95d42cfd9e	Merge pull request #2758 from pjain1/fix_npe_in_filter handle null values in In Filter	2016-03-29 17:53:02 -07:00
Gian Merlino	1853f36e9f	More consistent empty-set filtering behavior on multi-value columns. The behavior is now that filters on "null" will match rows with no values. The behavior in the past was inconsistent; sometimes these filters would match and sometimes they wouldn't. Adds tests for this behavior to SelectorFilterTest and BoundFilterTest, for query-level filters and filtered aggregates. Fixes #2750.	2016-03-29 15:32:13 -07:00
Parag Jain	d892918a3d	handle null values in In Filter	2016-03-29 17:03:26 -05:00
Fangjin Yang	e023df2b92	Merge pull request #2754 from gianm/i-dont-get-it Remove error suppression code from IncrementalIndexAdapter.	2016-03-28 19:29:53 -07:00
Gian Merlino	c7ff0d698e	Remove error suppression code from IncrementalIndexAdapter.	2016-03-28 18:40:27 -07:00
fjy	c418a55638	cleanup distinct count agg	2016-03-28 17:29:41 -07:00
Fangjin Yang	9cb197adec	Merge pull request #2722 from himanshug/fix_hadoop_jar_upload config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-28 14:49:03 -07:00
Charles Allen	4a98c4fbac	Fix LookupExtractionFn equals and hashCode	2016-03-28 13:14:43 -07:00
Charles Allen	0ee861d0da	Add ExtractionFn to LookupExtractor bridge	2016-03-28 13:14:43 -07:00
Fangjin Yang	7fe277e6da	Merge pull request #2727 from gianm/optimize-bound-filter BoundFilter optimizations, and related interface changes.	2016-03-26 18:59:05 -07:00
Fangjin Yang	0dae28b6af	Merge pull request #2729 from jon-wei/fix_hyperunique_comparator Fix HyperUniquesAggregatorFactory comparator	2016-03-26 15:39:35 -07:00
Gian Merlino	2970b49adc	BoundFilter optimizations, and related interface changes. BoundFilter: - For lexicographic bounds, use bitmapIndex.getIndex to find the start and end points, then union all bitmaps between those points. - For alphanumeric bounds, iterate through dimValues, and union all bitmaps for values matching the predicate. - Change behavior for nulls: it used to be that the BoundFilter would never match nulls, now it matches nulls if "" is allowed by the lower limit and not excluded by the upper limit. Interface changes: - BitmapIndex: add `int getIndex(value)` to make it possible to get the index for a value without retrieving the bitmap. - BitmapIndex: remove `ImmutableBitmap getBitmap(value)`, change callers to `getBitmap(getIndex(value))`. - BitmapIndexSelector: allow retrieving the underlying BitmapIndex through getBitmapIndex. - Clarified contract of indexOf in Indexed, GenericIndexed. Also added tests for SelectorFilter, NotFilter, and BoundFilter.	2016-03-25 14:11:48 -07:00
jon-wei	9afaa2b94a	Fix HyperUniquesAggregatorFactory comparator	2016-03-25 12:36:42 -07:00
Gian Merlino	4ac9e03161	Fix predicate-based ValueMatcher behavior for IncrementalIndex on missing columns. Missing columns should be treated the same as columns containing 100% nulls.	2016-03-25 10:23:59 -07:00
Himanshu Gupta	e78a469fb7	UTs for ExtensionsConfig	2016-03-25 10:51:28 -05:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Nishant	0b03c9405f	Merge pull request #2614 from sirpkt/calendric_gran Support week, month, quarter, and year in query granularity	2016-03-24 16:21:01 -07:00
Himanshu	56343c6cdc	Merge pull request #2704 from navis/simple-optimize optimize single elemented and/or filter	2016-03-24 16:13:48 -05:00
Gian Merlino	713062053c	Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters. I believe that the instanceof chain in Filters exists because in the past, Filter and DimFilter were in different packages (DimFilter was in druid-client and Filter was in druid-processing). And since druid-client didn't depend on druid-processing, DimFilter couldn't have a toFilter method. But now it can.	2016-03-23 17:03:49 -07:00
Gian Merlino	dd86198902	All Filters should work with FilteredAggregators. This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a ValueMatcherFactory implementation to FilteredAggregatorFactory so it can take advantage of existing makeMatcher(ValueMatcherFactory) implementations. This patch also removes the Bound-based method from ValueMatcherFactory. Its only user was the SpatialFilter, which could use the Predicate-based method. Fixes #2604.	2016-03-23 12:24:01 -07:00
binlijin	57d78d3293	clean tmp file when index merge fail	2016-03-23 10:55:12 +08:00
navis.ryu	91f6be4884	optimize single elemented and/or filter	2016-03-23 09:29:15 +09:00
Gian Merlino	ff25325f3b	Improved docs for multi-value dimensions. - Add central doc for multi-value dimensions, with some content from other docs. - Link to multi-value dimension doc from topN and groupBy docs. - Fixes a broken link from dimensionspecs.md, which was presciently already linking to this nonexistent doc. - Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes "multi-value") in favor of "multi-value".	2016-03-22 14:40:55 -07:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Keuntae Park	7f29f2ac3b	support week, month, quarter, year in query granularity	2016-03-21 17:41:53 +09:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Slim	cf342d8d3c	Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility [QTL][Lookup] lookup module with the snapshot utility	2016-03-17 11:39:47 -05:00
Slim Bouguerra	0c86b29ef0	lookup module with the snapshot utility	2016-03-17 09:20:41 -05:00
Charles Allen	2ac8a22173	Merge pull request #2579 from metamx/closerIsCloser Make CloserRule use guava's Closer	2016-03-14 17:18:19 -07:00
Charles Allen	a64979463f	Make CloserRule use guava's Closer	2016-03-14 15:01:24 -07:00
Fangjin Yang	06813b510a	Merge pull request #2571 from himanshug/gp_by_avoid_sort avoid sort while doing groupBy merging when possible	2016-03-14 14:46:51 -07:00
Fangjin Yang	dbdbacaa18	Merge pull request #2260 from navis/cardinality-for-searchquery Support cardinality for search query	2016-03-14 13:24:40 -07:00
Slim	8cc3582e70	Merge pull request #2644 from metamx/optimize-timeboundary optimize timeboundary for min or max bound	2016-03-13 13:16:24 -05:00
navis.ryu	be341bf4e3	Support cardinality for search query (Fix for #2260 )	2016-03-12 09:51:01 +09:00
Xavier Léauté	6f0d6ef0e9	optimize timeboundary for min or max bound	2016-03-11 14:11:47 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Himanshu Gupta	dc0214bddb	while GroupBy merging use unsorted facts in IncrementalIndex wherever possible	2016-03-10 16:11:48 -06:00
Himanshu Gupta	02dfd5cd80	update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance	2016-03-10 16:11:48 -06:00
Xavier Léauté	90d7409e1a	Merge pull request #2611 from himanshug/gp_by_max_limit only allow lowering maxResults and maxIntermediateRows from groupBy query context	2016-03-10 13:44:13 -08:00
Gian Merlino	a2b1652787	Clarify parser docs. - Clarify what parseSpecs are used for. - Avro, Protobuf should use timeAndDims parseSpecs. - Hadoop jobs should use hadoopyString string parsers.	2016-03-10 08:45:04 -08:00
Fangjin Yang	68cffe1d91	Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache Fix caching of skipEmptyBuckets for TimeseriesQuery.	2016-03-09 18:45:59 -08:00
Gian Merlino	708bc674fa	Make specifying query context booleans more consistent. Before, some needed to be strings and some needed to be real booleans. Now they can all be either one.	2016-03-08 19:38:26 -08:00
Gian Merlino	40dad6dff4	Fix caching of skipEmptyBuckets for TimeseriesQuery.	2016-03-08 19:22:12 -08:00
Himanshu Gupta	ca5de3f583	only allow lowering maxResults and maxIntermediateRows from groupBy query context	2016-03-08 15:03:59 -06:00
Himanshu Gupta	099acb4966	allow groupBy max[Intermediate]Rows limit be overridable by context	2016-03-07 15:22:41 -06:00
Himanshu Gupta	c544ebf25e	reintroducing the safety check removed in commit-1d602be so that dim value ids are less than cardinality	2016-03-03 23:34:23 -06:00
Bingkun Guo	4a58462fc7	update querySegmentSpec when passing query to getQueryRunner After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment. In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.	2016-03-02 16:44:56 -06:00
Nishant	31b502773a	Merge pull request #2480 from navis/pagingfail-over-segments Select query cannot span to next segment with paging	2016-03-01 11:42:41 +05:30
Fangjin Yang	e5c25725c0	Merge pull request #2562 from himanshug/fix_2556 with nested GpBy query outer query results need to be further merged	2016-02-29 12:17:33 -08:00
Himanshu Gupta	0722ced413	with GpBy query outer query results need to be further merged	2016-02-29 10:16:25 -06:00
navis.ryu	b1ff920831	Lazily initialize predicate for bound filter	2016-02-29 15:35:52 +09:00
navis.ryu	5f1e60324a	Added more complex test case with versioned segments	2016-02-29 14:48:24 +09:00
navis.ryu	2686bfa394	Select query cannot span to next segment with paging	2016-02-29 00:01:46 +09:00
Fangjin Yang	29d29ba98d	Merge pull request #2263 from jon-wei/flex_dims3 Allow IncrementalIndex to store Long/Float dimensions	2016-02-25 17:23:02 -08:00
jon-wei	c17ce02467	Allow IncrementalIndex to store Long/Float dimensions	2016-02-24 13:51:57 -08:00
jon-wei	fd3782522c	Rename 'replaceMissingValues...' parameters in RegexExtractionFn	2016-02-24 13:12:56 -08:00
Nishant	fb7eae34ed	Merge pull request #2249 from metamx/workerExpanded Use Worker instead of ZkWorker whenever possible	2016-02-24 13:23:22 +05:30
Charles Allen	ac13a5942a	Use Worker instead of ZkWorker whenver possible * Moves last run task state information to Worker * Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker	2016-02-23 15:02:03 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Fangjin Yang	3bdd757024	Merge pull request #1773 from b-slim/log_details Adding downstream source when throwing QueryInterruptedException	2016-02-22 10:16:07 -08:00
Slim Bouguerra	77925cc061	adding downstream source of QueryInterruptedException	2016-02-20 13:05:14 -06:00
Fangjin Yang	8ee81947cd	Merge pull request #2494 from himanshug/fix_timeseries do not drop post-aggs in TimeseriesQueryToolChest.makePreComputeManipulatorFn	2016-02-20 10:37:32 -08:00
Gian Merlino	d25c46cb9f	Add comparator to HyperUniquesFinalizingPostAggregator. This makes it possible to do groupBys with clauses like "HAVING uniques > 10". Beforehand you couldn't do it with either an aggregator (because it returns an HLLV1 which the havingSpec can't understand) or a finalized postaggregator (because it didn't have a comparator). Now you can at least do it with a finalizing postaggregator. Trying it with the aggregator alone still doesn't work. Added some topN and groupBy tests verifying the comparator, and added an @Ignore test that should pass if havingSpecs are made work on the aggregator directly.	2016-02-19 08:36:08 -08:00
Himanshu Gupta	11b0117422	do not drop post-aggs in timeseries query tool chest makePreComputeManipulatorFn like other query types	2016-02-17 20:51:35 -06:00
Jaehong Choi	32b9d57b23	handle a failing UT in GroupByQueryRunnerTest after merging into the master	2016-02-16 16:56:57 +09:00
Jaehong Choi	b25bca85bc	Merge branch 'master' of https://github.com/druid-io/druid into support-alphanumeric-dimensional-sort-in-gropu-by	2016-02-16 16:42:05 +09:00
Jaehong Choi	e89afc901b	delete System.out.println() in test code	2016-02-16 15:26:37 +09:00
Navis Ryu	cd315627c9	Merge pull request #2393 from CHOIJAEHONG1/support-alphanumeric-dimensional-sort-in-gropu-by support alphanumeric sorting for dimensional columns in groupby (#2393)	2016-02-16 14:11:30 +09:00
Slim	16092eb5e2	Merge pull request #2464 from gianm/print-properties Make startup properties logging optional.	2016-02-14 15:11:35 -06:00
Gian Merlino	e0c049c0b0	Make startup properties logging optional. Off by default, but enabled in the example config files. See also #2452.	2016-02-12 14:12:16 -08:00
Himanshu Gupta	da5fcd0124	before facts get it , indexAndOffsets should already know about it	2016-02-12 13:32:06 -06:00
Jonathan Wei	d63eec65a1	Merge pull request #2208 from navis/metadataquery-minmax Support min/max values for metadata query	2016-02-11 17:28:07 -08:00
Jonathan Wei	e1b022eac9	Merge pull request #2349 from navis/dimensionspec-for-selectquery Support dimension spec for select query	2016-02-11 16:38:16 -08:00
navis.ryu	dd2375477a	Support min/max values for metadata query (#2208 )	2016-02-12 09:35:58 +09:00
Gian Merlino	2d037ef05e	Merge pull request #2453 from DreamLab/fix/topn_sorting_anomaly Fix for unstable behavior of HyperLogLog comparator	2016-02-11 16:05:34 -08:00
navis.ryu	4d63196535	Support dimension spec for select query	2016-02-12 08:54:28 +09:00
Himanshu	47d48e1e67	Merge pull request #2452 from gianm/print-properties PropertiesModule: Print properties, processors, totalMemory on startup.	2016-02-11 16:49:34 -06:00
turu	f277a54a5c	removed unsafe heuristics from hll compareTo and provided unit test for regression	2016-02-11 23:46:24 +01:00
Slim	368988d187	Merge pull request #2291 from druid-io/lookupManager Promoting LookupExtractor state and LookupExtractorFactory to be a first class druid state object.	2016-02-11 16:07:27 -06:00
Gian Merlino	29f7758e74	PropertiesModule: Print properties, processors, totalMemory on startup.	2016-02-11 13:51:08 -08:00
Slim Bouguerra	4e119b7a24	Adding lookup ref manager and lookup dimension spec impl	2016-02-11 12:11:51 -06:00
Jaehong Choi	2f2e2ff5b9	support alphanumeric sorting for dimensional columns in groupby	2016-02-11 17:31:28 +09:00
Keuntae Park	05a144e39a	fix crash with filtered aggregator at ingestion time - only for selector filter because extraction filter is not supported as cardinality is not fixed at ingestion time	2016-02-11 11:25:33 +09:00
Fangjin Yang	b1673ee90e	Merge pull request #2409 from gianm/smq-merged-thing SegmentMetadataQuery: Retain segment id when merging, if possible.	2016-02-08 15:43:39 -08:00
Fangjin Yang	c9c20bb7f3	Merge pull request #2395 from metamx/fixExtractionDimFilterNullTest Actually check cache key null checking in ExtractionDimFilterTest	2016-02-08 14:10:52 -08:00
Gian Merlino	bd9c04244f	SegmentMetadataQuery: Retain segment id when merging, if possible. This is helpful on realtime nodes, where two analyses from two different hydrants are merged together but they are actually from the same segment.	2016-02-08 13:07:02 -08:00
Himanshu Gupta	9fe1b28ee5	provide configuration to enable usage of Off heap merging for groupBy query	2016-02-05 14:18:06 -06:00
Himanshu Gupta	b40c342cd1	make Global stupid pool cache size configurable	2016-02-05 14:18:06 -06:00
Himanshu Gupta	72a1e730a2	OffheapIncrementalIndex updates to do the aggregation merging off-heap	2016-02-05 14:17:05 -06:00
Himanshu Gupta	907dd77483	OffheapIncrementalIndex a copy/paste of OnheapIncrementalIndex	2016-02-05 14:02:31 -06:00
Charles Allen	aac5f9b2c9	Actually check cache key null checking in ExtractionDimFilterTest	2016-02-04 09:44:13 -08:00
fjy	1aa363cea7	new quickstart	2016-02-04 09:37:38 -08:00
Fangjin Yang	da77591129	Merge pull request #2392 from metamx/fix2391 Allow ExtractionDimFilter value to be null	2016-02-03 17:47:14 -08:00
Charles Allen	d4f00096ff	Allow ExtractionDimFilter value to be null * Fixes #2391	2016-02-03 15:51:47 -08:00
Himanshu Gupta	6e7d90cf56	UTs for DefaultLimitSpec	2016-02-03 15:59:12 -06:00
Himanshu Gupta	29e0d7f971	lazily create comparators for row columns when needed	2016-02-03 13:38:20 -06:00
navis.ryu	1d602be0f9	Replace string[] with int[] for dimensions	2016-02-03 15:03:22 +09:00
binlijin	a5ef30ff84	optimize topn on particular situation	2016-02-02 14:20:09 +08:00
Himanshu	93c50d8538	Merge pull request #2094 from navis/simplify-index-merge Simplifying dimension merging	2016-01-29 11:23:14 -06:00
navis.ryu	55a888ea2f	time-descending result of select queries	2016-01-29 10:06:05 +09:00
navis.ryu	dd774ef4dd	one-pass merging of dictionary & index	2016-01-29 10:03:53 +09:00
Himanshu	edd7ce58aa	Merge pull request #2348 from AlexanderSaydakov/fix-aggregator-test-helper fixed createIndex	2016-01-28 16:01:36 -06:00
saydakov	e0860661b1	fixed createIndex	2016-01-28 13:20:50 -08:00
Nishant	99017f4518	Merge pull request #2326 from navis/use-reverse-iterator use reverse-iterator if possible	2016-01-28 19:48:38 +05:30
Nishant	3880f54b87	Merge pull request #2332 from himanshug/configurable_partial make populateUncoveredIntervals a configuration in query context	2016-01-28 10:34:35 +05:30
navis.ryu	7324ece8f9	use reverse-iterator if possible	2016-01-28 09:04:55 +09:00
Xavier Léauté	5a3642bb93	Merge pull request #2247 from metamx/pedanticBuild Enable strict building in travis	2016-01-27 10:27:03 -08:00
Xavier Léauté	2e5004095a	Merge pull request #2341 from gianm/smq-test SegmentMetadataQuery: Fix merging of ColumnAnalysis errors.	2016-01-27 09:37:06 -08:00
Charles Allen	508734c8b0	Long constant reformatting in tests `l` --> `L`	2016-01-27 08:59:19 -08:00
Gian Merlino	b1e6c01762	Make LookupExtractor abstract methods public, they have to work across classloaders.	2016-01-26 23:08:03 -08:00
Gian Merlino	795343f7ef	SegmentMetadataQuery: Fix merging of ColumnAnalysis errors. Also add tests for: - ColumnAnalysis folding - Mixed mmap/incremental merging	2016-01-26 17:16:26 -08:00
Himanshu Gupta	3719b6e3c8	make populateUncoveredIntervals a configuration in query context	2016-01-26 15:13:45 -06:00
Himanshu	3844658fb5	Merge pull request #2323 from druid-io/update-druidapi Update druid-api to 0.3.16	2016-01-26 13:02:10 -06:00
Himanshu Gupta	09d3678667	adding single threaded indexing and querying test for IncrementalIndex	2016-01-23 00:17:14 -06:00
Charles Allen	0000b9fc62	Remove sorting in ProtoBufInputRowParserTest Due to processing/src/test/java/io/druid/data/input/ProtoBufInputRowParserTest.java	2016-01-22 16:02:25 -08:00
Himanshu Gupta	2f7f5119cf	older segments might not have field bitmapSerdeFactory for dimension columns and we must use appropriate default	2016-01-22 13:28:25 -06:00
binlijin	1d1f4d996d	Merge pull request #2111 from binlijin/optimize-create-inverted-indexes optimize create inverted indexes	2016-01-22 11:36:27 +08:00
binlijin	55f7dd4629	optimize create inverted indexes	2016-01-22 10:40:09 +08:00
Gian Merlino	d416279c14	SegmentMetadataQuery support for returning aggregators.	2016-01-21 17:27:25 -08:00
Fangjin Yang	5a9cd89059	Merge pull request #2305 from gianm/segment-metadata-query-multivalues Add StorageAdapter#getColumnTypeName, and various SegmentMetadataQuery adjustments	2016-01-21 17:22:34 -08:00
Gian Merlino	e5913be90e	Merge pull request #2257 from tubemogul/index-merge-bug Adds support for empty merge metrics. fixes #2256	2016-01-21 16:38:00 -08:00
Gian Merlino	87c8046c6c	Add StorageAdapter#getColumnTypeName, and various SegmentMetadataQuery adjustments. SegmentMetadataQuery stuff: - Simplify implementation of SegmentAnalyzer. - Fix type names for realtime complex columns; this used to try to merge a nice type name (like "hyperUnique") from mmapped segments with the word "COMPLEX" from incremental index segments, leading to a merge failure. Now it always uses the nice name. - Add hasMultipleValues to ColumnAnalysis. - Add tests for both mmapped and incremental index segments. - Update docs to include errorMessage.	2016-01-21 15:50:33 -08:00
Fangjin Yang	3f998117a6	Merge pull request #2306 from jon-wei/inherit2 More specific null/empty str handling in IndexMerger	2016-01-21 14:36:09 -08:00
Michael Schiff	1e44445f06	Adds support for empty merge metrics. fixes #2256	2016-01-21 13:21:37 -08:00
jon-wei	459a236067	More specific null/empty str handling in IndexMerger	2016-01-21 12:24:38 -08:00
Slim	201539260c	Merge pull request #2076 from b-slim/issue_2010_upper_lower_extractionFN adding lower and upper extraction fn	2016-01-21 09:58:07 -06:00
Slim Bouguerra	78feb3a13e	adding lower and upper extraction fn	2016-01-21 08:59:05 -06:00
Gian Merlino	5a932d28c1	Merge pull request #2288 from tubemogul/index-merge-bug2 Null check in IncrementalIndexAdapter.getDimValueLookup()	2016-01-20 17:07:15 -08:00
Nishant	59ea186af7	fix reference counting for segments	2016-01-20 17:24:21 +05:30
Michael Schiff	50ceec78a2	null check in IncrementalIndexAdapter.getDimValueLookup()	2016-01-19 23:19:28 -08:00
jon-wei	bc1e9b27c8	Consolidate IndexMergerTest and IndexMergerV9Test	2016-01-19 16:28:35 -08:00
jon-wei	747343e621	Preserve dimension order across indexes during ingestion	2016-01-19 13:34:11 -08:00
Fangjin Yang	0c31f007fc	Merge pull request #1728 from himanshug/aggregators_in_segment_metadata Store AggregatorFactory[] in segment metadata	2016-01-19 12:55:49 -08:00
Himanshu Gupta	a99aef29a1	adding aggregators to segment metadata	2016-01-19 14:23:39 -06:00
Himanshu Gupta	52eb0f04a7	adding a new method getMergingFactory(..) to AggregatorFactory	2016-01-18 22:03:46 -06:00
Himanshu Gupta	77fc86c015	making AggregatorFactory abstract class	2016-01-18 22:03:46 -06:00
Himanshu Gupta	164b0aad7a	removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class instead of a Map to store segment metadata	2016-01-18 22:03:46 -06:00
zhxiaog	3459a202ce	fixed #1873 , add ability to express CONCAT as an extractionFn	2016-01-18 15:03:17 -08:00
Keuntae Park	238dd3be3c	support cascade execution of extraction filters in extraction dimension spec	2016-01-18 11:10:19 +09:00
Fangjin Yang	f6a1a4ae20	Merge pull request #2138 from KurtYoung/feature-build-v9 build v9 directly	2016-01-16 13:35:46 -06:00
Kurt Young	82ff98c2bf	add config for build v9 directly and update docs	2016-01-16 11:26:34 +08:00
Kurt Young	1f2168fae5	add IndexMergerV9 add unit tests for IndexMergerV9 and fix some bugs add more unit tests and fix bugs handle null values and add more tests minor changes & use LoggingProgressIndicator in IndexGeneratorReducer make some static class public from IndexMerger minor changes and add some comments changes for comments	2016-01-16 11:25:28 +08:00
Kurt Young	bb50d2a2b2	add some streaming writers	2016-01-16 11:25:26 +08:00
Fangjin Yang	e0932ba1c2	Merge pull request #2267 from himanshug/fix_topn_multi_val_filter Remap id's returned in XXXFilteredDimensionSpec.getRow() as per reduced cardinality	2016-01-14 17:06:54 -08:00
Fangjin Yang	7704699b40	Merge pull request #2265 from navis/strlen-dimension-ignored Strlen sort spec ignores dimension	2016-01-14 17:06:33 -08:00
Himanshu Gupta	ae6a111444	fix XXXFilteredDimensionSpec to remap the dictionary encodings as per new cardinality	2016-01-13 22:25:02 -06:00
binlijin	a3140b2548	fix topN filtering on multi-valued dimension bug	2016-01-13 22:25:02 -06:00
navis.ryu	ea9fabdf2f	Strlen sort spec ignores dimension	2016-01-14 11:05:44 +09:00
Fangjin Yang	4c014c1574	Merge pull request #2228 from metamx/incremental-index-mem2 Improve heap usage for IncrementalIndex	2016-01-13 14:48:03 -08:00
navis.ryu	18479bb757	time-descending result of timeseries queries	2016-01-13 12:23:01 +09:00
Fangjin Yang	d7ad93debc	Merge pull request #2221 from binlijin/topN_minTopNThreshold Allow change minTopNThreshold per topN query	2016-01-12 16:22:20 -08:00
Nishant	4863e2ca4f	cache metric selectors instead of creating new ones for every metric in each row clear selectors on close. Add comments about thread safety.	2016-01-13 00:45:23 +05:30
Nishant	dfe6abb721	Merge pull request #2250 from himanshug/agg_test_helper_fix remove redundant registering of json modules in AggregationTestHelper	2016-01-12 11:42:00 +05:30
navis.ryu	976ebc45c0	Simplify information in IncrementalIndex	2016-01-12 10:18:11 +09:00
Himanshu Gupta	b973604bf8	remove redundant registering of json modules in AggregationTestHelper	2016-01-11 19:03:22 -06:00
Xavier Léauté	46a7f2660d	fix casing to be consistent with other classes	2016-01-08 10:19:06 -08:00
Fangjin Yang	d0b10c29d7	Merge pull request #2197 from metamx/clearIncIndexClose Make OnHeapIncrementalIndex clean maps on close()	2016-01-07 15:43:47 -08:00
Gian Merlino	4ecd901a1a	Merge pull request #2219 from himanshug/identity_extraction_fn_singleton make IdentityExtractionFn singleton	2016-01-07 10:08:28 -08:00
Fangjin Yang	aaea95ed1b	Merge pull request #2207 from himanshug/theta_sketch_select_query fix bug for thetaSketch metric not working with select queries	2016-01-07 09:46:09 -08:00
binlijin	010c6e959c	add test	2016-01-07 18:01:46 +08:00
binlijin	a6bfcc5bfd	Allow change minTopNThreshold per topN query	2016-01-07 14:51:00 +08:00
Fangjin Yang	4cc81d3eff	Merge pull request #2096 from b-slim/add_use_case_unapply Add use case unapply	2016-01-06 21:58:12 -08:00
Himanshu Gupta	217079d0c7	make IdentityExtractionFn singleton	2016-01-06 22:29:07 -06:00
Himanshu	902f51433d	Merge pull request #2125 from mangeshpardeshiyahoo/master Add extraction function support for Dimension Selector	2016-01-06 14:22:26 -06:00
Mangesh Pardeshi	75ee952197	Add extraction function support for dimension Selector	2016-01-06 13:47:07 -06:00
Slim Bouguerra	032d3bf6e6	Optimization of extraction filter by reversing the lookup	2016-01-06 11:16:11 -06:00
Himanshu Gupta	3f048f0b15	adding support to execute Select queries in AggregationTestHelper so that Select query based UTs can be written for complex aggregator implementations	2016-01-05 21:54:55 -06:00
Charles Allen	91fc32749b	Make OnHeapIncrementalIndex clean maps on close()	2016-01-04 11:18:16 -08:00
Himanshu Gupta	b47d807738	Add support for filtering at DimensionSpec level so that multivalued dimensions can be filtered correctly also adding UTs for multi-valued dimensions	2015-12-30 17:59:47 -06:00
Himanshu Gupta	fa5c3bb014	adding decorate(DimensionSelector) to DimensionSpec to enable support for arbitrary filtering/transformations to returned dimension values	2015-12-30 15:06:24 -06:00
Nishant	b68265399c	Merge pull request #2168 from druid-io/remove-indexmaker Remove IndexMaker	2015-12-30 12:24:29 +05:30
Fangjin Yang	e14ad74088	Merge pull request #1936 from b-slim/between_range_with_predicat adding Upper/Lower Bound Filter	2015-12-29 10:11:22 -08:00
fjy	faf421726b	remove IndexMaker	2015-12-28 14:19:02 -08:00
Gian Merlino	83f4130b5f	SegmentMetadataQuery merging fixes. - Fix merging when the INTERVALS analysisType is disabled, and add a test. - Remove transformFn from CombiningSequence, use MappingSequence instead. transformFn did not work for "accumulate" anyway, which made the tests wrong (the intervals should have been condensed, but were not). - Add analysisTypes to the Druids segmentMetadataQuery builder to make testing simpler.	2015-12-22 07:57:10 -08:00
Robin	dded4441d3	for completeness, add unit test for groupby/having with unrecognized type	2015-12-21 12:06:56 -06:00
Himanshu Gupta	e1631967e3	adding comments to explain merge failure in segmentMetadata query	2015-12-19 11:39:24 -06:00
Himanshu Gupta	7ecad1be24	Fix and UT for testing segment analysis merge	2015-12-19 00:24:02 -06:00
Fangjin Yang	7019d3c421	Merge pull request #2107 from jon-wei/fix_smq More efficient SegmentMetadataQuery	2015-12-18 16:40:47 -08:00
Fangjin Yang	14229ba0f2	Merge pull request #1922 from metamx/jsonIgnoresFinalFields Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to	2015-12-18 15:38:32 -08:00
Fangjin Yang	71f554bf80	Merge pull request #2101 from himanshug/fix_extraction_dim_filter_cache_key add extractionFn bytes to cache key in ExtractionDimFilter	2015-12-18 12:05:43 -08:00
Fangjin Yang	9e6874cc7e	Merge pull request #2084 from binlijin/master minor optimize IndexMerger's MMappedIndexRowIterable	2015-12-18 11:42:55 -08:00
Bingkun	cc21a5fac7	Merge pull request #1999 from himanshug/remove_min_max_aggs remove min/max aggregator factory	2015-12-18 13:38:52 -06:00
jon-wei	356b07c6c3	More efficient SegmentMetadataQuery	2015-12-17 12:46:23 -08:00
Jonathan Wei	f8cf84f466	Merge pull request #1995 from himanshug/num_rows_seg_metadata_query add numRows to segment metadata query response	2015-12-17 12:23:46 -08:00
Himanshu Gupta	82ea348003	add extractionFn bytes to cache key in ExtractionDimFilter	2015-12-16 14:00:38 -06:00
Himanshu	628643d80e	Merge pull request #2091 from rasahner/noDefaultForGroupbyHaving take away default for groupBy/having	2015-12-16 01:07:40 -06:00
sahner	3441cf3110	take away default for groupBy/having	2015-12-15 10:32:45 -06:00
Fangjin Yang	e7f06cf61c	Merge pull request #2075 from jon-wei/regex_extract Configurable value replacement on match failure for RegexExtractionFn	2015-12-14 19:10:50 -08:00
jon-wei	c88f75df7c	Configurable value replacement on match failure for RegexExtractionFn	2015-12-14 17:57:41 -08:00
binlijin	362bea1090	minor optimize IndexMerger's MMappedIndexRowIterable	2015-12-11 15:04:46 +08:00
Xavier Léauté	d531e69d1a	Merge pull request #2079 from binlijin/master reduce bytearray copy to minimal optimize VSizeIndexedWriter	2015-12-10 21:30:09 -08:00
Slim Bouguerra	77afdf25e3	adding Bound Filter	2015-12-10 08:47:21 -06:00
Slim Bouguerra	ee1a39801a	adding bulk lookup and reverse lookup	2015-12-10 08:29:41 -06:00
binlijin	0eafbd55b2	reduce bytearray copy to minimal optimize VSizeIndexedWriter	2015-12-10 16:34:39 +08:00
Fangjin Yang	f4ba13a1ac	Merge pull request #2029 from b-slim/add_reverse_fn Adding reverse lookup function to LookupExtractor.	2015-12-09 12:50:13 -08:00
Xavier Léauté	9015a68c03	Merge pull request #2002 from navis/DRUID-2001 fixed #2001 GenericIndexed.fromIterable compares all values even when it's not sorted	2015-12-09 08:56:49 -08:00
Slim Bouguerra	85f339b687	introduction and implem of reverse lookup function unApply.	2015-12-09 10:02:57 -06:00
Nishant	6c23d8edb4	Merge pull request #2043 from mangeshpardeshiyahoo/master Add dimension selector support for groupby/having filters	2015-12-08 12:08:53 +05:30
Mangesh Pardeshi	d7ce120929	Add dimension selector support for groupby/having quries	2015-12-08 01:51:11 +00:00
Himanshu Gupta	431469e9c1	remove min/max aggregator factory which are replaced by double[min/max] aggregator factories	2015-12-05 22:36:49 -06:00
Himanshu Gupta	62ba9ade37	unifying license header in all java files	2015-12-05 22:16:23 -06:00
Gian Merlino	d21a640695	Merge pull request #2034 from b-slim/fix_cache_key Fix getCacheKey for DimFilters	2015-12-04 09:13:06 -08:00
Slim Bouguerra	fb4ff3cf54	fix getCacheKey	2015-12-04 08:07:08 -06:00
Charles Allen	9d02f47201	Update IncrementalIndexTest copyright notice	2015-12-03 18:03:08 -08:00
Charles Allen	be8c6fafb0	Merge pull request #2017 from tubemogul/issue/63 fixes issue #63	2015-12-03 18:01:11 -08:00
Gian Merlino	045df54404	Merge pull request #1961 from metamx/druidMetricsVersion Add the druid artifact version to metrics when emitted	2015-12-03 17:34:57 -08:00
Michael Schiff	b6cc2428e1	fixes issue #63	2015-12-03 17:30:47 -08:00
Himanshu	0eab8417cb	Merge pull request #2008 from codingwhatever/regex-search-query Regex search query	2015-12-03 09:57:34 -06:00
Sam Groth	596b7ebd9a	Adding RegexSearchQuerySpec	2015-12-03 09:16:02 -06:00
Himanshu	d02be6194d	Merge pull request #1967 from metamx/realtime-metrics-improvements Add datasource and taskId to metrics emitted by peons	2015-12-02 23:48:13 -06:00
Himanshu	00c6027777	Merge pull request #1986 from metamx/substring fixes #1874 adding a substring extraction function, tests, and documentation	2015-12-02 23:45:47 -06:00
Clint Wylie	68ef5f437a	fixes #1874 adding a substring extraction function, tests, and documentation	2015-12-01 23:50:32 -08:00
navis.ryu	87357a0534	fixed #2001 GenericIndexed.fromIterable compares all values even when it's not sorted	2015-12-02 15:11:14 +09:00
Nishant	1eb8211346	Add datasource and taskId to metrics emitted by peons This PR adds the datasource and taskId to the jvm and sys metrics emitted by the peons. fix spelling review comment review comment	2015-12-01 23:20:59 +05:30
Gian Merlino	cd2cff24ff	Fix serde for FragmentSearchQuerySpec and add some tests.	2015-11-30 17:34:35 -08:00
navis.ryu	c73418c181	fixed #2003 ColumnSelectorBitmapIndexSelector throws NPE for dimension not supporting bitmap	2015-11-24 10:45:36 +09:00
Himanshu Gupta	7a89b2e1a6	add numRows to segment metadata query response	2015-11-20 01:25:02 -06:00
Himanshu	d93640bfcb	Merge pull request #1974 from jon-wei/dim_order_merge Allow IndexMerger to use non-lexicographic dim order when merging indexes	2015-11-18 19:51:34 -06:00
Xavier Léauté	e3e6159336	Merge pull request #1985 from metamx/FixLookupCacheKey Change LookupExtractionFn cache key to be unique	2015-11-18 10:13:55 -08:00
Charles Allen	7abe999418	Change LookupExtractionFn cache key to be unique	2015-11-17 18:02:40 -08:00
jon-wei	4afc62be29	Allow IndexMerger to use non-lexicographic dim order when merging indexes	2015-11-17 13:02:31 -08:00
Xavier Léauté	d7eb2f717e	enable query caching on intermediate realtime persists	2015-11-17 10:58:00 -08:00
Gian Merlino	57f213d536	Better toString for groupBy, segmentMetadata queries.	2015-11-16 12:54:59 -08:00
jon-wei	cdceaf2d26	Fix IncrementalIndexAdapter getRows() Iterable	2015-11-12 13:10:42 -08:00
Charles Allen	af34e9c8cb	Add the druid artifact version to metrics when emitted	2015-11-12 12:11:27 -08:00
binlijin	286b8f8c6f	optimize index merge	2015-11-12 11:08:54 +08:00
Xavier Léauté	fa6142e217	cleanup and remove unused imports	2015-11-11 12:25:21 -08:00
dclim	fd0935ecb9	fix spatial dimension transformer to work with hadoop	2015-11-10 19:16:51 -07:00
Slim Bouguerra	c511273efd	adding in filter	2015-11-06 16:23:24 -06:00
Charles Allen	929b981710	Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to	2015-11-05 18:10:13 -08:00
fjy	8f231fd3e3	cleanup druid codebase	2015-11-04 13:59:53 -08:00
Gian Merlino	8defe29270	Merge pull request #1901 from guobingkun/fix_typo_and_rename Fix metadata typo and rename default extension directory	2015-11-03 14:02:11 -08:00
Bingkun Guo	962f65cc76	fix metadata typo and rename default extension directory	2015-11-03 14:50:42 -06:00
Fangjin Yang	cec09a9967	Merge pull request #1804 from himanshug/objectify_index_creators static to non-static conversion for methods in Index[Merger/Maker/IO]	2015-11-03 11:25:32 -08:00
Himanshu Gupta	8b67417ac8	make methods in Index[Merger,Maker,IO] non-static so that they can have appropriate ObjectMapper injected instead of creating one statically	2015-11-02 23:24:26 -06:00
navis.ryu	e03fc2032f	changed equals/hashCode implementation	2015-11-02 17:21:35 +09:00
navis.ryu	69c86716d6	addressed comments	2015-11-02 14:23:13 +09:00
navis.ryu	032c3e986d	Make 'search' filter have a case sensitive option(#1878 )	2015-10-30 16:38:54 +09:00
Fangjin Yang	25a0eb7ed5	Merge pull request #1799 from dclim/nested-groupby-aggregator-fix Support multiple outer aggregators of same type and provide more help…	2015-10-29 18:01:31 -07:00
Xavier Léauté	59872bd0cd	Merge pull request #1809 from metamx/fifoPriorityExecutorService Make PrioritizedExecutorService optionally FIFO	2015-10-27 15:19:32 -07:00
Charles Allen	060402a216	Merge pull request #1855 from himanshug/fix_having_specs fix [GreaterThan,LessThan,Equals] HavingSpecs	2015-10-27 14:46:04 -07:00
Charles Allen	ecdafa87c5	Make PrioritizedExecutorService optionally FIFO	2015-10-27 14:16:22 -07:00
Himanshu Gupta	a71c7270b9	making [GreaterThan,LessThan,Equals] HavingSpecs more robust by carefully using long vs float for comparison	2015-10-27 13:15:13 -05:00
Fangjin Yang	5a082b2f5e	Merge pull request #1824 from metamx/UniformGranularitySpecHashEquals Add hashCode and equals to UniformGranularitySpec	2015-10-26 09:34:01 -07:00
Fangjin Yang	5f23703216	Merge pull request #1638 from guobingkun/remove_maven_client_code Remove Maven client at runtime + Provide a way to load Druid extensions through local file system	2015-10-26 09:30:05 -07:00
Nishant	7cecc55045	Add segment merge time as a metric Add merge and persist cpu time Fix typo review comment move cpu time measuring to VMUtils review comments.	2015-10-22 12:28:03 +05:30
Bingkun Guo	4914925d65	New extension loading mechanism 1) Remove maven client from downloading extensions at runtime. 2) Provide a way to load Druid extensions and hadoop dependencies through file system. 3) Refactor pull-deps so that it can download extensions into extension directories. 4) Add documents on how to use this new extension loading mechanism. 5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0 are packaged within the Druid tarball.	2015-10-21 14:22:36 -05:00
Xavier Léauté	e4ac78e43d	bump next snapshot to 0.9.0	2015-10-20 13:46:13 -07:00
dclim	46ecdfa757	add comment explaining logic	2015-10-15 16:04:06 -06:00
Xavier Léauté	4c2c7a2c37	update version to 0.8.3	2015-10-14 21:40:55 -07:00
Charles Allen	f432b8e3f9	Add hashCode and equals to UniformGranularitySpec * Also add hashCode != 0 to AllGranularity and NoneGranularity	2015-10-13 16:42:21 -07:00
Gian Merlino	c9d6994040	Merge pull request #1821 from himanshug/storage_adapter_update cache max data timestamp in QueryableIndexStorageAdapter	2015-10-13 10:52:43 -07:00
Himanshu Gupta	490de1f98a	support multiple non-consecutive intervals in outer query of nested group-by	2015-10-13 10:16:06 -05:00
Himanshu Gupta	fbba30eb60	cache max data timestamp in QueryableIndexStorageAdapter so that TimestampCheckingOffset does not have to get it per cursor.	2015-10-12 15:34:22 -05:00
Charles Allen	8ed5d2c06a	Add hashCode and equals to stock lookups	2015-10-12 10:29:39 -07:00
Himanshu Gupta	2737fd83f5	in the IndexSizeExceededException put maxRowCount to confirm if it is correctly picked up from configuration	2015-10-06 15:23:14 -05:00
Himanshu Gupta	8654732ef6	make IndexSizeExceededException constructor take formatString and arguments than just fixed String like ISE, IAE etc	2015-10-06 13:44:22 -05:00
dclim	f4e0a76820	Support multiple outer aggregators of same type and provide more helpful exception when the same inner aggregator is referenced by multiple types of outer aggregators	2015-10-01 15:15:12 -06:00
Gian Merlino	774765dc40	GroupByQueryRunnerTest for hyperUnique finalizing post aggregators	2015-10-01 00:09:29 -04:00
Gian Merlino	e3bb93e8c7	Revert "Merge pull request #1781 from dclim/nested-groupby-multiple-same-aggregator-fix-v2" This reverts commit `dae488b7c0`, reversing changes made to `397be4b897`.	2015-10-01 00:05:59 -04:00
dclim	8e20a1e1f3	Use DoubleSumAggregatorFactory instead of CountAggregatorFactory, add test for non-integers	2015-09-30 17:11:39 -06:00
David Lim	70ae5ca922	Fix failure in nested groupBy with multiple aggregators with same fieldName Version 2 - Throws an exception if an outer query references an aggregator that doesn't exist in the inner query, and then uses the inner query aggregator names to form the columns for the intermediate incremental index. Also deleted all the getRequiredColumns() methods which are no longer being used. We do something wacky by adding an aggregator factory for the post aggregators when building the intermediate incremental index, otherwise queries on post aggregate results fail because the data isn't in the incremental index. Closes #1419	2015-09-30 15:43:11 -06:00
Charles Allen	8199ecf1a4	Merge pull request #1782 from jon-wei/smq_cachekey Add analysisTypes to SegmentMetadataQuery cache key	2015-09-29 15:51:35 -07:00
jon-wei	41ff271339	Add analysisTypes to SegmentMetadataQuery cache key	2015-09-29 14:33:35 -07:00
Charles Allen	2d847ad654	Merge pull request #1730 from metamx/union-queries-fix fix #1727 - Union bySegment queries fix	2015-09-29 12:23:25 -07:00
Nishant	573aa96bd6	fix #1727 - Union bySegment queries fix Fixes #1727. revert to doing merging for results for union queries on broker. revert unrelated changes Add test for union query runner Add test remove unused imports fix imports fix renamed file fix test update docs.	2015-09-29 23:32:36 +05:30
Gian Merlino	62d4ced4dd	Separate ListColumnIncluderator cache key parts with nul bytes	2015-09-29 13:59:58 -04:00
jon-wei	e6a6284ebd	Allow SegmentMetadataQuery to skip cardinality and size calculations	2015-09-22 13:51:55 -07:00
Gian Merlino	aaa8a88464	Merge pull request #1739 from jon-wei/segment_realtime Allow SegmentAnalyzer to read columns from StorageAdapter, allow SegmentMetadataQuery to query IncrementalIndexSegments on realtime node	2015-09-17 18:36:53 -07:00
Charles Allen	df4c2bab10	Soften concurrency requirements on IncrementalIndexTest	2015-09-17 15:51:07 -07:00
jon-wei	367c50d4ba	Allow SegmentAnalyzer to read columns from StorageAdapter, allow SegmentMetadataQuery to query IncrementalIndexSegments on realtime node	2015-09-16 18:39:31 -07:00
Charles Allen	6e1eb3b7fe	Add better concurrency testing to IncrementalIndexTest	2015-09-16 14:04:20 -07:00
Gian Merlino	9705c5139b	Merge pull request #1732 from jon-wei/segmentmeta Add support for a configurable default segment history period for segmentMetadata queries and GET /datasources/<datasourceName> lookups	2015-09-16 12:36:25 -07:00
Fangjin Yang	8b071a7230	Merge pull request #1710 from metamx/incrementalIndexConcurrentTestLatching Add some basic latching to concurrency testing in IncrementalIndexTest	2015-09-15 13:55:52 -07:00
jon-wei	193fb4fdfc	Add support for a configurable default segment history period for segmentMetadata queries and GET /datasources/<datasourceName> lookups	2015-09-14 19:41:42 -07:00
Charles Allen	bd605a097e	Merge pull request #1731 from metamx/regex-extraction-npe fix NPE with regex extraction function	2015-09-14 15:55:05 -07:00
Xavier Léauté	08a527d01a	fix NPE with regex extraction function	2015-09-14 14:45:30 -07:00
Charles Allen	e569f4b6a7	Add dimension extraction functionality to SearchQuery * Add IdentityExtractionFn	2015-09-14 11:36:15 -07:00
Himanshu	5ff92664f8	Merge pull request #1696 from metamx/cpuTimeReporting Add CPU time to metrics for segment scanning.	2015-09-14 10:53:55 -05:00
Fangjin Yang	34ef81572d	Merge pull request #1700 from himanshug/update_agg_test_helper update indexing in the helper to use multiple persists and merge	2015-09-14 06:56:29 -07:00
Charles Allen	8d3cdd8572	Don't check for sortedness if we already know GenericIndexedWriter isn't sorted	2015-09-11 16:32:09 -07:00
Charles Allen	d6849805ea	Add some basic latching to concurrency testing in IncrementalIndexTest	2015-09-10 10:06:51 -07:00
Himanshu Gupta	5da58e48e0	use Rule based TemporaryFolder for cleanup of temp directory/files	2015-09-09 11:10:33 -05:00
Himanshu Gupta	44911039c5	update indexing in the helper to use multiple persists and final merge to catch further issues in aggregator implementations	2015-09-09 11:10:33 -05:00
Charles Allen	fcf5cae81d	Add CPU time to metrics for segment scanning.	2015-09-08 13:34:19 -07:00
cheddar	4f61b42f40	Merge pull request #1578 from b-slim/fix_extraction_filter_2 Fix UT and documentation to the extraction filter	2015-09-01 10:46:20 -07:00
Himanshu	04ff6cd355	Merge pull request #1685 from gianm/close-loudly Close output streams and channels loudly when creating segments.	2015-08-28 23:32:22 -05:00
Gian Merlino	940e1aa3eb	Replace funky imports with standard ones. 1) Lots of Guava imports were not coming from the actual Guava 2) junit.framework.Assert should be org.junit.Assert	2015-08-28 18:02:05 -07:00
Gian Merlino	7d6fa2ba50	Close output streams and channels loudly when creating segments.	2015-08-28 17:14:03 -07:00
Himanshu Gupta	2e0dd1d792	adding UTs and addressing review comments to firehoseV2 addition to Realtime[Manager\|Plumber], essential segment metadata persist support, kafka-simple-consumer-firehose extension patch	2015-08-27 20:50:46 -05:00
lvjq	2237a8cf0f	kafka 8 simple consumer firehose	2015-08-27 20:50:46 -05:00
Charles Allen	c1388a1685	Merge pull request #1632 from Hailei/fix-subquery-innerquery-demension Inner Query should build on sub query	2015-08-27 10:25:38 -07:00
Gian Merlino	2a866f49df	Downgrade Jackson to 2.4.6.	2015-08-26 18:25:55 -07:00
Charles Allen	24aa762c79	Add test for #1632	2015-08-25 20:50:30 -07:00
Xavier Léauté	51f6a9a2c9	update jackson to 2.6.1	2015-08-25 16:07:01 -07:00
Himanshu Gupta	c57c07f28a	add ability for client code to provide InputStream of input data in addition to File It would be needed when input data file does not reside in the same jar but you could still use getResourceAsStream() to read the data inside a file	2015-08-20 00:54:58 -05:00
Xavier Léauté	3b2e41e42a	update for next release	2015-08-18 17:16:46 -07:00
Slim Bouguerra	7549f02578	support the case filter value is null	2015-08-17 15:09:37 -05:00
zhanghailei	234a958817	Inner Query should build on sub query	2015-08-17 18:18:26 +08:00
Charles Allen	db19d2d547	Revert "Update to guice 4.0"	2015-08-14 09:26:07 -07:00
Charles Allen	be89105621	Merge pull request #1602 from metamx/more-code-cleanup Some perf Improvements in Broker	2015-08-11 13:51:49 -07:00
Xavier Léauté	fbdb841928	Merge pull request #1603 from metamx/optimize-lexicographic-topN Optimizations for LexicographicTopNs	2015-08-11 13:35:34 -07:00
Nishant	b8d8a8da9e	Optimisations for LexicographicTopNs initial review for perf optimizations for lexicographic TopNs fix compilation create map with proper size review comment review comment review comments	2015-08-12 00:37:48 +05:30
Charles Allen	7e61216287	Update to guice 4.0 - Mark a lot of `@Provides` methods as final since guice 4.0 disallows overriding them	2015-08-10 13:57:18 -07:00
Slim Bouguerra	f0bc362981	clean code if is not needed anymore	2015-08-07 12:38:41 -05:00
Slim Bouguerra	64d638a386	optimize makeMatcher	2015-08-06 17:04:36 -05:00
Nishant	1a46c4c71c	avoid creating mergeSeqence when not required	2015-08-06 14:25:13 +05:30
Slim Bouguerra	83de5a4716	addressing reviewers comments	2015-08-03 09:03:28 -05:00
Slim Bouguerra	dda0790a60	Fix extractionFilter by implementing make matcher Fix getBitmapIndex to consider the case were dim is null Unit Test for exractionFn with empty result and null_column UT for TopN queries with Extraction filter refactor in Extractiuon fileter makematcher for realtime segment and clean code in b/processing/src/test/java/io/druid/query/groupby/GroupByQueryRunnerTest.java fix to make sure that empty string are converted to null	2015-08-03 09:02:17 -05:00
Himanshu Gupta	d11d9b6c45	dont waste memory in storing all lines from input CharSource.readLines() reads all lines from input into a in-memory list Since we need an iterator here, so this wastage can be easily prevented	2015-07-20 21:59:38 -05:00
Fangjin Yang	0481c8ca26	Merge pull request #1406 from zhaown/fix-breaking-while-exceeding-max-intermediate-rows Fix breaking while exceeding max intermediate rows.	2015-07-20 13:41:22 -07:00
Himanshu Gupta	f7a92db332	generic byte[] serde for InputRow	2015-07-20 12:01:53 -05:00
Himanshu Gupta	0439e8ec23	adding serde methods for intermediate aggregation object to ComplexMetricSerde This provides the alternative to using ComplexMetricSerde.getObjectStrategy() and using the serde methods from ObjectStrategy as that usage pattern is deprecated.	2015-07-20 12:01:53 -05:00
zhaown	524b05f073	Fix breaking while exceeding max intermediate rows.	2015-07-19 10:41:53 +08:00
Fangjin Yang	e21195f987	Merge pull request #1469 from guobingkun/table_config Inconsistent property names for "druid.metadata.storage.tables.xxx"	2015-07-17 07:43:19 -07:00
Himanshu	19af3bc9bc	Merge pull request #1535 from metamx/alphanum-docs-tests Update alphanumeric sort docs + more tests / examples	2015-07-16 22:09:41 -05:00
Xavier Léauté	2c464ad936	correct reference in docs + more tests / examples	2015-07-16 19:50:05 -07:00
Xavier Léauté	9616c10b1d	remove import static	2015-07-16 17:46:21 -07:00
Xavier Léauté	c1308203b8	Merge pull request #1532 from metamx/fixTopNDimExtractionDoubleApply Fix TopN dimension extractions being applied twice	2015-07-16 13:39:02 -07:00
Xavier Léauté	3a0793aaf9	Merge pull request #1533 from metamx/extraCheckGroupByDimExtraction Add more unit tests for group by	2015-07-15 21:09:00 -07:00
Charles Allen	7d0b77c261	Add more unit tests for group by	2015-07-15 20:15:21 -07:00
Xavier Léauté	a15a2c4047	fix histogram aggregator cache key	2015-07-15 17:33:36 -07:00
Charles Allen	9092c665b7	Fix TopN dimension extractions being applied twice	2015-07-15 16:58:15 -07:00
Charles Allen	456ad9ffba	Merge pull request #1529 from metamx/update-versions inrement version	2015-07-15 13:25:31 -07:00
Xavier Léauté	4cfb00bc8a	inrement version	2015-07-15 13:09:05 -07:00
Charles Allen	5eadd395e2	Move lots of executor service creation to Execs	2015-07-14 15:38:49 -07:00
Nishant	184b12bee8	fix groupBy caching to work with renamed aggregators Issue - while storing results in cache we store the event map which contains aggregator names mapped to values. Now when someone fire same query after renaming aggs, the cache key will be same but the event will contain metric values mapped to older names which leads to wrong results. Fix - modify cache to not store raw event but the actual list of values only. review comments + fix dimension renaming review comment	2015-07-09 11:48:26 +05:30
Xavier Léauté	9789417612	ModuleList is already part of Initialization	2015-07-01 11:37:40 -07:00
Xavier Léauté	2c463ae435	Merge pull request #1489 from metamx/moveTestPackages Move some test packages	2015-07-01 11:18:09 -07:00
Charles Allen	5e19a615f1	Add coments to DimExtractionTopNAlgorithm	2015-07-01 10:32:45 -07:00
Charles Allen	7a2a8a3d6e	Move extraction tests to more reasonable package	2015-07-01 10:30:50 -07:00
Bingkun Guo	4a0ae7d8d5	Fix inconsistent druid property names for "druid.metadata.storage.tables.xxx" between document and code	2015-06-29 10:12:30 -05:00
Xavier Léauté	28fa1642b9	add node time metrics to DirectDruidClient	2015-06-26 17:57:44 -07:00
Xavier Léauté	36b4453789	Merge pull request #1455 from druid-io/fix-protobuf Fix protobuf impl and docs	2015-06-22 23:15:40 -07:00
nishant	f9cdb0ad61	test for #1120 Make the changes described in #1120 to add test for the issue described there.	2015-06-21 23:34:21 +05:30
fjy	9c74993559	fix protobuf impl and docs	2015-06-20 21:59:38 -07:00
Xavier Léauté	0a5bb909a2	[maven-release-plugin] prepare for next development iteration	2015-06-18 17:35:19 -07:00
Xavier Léauté	59c6b2b279	[maven-release-plugin] prepare release druid-0.8.0-rc1	2015-06-18 17:35:14 -07:00
Charles Allen	6230ac90ae	Use IndexMerger for conversion	2015-06-10 11:34:58 -07:00
Xavier Léauté	395ba79f8b	Merge pull request #1403 from metamx/mergerMakerTests Improvements around resource handling in IndexMerger / IndexIO / QueryableIndex	2015-06-04 15:59:10 -07:00
Charles Allen	ed8eb5c991	Improvements around resource handling in IndexMerger / IndexIO / QueryableIndex * Fix resource leak in `io.druid.segment.IndexIO.DefaultIndexIOHandler#validateTwoSegments(java.io.File, java.io.File)` * Un-deprecate `close()` in `QueryableIndex` and make it inherit `Closeable` * Fix resource leaks in various unit tests * Add `CloserRule` for closing out resources	2015-06-04 14:18:27 -07:00
Himanshu	50ad0e6474	Merge pull request #1412 from pjain1/alphaNumericTopN_NPE_fix NPE fix for TopN query with alphaNumericTopN metric spec	2015-06-04 09:49:31 -05:00
Parag Jain	a7b09e857c	NPE fix for alphaNumericTopN when pervious stop is not specified	2015-06-04 09:30:31 -05:00
Xavier Léauté	35e2fde18e	Merge pull request #1386 from himanshug/aggregation_testing1 General class for testing any Aggregation Implementation	2015-06-03 23:43:36 -07:00
Xavier Léauté	92d7316ed8	Merge pull request #1414 from metamx/timeout2TIMEOUT Replace "timeout" with QueryContextKeys.TIMEOUT	2015-06-02 17:11:09 -07:00
Charles Allen	1c4d42bc15	Replace "timeout" with QueryContextKeys.TIMEOUT	2015-06-02 14:49:21 -07:00
Charles Allen	f48db09e35	Add optimizations for ExtractionFn by enabling MANY_TO_ONE vs ONE_TO_ONE codepaths * Also adds LookupExtractionFn and MapLookupExtractor which takes in an explicit mapping of renames * Add injective to javascript extraction fn	2015-06-02 12:22:56 -07:00
Himanshu Gupta	215c1ab01e	UTs for hyperUnique aggregation	2015-06-01 12:52:40 -05:00
Himanshu Gupta	160d5fe6b7	a general class for testing any [complex] aggregation implementation	2015-06-01 12:52:40 -05:00
Charles Allen	55292bba13	Add more IndexMergerTests	2015-05-28 18:18:20 -07:00
Charles Allen	1ebe622c7d	Add checkin GroupByQuery for null DimensionSpec in dimension list	2015-05-28 14:55:34 -07:00
Xavier Léauté	f9c624c7db	Merge pull request #1361 from mrijke/groupby-limithavingorder-unittest GroupBy Query with Having/Limit/Orderingspec inconsistencies (UnitTest)	2015-05-27 14:49:18 -07:00
Xavier Léauté	1a3f04f0ed	Merge pull request #1354 from metamx/multi-valued-dimension-compression Enabling compression for multiValued dimension	2015-05-26 23:43:53 -07:00
Charles Allen	fd64c24e43	Fix roaring extraction filter on empty values	2015-05-26 13:54:18 -07:00
nishant	81415282aa	Enabling compression for multiValued dimension Add test and refactoring Add benchmark tests	2015-05-27 00:09:14 +05:30
Charles Allen	e97d22a10a	Fix Extraction Filter cast problems for empty results	2015-05-22 15:20:11 -07:00
Charles Allen	e1399b7ce4	Add unit test to show breaking Dimension Extraction Filter	2015-05-22 15:02:11 -07:00
Xavier Léauté	75c092ccb1	Merge pull request #1375 from metamx/MetricManipulatorFnInstances Modify MetricManipulatorFns to use instanced classes	2015-05-22 15:56:47 -04:00
Charles Allen	042653ebcb	Modify MetricManipulatorFns to use instanced classes	2015-05-22 12:38:38 -07:00
Himanshu Gupta	723df735e9	force eagerness of processing of SegmentMetadata queries on the processing executor by converting the Sequence into List	2015-05-22 13:46:26 -05:00
Himanshu Gupta	5852b64852	adding UT for SegmentMetadata bySegment query which catches following regression caused by commit `55ebf0cfdf` it fails when we issue the SegmentMetadataQuery by setting {"bySegment" : true} in context with exception - java.lang.ClassCastException: io.druid.query.Result cannot be cast to io.druid.query.metadata.metadata.SegmentAnalysis at io.druid.query.metadata.SegmentMetadataQueryQueryToolChest$4.compare(SegmentMetadataQueryQueryToolChest.java:222) ~[druid-processing-0.7.3-SNAPSHOT.jar:0.7.3-SNAPSHOT] at com.google.common.collect.NullsFirstOrdering.compare(NullsFirstOrdering.java:44) ~[guava-16.0.1.jar:?] at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:46) ~[java-util-0.27.0.jar:?] at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:42) ~[java-util-0.27.0.jar:?] at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:649) ~[?:1.7.0_80]	2015-05-22 13:45:54 -05:00
Himanshu Gupta	da0cc32bc8	Revert commit `55ebf0cfdf` which caused following regression it fails when we issue the SegmentMetadataQuery by setting {"bySegment" : true} in context with exception - java.lang.ClassCastException: io.druid.query.Result cannot be cast to io.druid.query.metadata.metadata.SegmentAnalysis at io.druid.query.metadata.SegmentMetadataQueryQueryToolChest$4.compare(SegmentMetadataQueryQueryToolChest.java:222) ~[druid-processing-0.7.3-SNAPSHOT.jar:0.7.3-SNAPSHOT] at com.google.common.collect.NullsFirstOrdering.compare(NullsFirstOrdering.java:44) ~[guava-16.0.1.jar:?] at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:46) ~[java-util-0.27.0.jar:?] at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:42) ~[java-util-0.27.0.jar:?] at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:649) ~[?:1.7.0_80]	2015-05-22 13:39:34 -05:00
Maarten Rijke	82da479464	Fix for GroupBy with Having+Limit+Orderspec * Inverted function arguments to compose postProcFn for GroupBy queries with havingspec + limitspec. * Replaced query.getLimitSpec() with null in GroupByQueryToolChest's mergeGroupByResults * Added unittest to verify functionality	2015-05-19 18:35:48 +02:00
Himanshu Gupta	2fd3e9e8e5	return size = 0 in ColumnAnalysis if its unknown that is if complex agg did not implement inputSizeFn() so that segment metadata query shows atleast some information. also instead of COMPLEX, return type of data stored.	2015-05-15 20:11:56 -05:00
Xavier Léauté	3c3db7229c	Merge pull request #1355 from himanshug/long_max_min_aggregators Long max/min aggregators	2015-05-13 12:08:11 -07:00
Himanshu Gupta	cebb550796	additional UTs for [DoubleMax/DoubleMin] aggregation	2015-05-13 09:25:41 -05:00
Himanshu Gupta	d0ec945129	adding aliases doubleMax and doubleMin for max and min respectively renamed all [Max/Min].java to [DoubleMax/DoubleMin].java and created [Max/Min]AggregatorFactory.java which can be removed when we dont need the min/max aggregator type backward compatibility	2015-05-13 09:25:41 -05:00
Himanshu Gupta	2de38f7d29	UTs for long[Max/Min] aggregation	2015-05-13 09:25:22 -05:00
Himanshu Gupta	00436f93e2	long max/min aggregators implementation	2015-05-13 09:25:22 -05:00
fjy	7a6acf5c1b	update pom to 0.8	2015-05-11 19:41:58 -06:00
Xavier Léauté	33265d63e1	Merge pull request #1262 from metamx/fix-null-dimension fix handling of dimension having only null values	2015-05-06 13:51:26 -07:00
nishant	34be1e96fa	fix NPE review comments Add test fix test for java8	2015-05-05 23:11:13 +05:30
Neo	8f8400e24e	fix handling of dimension having only null values fixes #1211 fix value matcher more improvements more fixes for partial null column fix handling of dimension having only null values fixes #1211 fix value matcher more improvements more fixes for partial null column review comment IndexMaker speedups * About 15% speedup Conflicts: processing/src/main/java/io/druid/segment/IndexMaker.java fix handling of dimension having only null values fixes #1211 fix value matcher more improvements more fixes for partial null column fix handling of dimension having only null values fixes #1211 fix value matcher more improvements more fixes for partial null column review comment review comments review comment fix failing tests review comment fix compilation	2015-05-04 22:07:45 +05:30
nishant	50158357ff	fixes #1330 fixes #1330, Avoid creating Period instance as creating a Period from Long.MAX_VALUE throws arithmetic exception. After this query metric will emit duration in seconds instead of minutes.	2015-05-04 20:34:28 +05:30
Xavier Léauté	721505c017	Merge pull request #1208 from druid-io/rework-metrics Schemaless metrics + additional metrics for things we care about	2015-04-27 15:04:54 -07:00
fjy	963e5765bf	Schemaless metrics + additional metrics for things we care about	2015-04-27 13:39:40 -07:00
Charles Allen	27016c0289	Fix IndexIO segment validator to account for timestamp mismatches.	2015-04-27 12:42:16 -07:00
Charles Allen	633fdb029e	Add option to ConvertSegmentTask to skip validation * Validation is enabled by default	2015-04-27 08:37:55 -07:00
Charles Allen	303727e6a9	IndexMaker speedups * About 15% speedup Conflicts: processing/src/main/java/io/druid/segment/IndexMaker.java	2015-04-23 13:19:21 -07:00
Charles Allen	f2300430d1	Cleanup some code in index creation. * Add some unit tests * Add io.druid.segment.IndexMerger.reprocess for quick re-indexing of data * Add dim-value validation to validation checker (instead of ONLY index #) * General code refactoring to make things a little easier to read	2015-04-23 12:41:42 -07:00
Xavier Léauté	7939f43681	Merge pull request #1296 from druid-io/limit-test Add test for order by metric and limit across multiple days	2015-04-22 11:28:06 -07:00
fjy	97d87a06d0	Add another test for limit across multiple days	2015-04-22 11:27:37 -07:00
Fangjin Yang	28f69d6bd3	Merge pull request #1299 from metamx/improve-filter-datasource-metadata Improve filtering of segments for dataSourceMetadataQuery	2015-04-22 11:07:35 -07:00
Xavier Léauté	a0a28de551	fix serde issue when pulling timestamps from cache	2015-04-22 11:03:26 -07:00
Xavier Léauté	2b4406671e	Merge pull request #1301 from druid-io/fix-type fix count agg factory type	2015-04-21 09:24:20 -07:00
fjy	7805357ab1	fix count agg factory type	2015-04-21 09:23:04 -07:00
nishant	bb8c0cb50b	Improve filtering of segments for dataSourceMetadataQuery dataSourceMetadataQuery only needs to be executed on latest segments at present, modify filterSegments and add test.	2015-04-21 09:31:13 +05:30
Xavier Léauté	f73f14ab91	Merge pull request #1297 from metamx/versionConverterTaskUpdates Update VersionConverterTask for IndexSpec and allowing Forced updates	2015-04-20 16:44:35 -07:00
Charles Allen	7479ac9012	Update VersionConverterTask for IndexSepc and allowing Forced updates	2015-04-20 16:17:06 -07:00
fjy	d260515a43	update druid-api version	2015-04-17 14:58:35 -07:00
Bingkun Guo	cf155e4eba	Fix an issue that after broker forwards GroupByQuery to historical, havingSpec is still applied on postAggregations which are removed in the forwarded query. Add a unit test to replicate the issue. Add a query that can replicate this issue into integration test.	2015-04-17 13:00:41 -05:00
fjy	f0a19349bf	fix up some comments for contributed test	2015-04-16 15:07:09 -07:00
Fangjin Yang	90b17a5259	Merge pull request #1285 from venkateshk/limitspec-tests Unit test to surface bug with limit-spec order by over specific query intervals	2015-04-16 13:52:58 -07:00
Xavier Léauté	1d153674b6	remove overzealous check for backwards compatibility	2015-04-15 22:11:55 -07:00
Xavier Léauté	ea5572d001	Merge pull request #1271 from metamx/strictErrorChecking Add stricter checking for potential coding errors	2015-04-15 15:21:41 -07:00
Charles Allen	abdeaa0746	Add stricter checking for potential coding errors Can use via `mvn clean compile test-compile -P strict'	2015-04-15 14:52:25 -07:00
vkavuluri	a2ba5b6183	Unit test to surface bug with limit-spec order by over specific query intervals	2015-04-15 06:31:22 -07:00
Xavier Léauté	3a3046ccf3	add support for dimension compression - compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier - makes dimension compression configurable via IndexSpec - IndexSpec also enables configuring bitmap and metric compression	2015-04-14 10:44:18 -07:00
Xavier Léauté	bafc5114b4	add toString, equals, and hashCode to BitmapSerdeFactory	2015-04-14 10:44:18 -07:00
Xavier Léauté	d20128b89b	add compressed variable-size ints column type	2015-04-14 10:44:18 -07:00
Xavier Léauté	ce928d9636	add compressed ints column type	2015-04-14 10:44:17 -07:00
Xavier Léauté	5c23679238	add WritableSupplier and IndexedMultivalue	2015-04-14 10:44:17 -07:00
Xavier Léauté	1abb9cce7c	make IndexedInts closeable + add fill method	2015-04-14 10:44:17 -07:00
Xavier Léauté	ed0d49933e	fix memory leak in CompressedXXXIndexedSupplierTest	2015-04-14 10:44:16 -07:00
Xavier Léauté	6790e6cf0f	add fromList to CompressedLongsIndexedSupplier	2015-04-14 10:44:16 -07:00
Eric Tschetter	7517f0d0f0	Add some javadoc to the two Query processing interfaces to help aid in implementations of new Queries. Also, remove some comments that did not have enough context to actually make sense to anyone but the original author (at least, I hope they make sense to the author, I definitely don't know what was being said).	2015-04-09 18:11:42 -07:00
Fangjin Yang	208e307915	Merge pull request #1251 from metamx/uriSegmentLoaders Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""	2015-03-30 17:43:51 -07:00
fjy	aea7f9d192	[maven-release-plugin] prepare for next development iteration	2015-03-30 16:35:24 -07:00
fjy	060d7aef03	[maven-release-plugin] prepare release druid-0.7.1	2015-03-30 16:35:20 -07:00
Charles Allen	1c6cbea89c	Revert "Revert "Overhaul of SegmentPullers to add consistency and retries"" This reverts commit `f904bc7858`.	2015-03-30 13:40:04 -07:00
Fangjin Yang	f904bc7858	Revert "Overhaul of SegmentPullers to add consistency and retries"	2015-03-30 13:15:50 -07:00
Charles Allen	6d407e8677	Add URI handling to SegmentPullers * Requires https://github.com/druid-io/druid-api/pull/37 * Requires https://github.com/metamx/java-util/pull/22 * Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can	2015-03-30 12:33:23 -07:00
Fangjin Yang	e5653f0752	Merge pull request #1190 from vigiglobe/master Fix NPE when partionNumber 0 does not exist.	2015-03-26 13:25:39 -07:00
Xavier Léauté	389ea4c32f	Merge pull request #1245 from b-slim/fix_injector_plus_ut Bug fix @DruidSecondaryModule plus unit test	2015-03-26 10:04:44 -07:00
Fangjin Yang	a9c47de571	Merge pull request #1243 from metamx/fix-union-timeline-lookup fixes TimeboundaryQuery and DataSourceMetadata queries returning wrong values for union queries	2015-03-26 10:02:56 -07:00
Slim Bouguerra	1e6be7796e	bug fix @DruidSecondaryModule plus unit test	2015-03-26 10:44:52 -05:00
nishantmonu51	638bf9d4e9	return sorted List of TimeLineObjectHolder	2015-03-26 11:51:09 +05:30
msprunck	942c17a2aa	Remove timeline chunk count assumptions. * Replace with generic iterables	2015-03-24 22:40:49 +01:00
Prajwal Tuladhar	9983216871	use https maven repo URL to download dependencies	2015-03-20 14:09:07 -04:00
fjy	b389cfe404	[maven-release-plugin] prepare for next development iteration	2015-03-19 12:38:17 -07:00
fjy	60e7d543cc	[maven-release-plugin] prepare release druid-0.7.1-rc1	2015-03-19 12:38:13 -07:00
nishantmonu51	39e60b3405	fix race in groupByParallelQueryRunner add UT and use a queue for better concurrency	2015-03-17 20:57:05 +05:30
Xavier Léauté	127b6fd857	Merge pull request #1172 from himanshug/segment_metadata_eager force eager the processing of segment metadata query on the processing executor	2015-03-12 10:19:48 -07:00
Xavier Léauté	0a5a3fe2dc	fix file missing from rebase	2015-03-11 17:30:11 -07:00
Xavier Léauté	e01ed16030	serde tests + equals/hashCode fixes for extraction functions	2015-03-11 16:48:28 -07:00
Xavier Léauté	d3f5bddc5c	Add ability to apply extraction functions to the time dimension - Moves DimExtractionFn under a more generic ExtractionFn interface to support extracting dimension values other than strings - pushes down extractionFn to the storage adapter from query engine - 'dimExtractionFn' parameter has been deprecated in favor of 'extractionFn' - adds a TimeFormatExtractionFn, allowing to project the '__time' dimension - JavascriptDimExtractionFn renamed to JavascriptExtractionFn, adding support for any dimension value types that map directly to Javascript - update documentation for time column extraction and related changes	2015-03-11 16:45:42 -07:00
Himanshu Gupta	55ebf0cfdf	force eager the processing of segment metadata query on the processing threadpool by using ChainedExecutionQueryRunner in SegmentMetadataQueryRunnerFactory.mergeRunners(..)	2015-03-11 12:58:58 -05:00
Xavier Léauté	217e674063	Handling aggregators and post aggregators with duplicate names * add test for same-name groupBy hyperUniques post-agg * add test for same-name post-agg in groupby with approx histogram * Fixes https://github.com/druid-io/druid/issues/1045 * Throws an error if post aggs and aggs do not have unique names * Add more groupBy tests for Having filters	2015-03-10 17:10:43 -07:00
Fangjin Yang	0b467624ec	Merge pull request #694 from druid-io/arithmetic-op-strategies normal division & configurable ordering for ArithmeticPostAggregator	2015-03-10 13:48:27 -07:00
Fangjin Yang	2abdce1dc0	Merge pull request #1180 from metamx/logging-groupBy-NPE add null check early to catch root cause for groupBy NPE while running bySegment query	2015-03-09 09:16:33 -07:00
nishantmonu51	6e935cca0a	add null check early to catch root cause	2015-03-09 21:10:28 +05:30
Xavier Léauté	0d47c0c36d	normal division and configurable ordering for ArithmeticPostAggregator Fixes #510	2015-03-04 12:44:24 -08:00
Fangjin Yang	d685e2ab04	Merge pull request #1165 from friedhardware/fix-NPerror-select Added null check for the pagingSpec on a Select Query.	2015-03-02 14:17:06 -08:00
Fangjin Yang	e8605c63a9	Merge pull request #1150 from himanshug/broker-parallel-chunk-process interval chunk query runner now processes individual chunk in a threadpool	2015-03-02 13:50:23 -08:00
Himanshu Gupta	29039fd541	interval chunk query runner now processes individual chunk in a thread pool and prints metrics query/time per chunk	2015-03-02 15:45:09 -06:00
Joshua Schumacher	e6130e0fdc	Added null check for the pagingSpec on a Select Query.	2015-03-02 12:41:59 -08:00
Fangjin Yang	005f4da2c0	Merge pull request #1143 from metamx/update-rhino-1.7rc5 Update Rhino to 1.7RC5	2015-02-25 12:50:23 -08:00
Xavier Léauté	b167dcf82c	[maven-release-plugin] prepare for next development iteration	2015-02-23 14:28:06 -08:00
Xavier Léauté	e81ac2ba43	[maven-release-plugin] prepare release druid-0.7.0	2015-02-23 14:27:58 -08:00
James Estes	562de6c621	Update docs and examples for log4j2 usage. - Put configs early in classpath in examples so log4j2.xml will get picked up properly - Add an example log4j2.xml file. - Update Logging doc.	2015-02-19 11:40:56 -07:00
Xavier Léauté	c4d721fffd	update Rhino to 1.7RC5	2015-02-19 09:48:18 -08:00
Xavier Léauté	78df7f6165	Move Druid release artifacts to Sonatype - Switch to using Druid parent POM - Add required fields for Sonatype - Common plugin versions and settings have been moved to the parent pom - Cleanup artifacts and POMs for consistent formatting - Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed	2015-02-13 14:26:31 -08:00
fjy	d29740ed9f	[maven-release-plugin] prepare for next development iteration	2015-02-12 16:16:00 -08:00
fjy	211fd15b7e	[maven-release-plugin] prepare release druid-0.7.0-rc3	2015-02-12 16:15:56 -08:00
Fangjin Yang	90bc62eb5c	Merge pull request #1108 from metamx/improve-groupby-perf Improve groupby by removing conversion to case insensitive row	2015-02-12 11:45:20 -08:00
nishantmonu51	15cf432b74	remove conversion to case insensitive row this is not required after death to casing in 0.7	2015-02-11 19:40:36 +05:30
Xavier Léauté	c5e99bf6ec	Merge pull request #1105 from metamx/fixEmptyExtractionFilter Fix empty results on ExtractionFilter.	2015-02-10 14:25:58 -08:00
Charles Allen	b9cb311a52	Fix empty results on ExtractionFilter. * Now returns empty results rather than erroring out * Added unit tests for multiples case	2015-02-10 14:04:38 -08:00
fjy	708759e1e0	Update http-client to 1.0.0	2015-02-10 13:36:47 -08:00
Xavier Léauté	a7dcaffb53	fix `__time` column selector for incremental index - also adds tests for selecting the time column	2015-02-06 12:06:05 -08:00
Fangjin Yang	42e902b6e3	Merge pull request #1090 from metamx/alphanum-attribution update code attribution	2015-02-04 15:51:34 -08:00
Xavier Léauté	0fbc6071c9	update code attribution	2015-02-04 15:28:44 -08:00
Fangjin Yang	25cf15824b	Merge pull request #1085 from gianm/dsmrv-fix DataSourceMetadataResultValue fixes and JodaUtils adjustments.	2015-02-03 17:51:33 -08:00
Gian Merlino	085ad8d345	Fix DataSourceMetadataResultValue serde.	2015-02-03 17:39:42 -08:00
fjy	1f12c5b2f1	[maven-release-plugin] prepare for next development iteration	2015-02-03 12:06:49 -08:00
fjy	e82d431be7	[maven-release-plugin] prepare release druid-0.7.0-rc2	2015-02-03 12:06:41 -08:00
Xavier Léauté	4eff269536	Merge pull request #1079 from druid-io/cleanup-deps Remove non friendly dependencies from Druid	2015-02-03 11:56:41 -08:00
fjy	3e5d338c8e	Remove non friendly dependencies from Druid	2015-02-03 11:36:08 -08:00
Fangjin Yang	71b4c5fa86	Merge pull request #1076 from metamx/remove-threadlocals remove thread-locals in GenericIndexed in favor of wrapped objects	2015-02-02 20:02:33 -08:00
Xavier Léauté	cb2e300eba	remove thread-locals in GenericIndexed in favor of wrapped objects to reduce GC pressure	2015-02-02 15:59:30 -08:00
Eric Tschetter	42eba986ce	Towards consistent null handling This commit also includes 1) the addition of a context parameter on timeseries queries that allows it to ignore empty buckets instead of generating results for them 2) A cleanup of an unused method on an interface	2015-02-02 12:53:07 -08:00
Fangjin Yang	92e616de11	Merge pull request #1077 from metamx/remove-unused-imports remove unused imports	2015-02-02 10:45:27 -08:00
nishantmonu51	ba932bb1f2	remove unused imports	2015-02-02 21:53:39 +05:30
fjy	d05032b98a	towards a community led druid	2015-01-31 20:57:36 -08:00
Xavier Léauté	f24a89a22a	fix NPE for topN over missing hyperUniques column	2015-01-27 16:12:41 -08:00
Charles Allen	226dd91a31	Add a hash map for storing groupBy partition index * Improves groupBy performance by approx 15%	2015-01-26 08:42:02 -08:00
fjy	1f94de22c6	[maven-release-plugin] prepare for next development iteration	2015-01-20 14:23:55 -08:00
fjy	17476edc31	[maven-release-plugin] prepare release druid-0.7.0-rc1	2015-01-20 14:23:51 -08:00
Charles Allen	3d27747f7e	Upgrade to log4j2 Default behavior is as before. Added documentation for how to enable synchronous logging for select chatty classes: * io.druid.client.ServerInventoryView * io.druid.client.BatchServerInventoryView * io.druid.curator.inventory.CuratorInventoryManager * com.metamx.http.client.pool.ChannelResourceFactory	2015-01-20 12:35:18 -08:00
Fangjin Yang	91a79dbf95	Merge pull request #1031 from metamx/ingestmetadata-query DataSourceMetadata query	2015-01-19 21:55:35 -08:00
Charles Allen	7bb038756c	Account for very slow writer threads in IncrementalIndexTest	2015-01-17 13:02:59 -08:00
Fangjin Yang	b4041c13e5	Merge pull request #1029 from metamx/fixChainedExecutionQueryRunnerTest Address spurious test failures	2015-01-16 13:08:32 -08:00
Xavier Léauté	3b3aad78cb	Merge pull request #1027 from metamx/concurrentOnHeapIncrementalIndexFix Fix concurrency issues in OnheapIncrementalIndex	2015-01-16 12:54:42 -08:00
Charles Allen	197af967ef	Fix concurrency issues in OnheapIncrementalIndex * Was encountering weird errors when fast writes were coming in while queries were happening. * Added unit tests which tend to cause concurrency query problems	2015-01-16 12:01:46 -08:00
Charles Allen	ebafa2a786	Fix spurious test failures in ChainedExecutionQueryRunnerTest	2015-01-15 16:49:16 -08:00
Fangjin Yang	5bfcc43377	Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate Update all String conversions to and from byte[] to use the java-util StringUtils functions	2015-01-15 13:50:27 -08:00
nishantmonu51	c7452b75f6	Merge branch 'master' into ingestmetadata-query	2015-01-15 18:00:31 +05:30
Xavier Léauté	d5f4182de4	global test timeouts + fix test race condition	2015-01-07 23:36:57 -08:00
Fangjin Yang	852e863425	Merge pull request #981 from druid-io/strictModuleTyping Use Module instead of generic Object in Guice related items	2015-01-05 12:43:20 -08:00
Charles Allen	b1b5c9099e	Update all String conversions to and from byte[] to use the java-util StringUtils functions * Speedup of GroupBy with javaScript filters by ~10% * Requires https://github.com/metamx/java-util/pull/15	2015-01-05 11:22:32 -08:00
Xavier Léauté	3fc6cf918d	add test for large chunks	2015-01-02 14:31:22 -08:00
Xavier Léauté	f2f9cbeca8	throw error rather than returning garbage results	2015-01-02 14:29:21 -08:00
Xavier Léauté	071943a367	fix LZF compression with buffers exceeding LZF chunk size	2015-01-02 11:39:50 -08:00
Xavier Léauté	f2439899e7	fix bitmap factory serde	2014-12-23 15:07:32 -08:00
Xavier Léauté	27a3169312	increase test timeouts	2014-12-19 17:09:43 -08:00
Charles Allen	971afab36f	Lengthen CompressionStrategyTest::testKnownSizeConcurrency() to have 2m timeout on its test to account for shared Jenkins build lag	2014-12-19 12:53:20 -08:00
Charles Allen	7c8d4a7433	Use Module instead of generic Object in Guice related items	2014-12-19 10:54:06 -08:00
Fangjin Yang	be507b8cb4	Merge pull request #943 from mrijke/partialdimextractfn-nullpointer Fix NullPointerException in PartialDimExtractionFn	2014-12-16 12:29:27 -07:00
nishantmonu51	80e4b68ee7	review comments	2014-12-16 21:16:48 +05:30
Fangjin Yang	b3fe91bb50	Merge pull request #830 from metamx/union-merge-on-historical Union merge on historical	2014-12-15 13:36:47 -07:00
fjy	3cb7999eb9	i hate hadoop dependencies	2014-12-15 09:52:46 -08:00
nishantmonu51	a0d3579a92	add docs + fix tests	2014-12-11 17:58:01 +05:30
nishantmonu51	7ad03087c0	Merge branch 'master' into ingestmetadata-query	2014-12-11 16:54:38 +05:30
nishantmonu51	32b4f55b8a	review comments refactoring	2014-12-11 16:33:14 +05:30
nishantmonu51	3763357f6e	Ingest metadata query implementation	2014-12-10 19:44:00 +05:30
Fangjin Yang	d6d3ec6846	Merge pull request #948 from metamx/ingestion-docs Redocumenting ingestion	2014-12-09 15:30:03 -07:00
fjy	9596c11f42	address cr	2014-12-09 14:19:18 -08:00
nishantmonu51	1a1b0e6f23	merge from master and review comments	2014-12-09 13:16:45 +05:30
xvrl	1392e2731f	Merge pull request #936 from metamx/cachingRunnerImprovements General Caching Query Runners cleanup (40% query time reduction for HLL)	2014-12-08 14:07:52 -08:00
Charles Allen	7b65f0635d	General Caching Query Runners cleanup * Add type strictness to CachingClusteredClient. * Add background caching to CachingClusteredClient. Gives between 0% and 5% query speed increase. * Add @BackgroundCaching annotation for injected ExecutorService items * Add `numBackgroundThreads' configuration options to CacheConfig (default 0 aka same thread legacy behavior) * Add unit tests for CacheConfig * Add an abstract caching query runner class, currently it doesn't do anything exceppt simply make the two caching queries distinct. * Add caching to CachingQueryRunner. Gives up to a WHOPPING 40% reduction in query time on HLL queries * Updated docs with more info on cache settings.	2014-12-08 13:29:32 -08:00
Maarten Rijke	90670a9c7e	Fix NullPointerException in PartialDimExtractionFn by explicity checking for dimValue == null, attempt 2	2014-12-08 22:26:35 +01:00
Maarten Rijke	bd9bbf396c	Fix NullPointerException in PartialDimExtractionFn by explicity checking for dimValue == null	2014-12-08 20:11:58 +01:00
Xavier Léauté	ad23e49777	use fixed-size mapdb cache to avoid heap growing uncontrollably	2014-12-05 15:34:50 -08:00
Xavier Léauté	7cd45a6e1f	IncrementalIndex throws exception if limit exceeded - For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes - canAppendRow is a workaround for realtime index since the Firehose currently does not have a way of rolling back the last event in case of error - canAppendRow needs a fudge factor; there is a race between checking if we can add a row and actually adding a row, because of the way MapDB reports its size.	2014-12-04 14:38:16 -08:00
Xavier Léauté	c7dbe6116c	write byte data as is in smile	2014-12-04 10:57:56 -08:00
Xavier Léauté	c21a82a697	upgrade LZ4 to operate directly on ByteBuffers	2014-12-04 10:57:56 -08:00
Xavier Léauté	0c521e0a77	update joda-time and fix min/max instant	2014-12-04 10:57:56 -08:00
nishantmonu51	269a51964e	fix size calculation	2014-12-04 17:22:24 +05:30
nishantmonu51	4dc0fdba8a	consider mapped size in limit calculation & review comments	2014-12-03 23:47:30 +05:30
Charles Allen	529e7e0272	Merge pull request #927 from metamx/speedup-smile-bytes Improve Smile serde performance by writing binary data as is	2014-12-03 10:02:08 -08:00
Charles Allen	0f5d5840da	Merge pull request #924 from metamx/update-joda-time Update Joda-Time and fix min/max instant overflow	2014-12-03 09:15:39 -08:00
nishantmonu51	da8bd7836b	Introduce buffer size	2014-12-03 16:28:22 +05:30
Xavier Léauté	5fece517fa	write byte data as is in smile	2014-12-03 00:01:01 -08:00
Xavier Léauté	18f50097a9	upgrade LZ4 to operate directly on ByteBuffers	2014-12-02 23:53:56 -08:00
fjy	bc173d14fc	a whole bunch of cleanup and fixes	2014-12-02 17:32:05 -08:00
Xavier Léauté	a79389a9e5	update joda-time and fix min/max instant	2014-12-02 17:27:22 -08:00
nishantmonu51	b65933ffb8	make tests parameterised	2014-12-02 23:55:29 +05:30
nishantmonu51	6dc69c2f30	code cleanups & formatting	2014-12-02 22:44:33 +05:30
nishantmonu51	eac776f1a7	tests passing with on heap incremental index	2014-12-02 22:29:28 +05:30
Xavier Léauté	4eee7e69b9	fix cardinality aggregator caching	2014-11-26 15:00:37 -08:00
xvrl	5bc1be5ba0	Merge pull request #850 from metamx/druid-0.7.x-compressionstrategy Compression strategy changes	2014-11-25 12:58:39 -08:00
Charles Allen	c6043afa32	Removed empty function from CompressionStrategyTest	2014-11-25 12:57:06 -08:00

... 15 16 17 18 19 ...

2576 Commits