7501 Commits

Author SHA1 Message Date
kaijianding
2961406b90 fix zero period in PeriodGranularity causing gran.iterable(start, end) infinite loop (#3644) 2016-11-02 15:40:07 +05:30
Roman Leventov
4b0d6cf789 Fix resource leaks (ComplexColumn and GenericColumn) (#3629)
* Remove unused ComplexColumnImpl class

* Remove throws IOException from close() in GenericColumn, ComplexColumn, IndexedFloats and IndexedLongs

* Use concise try-with-resources syntax in several places

* Fix resource leaks (ComplexColumn and GenericColumn) in SegmentAnalyzer, SearchQueryRunner, QueryableIndexIndexableAdapter and QueryableIndexStorageAdapter

* Use Closer in Iterable, returned from QueryableIndexIndexableAdapter.getRows(), in order to try to close everything even if closing some parts thew exceptions
2016-11-02 09:23:52 +05:30
Himanshu
eb70a12e43 fix cleanup of tmp dir in HdfsDataSegmentPusher (#3636) 2016-11-01 12:45:38 -05:00
kaijianding
f1dee037d6 fix 'No Such File' error when execute script out of druid installation directory (#3517) 2016-11-01 09:57:09 -07:00
Gian Merlino
45940d6e40 Math expressions support for missing columns. (#3630)
Also add SchemaEvolutionTest to help test this kind of thing.

Fixes #3627 and includes test for #3625.
2016-11-01 09:40:25 -07:00
Himanshu
0e269ce72a max row limit is necessary is connector is setup for streaming (#3635) 2016-11-01 09:39:55 -07:00
Gian Merlino
89d9c61894 Deprecate Aggregator.getName and AggregatorFactory.getAggregatorStartValue. (#3572) 2016-10-31 15:24:30 -07:00
Himanshu
32c5494e97 eagerly allocate the intermediate computation buffers (#3628) 2016-10-31 15:24:07 -07:00
Gian Merlino
9f5c895d5f Fix imports in DefaultOfflineAppenderatorFactoryTest. (#3624) 2016-10-31 12:23:19 -07:00
Slim
f6995bc908 offline appenderator factory. (#3483)
* adding default offline appenderator

* adding test

* fix comments

* fix comments
2016-10-31 10:05:58 -07:00
Aveplatter
317d62e18c teeny tiny wording change (#3623) 2016-10-31 09:46:54 -07:00
Navis Ryu
3fca3be9ea SpecificSegmentQueryRunner misses missing segments from toYielder() (#3617) 2016-10-30 11:47:29 -07:00
Himanshu
23a8e22836 fix SketchMergeAggregatorFactory.finalizeResults, comparator and more UTs for timeseries, topN (#3613) 2016-10-28 15:48:33 -07:00
Akash Dwivedi
6a845e1f7b Adding getDelegate() to directly access delegate. (#3616)
👍
2016-10-27 15:57:36 -07:00
Charles Allen
78159d7ca4 Move off-heap QTL global cache delete lock outside of subclass lock (#3597)
* Move off-heap QTL global cache delete lock outside of subclass lock

* Make `delete` thread safe
2016-10-27 22:23:53 +05:30
Navis Ryu
0799640299 Faster interval comparator (#3605) 2016-10-26 14:20:27 +05:30
Parag Jain
98465f47b5 start stop metamx lifecycle annotated objects as well (#3610) 2016-10-25 15:56:16 -07:00
Navis Ryu
898c1c21af More best-effort parse long (#3603)
* More best-effort parse long

* addressed comments
2016-10-25 10:31:51 -07:00
David Lim
3c56cbdf82 fix timing issue with KafkaLookupExtractorFactoryTest (#3604) 2016-10-25 07:04:51 -07:00
Himanshu
641469fc38 manage overshadowing efficiently at coordinator (#3584)
* manage overshadowing efficiently at coordinator

* take readlock in VersionedIntervalTimeline.isOvershadowed()
2016-10-24 22:49:08 +05:30
Charles Allen
9bb735133f Move copyright to proper druid.io form (#3602) 2016-10-21 16:23:53 -07:00
Akash Dwivedi
4b3bd8bd63 Migrating java-util from Metamarkets. (#3585)
* Migrating java-util from Metamarkets.

* checkstyle and updated license on java-util files.

* Removed unused imports from whole project.

* cherry pick metamx/java-util@826021f.

* Copyright changes on java-util pom, address review comments.
2016-10-21 14:57:07 -07:00
Navis Ryu
8b7ff4409a Math expressional parameters for aggregator (#2783)
* Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was #2820)

* Addressed comments

* addressed comments
2016-10-19 13:58:35 -05:00
Roman Leventov
b113a34355 In CPUTimeMetricQueryRunner, account CPU consumed in baseSequence.toYielder() (#3587) 2016-10-18 09:06:42 -05:00
Charles Allen
2c5c8198db Make query/cpu/time still report on error (#3535) 2016-10-18 08:26:21 -05:00
Nishant
8ea5f9324d Integration Tests - fix middlemanager property name in doc (#3586) 2016-10-18 08:23:34 -05:00
Gian Merlino
dd0bb6da1e Unit test for #3544: Avoid exceptions for dataSource spec when using s3. (#3571) 2016-10-17 12:41:43 -07:00
Roman Leventov
9611358f0a Small topn scan improvements (#3526)
* Remove unused numProcessed param from PooledTopNAlgorithm.aggregateDimValue()

* Replace AtomicInteger with simple int in PooledTopNAlgorithm.scanAndAggregate() and aggregateDimValue()

* Remove unused import
2016-10-17 10:36:19 -07:00
Gian Merlino
285516bede Workaround non-thread-safe use of HLL aggregators. (#3578)
Despite the non-thread-safety of HyperLogLogCollector, it is actually currently used
by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and
"get" methods can be called simultaneously by OnheapIncrementalIndex, since its
"doAggregate" and "getMetricObjectValue" methods are not synchronized.

This means that the optimization of HyperLogLogCollector.fold in #3314 (saving and
restoring position rather than duplicating the storage buffer of the right-hand side)
could cause corruption in the face of concurrent writes.

This patch works around the issue by duplicating the storage buffer in "get" before
returning a collector. The returned collector still shares data with the original one,
but the situation is no worse than before #3314. In the future we may want to consider
making a thread safe version of HLLC that avoids these kinds of problems in realtime
indexing. But for now I thought it was best to do a small change that restored the old
behavior.
2016-10-17 09:39:12 -07:00
David Lim
c2ae734848 KafkaIndexTask: Allow run thread to stop gracefully instead of interrupting (#3534)
* allow run thread to gracefully complete instead of interrupting when stopGracefully() is called

* add comments
2016-10-17 10:52:19 -04:00
Gian Merlino
c1d3b8a30c Remove dropwizard-jdbc dependency from lookups-cached-single. (#3573)
Fixes #3548.
2016-10-17 10:37:47 -04:00
Gian Merlino
0ce33bc95f HdfsDataSegmentPusher: Properly include scheme, host in output path if necessary. (#3577)
Fixes #3576.
2016-10-17 10:37:18 -04:00
David Lim
472c409b99 KafkaLookupExtractorFactory: shutdown kafka consumer on close() (#3539)
* shutdown kafka consumer on close

* handle close() race condition
2016-10-15 09:55:51 -07:00
Charles Allen
3b6261c690 Add druid-lookups-cached-single to default distribution build (#3550)
Fixes #3527
2016-10-15 08:11:04 -07:00
Navis Ryu
4554c1214b Avoid exceptions for dataSource spec when using s3 (#3544) 2016-10-14 18:24:19 -07:00
Roman Leventov
5dc95389f7 Add Checkstyle framework (#3551)
* Add Checkstyle framework

* Avoid star import

* Need braces for control flow statements

* Redundant imports

* Add NewLineAtEndOfFile check
2016-10-13 13:37:47 -07:00
Roman Leventov
85ac8eff90 Improve performance of IndexMergerV9 (#3440)
* Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9

* Don't mask index in MergeIntIterator.makeQueueElement()

* DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator

* Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable

* Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }
2016-10-13 08:28:46 -07:00
Gian Merlino
ddc856214d When inserting segments, mark unused if already overshadowed. (#3499)
This is useful for the insert-segment-to-db tool, which would otherwise
potentially insert a lot of overshadowed segments as "used", causing
load and drop churn in the cluster.
2016-10-10 18:10:18 -07:00
jaehong choi
6f21778364 Support finding segments in AWS S3. (#3399)
* support finding segments from a AWS S3 storage.

* add more Uts

* address comments and add a document for the feature.

* update docs indentation

* update docs indentation

* address comments.
1. add a Ut for json ser/deser for the config object.
2. more informant error message in a Ut.

* address comments.
1. use @Min to validate the configuration object
2. change updateDescriptor to a string as it does not take an argument otherwise

* fix a Ut failure - delete a Ut for testing default max length.
2016-10-10 17:27:09 -07:00
Parag Jain
1e79a1be82 fix useExplicitVersion (#3559) 2016-10-10 14:28:06 -05:00
Akash Dwivedi
3a83e0513e Doc update(batch-ingestion) to include useExplicitVersion. (#3557) 2016-10-07 14:48:00 -07:00
Parag Jain
c255dd8b19 fix datasegment metadata (#3555) 2016-10-07 16:30:33 -05:00
Akash Dwivedi
078de4fcf9 Use explicit version from HadoopIngestionSpec. (#3554) 2016-10-07 13:59:14 -07:00
Himanshu
7e6824501c fix: QueryResource thread name includes whole inner query string for nested query (#3549)
* print exception details from QueryInterruptedException

* in QueryResource.java, set thread name to include dataSource names and not whole query string e.g. from QueryDataSource
2016-10-06 18:30:52 -07:00
Parag Jain
76a60a007e create parent dir on HDFS if it does not exist (#3547) 2016-10-06 16:14:00 -07:00
Charles Allen
76e77cb610 Make segment creation gauva 14 friendly (#3520) 2016-10-05 15:25:03 -07:00
Himanshu
1523de08fb SketchAggregatorFactory.combine(..) returns Union object now so that it can be reused across multiple combine(..) calls (#3471) 2016-10-05 08:40:14 -07:00
Parag Jain
592903571a add context to kafka supervisor for the kafka indexing task (#3464) 2016-10-04 20:08:43 -05:00
Parag Jain
e419407eba handle supervisor spec metadata failures (#3456)
close kafka consumer in case supervisor start fails
2016-10-04 10:15:28 -07:00
praveev
43cdc675c7 Add support for timezone in segment granularity (#3528)
* Add support for timezone in segment granularity

* CR feedback. Handle null timezone during equals check.

* Include timezone in docs.
Add timezone for ArbitraryGranularitySpec.
2016-10-03 08:15:42 -07:00