7484 Commits

Author SHA1 Message Date
Navis Ryu
898c1c21af More best-effort parse long (#3603)
* More best-effort parse long

* addressed comments
2016-10-25 10:31:51 -07:00
David Lim
3c56cbdf82 fix timing issue with KafkaLookupExtractorFactoryTest (#3604) 2016-10-25 07:04:51 -07:00
Himanshu
641469fc38 manage overshadowing efficiently at coordinator (#3584)
* manage overshadowing efficiently at coordinator

* take readlock in VersionedIntervalTimeline.isOvershadowed()
2016-10-24 22:49:08 +05:30
Charles Allen
9bb735133f Move copyright to proper druid.io form (#3602) 2016-10-21 16:23:53 -07:00
Akash Dwivedi
4b3bd8bd63 Migrating java-util from Metamarkets. (#3585)
* Migrating java-util from Metamarkets.

* checkstyle and updated license on java-util files.

* Removed unused imports from whole project.

* cherry pick metamx/java-util@826021f.

* Copyright changes on java-util pom, address review comments.
2016-10-21 14:57:07 -07:00
Navis Ryu
8b7ff4409a Math expressional parameters for aggregator (#2783)
* Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was #2820)

* Addressed comments

* addressed comments
2016-10-19 13:58:35 -05:00
Roman Leventov
b113a34355 In CPUTimeMetricQueryRunner, account CPU consumed in baseSequence.toYielder() (#3587) 2016-10-18 09:06:42 -05:00
Charles Allen
2c5c8198db Make query/cpu/time still report on error (#3535) 2016-10-18 08:26:21 -05:00
Nishant
8ea5f9324d Integration Tests - fix middlemanager property name in doc (#3586) 2016-10-18 08:23:34 -05:00
Gian Merlino
dd0bb6da1e Unit test for #3544: Avoid exceptions for dataSource spec when using s3. (#3571) 2016-10-17 12:41:43 -07:00
Roman Leventov
9611358f0a Small topn scan improvements (#3526)
* Remove unused numProcessed param from PooledTopNAlgorithm.aggregateDimValue()

* Replace AtomicInteger with simple int in PooledTopNAlgorithm.scanAndAggregate() and aggregateDimValue()

* Remove unused import
2016-10-17 10:36:19 -07:00
Gian Merlino
285516bede Workaround non-thread-safe use of HLL aggregators. (#3578)
Despite the non-thread-safety of HyperLogLogCollector, it is actually currently used
by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and
"get" methods can be called simultaneously by OnheapIncrementalIndex, since its
"doAggregate" and "getMetricObjectValue" methods are not synchronized.

This means that the optimization of HyperLogLogCollector.fold in #3314 (saving and
restoring position rather than duplicating the storage buffer of the right-hand side)
could cause corruption in the face of concurrent writes.

This patch works around the issue by duplicating the storage buffer in "get" before
returning a collector. The returned collector still shares data with the original one,
but the situation is no worse than before #3314. In the future we may want to consider
making a thread safe version of HLLC that avoids these kinds of problems in realtime
indexing. But for now I thought it was best to do a small change that restored the old
behavior.
2016-10-17 09:39:12 -07:00
David Lim
c2ae734848 KafkaIndexTask: Allow run thread to stop gracefully instead of interrupting (#3534)
* allow run thread to gracefully complete instead of interrupting when stopGracefully() is called

* add comments
2016-10-17 10:52:19 -04:00
Gian Merlino
c1d3b8a30c Remove dropwizard-jdbc dependency from lookups-cached-single. (#3573)
Fixes #3548.
2016-10-17 10:37:47 -04:00
Gian Merlino
0ce33bc95f HdfsDataSegmentPusher: Properly include scheme, host in output path if necessary. (#3577)
Fixes #3576.
2016-10-17 10:37:18 -04:00
David Lim
472c409b99 KafkaLookupExtractorFactory: shutdown kafka consumer on close() (#3539)
* shutdown kafka consumer on close

* handle close() race condition
2016-10-15 09:55:51 -07:00
Charles Allen
3b6261c690 Add druid-lookups-cached-single to default distribution build (#3550)
Fixes #3527
2016-10-15 08:11:04 -07:00
Navis Ryu
4554c1214b Avoid exceptions for dataSource spec when using s3 (#3544) 2016-10-14 18:24:19 -07:00
Roman Leventov
5dc95389f7 Add Checkstyle framework (#3551)
* Add Checkstyle framework

* Avoid star import

* Need braces for control flow statements

* Redundant imports

* Add NewLineAtEndOfFile check
2016-10-13 13:37:47 -07:00
Roman Leventov
85ac8eff90 Improve performance of IndexMergerV9 (#3440)
* Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9

* Don't mask index in MergeIntIterator.makeQueueElement()

* DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator

* Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable

* Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }
2016-10-13 08:28:46 -07:00
Gian Merlino
ddc856214d When inserting segments, mark unused if already overshadowed. (#3499)
This is useful for the insert-segment-to-db tool, which would otherwise
potentially insert a lot of overshadowed segments as "used", causing
load and drop churn in the cluster.
2016-10-10 18:10:18 -07:00
jaehong choi
6f21778364 Support finding segments in AWS S3. (#3399)
* support finding segments from a AWS S3 storage.

* add more Uts

* address comments and add a document for the feature.

* update docs indentation

* update docs indentation

* address comments.
1. add a Ut for json ser/deser for the config object.
2. more informant error message in a Ut.

* address comments.
1. use @Min to validate the configuration object
2. change updateDescriptor to a string as it does not take an argument otherwise

* fix a Ut failure - delete a Ut for testing default max length.
2016-10-10 17:27:09 -07:00
Parag Jain
1e79a1be82 fix useExplicitVersion (#3559) 2016-10-10 14:28:06 -05:00
Akash Dwivedi
3a83e0513e Doc update(batch-ingestion) to include useExplicitVersion. (#3557) 2016-10-07 14:48:00 -07:00
Parag Jain
c255dd8b19 fix datasegment metadata (#3555) 2016-10-07 16:30:33 -05:00
Akash Dwivedi
078de4fcf9 Use explicit version from HadoopIngestionSpec. (#3554) 2016-10-07 13:59:14 -07:00
Himanshu
7e6824501c fix: QueryResource thread name includes whole inner query string for nested query (#3549)
* print exception details from QueryInterruptedException

* in QueryResource.java, set thread name to include dataSource names and not whole query string e.g. from QueryDataSource
2016-10-06 18:30:52 -07:00
Parag Jain
76a60a007e create parent dir on HDFS if it does not exist (#3547) 2016-10-06 16:14:00 -07:00
Charles Allen
76e77cb610 Make segment creation gauva 14 friendly (#3520) 2016-10-05 15:25:03 -07:00
Himanshu
1523de08fb SketchAggregatorFactory.combine(..) returns Union object now so that it can be reused across multiple combine(..) calls (#3471) 2016-10-05 08:40:14 -07:00
Parag Jain
592903571a add context to kafka supervisor for the kafka indexing task (#3464) 2016-10-04 20:08:43 -05:00
Parag Jain
e419407eba handle supervisor spec metadata failures (#3456)
close kafka consumer in case supervisor start fails
2016-10-04 10:15:28 -07:00
praveev
43cdc675c7 Add support for timezone in segment granularity (#3528)
* Add support for timezone in segment granularity

* CR feedback. Handle null timezone during equals check.

* Include timezone in docs.
Add timezone for ArbitraryGranularitySpec.
2016-10-03 08:15:42 -07:00
Gian Merlino
40f2fe7893 Bump versions to 0.9.3-SNAPSHOT (#3524) 2016-09-29 13:53:32 -07:00
Charles Allen
654e1db309 Add simple test to FunctionalExtractionTest (#3522) 2016-09-28 23:45:15 -07:00
John Zhang
78b06a7d7e make global http client worker threads configurable (#3514) 2016-09-28 23:18:51 -07:00
Maciej Bryński
d0ea84149f Changing num threads to 9 (#3492) 2016-09-28 10:44:51 -06:00
Xiaoyao
91e6ab4fcf LRU cache guarantee to keep size under limit (#3510)
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
2016-09-27 17:13:06 -07:00
Parag Jain
56b0586097 secure BrokerQueryResource endpoints (#3506) 2016-09-26 11:27:24 -07:00
Parag Jain
15c9918c65 log exceptions while trying to pause task (#3504) 2016-09-23 16:53:23 -07:00
Gian Merlino
d5a8a35fec groupBy: GroupByRowProcessor fixes, invert subquery context overrides. (#3502)
- Fix GroupByRowProcessor config overrides
- Fix GroupByRowProcessor resource limit checking
- Invert subquery context overrides such that for the subquery, its own
  keys override keys from the outer query, not the other way around.

The last bit is necessary for the test to work, and seems like a better
way to do it anyway.
2016-09-23 14:41:09 -07:00
Gian Merlino
7195be32d8 groupBy v2: Fix dangling references. (#3500)
Acquiring references in the processing task prevents dangling references
caused by canceled processing tasks.
2016-09-24 01:59:11 +05:30
David Lim
9226d4af3c configurable shutdownTimeout for Kakfa supervisor (#3497)
* configurable shutdownTimeout

* cr change
2016-09-23 13:26:45 -06:00
David Lim
ca9114b41b add supervisor reset API (#3484)
* add supervisor reset API

* CR doc changes and kill running tasks / clear offsets from supervisor
2016-09-22 17:51:06 -07:00
Nishant
6099d20303 [FIX] ReleaseException when the path is being written by multiple tasks (#3494)
* fix ReleaseException when the path is being written by multiple task

* Do not throw IOException if another replica wins the race for segment creation

fix if check

* handle logging comments

* fix test
2016-09-22 14:25:41 -05:00
Gian Merlino
f8d71fc602 groupBy: Fix maxMergingDictionarySize config. (#3488) 2016-09-22 10:02:33 -07:00
Gian Merlino
c87ecea975 Fix ListFilteredDimensionSpec blacklisting on non-present values. (#3487) 2016-09-22 09:12:02 -07:00
Navis Ryu
74e1243c7e Fix test fail of PollingLookupTest.testApplyAfterDataChange (#3489) 2016-09-22 08:33:59 -07:00
Navis Ryu
49c0fe0e8b Show candidate hosts for the given query (#2282)
* Show candidate hosts for the given query

* Added test cases & minor changes to address comments

* Changed path-param to query-pram for intervals/numCandidates
2016-09-22 08:32:38 -07:00
Fokko Driesprong
67920c114e Fixed info message (#3481) 2016-09-21 15:50:29 -07:00