Commit Graph

7815 Commits

Author SHA1 Message Date
Roman Leventov 85ac8eff90 Improve performance of IndexMergerV9 (#3440)
* Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9

* Don't mask index in MergeIntIterator.makeQueueElement()

* DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator

* Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable

* Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }
2016-10-13 08:28:46 -07:00
Gian Merlino ddc856214d When inserting segments, mark unused if already overshadowed. (#3499)
This is useful for the insert-segment-to-db tool, which would otherwise
potentially insert a lot of overshadowed segments as "used", causing
load and drop churn in the cluster.
2016-10-10 18:10:18 -07:00
jaehong choi 6f21778364 Support finding segments in AWS S3. (#3399)
* support finding segments from a AWS S3 storage.

* add more Uts

* address comments and add a document for the feature.

* update docs indentation

* update docs indentation

* address comments.
1. add a Ut for json ser/deser for the config object.
2. more informant error message in a Ut.

* address comments.
1. use @Min to validate the configuration object
2. change updateDescriptor to a string as it does not take an argument otherwise

* fix a Ut failure - delete a Ut for testing default max length.
2016-10-10 17:27:09 -07:00
Parag Jain 1e79a1be82 fix useExplicitVersion (#3559) 2016-10-10 14:28:06 -05:00
Akash Dwivedi 3a83e0513e Doc update(batch-ingestion) to include useExplicitVersion. (#3557) 2016-10-07 14:48:00 -07:00
Parag Jain c255dd8b19 fix datasegment metadata (#3555) 2016-10-07 16:30:33 -05:00
Akash Dwivedi 078de4fcf9 Use explicit version from HadoopIngestionSpec. (#3554) 2016-10-07 13:59:14 -07:00
Himanshu 7e6824501c fix: QueryResource thread name includes whole inner query string for nested query (#3549)
* print exception details from QueryInterruptedException

* in QueryResource.java, set thread name to include dataSource names and not whole query string e.g. from QueryDataSource
2016-10-06 18:30:52 -07:00
Parag Jain 76a60a007e create parent dir on HDFS if it does not exist (#3547) 2016-10-06 16:14:00 -07:00
Charles Allen 76e77cb610 Make segment creation gauva 14 friendly (#3520) 2016-10-05 15:25:03 -07:00
Himanshu 1523de08fb SketchAggregatorFactory.combine(..) returns Union object now so that it can be reused across multiple combine(..) calls (#3471) 2016-10-05 08:40:14 -07:00
Parag Jain 592903571a add context to kafka supervisor for the kafka indexing task (#3464) 2016-10-04 20:08:43 -05:00
Parag Jain e419407eba handle supervisor spec metadata failures (#3456)
close kafka consumer in case supervisor start fails
2016-10-04 10:15:28 -07:00
praveev 43cdc675c7 Add support for timezone in segment granularity (#3528)
* Add support for timezone in segment granularity

* CR feedback. Handle null timezone during equals check.

* Include timezone in docs.
Add timezone for ArbitraryGranularitySpec.
2016-10-03 08:15:42 -07:00
Gian Merlino 40f2fe7893 Bump versions to 0.9.3-SNAPSHOT (#3524) 2016-09-29 13:53:32 -07:00
Charles Allen 654e1db309 Add simple test to FunctionalExtractionTest (#3522) 2016-09-28 23:45:15 -07:00
John Zhang 78b06a7d7e make global http client worker threads configurable (#3514) 2016-09-28 23:18:51 -07:00
Maciej Bryński d0ea84149f Changing num threads to 9 (#3492) 2016-09-28 10:44:51 -06:00
Xiaoyao 91e6ab4fcf LRU cache guarantee to keep size under limit (#3510)
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
2016-09-27 17:13:06 -07:00
Parag Jain 56b0586097 secure BrokerQueryResource endpoints (#3506) 2016-09-26 11:27:24 -07:00
Parag Jain 15c9918c65 log exceptions while trying to pause task (#3504) 2016-09-23 16:53:23 -07:00
Gian Merlino d5a8a35fec groupBy: GroupByRowProcessor fixes, invert subquery context overrides. (#3502)
- Fix GroupByRowProcessor config overrides
- Fix GroupByRowProcessor resource limit checking
- Invert subquery context overrides such that for the subquery, its own
  keys override keys from the outer query, not the other way around.

The last bit is necessary for the test to work, and seems like a better
way to do it anyway.
2016-09-23 14:41:09 -07:00
Gian Merlino 7195be32d8 groupBy v2: Fix dangling references. (#3500)
Acquiring references in the processing task prevents dangling references
caused by canceled processing tasks.
2016-09-24 01:59:11 +05:30
David Lim 9226d4af3c configurable shutdownTimeout for Kakfa supervisor (#3497)
* configurable shutdownTimeout

* cr change
2016-09-23 13:26:45 -06:00
David Lim ca9114b41b add supervisor reset API (#3484)
* add supervisor reset API

* CR doc changes and kill running tasks / clear offsets from supervisor
2016-09-22 17:51:06 -07:00
Nishant 6099d20303 [FIX] ReleaseException when the path is being written by multiple tasks (#3494)
* fix ReleaseException when the path is being written by multiple task

* Do not throw IOException if another replica wins the race for segment creation

fix if check

* handle logging comments

* fix test
2016-09-22 14:25:41 -05:00
Gian Merlino f8d71fc602 groupBy: Fix maxMergingDictionarySize config. (#3488) 2016-09-22 10:02:33 -07:00
Gian Merlino c87ecea975 Fix ListFilteredDimensionSpec blacklisting on non-present values. (#3487) 2016-09-22 09:12:02 -07:00
Navis Ryu 74e1243c7e Fix test fail of PollingLookupTest.testApplyAfterDataChange (#3489) 2016-09-22 08:33:59 -07:00
Navis Ryu 49c0fe0e8b Show candidate hosts for the given query (#2282)
* Show candidate hosts for the given query

* Added test cases & minor changes to address comments

* Changed path-param to query-pram for intervals/numCandidates
2016-09-22 08:32:38 -07:00
Fokko Driesprong 67920c114e Fixed info message (#3481) 2016-09-21 15:50:29 -07:00
Gian Merlino 27bd5cb13a Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. (#3473)
Fixes #3241.
2016-09-21 13:40:04 -06:00
Himanshu 05ea88df5c fix kafka-indexing-service pom to not reference specific version but parent version for druid core dependencies (#3472) 2016-09-20 15:18:21 -07:00
David Lim 96fcca18ea update KafkaSupervisor to make HTTP requests to tasks in parallel where possible (#3452) 2016-09-20 22:51:15 +05:30
Keuntae Park 54ec4dd584 Support renaming of outputName for cached select and search query results (#3395)
* support renaming of outputName for cached select and search queries

* rebase and resolve conflicts

* rollback CacheStrategy interface change

* updated based on review comments
2016-09-20 08:19:14 -07:00
Slim 3175e17a3b Cached lookup module. first cut implementing JDBC cache (#2819) 2016-09-16 13:45:54 -07:00
Charles Allen 95e08b38ea [QTL] Reduced Locking Lookups (#3071)
* Lockless lookups

* Fix compile problem

* Make stack trace throw instead

* Remove non-germane change

* * Add better naming to cache keys. Makes logging nicer
* Fix #3459

* Move start/stop lock to non-interruptable for readability purposes
2016-09-16 11:54:23 -07:00
Gian Merlino 76fcbd8fc5 Update Curator, ZK to latest stable versions. (#3461) 2016-09-16 09:16:14 -07:00
Gleb Smirnov d981a2aa02 Avoid interrupting ZookeeperConsumerConnector.shutdown() #3346 (#3403) 2016-09-14 17:44:27 -07:00
Gian Merlino 7a2a4bc6de JavaScript: Disable now affects worker selection and router strategy too. (#3458) 2016-09-13 16:37:42 -07:00
Gian Merlino e0e28866ee JavaScript docs: Fix links and typos, add to TOC. (#3457) 2016-09-13 15:26:44 -07:00
Himanshu a069257d37 avro-extension -- feature to specify multiple avro reader schemas inline (#3368)
* rename SimpleAvroBytesDecoder to InlineSchemaAvroBytesDecoder

* feature to specify multiple schemas inline in avro module
2016-09-13 14:54:31 -07:00
Gian Merlino 76a24054e3 JavaScript docs, including docs for globals. (#3454) 2016-09-13 13:46:55 -07:00
Gian Merlino bcff08826b KafkaIndexTask: Treat null values as unparseable. (#3453) 2016-09-13 10:56:38 -07:00
Slim ba6ddf307e Adding hadoop kerberos authentification. (#3419)
* adding kerberos authentication

* make the 2 functions identical
2016-09-13 10:42:50 -07:00
Jonathan Wei df766b2bbd Add dimension handling interface for ingestion and segment creation (#3217)
* Add dimension handling interface for ingestion and segment creation

* update javadocs for DimensionHandler/DimensionIndexer

* Move IndexIO row validation into DimensionHandler

* Fix null column skipping in mergerV9

* Add deprecation note for 'numeric_dims' filename pattern in IndexIO v8->v9 conversion

* Fix java7 test failure
2016-09-12 12:54:02 -07:00
Alexander Saydakov 1a5042ca26 updated dependency on sketches-core (#3443)
* updated dependency on sketches-core to 0.7.0

* Use sketches-core-0.4.1, which is the latest version still compatible
with JDK7
2016-09-09 16:21:32 -07:00
Slim 6a1cd7fc66 avoid throwing exceptions fix#3389 (#3441)
* avoid throwing exceptions

* log alert

* fix comments
2016-09-09 16:19:50 -07:00
Gian Merlino d108461838 groupBy v2: Parallel disk spilling. (#3433)
In ConcurrentGrouper, when it becomes clear that disk spilling is necessary, switch
from hash-based partitioning to thread-based partitioning. This stops processing
threads from blocking each other while spilling is occurring.
2016-09-09 16:49:58 -06:00
Gian Merlino 1344e3c3af Clearer filter docs. (#3448) 2016-09-09 13:47:13 -07:00