1362 Commits

Author SHA1 Message Date
Parag Jain
8e31a465ad report hand off count finite appenderator driver (#3925) 2017-02-13 10:41:24 -08:00
Gian Merlino
12317fd001 Bump version to 0.10.0-SNAPSHOT. (#3913) 2017-02-06 17:54:35 -08:00
DaimonPl
93b71e265e Extract HLL related code to separate module (#3900) 2017-02-03 09:45:11 -08:00
Parag Jain
1aabb45a09 auto reset option for Kafka Indexing service (#3842)
* auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers

* review comments

* review comments

* reverted last change

* review comments

* review comments

* fix typo
2017-02-02 14:57:45 -06:00
David Lim
ff52581bd3 IndexTask improvements (#3611)
* index task improvements

* code review changes

* add null check
2017-01-18 14:24:37 -08:00
Jihoon Son
d80bec83cc Enable auto license checking (#3836)
* Enable license checking

* Clean duplicated license headers
2017-01-10 18:13:47 -08:00
Charles Allen
229559b46a Make TaskLockbox's ReentrantLock fair (#3828) 2017-01-07 12:34:47 -08:00
Himanshu
4ca3b7f1e4 overlord helpers framework and tasklog auto cleanup (#3677)
* overlord helpers framework and tasklog auto cleanup

* review comment changes

* further review comments addressed
2016-12-21 15:18:55 -08:00
Gian Merlino
6440ddcbca Fix #3795 (Java 7 compatibility). (#3796)
* Fix #3795 (Java 7 compatibility).

Also introduce Animal Sniffer checks during build, which would
have caught the original problems.

* Add Animal Sniffer on caffeine-cache for JDK8.
2016-12-21 10:19:13 -08:00
Roman Leventov
70e83bea6d Fix PathChildrenCache's ExecutorService leak (#3726)
* Fix PathChildrenCache's executorService leak in Announcer, CuratorInventoryManager and RemoteTaskRunner

* Use a single ExecutorService for all workerStatusPathChildrenCaches in RemoteTaskRunner
2016-12-07 13:00:10 -08:00
Gian Merlino
4e67dd28c0 RemoteTaskRunnerConfig: Fix Guice error on startup. (#3737) 2016-12-06 00:19:53 +05:30
Charles Allen
27ab23ef44 Don't update segment metadata if archive doesn't move anything (#3476)
* Don't update segment metadata if archive doesn't move anything

* Fix restore task to handle potential null values

* Don't try to update empty metadata

* Address review comments

* Move to druid-io java-util
2016-12-01 07:49:28 -08:00
Niketh Sabbineni
2640d170c3 Blacklist workers if they fail for too many times (#3643)
* Blacklist workers if they fail for too many times

* Adding documentation

* Changing to timeout to period and updating docs

* 1. Add configurable maxPercentageBlacklistWorkers
2. Rename variable

* Change maxPercentageBlacklistWorkers to double

* Remove thread.sleep
2016-11-29 12:38:56 +05:30
Roman Leventov
c070b4a816 Fix concurrency defects, remove unnecessary volatiles (#3701) 2016-11-22 16:42:28 -08:00
Roman Leventov
7b56cec3b9 Fix resource leaks (#3702) 2016-11-18 21:21:36 +05:30
Gian Merlino
bcd20441be Make buildV9Directly the default. (#3688) 2016-11-14 09:29:32 -08:00
Roman Leventov
fbbb55f867 Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics (#3679)
* Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics, not only query metrics

* Remove unused imports

* Use empty string instead of "testing-version" as a version placeholder
2016-11-11 17:17:27 -06:00
Himanshu
b76b3f8d85 reset-cluster command to clean up druid state stored on metadata and deep storage (#3670) 2016-11-09 11:07:01 -06:00
Akash Dwivedi
4b3bd8bd63 Migrating java-util from Metamarkets. (#3585)
* Migrating java-util from Metamarkets.

* checkstyle and updated license on java-util files.

* Removed unused imports from whole project.

* cherry pick metamx/java-util@826021f.

* Copyright changes on java-util pom, address review comments.
2016-10-21 14:57:07 -07:00
Parag Jain
1e79a1be82 fix useExplicitVersion (#3559) 2016-10-10 14:28:06 -05:00
Akash Dwivedi
078de4fcf9 Use explicit version from HadoopIngestionSpec. (#3554) 2016-10-07 13:59:14 -07:00
Parag Jain
e419407eba handle supervisor spec metadata failures (#3456)
close kafka consumer in case supervisor start fails
2016-10-04 10:15:28 -07:00
Gian Merlino
40f2fe7893 Bump versions to 0.9.3-SNAPSHOT (#3524) 2016-09-29 13:53:32 -07:00
David Lim
ca9114b41b add supervisor reset API (#3484)
* add supervisor reset API

* CR doc changes and kill running tasks / clear offsets from supervisor
2016-09-22 17:51:06 -07:00
Gian Merlino
27bd5cb13a Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. (#3473)
Fixes #3241.
2016-09-21 13:40:04 -06:00
Gian Merlino
7a2a4bc6de JavaScript: Disable now affects worker selection and router strategy too. (#3458) 2016-09-13 16:37:42 -07:00
Dave Li
c4e8440c22 Adds long compression methods (#3148)
* add read

* update deprecated guava calls

* add write and vsizeserde

* add benchmark

* separate encoding and compression

* add header and reformat

* update doc

* address PR comment

* fix buffer order

* generate benchmark files

* separate encoding strategy and format

* fix benchmark

* modify supplier write to channel

* add float NONE handling

* address PR comment

* address PR comment 2
2016-08-30 16:17:46 -07:00
Nishant
4c2b8d29d3 Make RTR assign pending tasks by insertion order (#3405) 2016-08-30 12:22:44 -07:00
Gian Merlino
2f46effc8e FileTaskLogsTest: Throw unthrown exception. (#3352) 2016-08-11 09:40:28 -07:00
Himanshu
03cfcf002b fix the race described in #3174 (#3205) 2016-08-10 11:29:50 -07:00
kaijianding
50d52a24fc ability to not rollup at index time, make pre aggregation an option (#3020)
* ability to not rollup at index time, make pre aggregation an option

* rename getRowIndexForRollup to getPriorIndex

* fix doc misspelling

* test query using no-rollup indexes

* fix benchmark fail due to jmh bug
2016-08-02 11:13:05 -07:00
David Lim
d5ed3f1347 change expected response from ACCEPTED to OK (#3280) 2016-07-23 19:48:30 -07:00
Gian Merlino
06624c40c0 Share query handling between Appenderator and RealtimePlumber. (#3248)
Fixes inconsistent metric handling between the two implementations. Formerly,
RealtimePlumber only emitted query/segmentAndCache/time and query/wait and
Appenderator only emitted query/partial/time and query/wait (all per sink).

Now they both do the same thing:
- query/segmentAndCache/time, query/segment/time are the time spent per sink.
- query/cpu/time is the CPU time spent per query.
- query/wait/time is the executor waiting time per sink.

These generally match historical metrics, except segmentAndCache & segment
mean the same thing here, because one Sink may be partially cached and
partially uncached and we aren't splitting that out.
2016-07-19 22:15:13 -05:00
Hyukjin Kwon
55e7a52475 Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215) 2016-07-14 09:19:17 -07:00
Gian Merlino
ea03906fcf Configurable compressRunOnSerialization for Roaring bitmaps. (#3228)
Defaults to true, which is a change in behavior (this used to be false and unconfigurable).
2016-07-08 10:24:19 +05:30
Xavier Léauté
485e381387 remove datasource from hadoop output path (#3196)
fixes #2083, follow-up to #1702
2016-06-29 08:53:45 -07:00
Hyukjin Kwon
45f553fc28 Replace the deprecated usage of NoneShardSpec (#3166) 2016-06-25 10:27:25 -07:00
Charles Allen
6be18376c0 Make forking task runner have more informative thread names during the long-blocking part (#3172)
* Make forking task runner have more informative thread names during the long-blocking part

* Make string.format do the work
2016-06-24 08:56:01 -07:00
Gian Merlino
ebf890fe79 Update master version to 0.9.2-SNAPSHOT. (#3133) 2016-06-13 13:10:38 -07:00
David Lim
5a3db634ff add synchronization to SupervisorManager (#3077) 2016-06-07 00:29:23 -06:00
David Lim
a2290a8f05 support seamless config changes (#3051) 2016-06-03 13:50:19 -07:00
Charles Allen
474286bbce Make TaskMaster giant lock fair (#3050) 2016-06-02 12:10:40 -07:00
David Lim
3ef24c03b3 Validate X-Druid-Task-Id header in request/response and support retrying on outdated TaskLocation information, add KafkaIndexTaskClient unit tests (#3006)
* validate X-Druid-Task-Id header in request and add header to response

* modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant
2016-05-25 22:05:18 -07:00
Charles Allen
15ccf451f9 Move QueryGranularity static fields to QueryGranularities (#2980)
* Move QueryGranularity static fields to QueryGranularityUtil
* Fixes #2979

* Add test showing #2979

* change name to QueryGranularities
2016-05-17 16:23:48 -07:00
Charles Allen
eaaad01de7 [QTL] Datasource as lookupTier (#2955)
* Datasource as lookup tier
* Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for.

* Fix bad docs for lookups lookupTier

* Add Datasource name holder

* Move task and datasource to be pulled from Task file

* Make LookupModule pull from bound dataSource

* Fix test

* Fix code style on imports

* Fix formatting

* Make naming better

* Address code comments about naming
2016-05-17 15:44:42 -07:00
David Lim
b489f63698 Supervisor for KafkaIndexTask (#2656)
* supervisor for kafka indexing tasks

* cr changes
2016-05-04 23:13:13 -07:00
Gian Merlino
f8ddfb9a4b Split SegmentInsertAction and SegmentTransactionalInsertAction for backwards compat. (#2922)
Fixes #2912.
2016-05-04 13:54:34 -07:00
Himanshu
50065c8288 fix spurious failure of RTR concurrency test (#2915) 2016-05-04 10:30:20 -07:00
Charles Allen
3f71a4a302 Fix missing log arguments in PendingTaskBasedWorkerResourceManagementStrategy (#2898) 2016-04-28 18:15:41 -07:00
Parag Jain
0d745ee120 Basic authorization support in Druid (#2424)
- Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions
- If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks
-  `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN`
- As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are  - `READ` or `WRITE`
- Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example
 - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/`
 - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/`
 - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task`
 - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc
 - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc
- For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method.

Note -
JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424
2016-04-28 16:50:28 -07:00