Commit Graph

7258 Commits

Author SHA1 Message Date
michaelschiff 2203a812bc statsd-emitter (#2410) 2016-04-28 18:41:02 -07:00
David Lim 5f0a9ccc57 fix ClassCastException in FiniteAppenderatorDriver (#2896) 2016-04-28 18:39:24 -07:00
Charles Allen 3f71a4a302 Fix missing log arguments in PendingTaskBasedWorkerResourceManagementStrategy (#2898) 2016-04-28 18:15:41 -07:00
Gian Merlino 67b47c982f Datasketches: Remove isInputThetaSketch from cache key. (#2899) 2016-04-28 18:14:52 -07:00
Parag Jain 0d745ee120 Basic authorization support in Druid (#2424)
- Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions
- If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks
-  `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN`
- As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are  - `READ` or `WRITE`
- Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example
 - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/`
 - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/`
 - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task`
 - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc
 - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc
- For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method.

Note -
JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424
2016-04-28 16:50:28 -07:00
David Lim 890bdb543d doc fixes (#2897) 2016-04-28 15:34:58 -07:00
Himanshu 9669e79df2 fix misleading error log due to race in RTR and concurrency test (#2878) 2016-04-28 10:28:00 -07:00
Gian Merlino 16080dc54f Adjust colliding aggregator cache IDs. (#2891)
- Renumbered ApproximateHistogramAggregatorFactory from 8 to 12,
  8 was taken by CardinalityAggregatorFactory
- Renumbered ApproximateHistogramFoldingAggregatorFactory from 9 to 13,
  9 was taken by FilteredAggregatorFactory
2016-04-28 10:11:33 -07:00
Gian Merlino 909abd17f3 Sketch cache key should include size, isInputThetaSketch. (#2893) 2016-04-28 10:10:46 -07:00
Gian Merlino 90ce03c66f Fix integer overflow in SegmentMetadataQuery numRows. (#2890) 2016-04-27 14:37:04 -07:00
Nishant c29cb7d711 add pending task based resource management strategy (#2086) 2016-04-27 10:40:53 -07:00
Nishant bf5e5e7b75 fix #2886 (#2887)
Fixes https://github.com/druid-io/druid/issues/2886
2016-04-27 08:29:41 -07:00
Slim 58510d826b fix emit wait time (#2869) 2016-04-26 17:07:03 -07:00
John Wang 5658bd99eb added contextual time parse (#2867) 2016-04-25 13:35:10 -07:00
Slim 55785267e4 postAgg filedName must match name of AGG (#2874) 2016-04-22 11:11:54 -07:00
binlijin 9151099e08 add document for druid.segmentCache.numBootstrapThreads (#2872) 2016-04-22 12:06:08 +08:00
Gian Merlino 6dc7688a29 TimeAndDims equals/hashCode implementation. (#2870)
Adapted from #2692, thanks @navis for original implementation.
2016-04-22 08:45:20 +08:00
Himanshu 3cfd9c64c9 make singleThreaded groupBy query config overridable at query time (#2828)
* make isSingleThreaded groupBy query processing overridable at query time

* refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent
2016-04-21 17:12:58 -07:00
David Lim 7641f2628f add control and status endpoints to KafkaIndexTask (#2730) 2016-04-21 15:34:59 -07:00
Slim 984a518c9f Merge pull request #2734 from b-slim/LookupIntrospection2
[QTL][Lookup] adding introspection endpoint
2016-04-21 12:15:57 -05:00
Gian Merlino c74391e54c JavaScript: Ability to disable. (#2853)
Fixes #2852.
2016-04-21 09:43:15 -05:00
Xavier Léauté 5938d9085b Stream segments from database (#2859)
* Avoids fetching all segment records into heap by JDBC driver
* Set connection to read-only to help database optimize queries
* Update JDBC drivers (MySQL has fixes for streaming results)
2016-04-21 05:40:07 +08:00
Gian Merlino 7d3e55717d Reduce cost of various toFilter calls. (#2860)
These happen once per segment and so it's better if they don't do
as much work.
2016-04-21 04:28:46 +08:00
Xavier Léauté fc91120b54 Merge pull request #2857 from metamx/upgrade-zk
upgrade zookeeper client dependency to 3.4.8
2016-04-20 10:36:07 +05:30
Gian Merlino 59460b17cc Add Filters.matchPredicate helper, use it where appropriate. (#2851)
This approach simplifies code and is generally faster, due to skipping
unnecessary dictionary lookups (see #2850).
2016-04-19 15:54:32 -07:00
Xavier Léauté b2745befb7 remove obsolete comment (#2858) 2016-04-19 13:06:58 -07:00
Nishant f80a5dc4ef Avoid creating multiple NoneShardSpec objects (#2855)
* Avoid creating multiple NoneShardSpec objects
* deprecate NoneShardSpec constructor
2016-04-19 10:30:14 -07:00
Nishant dbf63f738f Add ability to filter segments for specific dataSources on broker without creating tiers (#2848)
* Add back FilteredServerView removed in a32906c7fd to reduce memory usage using watched tiers.

* Add functionality to specify "druid.broker.segment.watchedDataSources"
2016-04-19 10:10:06 -07:00
Gian Merlino 08c784fbf6 KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844)
segment creation deterministic.

This means that each segment will contain data from just one Kafka
partition. So, users will probably not want to have a super high number
of Kafka partitions...

Fixes #2703.
2016-04-18 22:29:52 -07:00
Jisoo Kim 7b65ca7889 refactor ClientQuerySegmentWalker (#2837)
* refactor ClientQuerySegmentWalker

* add header to FluentQueryRunnerBuilder

* refactor QueryRunnerTestHelper
2016-04-18 14:00:47 -07:00
Xavier Léauté 838768c632 upgrade curator, fixes #2829 (#2849) 2016-04-18 13:17:36 -07:00
Gian Merlino 7c0b1dde3a DimensionPredicateFilter: Skip unnecessary dictionary lookup. (#2850) 2016-04-18 12:38:25 -07:00
Gaurav Kumar f5822faca3 Fixed wrong parseSpec in Avro Hadoop Parser (#2846)
`parseSpec` should contain `format` instead of `type`. It was wrongly defaulting to `tsv`
2016-04-16 11:34:54 -07:00
binlijin c1e690288c Improve some log (#2807) 2016-04-15 09:34:26 -07:00
Nishant 632b21472b fix test failure (#2818)
formatting changes
2016-04-14 21:40:19 -07:00
Jonathan Wei b534f7203c Fix performance regression from #2753 in IndexMerger (#2841) 2016-04-14 21:39:41 -07:00
Jonathan Wei a26134575b Fix NPE in TopNLexicographicResultBuilder.addEntry() (#2835) 2016-04-13 17:27:16 -07:00
du00cs 639d0630b8 jackson conflict workaround in hadooop ingestio & parquet extension coordinate update (#2817) 2016-04-13 14:20:33 -07:00
Fangjin Yang 0c4a42bb6f change toc entry (#2834) 2016-04-13 13:45:07 -07:00
Gian Merlino e320d13385 Fix various broken links in the docs. (#2833) 2016-04-13 13:30:01 -07:00
Gian Merlino 725ee1401d Update tranquility version in the docs. (#2832) 2016-04-13 11:33:59 -07:00
Gian Merlino aa25cc1f68 Fix up Kafka tutorial (#2831)
1) Remove extraneous section
2) Remove -SNAPSHOT version
2016-04-13 11:33:45 -07:00
Fangjin Yang 22c2cd57fb update readme (#2830) 2016-04-13 11:33:31 -07:00
Fangjin Yang abd951df1a Document how to use roaring bitmaps (#2824)
* Document how to use roaring bitmaps

This fixes #2408.
While not all indexSpec properties are explained, it does explain how roaring bitmaps can be turned on.

* fix

* fix

* fix

* fix
2016-04-12 19:28:02 -07:00
michaelschiff db35dd7508 fix issue #2744. Check for null before combining metrics (#2774) 2016-04-12 14:46:31 -07:00
Charles Allen ed5377465a add AirBnB Caravel to list of libraries (#2719) 2016-04-12 12:53:50 -07:00
Sébastien Launay 37d2ab623e Merge pull request #2815 from slaunay/documentation/hadoop-classpath-issue-fix-with-configuration
Doc for mapreduce.job.user.classpath.first=true
2016-04-12 10:51:51 -07:00
Fangjin Yang bf4aa965fb Merge pull request #2777 from metamx/postgresql-upsert
support PostgreSQL >= 9.5 upsert capability
2016-04-12 10:24:08 -07:00
Fangjin Yang 886ee4e30d Merge pull request #2821 from metamx/review-comments-2784
handle review comments for PR 2784
2016-04-12 10:20:43 -07:00
Fangjin Yang b486eff6b7 Merge pull request #2805 from metamx/query-time-start
request log should reflect time the query was received
2016-04-12 09:44:42 -07:00