7216 Commits

Author SHA1 Message Date
Gian Merlino
e680665f1c Fix Avro parseSpec example, "type" should be "format". (#2918) 2016-05-03 09:22:43 -07:00
binlijin
841be5c61f periodically emit metric segment/scan/pending (#2854) 2016-05-02 22:38:13 -07:00
Navis Ryu
2729fea84d Fix parsing fail of segment id with datasource containing underscore (#2797)
* Fix parsing fail of segment id with underscored datasource (Fix for #2786)

* addressed comment

* renamed and moved code into api. added log4 dependency for tests

* addressed comments

* fixed test fails
2016-05-02 22:37:28 -07:00
Himanshu
6c5bf91f9a publish metrics numJettyConns to see how number of active jetty connections change over time (#2839)
this can be compared with numer of active queries to see if requests are waiting in jetty queue
2016-05-02 14:08:25 -07:00
Charles Allen
6b957aa072 [QTL] Make URI Exctraction Namespace take more sane arguments (#2738)
* Make URI Exctraction Namespace take more sane arguments
* Fixes https://github.com/druid-io/druid/issues/2669

* Update docs

* Rename error message

* Undo overzealous deletion of docs

* Explain caching mechanism a bit more in docs
2016-05-02 12:54:34 -07:00
Charles Allen
54b717bdc3 [QTL] Move kafka-extraction-namespace to the Lookup framework. (#2800)
* Move kafka-extraction-namespace to the Lookup framework.

* Address comments

* Fix missing kafka introspection

* Fix tests to be less racy

* Make testing a bit more leniant

* Make tests even more forgiving

* Add comments to kafka lookup cache method

* Move startStopLock to just use started

* Make start() and stop() idempotent

* Forgot to update test after last change, test now accounts for idempotency

* Add extra idempotency on stop check

* Add more descriptive docs of behavior
2016-05-02 09:45:13 -07:00
John Wang
e1eb3b1d95 Merge pull request #2905 from javasoze/eclipse_formatting
eclipse formatting fixes and added import order specs
2016-04-29 18:42:03 -07:00
Gian Merlino
488d12d592 CombiningSequence: Delay making next yielder on creation until it is actually asked for. (#2892)
This fixes the behavior of limited combining sequences (otherwise limit = 1 would
actually yield 2 elements).
2016-04-29 11:12:58 -07:00
michaelschiff
2203a812bc statsd-emitter (#2410) 2016-04-28 18:41:02 -07:00
David Lim
5f0a9ccc57 fix ClassCastException in FiniteAppenderatorDriver (#2896) 2016-04-28 18:39:24 -07:00
Charles Allen
3f71a4a302 Fix missing log arguments in PendingTaskBasedWorkerResourceManagementStrategy (#2898) 2016-04-28 18:15:41 -07:00
Gian Merlino
67b47c982f Datasketches: Remove isInputThetaSketch from cache key. (#2899) 2016-04-28 18:14:52 -07:00
Parag Jain
0d745ee120 Basic authorization support in Druid (#2424)
- Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions
- If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks
-  `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN`
- As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are  - `READ` or `WRITE`
- Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example
 - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/`
 - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/`
 - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task`
 - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc
 - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc
- For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method.

Note -
JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424
2016-04-28 16:50:28 -07:00
David Lim
890bdb543d doc fixes (#2897) 2016-04-28 15:34:58 -07:00
Himanshu
9669e79df2 fix misleading error log due to race in RTR and concurrency test (#2878) 2016-04-28 10:28:00 -07:00
Gian Merlino
16080dc54f Adjust colliding aggregator cache IDs. (#2891)
- Renumbered ApproximateHistogramAggregatorFactory from 8 to 12,
  8 was taken by CardinalityAggregatorFactory
- Renumbered ApproximateHistogramFoldingAggregatorFactory from 9 to 13,
  9 was taken by FilteredAggregatorFactory
2016-04-28 10:11:33 -07:00
Gian Merlino
909abd17f3 Sketch cache key should include size, isInputThetaSketch. (#2893) 2016-04-28 10:10:46 -07:00
Gian Merlino
90ce03c66f Fix integer overflow in SegmentMetadataQuery numRows. (#2890) 2016-04-27 14:37:04 -07:00
Nishant
c29cb7d711 add pending task based resource management strategy (#2086) 2016-04-27 10:40:53 -07:00
Nishant
bf5e5e7b75 fix #2886 (#2887)
Fixes https://github.com/druid-io/druid/issues/2886
2016-04-27 08:29:41 -07:00
Slim
58510d826b fix emit wait time (#2869) 2016-04-26 17:07:03 -07:00
John Wang
5658bd99eb added contextual time parse (#2867) 2016-04-25 13:35:10 -07:00
Slim
55785267e4 postAgg filedName must match name of AGG (#2874) 2016-04-22 11:11:54 -07:00
binlijin
9151099e08 add document for druid.segmentCache.numBootstrapThreads (#2872) 2016-04-22 12:06:08 +08:00
Gian Merlino
6dc7688a29 TimeAndDims equals/hashCode implementation. (#2870)
Adapted from #2692, thanks @navis for original implementation.
2016-04-22 08:45:20 +08:00
Himanshu
3cfd9c64c9 make singleThreaded groupBy query config overridable at query time (#2828)
* make isSingleThreaded groupBy query processing overridable at query time

* refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent
2016-04-21 17:12:58 -07:00
David Lim
7641f2628f add control and status endpoints to KafkaIndexTask (#2730) 2016-04-21 15:34:59 -07:00
Slim
984a518c9f Merge pull request #2734 from b-slim/LookupIntrospection2
[QTL][Lookup] adding introspection endpoint
2016-04-21 12:15:57 -05:00
Gian Merlino
c74391e54c JavaScript: Ability to disable. (#2853)
Fixes #2852.
2016-04-21 09:43:15 -05:00
Xavier Léauté
5938d9085b Stream segments from database (#2859)
* Avoids fetching all segment records into heap by JDBC driver
* Set connection to read-only to help database optimize queries
* Update JDBC drivers (MySQL has fixes for streaming results)
2016-04-21 05:40:07 +08:00
Gian Merlino
7d3e55717d Reduce cost of various toFilter calls. (#2860)
These happen once per segment and so it's better if they don't do
as much work.
2016-04-21 04:28:46 +08:00
Xavier Léauté
fc91120b54 Merge pull request #2857 from metamx/upgrade-zk
upgrade zookeeper client dependency to 3.4.8
2016-04-20 10:36:07 +05:30
Gian Merlino
59460b17cc Add Filters.matchPredicate helper, use it where appropriate. (#2851)
This approach simplifies code and is generally faster, due to skipping
unnecessary dictionary lookups (see #2850).
2016-04-19 15:54:32 -07:00
Xavier Léauté
b2745befb7 remove obsolete comment (#2858) 2016-04-19 13:06:58 -07:00
Nishant
f80a5dc4ef Avoid creating multiple NoneShardSpec objects (#2855)
* Avoid creating multiple NoneShardSpec objects
* deprecate NoneShardSpec constructor
2016-04-19 10:30:14 -07:00
Nishant
dbf63f738f Add ability to filter segments for specific dataSources on broker without creating tiers (#2848)
* Add back FilteredServerView removed in a32906c7fd11c9a8554df2621a172353a523a9dd to reduce memory usage using watched tiers.

* Add functionality to specify "druid.broker.segment.watchedDataSources"
2016-04-19 10:10:06 -07:00
Gian Merlino
08c784fbf6 KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844)
segment creation deterministic.

This means that each segment will contain data from just one Kafka
partition. So, users will probably not want to have a super high number
of Kafka partitions...

Fixes #2703.
2016-04-18 22:29:52 -07:00
Jisoo Kim
7b65ca7889 refactor ClientQuerySegmentWalker (#2837)
* refactor ClientQuerySegmentWalker

* add header to FluentQueryRunnerBuilder

* refactor QueryRunnerTestHelper
2016-04-18 14:00:47 -07:00
Xavier Léauté
838768c632 upgrade curator, fixes #2829 (#2849) 2016-04-18 13:17:36 -07:00
Gian Merlino
7c0b1dde3a DimensionPredicateFilter: Skip unnecessary dictionary lookup. (#2850) 2016-04-18 12:38:25 -07:00
Gaurav Kumar
f5822faca3 Fixed wrong parseSpec in Avro Hadoop Parser (#2846)
`parseSpec` should contain `format` instead of `type`. It was wrongly defaulting to `tsv`
2016-04-16 11:34:54 -07:00
binlijin
c1e690288c Improve some log (#2807) 2016-04-15 09:34:26 -07:00
Nishant
632b21472b fix test failure (#2818)
formatting changes
2016-04-14 21:40:19 -07:00
Jonathan Wei
b534f7203c Fix performance regression from #2753 in IndexMerger (#2841) 2016-04-14 21:39:41 -07:00
Jonathan Wei
a26134575b Fix NPE in TopNLexicographicResultBuilder.addEntry() (#2835) 2016-04-13 17:27:16 -07:00
du00cs
639d0630b8 jackson conflict workaround in hadooop ingestio & parquet extension coordinate update (#2817) 2016-04-13 14:20:33 -07:00
Fangjin Yang
0c4a42bb6f change toc entry (#2834) 2016-04-13 13:45:07 -07:00
Gian Merlino
e320d13385 Fix various broken links in the docs. (#2833) 2016-04-13 13:30:01 -07:00
Gian Merlino
725ee1401d Update tranquility version in the docs. (#2832) 2016-04-13 11:33:59 -07:00
Gian Merlino
aa25cc1f68 Fix up Kafka tutorial (#2831)
1) Remove extraneous section
2) Remove -SNAPSHOT version
2016-04-13 11:33:45 -07:00