125 Commits

Author SHA1 Message Date
Nishant
f576a0ff14 Contrib Extension for Ambari Metrics Emitter (#3767)
* Contrib Extension for Ambari Metrics Emitter

extension to enable druid to send metrics to ambari metrics server
(https://cwiki.apache.org/confluence/display/AMBARI/Metrics)

review comments

switch to public repo

* review comments

* add docs

* fix pom version

* Add link for doc page in extensions.md

* remove unused imports

* review comments

review comments

remove unused dependency

review comment
2016-12-19 11:12:47 -08:00
David Lim
8eee259629 add documentation on segments generated (#3785) 2016-12-19 09:41:47 -08:00
Ninglin Du
469ab21091 [Feature] Thrift support for realtime and batch ingestion (#3418)
* Thrift ingestion plugin

1. thrift binary is platform dependent, use scrooge to generate java files to avoid style check failure
2. stream and hadoop ingesion are both supported, input format can be sequence file and lzo thrift block file.
3. base64 and protocol aware

change header

* fix conlicts in pom
2016-12-13 10:05:15 -08:00
Erik Dubbelboer
9f7050e221 Fix some grammar and spelling mistakes (#3717) 2016-11-28 11:49:30 -08:00
Himanshu
7d37f675ba fix the documented property name for specifying avro reader schema (#3708) 2016-11-22 15:02:41 -08:00
Parag Jain
7ee6bb7410 option to reset offest automatically in case of OffsetOutOfRangeException (#3678)
* option to reset offset automatically in case of OffsetOutOfRangeException
if the next offset is less than the earliest available offset for that partition

* review comments

* refactoring

* refactor

* review comments
2016-11-21 16:29:46 -06:00
Erik Dubbelboer
7d36f540e8 WIP: Add Google Storage support (#2458)
Also excludes the correct artifacts from #2741
2016-11-16 14:06:45 +05:30
Keuntae Park
094f5b851b Support Min/Max for Timestamp (#3299)
* Min/Max aggregator for Timestamp

* remove unused imports and method

* rebase and zip the test data

* add docs
2016-11-14 23:00:21 -08:00
Gian Merlino
bcd20441be Make buildV9Directly the default. (#3688) 2016-11-14 09:29:32 -08:00
Mark
575aeb843a Metadata Storage extension for Microsoft SqlServer (sqlserver-metadata-storage) (#3421) 2016-11-08 14:56:52 -08:00
Nicolas Colomer
37ecffb648 Add support for Confluent Schema Registry in the avro extension (#3529) 2016-11-08 16:10:45 -06:00
cheddar
c49a9d5693 Call out semver expectations for modules (#3659)
* Call out semver expectations for modules

* Update modules.md

* Link to versioning
2016-11-04 12:52:05 -07:00
Gian Merlino
4203580290 URIExtractionNamespace: Treat null values in lookup maps as missing entries. (#3512)
* URIExtractionNamespace: Treat null values in lookup maps as missing entries.

This is useful when many logical lookups are derived from the same base JSON file,
and some lookups' values may be unknown sometimes.

* Add test, logging message, and address other comments.

* Update docs.
2016-11-03 13:53:04 -07:00
David Lim
9226d4af3c configurable shutdownTimeout for Kakfa supervisor (#3497)
* configurable shutdownTimeout

* cr change
2016-09-23 13:26:45 -06:00
David Lim
ca9114b41b add supervisor reset API (#3484)
* add supervisor reset API

* CR doc changes and kill running tasks / clear offsets from supervisor
2016-09-22 17:51:06 -07:00
Gian Merlino
27bd5cb13a Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. (#3473)
Fixes #3241.
2016-09-21 13:40:04 -06:00
David Lim
96fcca18ea update KafkaSupervisor to make HTTP requests to tasks in parallel where possible (#3452) 2016-09-20 22:51:15 +05:30
Slim
3175e17a3b Cached lookup module. first cut implementing JDBC cache (#2819) 2016-09-16 13:45:54 -07:00
Gian Merlino
e0e28866ee JavaScript docs: Fix links and typos, add to TOC. (#3457) 2016-09-13 15:26:44 -07:00
Himanshu
a069257d37 avro-extension -- feature to specify multiple avro reader schemas inline (#3368)
* rename SimpleAvroBytesDecoder to InlineSchemaAvroBytesDecoder

* feature to specify multiple schemas inline in avro module
2016-09-13 14:54:31 -07:00
Gian Merlino
76a24054e3 JavaScript docs, including docs for globals. (#3454) 2016-09-13 13:46:55 -07:00
Slim
ba6ddf307e Adding hadoop kerberos authentification. (#3419)
* adding kerberos authentication

* make the 2 functions identical
2016-09-13 10:42:50 -07:00
David Lim
3a97fd4d6c doc fix (#3430) 2016-09-06 13:13:30 -06:00
Stéphane Derosiaux
48dce88aab Add flag binaryAsString for parquet ingestion (#3381) 2016-08-30 17:30:50 -07:00
Dave Li
c4e8440c22 Adds long compression methods (#3148)
* add read

* update deprecated guava calls

* add write and vsizeserde

* add benchmark

* separate encoding and compression

* add header and reformat

* update doc

* address PR comment

* fix buffer order

* generate benchmark files

* separate encoding strategy and format

* fix benchmark

* modify supplier write to channel

* add float NONE handling

* address PR comment

* address PR comment 2
2016-08-30 16:17:46 -07:00
Fangjin Yang
edb0eca3a9 fix docs (#3370) 2016-08-16 16:25:50 -07:00
Fangjin Yang
6beb8ac342 fix some docs and add new content (#3369) 2016-08-16 15:00:18 -07:00
Himanshu
46da682231 avro-extensions -- feature to specify avro reader schema inline in the task json for all events (#3249) 2016-08-10 10:49:26 -07:00
Jonathan Wei
decefb7477 Add time interval dim filter and retention analysis example (#3315)
* Add time interval dim filter and retention analysis example

* Use closed-open matching for intervals, update cache key generation

* Fix time filtering tests for interval boundary change
2016-08-05 07:25:04 -07:00
Navis Ryu
5b3f0ccb1f Support variance and standard deviation (#2525)
* Support variance and standard deviation

* addressed comments
2016-08-04 17:32:58 -07:00
Fangjin Yang
d51ec398d4 fix parquet docs (#3304) 2016-08-01 07:54:48 -07:00
Keuntae Park
95a58097e2 Hadoop InputRowParser for Orc file (#3019)
* InputRowParser to decode OrcStruct from OrcNewInputFormat

* add unit test for orc hadoop indexing

* update docs and fix test code bug

* doc updated

* resove maven dependency conflict

* remove unused imports

* fix returning array type from Object[] to correct primitive array type

* fix to support getDimension() of MapBasedRow : changing return type of orc list from array to list

* rebase and updated based on comments

* updated based on comments

* on reflecting review comments

* fix bug in typeStringFromParseSpec() and add unit test

* add license header
2016-07-26 09:42:56 -07:00
Gian Merlino
ea03906fcf Configurable compressRunOnSerialization for Roaring bitmaps. (#3228)
Defaults to true, which is a change in behavior (this used to be false and unconfigurable).
2016-07-08 10:24:19 +05:30
Charles Allen
3f1681c16c Caffeine cache extension (#3028)
* Initial commit of caffeine cache

* Address code comments

* Move and fixup README.md a bit

* Improve caffeine readme information

* Cleanup caffeine pom

* Address review comments

* Bump caffeine to 2.3.1

* Bump druid version to 0.9.2-SNAPSHOT

* Make test not fail randomly.

See https://github.com/ben-manes/caffeine/pull/93#issuecomment-227617998 for an explanation

* Fix distribution and documentation

* Add caffeine to extensions.md

* Fix links in extensions.md

* Lexicographic
2016-07-06 15:42:54 -07:00
Charles Allen
8b7d9750ee Update extension docs for global lookup module (#3206) 2016-06-29 12:51:52 -07:00
David Lim
b24425a280 update docs with new behavior (#3200) 2016-06-28 16:17:04 -07:00
Gian Merlino
c12712e8b8 Move "libraries.md" out of docs, onto the main site. (#3159) 2016-06-16 18:14:35 -07:00
michaelschiff
7294ea87c3 link to statsd metrics emitter docs from development/extensions.html doc page (#3125) 2016-06-10 16:27:16 -07:00
Gian Merlino
99ee3f4dc3 Fixups, clarifications to lookup docs. (#3060) 2016-06-07 10:43:35 -07:00
Charles Allen
fa41a6466a Cleanup the base lookup cluster wide config docs (#3061)
* Cleanup the base lookup cluster wide config docs

* Add better examples in lookups-cached-global.md

* Use actual valid stock lookups

* Fixed maps with :

* Add mix of lookups

* Better examples in extension

* Remove unneeded namespace requirement

* Add extra line space

* Add link to lookup tiers

* Renamed header
2016-06-07 10:42:41 -07:00
Charles Allen
8cac710546 Async lookups-cached-global by default (#3074)
* Async lookups-cached-global by default
* Also better lookup docs

* Fix test timeouts

* Fix timing of deserialized test

* Fix problem with 0 wait failing immediately
2016-06-03 15:58:10 -05:00
David Lim
a2290a8f05 support seamless config changes (#3051) 2016-06-03 13:50:19 -07:00
Erik Dubbelboer
b4737336e5 Added info about Google Cloud Storage (#3056) 2016-06-02 10:06:07 -07:00
David Lim
f6c39cc844 Kafka task minimum message time (#3035)
* add KafkaIndexTask support for minimumMessageTime

* add Kafka supervisor support for lateMessageRejectionPeriod
2016-05-31 11:37:00 -07:00
scusjs
ebb6831770 rm , of jobProperties. jackson can not parse it (#3012) 2016-05-26 09:46:33 -07:00
Charles Allen
245077b47f Fix formatting in lookups-cached-global.md (#3009) 2016-05-24 17:28:39 -07:00
Charles Allen
c738c0e1cd Silly Typo in docs 2016-05-24 13:31:58 -07:00
Charles Allen
8024b915e2 [QTL] Implement LookupExtractorFactory of namespaced lookup (#2926)
* support LookupReferencesManager registration of namespaced lookup and eliminate static configurations for lookup from namespecd lookup extensions

- druid-namespace-lookup and druid-kafka-extraction-namespace are modified
- However, druid-namespace-lookup still has configuration about ON/OFF
  HEAP cache manager selection, which is not namespace wide
  configuration but node wide configuration as multiple namespace shares
  the same cache manager

* update KafkaExtractionNamespaceTest to reflect argument signature changes

* Add more synchronization functionality to NamespaceLookupExtractorFactory

* Remove old way of using extraction namespaces

* resolve compile error by supporting LookupIntrospectHandler

* Remove kafka lookups

* Remove unused stuff

* Fix start and stop behavior to be consistent with new javadocs

* Remove unused strings

* Add timeout option

* Address comments on configurations and improve docs

* Add more options and update hash key and replaces

* Move monitoring to the overriding classes

* Add better start/stop logging

* Remove old docs about namespace names

* Fix bad comma

* Add `@JsonIgnore` to lookup factory

* Address code review comments

* Remove ExtractionNamespace from module json registration

* Fix problems with naming and initialization. Add tests

* Optimize imports / reformat

* Fix future not being properly cancelled on failed initial scheduling

* Fix delete returns

* Add more docs about whole introspection

* Add `/version` introspection point for lookups

* Add more tests and address comments

* Add StaticMap extraction namespace for testing. Also add a bunch of tests

* Move cache system property to `druid.lookup.namespace.cache.type`

* Make VERSION lower case

* Change poll period to 0ms  for StaticMap

* Move cache key to bytebuffer

* Change hashCode and equals on static map extraction fn

* Add more comments on StaticMap

* Address comments

* Make scheduleAndWait use a latch

* Sanity renames and fix imports

* Remove extra info in docs

* Fix review comments

* Strengthen failure on start from warn to error

* Address comments

* Rename namespace-lookup to lookups-cached-global

* Fix injective mis-naming
* Also add serde test
2016-05-24 10:56:40 -07:00
Nishant
dea4391a49 fix broken links (#3003) 2016-05-23 06:38:21 -07:00
Fangjin Yang
00de26c76a fix extensions docs (#2995)
* fix extensions docs

* fix mistakes
2016-05-19 14:01:06 -07:00