Commit Graph

153 Commits

Author SHA1 Message Date
Bartosz Ługowski 8dddccc687 Graphite emitter - add plaintext protocol ()
* Graphite emitter - add plaintext protocol. Configurable option of replacing slash to dot in metric name.

* Graphite emitter - fix misspelling in docs.

* Graphite emitter - extend docs.

* Graphite emitter - fix code style.
2017-08-29 06:23:06 -07:00
Gian Merlino 9fbfc1be32 Add @ExtensionPoint and @PublicApi annotations. ()
* Add @ExtensionPoint and @PublicApi annotations.

* Clean up wording.

* Remove unused import.

* Remove unused imports.

* Only types can be extension points.

* Adjust annotations some more.

* Remove unused import.

* Make ServletFilterHolder an extension point.

* Add a couple extension points, and update docs.
2017-08-28 14:50:58 -07:00
QiuMM 59a48a560a Redis cache extension doc ()
* Redis cache extension doc

* link redis cache doc in extensions.md
2017-08-24 09:53:51 -05:00
Yuewen Wang c821bc9a5a Implement "earlyMessageRejectionPeriod" config discussed in issue ()
* Implement "earlyMessageRejectionPeriod" config discussed in issue 
    * implement the logics of this param
    * Added doc of this config
    * Added unit tests of it

* Update KafkaSupervisor.java

ameliorate comment

* fix format

* fix bug when rebasing
2017-08-11 09:12:08 +09:00
Peter Cunningham ede7cf9eef Added support for where clauses to JDBC lookups. ()
* Added support for where clauses to filter lookup values on ingestion.

Added a filter field to the JDBC lookups that is used to generate a
where clause so that only rows matching the filter value will be
brought into Druid. Example being filter="SOMECOLUMN=1"

* Required changes based on code review.

* Required changes based on code review.

* Added support for where clauses to filter lookup values on ingestion.

Added a filter field to the JDBC lookups that is used to generate a
where clause so that only rows matching the filter value will be
brought into Druid. Example being filter="SOMECOLUMN=1"

* Updates based on code review, mainly formatting and small refactor of
the buildLookupQuery method.

* Fixed broken buildLookupQuery method

* Removed empty line.

* Updates per review comments
2017-08-09 10:47:46 -07:00
Parag Jain 6e2f78f552 TLS support () 2017-07-06 17:40:12 -07:00
Roman Leventov 2fa4b10145 More fine-grained DI for management node types. Don't allocate processing resources on Router ()
* Remove DruidProcessingModule, QueryableModule and QueryRunnerFactoryModule from DI for coordinator, overlord, middle-manager. Add RouterDruidProcessing not to allocate processing resources on router

* Fix examples

* Fixes

* Revert Peon configs and add comments

* Remove qualifier
2017-06-27 22:58:01 -07:00
Roman Leventov 05d58689ad Remove the ability to create segments in v8 format ()
* Remove ability to create segments in v8 format

* Fix IndexGeneratorJobTest

* Fix parameterized test name in IndexMergerTest

* Remove extra legacy merging stuff

* Remove legacy serializer builders

* Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest
2017-06-26 13:21:39 -07:00
Fokko Driesprong ff501e8f13 Add Date support to the parquet reader ()
* Add Date support to the parquet reader

Add support for the Date logical type. Currently this is not supported. Since the parquet
date is number of days since epoch gets interpreted as seconds since epoch, it will fails
on indexing the data because it will not map to the appriopriate bucket.

* Cleaned up code and tests

Got rid of unused json files in the examples, cleaned up the tests by
using try-with-resources. Now get the filenames from the json file
instead of hard coding them and integrated general improvements from
the feedback provided by leventov.

* Got rid of the caching

Remove the caching of the logical type of the time dimension column
and cleaned up the code a bit.
2017-06-22 15:56:08 -05:00
Yuya Fujiwara 152d4e89ab Fix typo in the avro.md. () 2017-06-06 07:14:08 -07:00
David Lim 13ecf90923 Report Kafka lag information in supervisor status report ()
* refactor lag reporting and report lag at status endpoint

* refactor offset reporting logic to fetch offsets periodically vs. at request time

* remove JavaCompatUtils

* code review changes

* code review changes
2017-06-05 13:26:25 -07:00
Jihoon Son 1150bf7a2c Refactoring Appenderator Driver ()
* Refactoring Appenderator

1) Added publishExecutor and handoffExecutor for background publishing and handing segments off
2) Change add() to not move segments out in it

* Address comments

1) Remove publishTimeout for KafkaIndexTask
2) Simplifying registerHandoff()
3) Add increamental handoff test

* Remove unused variable

* Add persist() to Appenderator and more tests for AppenderatorDriver

* Remove unused imports

* Fix strict build

* Address comments
2017-06-02 07:09:11 +09:00
Kenji Noguchi 3400f601db Protobuf extension ()
* move ProtoBufInputRowParser from processing module to protobuf extensions

* Ported PR 

* add DynamicMessage

* fix local test stuff that slipped in

* add license header

* removed redundant type name

* removed commented code

* fix code style

* rename ProtoBuf -> Protobuf

* pom.xml: shade protobuf classes, handle .desc resource file as binary file

* clean up error messages

* pick first message type from descriptor if not specified

* fix protoMessageType null check. add test case

* move protobuf-extension from contrib to core

* document: add new configuration keys, and descriptions

* update document. add examples

* move protobuf-extension from contrib to core (2nd try)

* touch

* include protobuf extensions in the distribution

* fix whitespace

* include protobuf example in the distribution

* example: create new pb obj everytime

* document: use properly quoted json

* fix whitespace

* bump parent version to 0.10.1-SNAPSHOT

* ignore Override check

* touch
2017-05-30 13:11:58 -07:00
Jihoon Son 11b7b1bea6 Add support for HttpFirehose ()
* Add support for HttpFirehose

* Fix document

* Add documents
2017-05-25 16:13:04 -05:00
Jihoon Son 733dfc9b30 Add PrefetchableTextFilesFirehoseFactory for cloud storage types ()
* Add PrefetcheableTextFilesFirehoseFactory

* fix comment

* exception handling

* Fix wrong json property

* Remove ReplayableFirehoseFactory and fix misspelling

* Defer object initialization

* Add a temporaryDirectory parameter to FirehoseFactory.connect()

* fix when cache and fetch are disabled

* Address comments

* Add more test

* Increase timeout for test

* Add wrapObjectStream

* Move methods to Firehose from PrefetchableFirehoseFactory

* Cleanup comment

* add directory listing to s3 firehose

* Rename a variable

* Addressing comments

* Update document

* Support disabling prefetch

* Fix race condition

* Add fetchLock

* Remove ReplayableFirehoseFactoryTest

* Fix compilation error

* Fix test failure

* Address comments

* Add default implementation for new method
2017-05-18 15:37:18 +09:00
David Lim 8333043b7b add skipOffsetGaps flag () 2017-05-16 12:19:28 -06:00
Jihoon Son 50a4ec2b0b Add support for headers and skipping thereof for CSV and TSV ()
* initial commit

* small fixes

* fix bug

* fix bug

* address code review

* more cr

* more cr

* more cr

* fix

* Skip head rows for CSV and TSV

* Move checking skipHeadRows to FileIteratingFirehose

* Remove checking null iterators

* Remove unused imports

* Address comments

* Fix compilation error

* Address comments

* Add more tests

* Add a comment to ReplayableFirehose

* Addressing comments

* Add docs and fix typos
2017-05-15 22:57:31 -07:00
hzy001 0c464f4a84 Fix docs ()
* Fix one typo

Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>

* Fix deprecated links

Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>
2017-05-01 09:55:43 -07:00
Gian Merlino 631068b099 Fix broken DataSketches link. ()
* Fix broken DataSketches link.

* Better fixed link.
2017-04-27 17:37:12 -07:00
satishbhor d51097c809 Fix lz4 library incompatibility in kafka-indexing-service extension ()
* Fix lz4 library incompatibility in kafka-indexing-service extension 

* Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension 

* Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension 
2017-04-25 12:23:51 +09:00
Dongkyu Hwangbo 0d2e91ed50 Adding Kafka-emitter ()
* Initial commit

* Apply another config: clustername

* Rename variable

* Fix bug

* Add retry logic

* Edit retry logic

* Upgrade kafka-clients version to the most recent release

* Make callback single object

* Write documentation

* Rewrite error message and emit logic

* Handling AlertEvent

* Override toString()

* make clusterName more optional

* bump up druid version

* add producer.config option which make user can apply another optional config value of kafka producer

* remove potential blocking in emit()

* using MemoryBoundLinkedBlockingQueue

* Fixing coding convention

* Remove logging every exception and just increment counting

* refactoring

* trivial modification

* logging when callback has exception

* Replace kafka-clients 0.10.1.1 with 0.10.2.0

* Resolve the problem related of classloader

* adopt try statement

* code reformatting

* make variables final

* rewrite toString
2017-04-04 14:07:43 -07:00
Fokko Driesprong add17fa7db Remove the metadataUpdateSpec from specfile ()
Get rid of the metadataUpdateSpec section in the json example to
ingest parquet into druid. When this element is present, it will
fail start an indexing job.
2017-03-01 14:24:36 -08:00
Akash Dwivedi 94da5e80f9 Namespace optimization for hdfs data segments. ()
* NN optimization for hdfs data segments.

* HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage
format.Docs update.

* Common utility function in DataSegmentPusherUtil.

* new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper

* reuse getHdfsStorageDirUptoVersion in
DataSegmentPusherUtil.getHdfsStorageDir()

* Addressed comments.

* Review comments.

* HdfsDataSegmentKiller requested changes.

* extra newline

* Add maprfs.
2017-03-01 09:51:20 -08:00
michaelschiff e5fb0e1ff5 New property for each metric that tells the StatsDEmitter to convert metric values from range 0-1 to 0-100. This ()
prevents rates and percentages expressed as Doubles (0.xx) from being rounded down to 0.
2017-02-16 13:55:56 -08:00
Himanshu 9dfcf0763a disable javascript execution by default () 2017-02-13 15:11:18 -08:00
Erik Dubbelboer 2aa2fa57b5 Simple doc fix () 2017-02-06 15:52:17 +05:30
Nishant Bangarwa a457cded28 Druid Extension to enable Authentication using Kerberos. ()
* Add extension for supporting kerberos security

- This PR adds an extension for supporting druid authentication via
Kerberos.
- Working on the docs.

* Add docs

* review comments

* more review comments

* Block all paths by default

* more review comments - use proper Oid

* Allow extensions to override httpclient for integration tests

* Add kerberos lock to prevent multithreaded issues.

* review comment - remove enabled flag and fix router injection

* Add Cookie Handling and more detailed docs

* review comment - rename DruidKerberosConfig -> AuthKerberosConfig

* review comments

* fix travis failure on jdk7
2017-02-02 14:55:21 -06:00
kaijianding 33ae9dd485 streaming version of select query ()
* streaming version of select query

* use columns instead of dimensions and metrics;prepare for valueVector;remove granularity

* respect query limit within historical

* use constant

* fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner

* add some test for scan query

* add scan query document

* fix merge conflicts

* add compactedList resultFormat, this format is better for json ser/der

* respect query timeout

* respect query limit on broker

* use static consts and remove unused code
2017-01-19 16:09:53 -06:00
Nishant f576a0ff14 Contrib Extension for Ambari Metrics Emitter ()
* Contrib Extension for Ambari Metrics Emitter

extension to enable druid to send metrics to ambari metrics server
(https://cwiki.apache.org/confluence/display/AMBARI/Metrics)

review comments

switch to public repo

* review comments

* add docs

* fix pom version

* Add link for doc page in extensions.md

* remove unused imports

* review comments

review comments

remove unused dependency

review comment
2016-12-19 11:12:47 -08:00
David Lim 8eee259629 add documentation on segments generated () 2016-12-19 09:41:47 -08:00
Ninglin Du 469ab21091 [Feature] Thrift support for realtime and batch ingestion ()
* Thrift ingestion plugin

1. thrift binary is platform dependent, use scrooge to generate java files to avoid style check failure
2. stream and hadoop ingesion are both supported, input format can be sequence file and lzo thrift block file.
3. base64 and protocol aware

change header

* fix conlicts in pom
2016-12-13 10:05:15 -08:00
Erik Dubbelboer 9f7050e221 Fix some grammar and spelling mistakes () 2016-11-28 11:49:30 -08:00
Himanshu 7d37f675ba fix the documented property name for specifying avro reader schema () 2016-11-22 15:02:41 -08:00
Parag Jain 7ee6bb7410 option to reset offest automatically in case of OffsetOutOfRangeException ()
* option to reset offset automatically in case of OffsetOutOfRangeException
if the next offset is less than the earliest available offset for that partition

* review comments

* refactoring

* refactor

* review comments
2016-11-21 16:29:46 -06:00
Erik Dubbelboer 7d36f540e8 WIP: Add Google Storage support ()
Also excludes the correct artifacts from 
2016-11-16 14:06:45 +05:30
Keuntae Park 094f5b851b Support Min/Max for Timestamp ()
* Min/Max aggregator for Timestamp

* remove unused imports and method

* rebase and zip the test data

* add docs
2016-11-14 23:00:21 -08:00
Gian Merlino bcd20441be Make buildV9Directly the default. () 2016-11-14 09:29:32 -08:00
Mark 575aeb843a Metadata Storage extension for Microsoft SqlServer (sqlserver-metadata-storage) () 2016-11-08 14:56:52 -08:00
Nicolas Colomer 37ecffb648 Add support for Confluent Schema Registry in the avro extension () 2016-11-08 16:10:45 -06:00
cheddar c49a9d5693 Call out semver expectations for modules ()
* Call out semver expectations for modules

* Update modules.md

* Link to versioning
2016-11-04 12:52:05 -07:00
Gian Merlino 4203580290 URIExtractionNamespace: Treat null values in lookup maps as missing entries. ()
* URIExtractionNamespace: Treat null values in lookup maps as missing entries.

This is useful when many logical lookups are derived from the same base JSON file,
and some lookups' values may be unknown sometimes.

* Add test, logging message, and address other comments.

* Update docs.
2016-11-03 13:53:04 -07:00
David Lim 9226d4af3c configurable shutdownTimeout for Kakfa supervisor ()
* configurable shutdownTimeout

* cr change
2016-09-23 13:26:45 -06:00
David Lim ca9114b41b add supervisor reset API ()
* add supervisor reset API

* CR doc changes and kill running tasks / clear offsets from supervisor
2016-09-22 17:51:06 -07:00
Gian Merlino 27bd5cb13a Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. ()
Fixes .
2016-09-21 13:40:04 -06:00
David Lim 96fcca18ea update KafkaSupervisor to make HTTP requests to tasks in parallel where possible () 2016-09-20 22:51:15 +05:30
Slim 3175e17a3b Cached lookup module. first cut implementing JDBC cache () 2016-09-16 13:45:54 -07:00
Gian Merlino e0e28866ee JavaScript docs: Fix links and typos, add to TOC. () 2016-09-13 15:26:44 -07:00
Himanshu a069257d37 avro-extension -- feature to specify multiple avro reader schemas inline ()
* rename SimpleAvroBytesDecoder to InlineSchemaAvroBytesDecoder

* feature to specify multiple schemas inline in avro module
2016-09-13 14:54:31 -07:00
Gian Merlino 76a24054e3 JavaScript docs, including docs for globals. () 2016-09-13 13:46:55 -07:00
Slim ba6ddf307e Adding hadoop kerberos authentification. ()
* adding kerberos authentication

* make the 2 functions identical
2016-09-13 10:42:50 -07:00