druid

Commit Graph

Author	SHA1	Message	Date
David Glasser	c08f391605	statsd-emitter: support constant DogStatsD tags (#6791 ) PR #6605 added support to the statsd emitter for DogStatsD tags. This commit lets you specify "constant tags" in the config file which are included with every event. This is helpful if you are running in an environment where you cannot configure your datadog-agent with tags like "cluster name" --- eg, a Kubernetes cluster with a datadog-agent on each node and different Druid deployments in different namespaces but sharing the same datadog-agent daemonset. Also fix the name of an existing boolean getter to start with 'is'.	2019-01-04 15:35:37 +08:00
Mingming Qiu	6761663509	make kafka poll timeout can be configured (#6773 ) * make kafka poll timeout can be configured * add doc * rename DEFAULT_POLL_TIMEOUT to DEFAULT_POLL_TIMEOUT_MILLIS	2019-01-03 12:16:02 +08:00
Joshua Sun	7c7997e8a1	Add Kinesis Indexing Service to core Druid (#6431 ) * created seekablestream classes * created seekablestreamsupervisor class * first attempt to integrate kafa indexing service to use SeekableStream * seekablestream bug fixes * kafkarecordsupplier * integrated kafka indexing service with seekablestream * implemented resume/suspend and refactored some package names * moved kinesis indexing service into core druid extensions * merged some changes from kafka supervisor race condition * integrated kinesis-indexing-service with seekablestream * unite tests for kinesis-indexing-service * various bug fixes for kinesis-indexing-service * refactored kinesisindexingtask * finished up more kinesis unit tests * more bug fixes for kinesis-indexing-service * finsihed refactoring kinesis unit tests * removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions * kinesis-indexing-service code cleanup and docs * merge #6291 merge #6337 merge #6383 * added more docs and reordered methods * fixd kinesis tests after merging master and added docs in seekablestream * fix various things from pr comment * improve recordsupplier and add unit tests * migrated to aws-java-sdk-kinesis * merge changes from master * fix pom files and forbiddenapi checks * checkpoint JavaType bug fix * fix pom and stuff * disable checkpointing in kinesis * fix kinesis sequence number null in closed shard * merge changes from master * fixes for kinesis tasks * capitalized <partitionType, sequenceType> * removed abstract class loggers * conform to guava api restrictions * add docker for travis other modules test * address comments * improve RecordSupplier to supply records in batch * fix strict compile issue * add test scope for localstack dependency * kinesis indexing task refactoring * comments * github comments * minor fix * removed unneeded readme * fix deserialization bug * fix various bugs * KinesisRecordSupplier unable to catch up to earliest position in stream bug fix * minor changes to kinesis * implement deaggregate for kinesis * Merge remote-tracking branch 'upstream/master' into seekablestream * fix kinesis offset discrepancy with kafka * kinesis record supplier disable getPosition * pr comments * mock for kinesis tests and remove docker dependency for unit tests * PR comments * avg lag in kafkasupervisor #6587 * refacotred SequenceMetadata in taskRunners * small fix * more small fix * recordsupplier resource leak * revert .travis.yml formatting * fix style * kinesis docs * doc part2 * more docs * comments * comments2 revert string replace changes * comments * teamcity * comments part 1 * comments part 2 * comments part 3 * merge #6754 * fix injection binding * comments * KinesisRegion refactor * comments part idk lol * can't think of a commit msg anymore * remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner * commmmmmmmmmments * extra error handling in KinesisRecordSupplier getRecords * comments * quickfix * typo * oof	2018-12-21 12:49:24 -07:00
Jonathan Wei	c713116a75	Use @Coordinator leader client in CoordinatorRuleManager (#6729 )	2018-12-16 15:18:09 -08:00
Clint Wylie	4ec068642d	move parquet extension input formats up a level to `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet` and `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro` (#6727 )	2018-12-13 16:33:42 -08:00
David Lim	f7bbee2e65	Front Matter header needs to be on the first line for md to be rendered properly by jekyll (#6733 )	2018-12-13 11:47:20 -08:00
Vadim Ogievetsky	da4836f38c	Added titles and harmonized docs to improve usability and SEO (#6731 ) * added titles and harmonized docs * manually fixed some titles	2018-12-12 20:42:12 -08:00
David Lim	e2bedab665	fix links to use relative references (#6696 )	2018-11-30 16:32:10 -08:00
David Lim	b332021c49	remove extensions from default configs that have configuration/library dependencies and update docs (#6694 )	2018-11-30 12:52:46 -08:00
Mingming Qiu	849ba867b2	fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory (#6656 )	2018-11-27 15:59:58 -08:00
Clint Wylie	efdec50847	bloom filter sql (#6502 ) * bloom filter sql support * docs * style fix * style fixes after rebase * use copied/patched bloomkfilter * remove context literal lookup function, changes from review * fix build * rename LookupOperatorConversion to QueryLookupOperatorConversion * remove doc * revert unintended change * add internal exception to bloom filter deserialization exception	2018-11-27 14:11:18 +08:00
Jonathan Wei	e285b1103d	Use PasswordProvider for basic HTTP escalator (#6650 )	2018-11-21 07:34:15 -08:00
Deiwin Sarjas	e0d1dc5846	Support DogStatsD style tags in statsd-emitter (#6605 ) * Replace StatsD client library The [Datadog package][1] is a StatsD compatible drop-in replacement for the client library, but it seems to be [better maintained][2] and has support for Datadog DogStatsD specific features, which will be made use of in a subsequent commit. The `count`, `time`, and `gauge` methods are actually exactly compatible with the previous library and the modifications shouldn't be required, but EasyMock seems to have a hard time dealing with the variable arguments added by the DogStatsD library and causes tests to fail if no arguments are provided for the last String vararg. Passing an empty array fixes the test failures. [1]: https://github.com/DataDog/java-dogstatsd-client [2]: https://github.com/tim-group/java-statsd-client/issues/37#issuecomment-248698856 * Retain dimension key information for StatsD metrics This doesn't change behavior, but allows separating dimensions from the metric name in subsequent commits. There is a possible order change for values from `dimsBuilder.build().values()`, but from the tests it looks like it doesn't affect actual behavior and the order of user dimensions is also retained. * Support DogStatsD style tags in statsd-emitter Datadog [doesn't support name-encoded dimensions and uses a concept of _tags_ instead.][1] This change allows Datadog users to send the metrics without having to encode the various dimensions in the metric names. This enables building graphs and monitors with and without aggregation across various dimensions from the same data. As tests in this commit verify, the behavior remains the same for users who don't enable the `druid.emitter.statsd.dogstatsd` configuration flag. [1]: https://www.datadoghq.com/blog/the-power-of-tagged-metrics/#tags-decouple-collection-and-reporting * Disable convertRange behavior for DogStatsD users DogStatsD, unlike regular StatsD, supports floating-point values, so this behavior is unnecessary. It would be possible to still support `convertRange`, even with `dogstatsd` enabled, but that would mean that people using the default mapping would have some of the gauges unnecessarily converted. `time` is in milliseconds and doesn't support floating-point values.	2018-11-19 09:47:57 -08:00
Gian Merlino	7cd457f41c	Kafka: Add warning to doc for earlyMessageRejectionPeriod. (#6644 )	2018-11-18 15:47:38 -07:00
David Lim	afb239b17a	add missing license headers, in particular to MD files; clean up RAT … (#6563 ) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg	2018-11-13 09:38:37 -08:00
Clint Wylie	1224d8b746	overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360 ) * move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql * fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays * remove leftover print * convert micro timestamp to millis * checkstyle * add ignore for .parquet and .parq to rat exclude * fix legit test failure from avro flattern behavior change * fix rebase * add exclusions to pom to cut down on redundant jars * refactor tests, add support for unwrapping lists for parquet-avro, review comments * more comment * fix oops * tweak parquet-avro list handling * more docs * fix style * grr styles	2018-11-05 21:33:42 -08:00
Caroline1000	26d992840c	correct default tier name (#6568 )	2018-11-01 17:51:13 -07:00
Jihoon Son	a92c2a197b	Move supervisor APIs to api-reference (#6555 ) * Move supervisor APIs to api-reference * fix kafka-specific docs * add ingestion stats report	2018-11-01 13:10:05 -07:00
taiii	b1159174b7	Update mysql.md (#6545 )	2018-10-30 14:01:32 -07:00
Jonathan Wei	b2d9b6f23d	Allow custom TLS cert checks (#6432 ) * Allow custom TLS cert checks * PR comment * Checkstyle, PR comment	2018-10-24 16:31:52 -07:00
David Lim	822e564f54	include mysql-metadata-storage extension in distribution, but without… (#6497 ) * include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library * Install mysql connector package * use symlinks to avoid versioning issues * add documentation for fetching the mysql connector	2018-10-20 18:18:58 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
Atul Mohan	ab7b4798cc	Securing passwords used for SSL connections to Kafka (#6285 ) * Secure credentials in consumer properties * Merge master * Refactor property population into separate method * Fix property setter * Fix tests	2018-10-11 10:03:01 -07:00
QiuMM	f8f4526b16	Add suspend\|resume\|terminate all supervisors endpoints. (#6272 ) * ability to showdown all supervisors * add doc * address comments * fix code style * address comments * change ternary assignment to if statement * better docs	2018-10-10 21:41:59 -07:00
QiuMM	d559dfecb2	replace deprecated druid.port by druid.plaintextPort in docs (#6427 )	2018-10-09 10:57:01 -07:00
Nishant Bangarwa	c9d281a2e9	Add ability to pass in Bloom filter from Hive Queries (#6222 ) * Bloom filter initial implementation fix checkstyle review comments Fix wierd failure review comments Revert "Fix wierd failure" This reverts commit a13a83ad7887e679f6d539191b52aeaaea85b613. * fix test * review comment	2018-09-26 16:04:26 -07:00
Benedict Jin	e5d9fcfe8f	Add maven.exec.xxx.skip option for exec-maven-plugin (#6162 ) * Fix conflicts * Modify io.druid into org.apache.druid	2018-09-25 10:05:26 -07:00
Jonathan Wei	ee7b565469	Docs for ingestion stat reports and new parse exception handling (#6373 )	2018-09-24 17:45:05 -07:00
Alexander Saydakov	93345064b5	HllSketch module (#5712 ) * HllSketch module * updated license and imports * updated package name * implemented makeAggregateCombiner() * removed json marks * style fix * added module * removed unnecessary import, side effect of package renaming * use TreadLocalRandom * addressing code review points, mostly formatting and comments * javadoc * natural order with nulls * typo * factored out raw input value extraction * singleton * style fix * style fix * use Collections.singletonList instead of Arrays.asList * suppress warning	2018-09-24 08:41:56 -07:00
Jonathan Wei	8972244c68	Mutual TLS support (#6076 ) * Mutual TLS support * Kafka test fixes * TeamCity fix * Split integration tests * Use localhost DOCKER_IP * Increase server thread count * Increase SSL handshake timeouts * Add broken pipe retries, use injected client config params * PR comments, Rat license check exclusion	2018-09-19 09:56:15 -07:00
QiuMM	85391e9fb3	fix opentsdb emitter always be running and fail sending tags whose value contains colon (#6251 ) * fix opentsdb emitter always be running * check if emitter started * add more details about consumeDelay in doc * fix possible thread unsafe * fix fail sending tags whose value contain colon	2018-09-14 12:14:15 -07:00
Jonathan Wei	fd6786ac6c	Fix endpoint permissions section in basic-security docs (#6331 )	2018-09-13 15:23:41 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Jonathan Wei	60cbc64472	Use PasswordProvider, fix info on initial passwords in basic security extension docs (#6303 ) * Fix info on initial passwords in basic security extension docs * Use PasswordProvider * Compile fix	2018-09-05 17:07:16 -07:00
Jonathan Wei	180e3ccfad	Docs consistency cleanup (#6259 )	2018-09-04 12:54:41 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Ryan Plessner	9c500fb69f	Add PostgreSQLConnectorConfig to expose SSL configuration options (#6181 ) * Add PostgreSQLConnectorConfig to expose SSL configuration options for the Postgres Metadata Storage module. * Fix checkstyle violations and add license header * Convert properties in the postgres docs to be the full property path and fix typo * Fix grammar in sslFactory docs	2018-08-21 16:45:27 -07:00
pdeva	c028d18d74	update redis-cache documentation (#6109 ) * update redis-cache documentation added clarifying info on setup and enablement * added link	2018-08-09 13:44:59 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Eyal Yurman	94d6c9a0a5	Remove JDK 7 from build documentation. (#6031 ) See issue #6030	2018-07-26 17:05:07 -07:00
Caroline1000	ee4a5aafb0	add config values for GCS deep storage (#5875 ) * add config values for GCS deep storage * fix config values for GCS deep storage	2018-07-05 09:53:41 -07:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
Siddharth Subramanian	37409dc2f4	Fix minor documentation error (#5851 ) Adding a required `,` in the sample JSON	2018-06-06 12:51:56 -07:00
awelsh93	1a4707f09c	Remove extra slash in endpoint (#5822 )	2018-06-05 13:11:26 -07:00
Alexander Saydakov	d1cdcd4895	Datasketches doc correction (#5816 ) * func was renamed to operation during code review * added missing descriptions, some cleanup	2018-06-05 17:52:37 +05:30
Jihoon Son	67ff7dacbd	Support server-side encryption for s3 (#5740 ) * Support server-side encryption for s3 * fix teamcity * typo * address comments * Refactoring configuration injection * fix doc * fix doc	2018-05-28 20:22:08 -07:00
Dylan Wylie	2c5f0038fd	Make lookup offheap buffer configurable (#5696 ) * Make lookup offheap buffer configurable Fixes #3663 * Address comments * Update docs * Update docs	2018-05-04 10:00:55 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
Jihoon Son	d4311b4a5a	Support enablePathStyleAccess, disableChunkedEncoding, and forceGlobalBucketAccessEnabled for aws client (#5702 ) * Support enablePathStyleAccess and disableChunkedEncoding for aws client * add an option for forceGlobalBucketAccessEnabled * add missing doc	2018-05-02 10:45:38 -07:00
Gian Merlino	0f8493846e	Replace dev list references in docs. (#5723 )	2018-04-30 11:25:45 -07:00
Nishant Bangarwa	b32aad9ab4	Fix some broken links in druid docs (#5622 ) * Fix some broken links in druid docs * review comment	2018-04-11 10:27:33 -07:00
Nishant Bangarwa	80fa5094e8	Fix Kerberos Authentication failing requests without cookies and excludedPaths config. (#5596 ) * Fix Kerberos Authentication failing requests without cookies. KerberosAuthenticator was failing `First` request from the clients. After authentication we were setting the cookie properly but not setting the the authenticated flag in the request. This PR fixed that. Additional Fixes - * Removing of Unused SpnegoFilterConfig - replaced by KerberosAuthenticator * Unused internalClientKeytab and principal from KerberosAuthenticator * Fix docs accordingly and add docs for configuring an escalated client. * Fix excluded path config behavior * spelling correction * Revert "spelling correction" This reverts commit `fb754b43d8`. * Revert "Fix excluded path config behavior" This reverts commit `3901047769`.	2018-04-09 20:45:35 -07:00
Alexander T	ad6f234e1e	Update lookups-cached-global.md (#5525 ) Update lookup creation example to work with version 0.12.0	2018-04-06 16:13:17 -07:00
Dylan Wylie	ddd23a11e6	Fix taskDuration docs for KafkaIndexingService (#5572 ) * With incremental handoff the changed line is no longer true.	2018-04-05 23:52:58 -07:00
Nathan Hartwell	ea30c05355	Adding ParserSpec for Influx Line Protocol (#5440 ) * Adding ParserSpec for Influx Line Protocol * Addressing PR feedback - Remove extraneous TODO - Better handling of parse errors (e.g. invalid timestamp) - Handle sub-millisecond timestamps * Adding documentation for Influx parser * Fixing docs	2018-03-26 14:28:46 -07:00
Gian Merlino	0851f2206c	Expanded documentation for DataSketches aggregators. (#5513 ) Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448. I also added redirects and updated links to point to the new datasketches-extension.html landing page for the extension, rather than to the old page about theta sketches.	2018-03-21 18:19:27 -07:00
Jihoon Son	1ad898bde2	Use the official aws-sdk instead of jet3t (#5382 ) * Use the official aws-sdk instead of jet3t * fix compile and serde tests * address comments and fix test * add http version string * remove redundant dependencies, fix potential NPE, and fix test * resolve TODOs * fix build * downgrade jackson version to 2.6.7 * fix test * resolve the last TODO * support proxy and endpoint configurations * fix build * remove debugging log * downgrade hadoop version to 2.8.3 * fix tests * remove unused log * fix it test * revert KerberosAuthenticator change * change hadoop-aws scope to provided in hdfs-storage * address comments * address comments	2018-03-21 15:36:54 -07:00
Christoph Hösler	34f655599d	Let MySQLConnector accept all UTF charsets and recommend utf8mb4 (#5411 ) * Let MySQLConnector accept all UTF charsets and recommend utf8mb4 * Fix regex and remove newline in log statement	2018-03-13 01:16:10 -07:00
Parag Jain	fba13d8978	time based checkpointing for Kafka Indexing Service (#5255 ) * time based checkpointing * add test and fix issue * fix comments * fix formatting * update docs	2018-02-15 20:57:02 -08:00
David Lim	20a3164180	Support for router forwarding requests to active coordinator/overlord (#5369 ) * allow router to forward requests to coordinator and overlord * fix forbidden API * more forbidden api fixes * code review changes	2018-02-15 14:38:58 -08:00
Dan Suzuki	472ba14dfe	Support Map type in ORC extension (#5363 ) * Support map type in orc extension. Added getMapObject in OrcHadoopInputRowParser Updated parse tests to parse map-type field in OrcHadoopInputRowParserTest * changed from for-loop to foreach * added resolution of column names when map types are exploded to several columns. updated the document as well -- orc.md. * Update orc.md change from review	2018-02-15 13:03:15 -08:00
Parag Jain	b9b3be6965	fix segment info in Kafka indexing service docs (#5390 ) * fix segment info in Kafka indexing service docs * review updates	2018-02-15 09:57:30 -08:00
QiuMM	aa7aee53ce	Opentsdb emitter extension (#5380 ) * opentsdb emitter extension * doc for opentsdb emitter extension * update opentsdb emitter doc * add the ms unit to the constant name * add a configurable event limit * fix version to 0.13.0-SNAPSHOT * using a thread to consume metric event * rename method and parameter	2018-02-13 13:10:22 -08:00
Gian Merlino	ed47a1e1a9	Lookups: Inherit "injective" from registered lookups, improve docs. (#5316 ) Code changes: - In the lookup-based extractionFns, inherit injective property from the lookup itself if not specified. Doc changes: - Add a "Query execution" section to the lookups doc explaining how injective lookups and their optimizations work. - Remove scary warnings against using registeredLookup extractionFns. They are necessary and important since they work with filters and function cascades -- two things that the dimension specs do not do. They deserve to be first class citizens. - Move the "registeredLookup" fn above the "lookup" fn. It's probably more commonly used, so the docs read better this way.	2018-02-01 18:30:19 -08:00
Jonathan Wei	80419752b5	Add metamx emitter, http clients, and metrics packages to druid java-util (#5289 ) * Add metamx java-util emitter, http clients, and metrics packages to druid java-util * Remove metamx java-util from pom.xml files * Checkstyle fixes * Import fix * TeamCity inspection fixes * Use slf4j, move some version defs to master pom.xml * Use parent jvm-attach-api and maven-surefire-plugin versions * Add ] to log msg, suppress inspection	2018-01-24 22:10:36 +01:00
Fokko Driesprong	cc32640642	Update the example of the dimensionsSpec (#5293 ) The example was outdated with the dateSpec	2018-01-24 11:28:54 -08:00
Jihoon Son	241efafbb2	Automatic compaction by coordinators (#5102 ) * Automatic compaction by coordinator * add links * skip compaction for very recent segments if they are small * fix finding search interval * fix finding search interval * fix TimelineHolder iteration * add test for newestSegmentFirstPolicy * add CompactionSegmentIterator * add numTargetCompactionSegments * add missing config * fix skipping huge shards * fix handling large number of segments per shard * fix test failure * change recursive call to loop * fix logging * fix build * fix test failure * address comments * change dataSources type * check running pendingTasks at each run * fix test * address comments * fix build * fix test * address comments * address comments * add doc for segment size optimization * address comment	2018-01-13 13:52:37 +09:00
Atul Mohan	3cc4a0ab19	Support for encryption of MySQL connections (#5122 ) * Encrypting MySQL connections * Update docs * Make verifyServerCertificate a configurable parameter * Change password parameter and doc update * Make server certificate verification disabled by default * Update tostring * Update docs * Add check for trust store passwords * Add warning for null password	2018-01-10 11:33:54 -08:00
Jonathan Wei	02544f9197	Add missing auth doc links (#5224 )	2018-01-05 16:23:13 -06:00
Nishant Bangarwa	494e0b79ed	Allow configuring header size for druid requests (#5174 ) * Allow configuring header size for druid requests * fix configuration name in doc. * add more info to docs. * Add info to kerberos doc.	2017-12-20 18:51:40 -08:00
Jonathan Wei	f48c9d7be1	Basic auth extension (#5099 ) * Basic auth extension * Add auth configuration integration test * Fix missing authorizerName property * PR comments * Fix missing @JsonProperty annotation * PR comments * more PR comments	2017-12-14 10:36:04 -08:00
Roman Leventov	a7a6a0487e	Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762 ) * Replace IOPeon with OutputMedium; Improve compression * Fix test * Cleanup CompressionStrategy * Javadocs * Add OutputBytesTest * Address comments * Random access in OutputBytes and GenericIndexedWriter * Fix bugs * Fixes * Test OutputBytes.readFully() * Address comments * Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes * Add comments to ByteBufferInputStream * Remove unused declarations	2017-12-04 18:04:27 -08:00
chaoqiang	50140ce820	StatsD Emitter Doc on blankHolder (#5101 ) * fix equalDistribution worker select strategy * replace anonymous Comparator * keep previous version sorting comment * fix code style * update comment * move JsonProperty * fix statsD emitter with blank character * Add blankHolder doc On statsD monitor	2017-11-18 12:00:47 -08:00
Parag Jain	cb03efeb14	Kafka Index Task that supports Incremental handoffs (#4815 ) * Kafka Index Task that supports Incremental handoffs - Incrementally handoff segments when they hit maxRowsPerSegment limit - Decouple segment partitioning from Kafka partitioning, all records from consumed partitions go to a single druid segment - Support for restoring task on middle manager restarts by check pointing end offsets for segments * take care of review comments * make getCurrentOffsets call async, keep track of publishing sequence, review comments * fix setEndoffset duplicate request handling, formatting * fix unit test * backward compatibility * make AppenderatorDriverMetadata backwards compatible * add unit test * fix deadlock between persist and push executors in AppenderatorImpl * fix formatting * use persist dir instead of work dir * review comments * fix deadlock * actually fix deadlock	2017-11-17 16:05:20 -06:00
Jonathan Wei	6840eabd87	Add Router connection balancers for Avatica queries (#4983 ) * Add Router connection balancers for Avatica queries * PR comments * Adjust test bounds * PR comments * Add doc comments * PR comments * PR comment * Checkstyle fix	2017-11-01 14:01:13 -07:00
elloooooo	52a162e302	define earlyMessegeRejectPeriod as the period after the taskduration (#4990 )	2017-10-27 01:13:46 +05:30
chunghochen	0614b92df1	adding new post aggregators for test statistics to druid-stats extension (#4532 ) * adding new post aggregators of test stats to druid-stats extension * changes to address code review comments * fix checkstyle violations using druid_intellij_formatting.xml after merge upstream/master * add @Override annotation per CI log * make changes per review comments/discussions * remove some blocks per review comments	2017-10-09 23:43:27 -07:00
Gian Merlino	bf8fd4c203	Add flattenSpec support to the Avro parser. (#4832 ) * Add flattenSpec support to the Avro parser. Also: - Refactor the JSONPathParser a bit so it can share flattening code with Avro (see ObjectFlatteners). - Remove the JSONParser. It was only used in two places: by UriNamespaceExtractor, and as a base for JSONToLowerParser. Migrated the former to JSONPathParser and made the latter a standalone. - Move GenericRecordAsMap to the Parquet extension, since the Avro extension no longer uses it. * Fix indentation. * Fix equals/hashCode.	2017-09-26 09:26:06 -07:00
Roman Leventov	b56a907145	Add namespace extraction thread config (#4833 )	2017-09-25 09:52:36 -07:00
Charles Allen	a6470c1d03	Move caffeine out of extension and make it the default cache implementation. (#4810 ) * Move caffeine out of extension. * Remove `JsonTypeName` from the class itself * Fix bad docs * Fix distribution pom * Fix unused import * Make caffeine default * Address code comments * Add more description around the jre version in the readme * Add suggested comments	2017-09-22 10:46:55 -07:00
Jonathan Wei	09fcb75583	Add RequestLogEvent emitters config to graphite-emitter (#4678 ) * Add RequestLogEvent emitters config to graphite-emitter * eagerly compute emitter list * use lambdas * checkstyle	2017-09-22 06:14:32 -07:00
Jonathan Wei	164c73f2b2	Fix kerberos authenticator docs (#4822 )	2017-09-19 14:32:22 -05:00
Jonathan Wei	c2a0e753b6	Extension points for authentication/authorization (#4271 ) * Extension points for authentication/authorization * Address some PR comments * Authorization result caching * Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter * Use Set for auth caching, close outputstreams in filters * Don't close output stream on success in sanity check filter * Add ConfigResourceFilter to coordinator lookups * Fix filtering authorization check for empty resource list * HttpClient users must explicitly escalate the client * Remove response modification from PreResponseAuthorizationCheckFilter * Remove extraneous pom.xml * Fix unit test * Better lifecycle management * Rename AuthorizationManager to Authorizer * Fix authorization denials for empty supervisor list * Address some PR comments * Address more PR comments * Small cleanup * Add Jetty HttpClient wrapper to Authenticator * Remove Authorizer start/stop * Restore immutable context map in DruidConnection, UT fix * Fix/update docs * Add authorization checks to EventReceiverFirehose * Fix router authorization check failure, restore PreResponseAuthorizationFilter changes * Compile fixes * Test fixes * Update Authenticator/Authorizer doc comments * Merge fixes * PR comments * Fix test * Fix IT * More PR comments * PR comments * SSL fix	2017-09-15 23:45:48 -07:00
Gian Merlino	2ce8123bdb	Move scan-query from a contrib extension into core. (#4751 ) * Move scan-query from a contrib extension into core. Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion This patch also adds support for virtual columns to the Scan query, and updates Druid SQL to use Scan instead of Select. This patch also makes some behavioral changes to handling of the __time column. In particular, it is now is returned as "__time" rather than "timestamp"; it is no longer included if you do not specifically ask for it in your "columns"; and it is returned as a long rather than a string. Users can revert time handling to the legacy extension behavior by setting "legacy" : true in their queries, or setting the property druid.query.scan.legacy = true. This is meant to provide a migration path for users that were formerly using the contrib extension. * Adjustments from review. * Add back Select query. * Adjust SQL docs. * Restore SelectQuery link.	2017-09-13 09:51:24 -07:00
Bartosz Ługowski	8dddccc687	Graphite emitter - add plaintext protocol (#4265 ) * Graphite emitter - add plaintext protocol. Configurable option of replacing slash to dot in metric name. * Graphite emitter - fix misspelling in docs. * Graphite emitter - extend docs. * Graphite emitter - fix code style.	2017-08-29 06:23:06 -07:00
Gian Merlino	9fbfc1be32	Add @ExtensionPoint and @PublicApi annotations. (#4433 ) * Add @ExtensionPoint and @PublicApi annotations. * Clean up wording. * Remove unused import. * Remove unused imports. * Only types can be extension points. * Adjust annotations some more. * Remove unused import. * Make ServletFilterHolder an extension point. * Add a couple extension points, and update docs.	2017-08-28 14:50:58 -07:00
QiuMM	59a48a560a	Redis cache extension doc (#4702 ) * Redis cache extension doc * link redis cache doc in extensions.md	2017-08-24 09:53:51 -05:00
Yuewen Wang	c821bc9a5a	Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 (#4607 ) * Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 * implement the logics of this param * Added doc of this config * Added unit tests of it * Update KafkaSupervisor.java ameliorate comment * fix format * fix bug when rebasing	2017-08-11 09:12:08 +09:00
Peter Cunningham	ede7cf9eef	Added support for where clauses to JDBC lookups. (#4643 ) * Added support for where clauses to filter lookup values on ingestion. Added a filter field to the JDBC lookups that is used to generate a where clause so that only rows matching the filter value will be brought into Druid. Example being filter="SOMECOLUMN=1" * Required changes based on code review. * Required changes based on code review. * Added support for where clauses to filter lookup values on ingestion. Added a filter field to the JDBC lookups that is used to generate a where clause so that only rows matching the filter value will be brought into Druid. Example being filter="SOMECOLUMN=1" * Updates based on code review, mainly formatting and small refactor of the buildLookupQuery method. * Fixed broken buildLookupQuery method * Removed empty line. * Updates per review comments	2017-08-09 10:47:46 -07:00
Parag Jain	6e2f78f552	TLS support (#4270 )	2017-07-06 17:40:12 -07:00
Roman Leventov	2fa4b10145	More fine-grained DI for management node types. Don't allocate processing resources on Router (#4429 ) * Remove DruidProcessingModule, QueryableModule and QueryRunnerFactoryModule from DI for coordinator, overlord, middle-manager. Add RouterDruidProcessing not to allocate processing resources on router * Fix examples * Fixes * Revert Peon configs and add comments * Remove qualifier	2017-06-27 22:58:01 -07:00
Roman Leventov	05d58689ad	Remove the ability to create segments in v8 format (#4420 ) * Remove ability to create segments in v8 format * Fix IndexGeneratorJobTest * Fix parameterized test name in IndexMergerTest * Remove extra legacy merging stuff * Remove legacy serializer builders * Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest	2017-06-26 13:21:39 -07:00
Fokko Driesprong	ff501e8f13	Add Date support to the parquet reader (#4423 ) * Add Date support to the parquet reader Add support for the Date logical type. Currently this is not supported. Since the parquet date is number of days since epoch gets interpreted as seconds since epoch, it will fails on indexing the data because it will not map to the appriopriate bucket. * Cleaned up code and tests Got rid of unused json files in the examples, cleaned up the tests by using try-with-resources. Now get the filenames from the json file instead of hard coding them and integrated general improvements from the feedback provided by leventov. * Got rid of the caching Remove the caching of the logical type of the time dimension column and cleaned up the code a bit.	2017-06-22 15:56:08 -05:00
Yuya Fujiwara	152d4e89ab	Fix typo in the avro.md. (#4370 )	2017-06-06 07:14:08 -07:00
David Lim	13ecf90923	Report Kafka lag information in supervisor status report (#4314 ) * refactor lag reporting and report lag at status endpoint * refactor offset reporting logic to fetch offsets periodically vs. at request time * remove JavaCompatUtils * code review changes * code review changes	2017-06-05 13:26:25 -07:00
Jihoon Son	1150bf7a2c	Refactoring Appenderator Driver (#4292 ) * Refactoring Appenderator 1) Added publishExecutor and handoffExecutor for background publishing and handing segments off 2) Change add() to not move segments out in it * Address comments 1) Remove publishTimeout for KafkaIndexTask 2) Simplifying registerHandoff() 3) Add increamental handoff test * Remove unused variable * Add persist() to Appenderator and more tests for AppenderatorDriver * Remove unused imports * Fix strict build * Address comments	2017-06-02 07:09:11 +09:00
Kenji Noguchi	3400f601db	Protobuf extension (#4039 ) * move ProtoBufInputRowParser from processing module to protobuf extensions * Ported PR #3509 * add DynamicMessage * fix local test stuff that slipped in * add license header * removed redundant type name * removed commented code * fix code style * rename ProtoBuf -> Protobuf * pom.xml: shade protobuf classes, handle .desc resource file as binary file * clean up error messages * pick first message type from descriptor if not specified * fix protoMessageType null check. add test case * move protobuf-extension from contrib to core * document: add new configuration keys, and descriptions * update document. add examples * move protobuf-extension from contrib to core (2nd try) * touch * include protobuf extensions in the distribution * fix whitespace * include protobuf example in the distribution * example: create new pb obj everytime * document: use properly quoted json * fix whitespace * bump parent version to 0.10.1-SNAPSHOT * ignore Override check * touch	2017-05-30 13:11:58 -07:00
Jihoon Son	11b7b1bea6	Add support for HttpFirehose (#4297 ) * Add support for HttpFirehose * Fix document * Add documents	2017-05-25 16:13:04 -05:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00

1 2 3 4 5 ...

288 Commits