druid

Commit Graph

Author	SHA1	Message	Date
Jihoon Son	8e3a58f723	Improve druid.storage.sse.kms.keyId and druid.s3.protocol (#7012 ) * Improve druid.storage.sse.kms.keyId and druid.s3.protocol * fix article	2019-02-06 15:00:51 -08:00
Jihoon Son	75c70c2ccc	Add doc for S3 permissions settings (#7011 ) * Add doc for S3 permissions settings * add a comment about additional settings	2019-02-05 11:52:09 -08:00
Clint Wylie	7a5827e12e	bloom filter sql aggregator (#6950 ) * adds sql aggregator for bloom filter, adds complex value serde for sql results * fix tests * checkstyle * fix copy-paste	2019-02-01 13:54:46 -08:00
Gian Merlino	54735a5ad1	Kafka indexing: Remove experimental notice. (#6970 )	2019-01-31 09:54:22 -08:00
Jonathan Wei	82137874ea	Add master/data/query server concepts to docs/packaging (#6916 ) * Add master/data/query server concepts to docs/packaging * PR comments * TOC and markdown fix * Update image legend * PR comment * More PR comments	2019-01-30 19:41:07 -08:00
Jihoon Son	d4fbbb8deb	Support protocol configuration for S3 (#6954 ) * Support protocol configuration for S3 * Add doc	2019-01-30 19:32:00 -08:00
Clint Wylie	a6d81c0d16	Adds bloom filter aggregator to 'druid-bloom-filters' extension (#6397 ) * blooming aggs * partially address review * fix docs * minor test refactor after rebase * use copied bloomkfilter * add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests * add methods to BloomKFilter to get number of set bits, use in comparator, fixes * more docs * fix * fix style * simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets * oof, more fixes * more sane docs example * fix it * do the right thing in the right place * formatting * fix * avoid conflict * typo fixes, faster comparator, docs for comparator behavior * unused imports * use buffer comparator instead of deserializing * striped readwrite lock for buffer agg, null handling comparator, other review changes * style fixes * style * remove sync for now * oops * consistency * inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception * CardinalityBufferAggregator inspect selectors instead of selectorPluses * fix style * refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments * adjustment * fix teamcity error? * rename nil aggs to empty, change empty agg constructor signature, add comments * use stringutils base64 stuff to be chill with master * add aggregate combiner, comment	2019-01-29 20:05:17 +07:00
Clint Wylie	af3cbc3687	add bloom filter druid expression (#6904 ) * add "bloom_filter_test" druid expression to support bloom filters in ExpressionVirtualColumn and ExpressionDimFilter and sql expressions * more docs * use java.util.Base64, doc fixes	2019-01-28 08:41:45 -05:00
Navin Kumar	ae4dba7785	Fix Configuration options (#6884 ) Change `druid.metadata.postgres.` to `druid.metadata.postgres.ssl.`	2019-01-27 12:35:27 -08:00
Justin Borromeo	86e171a234	Doc change and commands tested command on v5 and v8 (#6886 )	2019-01-18 15:13:11 -08:00
Jonathan Wei	68f744ec0a	Fixed buckets histogram aggregator (#6638 ) * Fixed buckets histogram aggregator * PR comments * More PR comments * Checkstyle * TeamCity * More TeamCity * PR comment * PR comment * Fix doc formatting	2019-01-17 14:51:16 -08:00
David Glasser	c08f391605	statsd-emitter: support constant DogStatsD tags (#6791 ) PR #6605 added support to the statsd emitter for DogStatsD tags. This commit lets you specify "constant tags" in the config file which are included with every event. This is helpful if you are running in an environment where you cannot configure your datadog-agent with tags like "cluster name" --- eg, a Kubernetes cluster with a datadog-agent on each node and different Druid deployments in different namespaces but sharing the same datadog-agent daemonset. Also fix the name of an existing boolean getter to start with 'is'.	2019-01-04 15:35:37 +08:00
Mingming Qiu	6761663509	make kafka poll timeout can be configured (#6773 ) * make kafka poll timeout can be configured * add doc * rename DEFAULT_POLL_TIMEOUT to DEFAULT_POLL_TIMEOUT_MILLIS	2019-01-03 12:16:02 +08:00
Joshua Sun	7c7997e8a1	Add Kinesis Indexing Service to core Druid (#6431 ) * created seekablestream classes * created seekablestreamsupervisor class * first attempt to integrate kafa indexing service to use SeekableStream * seekablestream bug fixes * kafkarecordsupplier * integrated kafka indexing service with seekablestream * implemented resume/suspend and refactored some package names * moved kinesis indexing service into core druid extensions * merged some changes from kafka supervisor race condition * integrated kinesis-indexing-service with seekablestream * unite tests for kinesis-indexing-service * various bug fixes for kinesis-indexing-service * refactored kinesisindexingtask * finished up more kinesis unit tests * more bug fixes for kinesis-indexing-service * finsihed refactoring kinesis unit tests * removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions * kinesis-indexing-service code cleanup and docs * merge #6291 merge #6337 merge #6383 * added more docs and reordered methods * fixd kinesis tests after merging master and added docs in seekablestream * fix various things from pr comment * improve recordsupplier and add unit tests * migrated to aws-java-sdk-kinesis * merge changes from master * fix pom files and forbiddenapi checks * checkpoint JavaType bug fix * fix pom and stuff * disable checkpointing in kinesis * fix kinesis sequence number null in closed shard * merge changes from master * fixes for kinesis tasks * capitalized <partitionType, sequenceType> * removed abstract class loggers * conform to guava api restrictions * add docker for travis other modules test * address comments * improve RecordSupplier to supply records in batch * fix strict compile issue * add test scope for localstack dependency * kinesis indexing task refactoring * comments * github comments * minor fix * removed unneeded readme * fix deserialization bug * fix various bugs * KinesisRecordSupplier unable to catch up to earliest position in stream bug fix * minor changes to kinesis * implement deaggregate for kinesis * Merge remote-tracking branch 'upstream/master' into seekablestream * fix kinesis offset discrepancy with kafka * kinesis record supplier disable getPosition * pr comments * mock for kinesis tests and remove docker dependency for unit tests * PR comments * avg lag in kafkasupervisor #6587 * refacotred SequenceMetadata in taskRunners * small fix * more small fix * recordsupplier resource leak * revert .travis.yml formatting * fix style * kinesis docs * doc part2 * more docs * comments * comments2 revert string replace changes * comments * teamcity * comments part 1 * comments part 2 * comments part 3 * merge #6754 * fix injection binding * comments * KinesisRegion refactor * comments part idk lol * can't think of a commit msg anymore * remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner * commmmmmmmmmments * extra error handling in KinesisRecordSupplier getRecords * comments * quickfix * typo * oof	2018-12-21 12:49:24 -07:00
Jonathan Wei	c713116a75	Use @Coordinator leader client in CoordinatorRuleManager (#6729 )	2018-12-16 15:18:09 -08:00
Clint Wylie	4ec068642d	move parquet extension input formats up a level to `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet` and `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro` (#6727 )	2018-12-13 16:33:42 -08:00
David Lim	f7bbee2e65	Front Matter header needs to be on the first line for md to be rendered properly by jekyll (#6733 )	2018-12-13 11:47:20 -08:00
Vadim Ogievetsky	da4836f38c	Added titles and harmonized docs to improve usability and SEO (#6731 ) * added titles and harmonized docs * manually fixed some titles	2018-12-12 20:42:12 -08:00
David Lim	e2bedab665	fix links to use relative references (#6696 )	2018-11-30 16:32:10 -08:00
David Lim	b332021c49	remove extensions from default configs that have configuration/library dependencies and update docs (#6694 )	2018-11-30 12:52:46 -08:00
Mingming Qiu	849ba867b2	fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory (#6656 )	2018-11-27 15:59:58 -08:00
Clint Wylie	efdec50847	bloom filter sql (#6502 ) * bloom filter sql support * docs * style fix * style fixes after rebase * use copied/patched bloomkfilter * remove context literal lookup function, changes from review * fix build * rename LookupOperatorConversion to QueryLookupOperatorConversion * remove doc * revert unintended change * add internal exception to bloom filter deserialization exception	2018-11-27 14:11:18 +08:00
Jonathan Wei	e285b1103d	Use PasswordProvider for basic HTTP escalator (#6650 )	2018-11-21 07:34:15 -08:00
Deiwin Sarjas	e0d1dc5846	Support DogStatsD style tags in statsd-emitter (#6605 ) * Replace StatsD client library The [Datadog package][1] is a StatsD compatible drop-in replacement for the client library, but it seems to be [better maintained][2] and has support for Datadog DogStatsD specific features, which will be made use of in a subsequent commit. The `count`, `time`, and `gauge` methods are actually exactly compatible with the previous library and the modifications shouldn't be required, but EasyMock seems to have a hard time dealing with the variable arguments added by the DogStatsD library and causes tests to fail if no arguments are provided for the last String vararg. Passing an empty array fixes the test failures. [1]: https://github.com/DataDog/java-dogstatsd-client [2]: https://github.com/tim-group/java-statsd-client/issues/37#issuecomment-248698856 * Retain dimension key information for StatsD metrics This doesn't change behavior, but allows separating dimensions from the metric name in subsequent commits. There is a possible order change for values from `dimsBuilder.build().values()`, but from the tests it looks like it doesn't affect actual behavior and the order of user dimensions is also retained. * Support DogStatsD style tags in statsd-emitter Datadog [doesn't support name-encoded dimensions and uses a concept of _tags_ instead.][1] This change allows Datadog users to send the metrics without having to encode the various dimensions in the metric names. This enables building graphs and monitors with and without aggregation across various dimensions from the same data. As tests in this commit verify, the behavior remains the same for users who don't enable the `druid.emitter.statsd.dogstatsd` configuration flag. [1]: https://www.datadoghq.com/blog/the-power-of-tagged-metrics/#tags-decouple-collection-and-reporting * Disable convertRange behavior for DogStatsD users DogStatsD, unlike regular StatsD, supports floating-point values, so this behavior is unnecessary. It would be possible to still support `convertRange`, even with `dogstatsd` enabled, but that would mean that people using the default mapping would have some of the gauges unnecessarily converted. `time` is in milliseconds and doesn't support floating-point values.	2018-11-19 09:47:57 -08:00
Gian Merlino	7cd457f41c	Kafka: Add warning to doc for earlyMessageRejectionPeriod. (#6644 )	2018-11-18 15:47:38 -07:00
David Lim	afb239b17a	add missing license headers, in particular to MD files; clean up RAT … (#6563 ) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg	2018-11-13 09:38:37 -08:00
Clint Wylie	1224d8b746	overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360 ) * move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql * fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays * remove leftover print * convert micro timestamp to millis * checkstyle * add ignore for .parquet and .parq to rat exclude * fix legit test failure from avro flattern behavior change * fix rebase * add exclusions to pom to cut down on redundant jars * refactor tests, add support for unwrapping lists for parquet-avro, review comments * more comment * fix oops * tweak parquet-avro list handling * more docs * fix style * grr styles	2018-11-05 21:33:42 -08:00
Caroline1000	26d992840c	correct default tier name (#6568 )	2018-11-01 17:51:13 -07:00
Jihoon Son	a92c2a197b	Move supervisor APIs to api-reference (#6555 ) * Move supervisor APIs to api-reference * fix kafka-specific docs * add ingestion stats report	2018-11-01 13:10:05 -07:00
taiii	b1159174b7	Update mysql.md (#6545 )	2018-10-30 14:01:32 -07:00
Jonathan Wei	b2d9b6f23d	Allow custom TLS cert checks (#6432 ) * Allow custom TLS cert checks * PR comment * Checkstyle, PR comment	2018-10-24 16:31:52 -07:00
David Lim	822e564f54	include mysql-metadata-storage extension in distribution, but without… (#6497 ) * include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library * Install mysql connector package * use symlinks to avoid versioning issues * add documentation for fetching the mysql connector	2018-10-20 18:18:58 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
Atul Mohan	ab7b4798cc	Securing passwords used for SSL connections to Kafka (#6285 ) * Secure credentials in consumer properties * Merge master * Refactor property population into separate method * Fix property setter * Fix tests	2018-10-11 10:03:01 -07:00
QiuMM	f8f4526b16	Add suspend\|resume\|terminate all supervisors endpoints. (#6272 ) * ability to showdown all supervisors * add doc * address comments * fix code style * address comments * change ternary assignment to if statement * better docs	2018-10-10 21:41:59 -07:00
QiuMM	d559dfecb2	replace deprecated druid.port by druid.plaintextPort in docs (#6427 )	2018-10-09 10:57:01 -07:00
Nishant Bangarwa	c9d281a2e9	Add ability to pass in Bloom filter from Hive Queries (#6222 ) * Bloom filter initial implementation fix checkstyle review comments Fix wierd failure review comments Revert "Fix wierd failure" This reverts commit a13a83ad7887e679f6d539191b52aeaaea85b613. * fix test * review comment	2018-09-26 16:04:26 -07:00
Benedict Jin	e5d9fcfe8f	Add maven.exec.xxx.skip option for exec-maven-plugin (#6162 ) * Fix conflicts * Modify io.druid into org.apache.druid	2018-09-25 10:05:26 -07:00
Jonathan Wei	ee7b565469	Docs for ingestion stat reports and new parse exception handling (#6373 )	2018-09-24 17:45:05 -07:00
Alexander Saydakov	93345064b5	HllSketch module (#5712 ) * HllSketch module * updated license and imports * updated package name * implemented makeAggregateCombiner() * removed json marks * style fix * added module * removed unnecessary import, side effect of package renaming * use TreadLocalRandom * addressing code review points, mostly formatting and comments * javadoc * natural order with nulls * typo * factored out raw input value extraction * singleton * style fix * style fix * use Collections.singletonList instead of Arrays.asList * suppress warning	2018-09-24 08:41:56 -07:00
Jonathan Wei	8972244c68	Mutual TLS support (#6076 ) * Mutual TLS support * Kafka test fixes * TeamCity fix * Split integration tests * Use localhost DOCKER_IP * Increase server thread count * Increase SSL handshake timeouts * Add broken pipe retries, use injected client config params * PR comments, Rat license check exclusion	2018-09-19 09:56:15 -07:00
QiuMM	85391e9fb3	fix opentsdb emitter always be running and fail sending tags whose value contains colon (#6251 ) * fix opentsdb emitter always be running * check if emitter started * add more details about consumeDelay in doc * fix possible thread unsafe * fix fail sending tags whose value contain colon	2018-09-14 12:14:15 -07:00
Jonathan Wei	fd6786ac6c	Fix endpoint permissions section in basic-security docs (#6331 )	2018-09-13 15:23:41 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Jonathan Wei	60cbc64472	Use PasswordProvider, fix info on initial passwords in basic security extension docs (#6303 ) * Fix info on initial passwords in basic security extension docs * Use PasswordProvider * Compile fix	2018-09-05 17:07:16 -07:00
Jonathan Wei	180e3ccfad	Docs consistency cleanup (#6259 )	2018-09-04 12:54:41 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Ryan Plessner	9c500fb69f	Add PostgreSQLConnectorConfig to expose SSL configuration options (#6181 ) * Add PostgreSQLConnectorConfig to expose SSL configuration options for the Postgres Metadata Storage module. * Fix checkstyle violations and add license header * Convert properties in the postgres docs to be the full property path and fix typo * Fix grammar in sslFactory docs	2018-08-21 16:45:27 -07:00
pdeva	c028d18d74	update redis-cache documentation (#6109 ) * update redis-cache documentation added clarifying info on setup and enablement * added link	2018-08-09 13:44:59 -07:00

1 2 3 4 5

249 Commits