Commit Graph

8949 Commits

Author SHA1 Message Date
Gian Merlino e497141e92 Update Curator to 4.1.0. (#6862) 2019-01-15 14:12:07 -08:00
Jonathan Wei 8537a771b0 Some fixes and tests for spaces/non-ASCII chars in datasource names (#6761)
* Fixes and tests for spaces/non-ASCII datasource names

* Some unit test fixes

* Fix ITRealtimeIndexTaskTest

* Checkstyle

* TeamCity

* PR comments
2019-01-15 08:33:31 -08:00
Surekha f72f33f84a Fix num_replicas count in sys.segments table (#6804)
* Fix num_replicas count from sys.segments

* Adjust unit test for num_replica > 1

* Pass named arguments instead of passing boolean constants

* Address PR comments

* PR comments
2019-01-15 08:31:29 -08:00
Jonathan Wei 9a8bade2fb Update approximate aggregators docs (#6848) 2019-01-11 21:50:51 -08:00
Charles Allen 5d2947cd52 Use Guava Compatible immediate executor service (#6815)
* Use multi-guava version friendly direct executor implementation

* Don't use a singleton

* Fix strict compliation complaints

* Copy Guava's DirectExecutor

* Fix javadoc

* Imports are the devil
2019-01-11 10:42:19 -08:00
Jihoon Son 39780d914c Propagate errors to RemoteTaskActionClient (#6837)
* Propagate errors to RemoteTaskActionClient

* style
2019-01-10 20:53:18 -08:00
Jonathan Wei b18d681551 Use kafka_2.12-0.10.2.2 (#6846) 2019-01-10 20:52:55 -08:00
Furkan KAMACI 55927bf8e3 Kafka version is updated (#6835)
Update Kafka version in tutorial from 0.10.2.0 to 0.10.2.2
2019-01-10 17:58:40 -08:00
pzhdfy ea3f426171 bugfix: Materialized view not support post-aggregator (#6689)
* bugfix: Materialized view not support post post-aggregator

* add unit test
2019-01-10 14:25:09 -08:00
Jihoon Son c35a39d70b
Add support maxRowsPerSegment for auto compaction (#6780)
* Add support maxRowsPerSegment for auto compaction

* fix build

* fix build

* fix teamcity

* add test

* fix test

* address comment
2019-01-10 09:50:14 -08:00
Clint Wylie ccfd1244d1 fix parquet parse performance issue (#6833)
* check that value is present before conversion to prevent silent, expensive exception and fix another bug

* cleanup

* now with less parenthesis
2019-01-10 09:18:57 -08:00
Jihoon Son 934c83bca6 Fix TaskLockbox when there are multiple intervals of the same start but differerent end (#6822)
* Fix TaskLockbox when there are multiple intervals of the same start but differernt end

* fix build

* fix npe
2019-01-09 19:38:27 -08:00
Furkan KAMACI ea973fee6b Tranquility version is updated (#6824) 2019-01-10 09:46:58 +08:00
Richard Startin 99097617a1 use RoaringBitmapWriter for RoaringBitmap construction (#6764) 2019-01-08 17:18:41 -08:00
dongyifeng def823124c add version comparator for StringComparator (#6745)
* add version comparator for StringComparator

* add more test case and docs
2019-01-08 17:17:03 -08:00
Jihoon Son c4716d1639 Fix ParallelIndexTask when publishing empty segments (#6807)
* Fix ParallelIndexTask when publishing empty segments

* unused import
2019-01-08 17:15:16 -08:00
Benjamin Hopp ef80c4e036 Update sql.md (#6821)
Corrected defaults for druid.sql.avatica.maxStatementsPerConnection and druid.sql.avatica.maxConnections
2019-01-08 10:15:12 -08:00
Mingming Qiu 8ebb7b558b Handoff should ignore segments that are dropped by drop rules (#6676)
* Handoff should ignore segments that are dropped by drop rules

* fix travis-ci

* fix tests

* address comments

* remove line added by accident

* address comments

* add javadoc and logging the full stack trace of exception

* add error message
2019-01-07 14:43:11 -08:00
Janek Lasocki-Biczysko b88e6304c4 Fix broken link in ingestion/schema-design.md docs (#6810) 2019-01-06 18:20:53 -08:00
Mingming Qiu 636964fcb5 Fix issue that tasks failed because of no sink for identifier (#6724)
* Fix issue that tasks failed because of no sink for identifier

* make find sinks to persist run in one callable together with the actual persist work

* Revert "make find sinks to persist run in one callable together with the actual persist work"

This reverts commit a24a2d80ae.
2019-01-04 17:09:11 -08:00
David Glasser c08f391605 statsd-emitter: support constant DogStatsD tags (#6791)
PR #6605 added support to the statsd emitter for DogStatsD tags. This commit
lets you specify "constant tags" in the config file which are included with
every event. This is helpful if you are running in an environment where you
cannot configure your datadog-agent with tags like "cluster name" --- eg, a
Kubernetes cluster with a datadog-agent on each node and different Druid
deployments in different namespaces but sharing the same datadog-agent
daemonset.

Also fix the name of an existing boolean getter to start with 'is'.
2019-01-04 15:35:37 +08:00
thomask 0e04acca43 Show how to include classpath in command (#6802)
Would have saved me some time
2019-01-03 18:31:55 -08:00
elloooooo 832a3b16ed Improve slfj logger input for MDC field:datasource (#6787)
* improve slfj logger  MDC datasource input

* add some UT and isNested field
2019-01-03 18:00:04 -08:00
Jihoon Son 9ad6a733a5 Add support segmentGranularity for CompactionTask (#6758)
* Add support segmentGranularity

* add doc and fix combination of options

* improve doc
2019-01-03 17:50:45 -08:00
Gian Merlino bc671ac436
SQL: Fix ordering of sort, sortProject in DruidSemiJoin. (#6769)
They were added in the wrong order, leading to this error message
when evaluating rules:

"Cannot move from stage[AGGREGATE] to stage[SORT_PROJECT]"
2019-01-03 10:36:28 -08:00
Jihoon Son a7391b396b Copy splitToList from Guava (#6800)
* Copy splitToList from Guava

* add comment

* fix commit tag
2019-01-03 18:42:13 +08:00
Mingming Qiu 6761663509 make kafka poll timeout can be configured (#6773)
* make kafka poll timeout can be configured

* add doc

* rename DEFAULT_POLL_TIMEOUT to DEFAULT_POLL_TIMEOUT_MILLIS
2019-01-03 12:16:02 +08:00
Benedict Jin e8ddd9942d Fix wrong counter getFailedSendingTimeCounter method (#6793)
* Fix wrong counter getFailedSendingTimeCounter method

* Add testcases

* Add getTimeSumAndCount for testcases
2019-01-02 23:50:54 +08:00
Mingming Qiu 114a9fc38f change propertyBase in ServerViewModule (#6774) 2019-01-02 16:44:02 +08:00
Clint Wylie 67f832957b add bloom filter operator to general sql docs (#6785) 2018-12-31 11:30:33 -08:00
Jihoon Son fa7cb906e4 Fix auto compaction to consider intervals of running tasks (#6767)
* Fix auto compaction to consider intervals of running tasks

* adjust initial collection size
2018-12-27 18:03:44 -08:00
Joshua Sun 7c7997e8a1 Add Kinesis Indexing Service to core Druid (#6431)
* created seekablestream classes

* created seekablestreamsupervisor class

* first attempt to integrate kafa indexing service to use SeekableStream

* seekablestream bug fixes

* kafkarecordsupplier

* integrated kafka indexing service with seekablestream

* implemented resume/suspend and refactored some package names

* moved kinesis indexing service into core druid extensions

* merged some changes from kafka supervisor race condition

* integrated kinesis-indexing-service with seekablestream

* unite tests for kinesis-indexing-service

* various bug fixes for kinesis-indexing-service

* refactored kinesisindexingtask

* finished up more kinesis unit tests

* more bug fixes for kinesis-indexing-service

* finsihed refactoring kinesis unit tests

* removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions

* kinesis-indexing-service code cleanup and docs

* merge #6291

merge #6337

merge #6383

* added more docs and reordered methods

* fixd kinesis tests after merging master and added docs in seekablestream

* fix various things from pr comment

* improve recordsupplier and add unit tests

* migrated to aws-java-sdk-kinesis

* merge changes from master

* fix pom files and forbiddenapi checks

* checkpoint JavaType bug fix

* fix pom and stuff

* disable checkpointing in kinesis

* fix kinesis sequence number null in closed shard

* merge changes from master

* fixes for kinesis tasks

* capitalized <partitionType, sequenceType>

* removed abstract class loggers

* conform to guava api restrictions

* add docker for travis other modules test

* address comments

* improve RecordSupplier to supply records in batch

* fix strict compile issue

* add test scope for localstack dependency

* kinesis indexing task refactoring

* comments

* github comments

* minor fix

* removed unneeded readme

* fix deserialization bug

* fix various bugs

* KinesisRecordSupplier unable to catch up to earliest position in stream bug fix

* minor changes to kinesis

* implement deaggregate for kinesis

* Merge remote-tracking branch 'upstream/master' into seekablestream

* fix kinesis offset discrepancy with kafka

* kinesis record supplier disable getPosition

* pr comments

* mock for kinesis tests and remove docker dependency for unit tests

* PR comments

* avg lag in kafkasupervisor #6587

* refacotred SequenceMetadata in taskRunners

* small fix

* more small fix

* recordsupplier resource leak

* revert .travis.yml formatting

* fix style

* kinesis docs

* doc part2

* more docs

* comments

* comments*2

* revert string replace changes

* comments

* teamcity

* comments part 1

* comments part 2

* comments part 3

* merge #6754

* fix injection binding

* comments

* KinesisRegion refactor

* comments part idk lol

* can't think of a commit msg anymore

* remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner

* commmmmmmmmmments

* extra error handling in KinesisRecordSupplier getRecords

* comments

* quickfix

* typo

* oof
2018-12-21 12:49:24 -07:00
Christoph Hösler bd44e2971f fix(indexer): don't flush closed stream (#6748) 2018-12-20 15:40:07 -08:00
Clint Wylie 522ef1a013 update roaring to latest (#6750) 2018-12-20 15:10:24 -08:00
Surekha 5e5aad49e6 Set is_available to false by default for published segment (#6757)
* Set is_available to false by default for published segment

* Address comments

Fix the is_published value for segments not in metadata store

* Remove unused import

* Use non-null sharSpec for a segment in test

* Fix checkstyle

* Modify comment
2018-12-20 13:29:00 -08:00
Gian Merlino 6fbf3d635b Add IntelliJ codestyle setting for "blank lines before package". (#6766)
Required to get IntelliJ to automatically format code for compliance
with the check introduced in #6543.
2018-12-20 10:13:03 -08:00
Gian Merlino f0b7c272b9 Broker: Start up DruidSchema immediately if there are no segments. (#6765)
Fixes a bug introduced in #6742, where the broker would delay startup
indefinitely if there were no segments at all being served by any
data servers.
2018-12-20 11:07:35 -07:00
David Glasser 9bbd992885 Update two Javadocs to cite druid.generic.useDefaultValueForNull (#6760)
See #4349.
2018-12-20 09:39:37 -08:00
Jihoon Son 78defa436b Fix missing DataNodeService for historical (#6762) 2018-12-20 09:13:38 -08:00
Gian Merlino 7a09cde4de
Broker: Await initialization before finishing startup. (#6742)
* Broker: Await initialization before finishing startup.

In particular, hold off on announcing the service and starting the
HTTP server until the server view and SQL metadata cache are finished
initializing. This closes a window of time where a Broker could return
partial results shortly after startup.

As part of this, some simplification of server-lifecycle service
announcements. This helps ensure that the two different kinds of
announcements we do (legacy and new-style) stay in sync.

* Remove unused imports.

* Fix NPE in ServerRunnable.
2018-12-18 20:32:31 -08:00
Gian Merlino f12a1aa993 SQL: Add support for queries with project-after-semijoin. (#6756)
* SQL: Add support for queries with project-after-semijoin.

These didn't work before, since the top Project rel wasn't getting
merged into the DruidSemiJoin rel. This patch allows that to happen.

* Null handling

* Null handling

* Null handling
2018-12-18 17:53:14 -08:00
Jihoon Son 4591c56afb Fix error handling after pause request in Kafka supervisor (#6754)
* Fix error handling after pause request in kafka supervisor

* fix test

* fix test
2018-12-18 17:52:44 -08:00
Clint Wylie 9505074530 fix log typo (#6755)
* fix log typo, add DataSegmentUtils.getIdentifiersString util method

* fix indecisive oops
2018-12-18 15:10:25 -08:00
Jihoon Son f0ee6bf898 Fix auto compaction when the firstSegment is in skipOffset (#6738)
* Fix auto compaction when the firstSegment is in skipOffset

* remove duplicate
2018-12-18 19:10:46 +08:00
Jihoon Son 2c380e3a26 Fix doc for automatic compaction (#6749) 2018-12-17 11:44:33 -08:00
Clint Wylie 486c6f3cf9 emit logs that are only useful for debugging at debug level (#6741)
* make logs that are only useful for debugging be at debug level so log volume is much more chill

* info level messages for total merge buffer allocated/free

* more chill compaction logs
2018-12-17 14:20:28 +08:00
Jonathan Wei c713116a75 Use @Coordinator leader client in CoordinatorRuleManager (#6729) 2018-12-16 15:18:09 -08:00
Gian Merlino 04e7c7fbdc FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637)
* FilteredRequestLogger: Fix start/stop, invalid delegate behavior.

Fixes two bugs:

1) FilteredRequestLogger did not start/stop the delegate.

2) FilteredRequestLogger would ignore an invalid delegate type, and
instead silently substitute the "noop" logger. This was due to a larger
problem with RequestLoggerProvider setup in general; the fix here is
to remove "defaultImpl" from the RequestLoggerProvider interface, and
instead have JsonConfigurator be responsible for creating the
default implementations. It is stricter about things than the old system
was, and is only willing to make a noop logger if it doesn't see any
request logger configs. Otherwise, it'll raise a provision error.

* Remove unneeded annotations.
2018-12-14 16:55:44 +08:00
Clint Wylie 4ec068642d move parquet extension input formats up a level to `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet` and `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro` (#6727) 2018-12-13 16:33:42 -08:00
David Lim f7bbee2e65 Front Matter header needs to be on the first line for md to be rendered properly by jekyll (#6733) 2018-12-13 11:47:20 -08:00