Commit Graph

8968 Commits

Author SHA1 Message Date
Joshua Sun 7c7997e8a1 Add Kinesis Indexing Service to core Druid (#6431)
* created seekablestream classes

* created seekablestreamsupervisor class

* first attempt to integrate kafa indexing service to use SeekableStream

* seekablestream bug fixes

* kafkarecordsupplier

* integrated kafka indexing service with seekablestream

* implemented resume/suspend and refactored some package names

* moved kinesis indexing service into core druid extensions

* merged some changes from kafka supervisor race condition

* integrated kinesis-indexing-service with seekablestream

* unite tests for kinesis-indexing-service

* various bug fixes for kinesis-indexing-service

* refactored kinesisindexingtask

* finished up more kinesis unit tests

* more bug fixes for kinesis-indexing-service

* finsihed refactoring kinesis unit tests

* removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions

* kinesis-indexing-service code cleanup and docs

* merge #6291

merge #6337

merge #6383

* added more docs and reordered methods

* fixd kinesis tests after merging master and added docs in seekablestream

* fix various things from pr comment

* improve recordsupplier and add unit tests

* migrated to aws-java-sdk-kinesis

* merge changes from master

* fix pom files and forbiddenapi checks

* checkpoint JavaType bug fix

* fix pom and stuff

* disable checkpointing in kinesis

* fix kinesis sequence number null in closed shard

* merge changes from master

* fixes for kinesis tasks

* capitalized <partitionType, sequenceType>

* removed abstract class loggers

* conform to guava api restrictions

* add docker for travis other modules test

* address comments

* improve RecordSupplier to supply records in batch

* fix strict compile issue

* add test scope for localstack dependency

* kinesis indexing task refactoring

* comments

* github comments

* minor fix

* removed unneeded readme

* fix deserialization bug

* fix various bugs

* KinesisRecordSupplier unable to catch up to earliest position in stream bug fix

* minor changes to kinesis

* implement deaggregate for kinesis

* Merge remote-tracking branch 'upstream/master' into seekablestream

* fix kinesis offset discrepancy with kafka

* kinesis record supplier disable getPosition

* pr comments

* mock for kinesis tests and remove docker dependency for unit tests

* PR comments

* avg lag in kafkasupervisor #6587

* refacotred SequenceMetadata in taskRunners

* small fix

* more small fix

* recordsupplier resource leak

* revert .travis.yml formatting

* fix style

* kinesis docs

* doc part2

* more docs

* comments

* comments*2

* revert string replace changes

* comments

* teamcity

* comments part 1

* comments part 2

* comments part 3

* merge #6754

* fix injection binding

* comments

* KinesisRegion refactor

* comments part idk lol

* can't think of a commit msg anymore

* remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner

* commmmmmmmmmments

* extra error handling in KinesisRecordSupplier getRecords

* comments

* quickfix

* typo

* oof
2018-12-21 12:49:24 -07:00
Christoph Hösler bd44e2971f fix(indexer): don't flush closed stream (#6748) 2018-12-20 15:40:07 -08:00
Clint Wylie 522ef1a013 update roaring to latest (#6750) 2018-12-20 15:10:24 -08:00
Surekha 5e5aad49e6 Set is_available to false by default for published segment (#6757)
* Set is_available to false by default for published segment

* Address comments

Fix the is_published value for segments not in metadata store

* Remove unused import

* Use non-null sharSpec for a segment in test

* Fix checkstyle

* Modify comment
2018-12-20 13:29:00 -08:00
Gian Merlino 6fbf3d635b Add IntelliJ codestyle setting for "blank lines before package". (#6766)
Required to get IntelliJ to automatically format code for compliance
with the check introduced in #6543.
2018-12-20 10:13:03 -08:00
Gian Merlino f0b7c272b9 Broker: Start up DruidSchema immediately if there are no segments. (#6765)
Fixes a bug introduced in #6742, where the broker would delay startup
indefinitely if there were no segments at all being served by any
data servers.
2018-12-20 11:07:35 -07:00
David Glasser 9bbd992885 Update two Javadocs to cite druid.generic.useDefaultValueForNull (#6760)
See #4349.
2018-12-20 09:39:37 -08:00
Jihoon Son 78defa436b Fix missing DataNodeService for historical (#6762) 2018-12-20 09:13:38 -08:00
Gian Merlino 7a09cde4de
Broker: Await initialization before finishing startup. (#6742)
* Broker: Await initialization before finishing startup.

In particular, hold off on announcing the service and starting the
HTTP server until the server view and SQL metadata cache are finished
initializing. This closes a window of time where a Broker could return
partial results shortly after startup.

As part of this, some simplification of server-lifecycle service
announcements. This helps ensure that the two different kinds of
announcements we do (legacy and new-style) stay in sync.

* Remove unused imports.

* Fix NPE in ServerRunnable.
2018-12-18 20:32:31 -08:00
Gian Merlino f12a1aa993 SQL: Add support for queries with project-after-semijoin. (#6756)
* SQL: Add support for queries with project-after-semijoin.

These didn't work before, since the top Project rel wasn't getting
merged into the DruidSemiJoin rel. This patch allows that to happen.

* Null handling

* Null handling

* Null handling
2018-12-18 17:53:14 -08:00
Jihoon Son 4591c56afb Fix error handling after pause request in Kafka supervisor (#6754)
* Fix error handling after pause request in kafka supervisor

* fix test

* fix test
2018-12-18 17:52:44 -08:00
Clint Wylie 9505074530 fix log typo (#6755)
* fix log typo, add DataSegmentUtils.getIdentifiersString util method

* fix indecisive oops
2018-12-18 15:10:25 -08:00
Jihoon Son f0ee6bf898 Fix auto compaction when the firstSegment is in skipOffset (#6738)
* Fix auto compaction when the firstSegment is in skipOffset

* remove duplicate
2018-12-18 19:10:46 +08:00
Jihoon Son 2c380e3a26 Fix doc for automatic compaction (#6749) 2018-12-17 11:44:33 -08:00
Clint Wylie 486c6f3cf9 emit logs that are only useful for debugging at debug level (#6741)
* make logs that are only useful for debugging be at debug level so log volume is much more chill

* info level messages for total merge buffer allocated/free

* more chill compaction logs
2018-12-17 14:20:28 +08:00
Jonathan Wei c713116a75 Use @Coordinator leader client in CoordinatorRuleManager (#6729) 2018-12-16 15:18:09 -08:00
Gian Merlino 04e7c7fbdc FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637)
* FilteredRequestLogger: Fix start/stop, invalid delegate behavior.

Fixes two bugs:

1) FilteredRequestLogger did not start/stop the delegate.

2) FilteredRequestLogger would ignore an invalid delegate type, and
instead silently substitute the "noop" logger. This was due to a larger
problem with RequestLoggerProvider setup in general; the fix here is
to remove "defaultImpl" from the RequestLoggerProvider interface, and
instead have JsonConfigurator be responsible for creating the
default implementations. It is stricter about things than the old system
was, and is only willing to make a noop logger if it doesn't see any
request logger configs. Otherwise, it'll raise a provision error.

* Remove unneeded annotations.
2018-12-14 16:55:44 +08:00
Clint Wylie 4ec068642d move parquet extension input formats up a level to `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet` and `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro` (#6727) 2018-12-13 16:33:42 -08:00
David Lim f7bbee2e65 Front Matter header needs to be on the first line for md to be rendered properly by jekyll (#6733) 2018-12-13 11:47:20 -08:00
Vadim Ogievetsky da4836f38c Added titles and harmonized docs to improve usability and SEO (#6731)
* added titles and harmonized docs

* manually fixed some titles
2018-12-12 20:42:12 -08:00
Clint Wylie 55914687bb Fix broken link in docs toc (#6728)
Change 'peon.html' to the correct link, 'peons.html'. No redirect is needed because the file has always been 'peons', just an incorrect link was introduced in the toc here https://github.com/apache/incubator-druid/pull/6259/files#diff-45297643736c5fb6da0e92f2c3df5d68R89
2018-12-12 15:14:38 -08:00
dongyifeng 91e3cf7196 add charset UTF-8 to log api (#6709)
When I retrieve the task log in browser, the Chinese characters all end up as garbage.
![image](https://user-images.githubusercontent.com/1322134/49502749-bd614080-f8b0-11e8-839e-07f7117eebfd.png)
After adding charset UTF-8, it was correct.
![image](https://user-images.githubusercontent.com/1322134/49502804-dc5fd280-f8b0-11e8-916b-bda8f1e7f318.png)
2018-12-12 16:31:04 +01:00
Atul Mohan 86e3ae5b48 Add fail message (#6720) 2018-12-11 08:05:50 -08:00
David Lim 3443f9a008 add missing contrib extensions to distribution packaging, fix influx package name, fix checkstyle plugin config to support Maven < 3.3.9 (#6717) 2018-12-10 17:56:32 -08:00
Jihoon Son f727333b70 Fix shutdownAllTasks API for non-existing dataSource (#6706) 2018-12-08 09:54:01 -08:00
Gian Merlino b7709e1245 FileUtils: Sync directory entry too on writeAtomically. (#6677)
* FileUtils: Sync directory entry too on writeAtomically.

See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync

This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.

* Remove unused import.

* Avoid FileOutputStream.

* Allow non-atomic writes to overwrite.

* Add some comments. And no need to flush an unbuffered stream.
2018-12-08 17:12:59 +01:00
Furkan KAMACI bbb283fa34 Double-checked locking bugs (#6662)
* Double-checked locking bug is fixed.

* @Nullable is removed since there is no need to use along with @MonotonicNonNull.

* Static import is removed.

* Lazy initialization is implemented.

* Local variables used instead of volatile ones.

* Local variables used instead of volatile ones.
2018-12-07 17:10:29 +01:00
Jihoon Son d525e5b18e Fix travis timeout in BufferHashGrouperTest (#6713)
* Fix travis timeout in BufferHashGrouperTest

* adjust buffer size

* adjust bufferSize and loadFactor

* increase memory

* add debug code

* cat error

* after script

* print logs

* print per 2 min

* use direct mem

* clean up
2018-12-07 12:05:27 +08:00
Vincent Newkirk cc44a4a28f Correct Documentation for lowerStrict/upperStrict (#6707)
The documentation for Bound filter's lowerStrict/upperStrict is incorrect. It is not consistent with the examples provided and actual behaviour of the bound filter. Correct this.
2018-12-06 10:14:50 -08:00
Mingming Qiu e8dd3716b8
add close method in Cache interface (#6540)
* add close method in Cache interface

* address comments

* address comments and fix travis-ci

* use try-finally
2018-12-06 17:28:41 +08:00
Mingming Qiu 607339003b Add TaskCountStatsMonitor to monitor task count stats (#6657)
* Add TaskCountStatsMonitor to monitor task count stats

* address comments

* add file header

* tweak test
2018-12-04 13:37:17 -08:00
Atul Mohan ec36f0b82f Add default comparison to HavingSpecMetricComparator for custom Aggregator types (#6505)
* Add default comparison

* Switch to BigDecimal comparison

* Add comparator from AggFactory

* Fix indent

* Add tests
2018-12-04 13:35:13 -08:00
Clint Wylie a1c9d0add2 autosize processing buffers based on direct memory sizing by default (#6588)
* autosize processing buffers based on direct memory sizing

* remove oops, more test

* max 1gb autosize buffers, test, start of docs

* fix oops

* revert accidental change

* print buffer size in exception

* change the things
2018-12-03 18:40:02 -07:00
Clint Wylie 43adb391c2 remove AbstractResourceFilter.isApplicable because it is not (#6691)
* remove AbstractResourceFilter.isApplicable because it is not, add tests for OverlordResource.doShutdown and OverlordResource.shutdownTasksForDatasource

* cleanup
2018-12-01 21:52:31 +08:00
David Lim e2bedab665 fix links to use relative references (#6696) 2018-11-30 16:32:10 -08:00
Roman Leventov ec38df7575
Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606)
* Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns

* Fix style

* Fix HttpEmitterTest.timeoutEmptyQueue()

* Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests

* Clarify code
2018-12-01 01:12:56 +01:00
David Lim b332021c49 remove extensions from default configs that have configuration/library dependencies and update docs (#6694) 2018-11-30 12:52:46 -08:00
rcgarcia74 9bf835b84f remove #658 doc reference for Schema-less design (#6693) 2018-11-30 12:53:57 -07:00
Shimi Kiviti 5fae522f12 replace Files.map() with FileUtils.map() (#6692) 2018-11-30 07:40:41 -08:00
Jihoon Son d6539abd0a Fix overlord api and console (#6686)
* Fix overlord APIs and console

* remove getRunningTasksByDataSource

* add missing path to isApplicable
2018-11-29 23:45:28 -08:00
陈春斌 624f328ea1 lazy create descriptor in ProtobufInputRowParser (#6678) 2018-11-28 21:59:29 -08:00
Mingming Qiu c81cb94226 fix cannot resolve param at OverlordResource#getTasks (#6679) 2018-11-29 10:07:21 +08:00
Mingming Qiu c5405bb592 emit maxLag/avgLag in KafkaSupervisor (#6587)
* emit maxLag/totalLag/avgLag in KafkaSupervisor

* modify ingest/kafka/totalLag to ingest/kafka/lag for backwards compatibility
2018-11-28 02:11:14 -08:00
hate13 f4b49f01ff add rule count on log (#6467)
* add rule count on log

* add final
2018-11-28 16:08:38 +08:00
David Glasser d150483fd3 indexing-service: fix HTML title on overlord console (#6671)
This follows a similar fix to the body of the page done in #5627.
2018-11-27 22:34:26 -08:00
Mingming Qiu 849ba867b2 fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory (#6656) 2018-11-27 15:59:58 -08:00
Jihoon Son 422b76b33c Fix IndexTaskClient to retry on ChannelException (#6649)
* Fix IndexTaskClient to retry on ChannelException

* fix travis and add javadoc

* address comment
2018-11-27 15:54:38 -08:00
Mingming Qiu 9a89200607 Emit query metrics even if the ETags are equal (#6663) 2018-11-27 15:18:01 -08:00
Jihoon Son 219f0965dc Remove duplicate DataSegmentTest (#6669) 2018-11-27 15:13:39 -08:00
Clint Wylie 8f8a569aa2 faster flattening for non-existent paths (#6654)
* faster flattening for non-existent properties to circumvent upstream json-path issue

* fix json provider

* revert to using null instead of undefined
2018-11-27 14:14:11 -08:00