* orc extension reworked to use apache orc map-reduce lib, moved to core extensions, support for flattenSpec, tests, docs
* change binary handling to be compatible with avro and parquet, Rows.objectToStrings now converts byte[] to base64, change date handling
* better docs and tests
* fix it
* formatting
* doc fix
* fix it
* exclude redundant dependencies
* use latest orc-mapreduce, add hadoop jobProperties recommendations to docs
* doc fix
* review stuff and fix binaryAsString
* cache for root level fields
* more better
* Support kafka transactional topics
* update kafka to version 2.0.0
* Remove the skipOffsetGaps option since it's not used anymore
* Adjust kafka consumer to use transactional semantics
* Update tests
* Remove unused import from test
* Fix compilation
* Invoke transaction api to fix a unit test
* temporary modification of travis.yml for debugging
* another attempt to get travis tasklogs
* update kafka to 2.0.1 at all places
* Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes
* Add deprecated in docs for kafka-eight and kafka-simple extensions
* Remove skipOffsetGaps and code changes for transaction support
* Fix indentation
* remove skipOffsetGaps from kinesis
* Add transaction api to KafkaRecordSupplierTest
* Fix indent
* Fix test
* update kafka version to 2.1.0
* blooming aggs
* partially address review
* fix docs
* minor test refactor after rebase
* use copied bloomkfilter
* add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests
* add methods to BloomKFilter to get number of set bits, use in comparator, fixes
* more docs
* fix
* fix style
* simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets
* oof, more fixes
* more sane docs example
* fix it
* do the right thing in the right place
* formatting
* fix
* avoid conflict
* typo fixes, faster comparator, docs for comparator behavior
* unused imports
* use buffer comparator instead of deserializing
* striped readwrite lock for buffer agg, null handling comparator, other review changes
* style fixes
* style
* remove sync for now
* oops
* consistency
* inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception
* CardinalityBufferAggregator inspect selectors instead of selectorPluses
* fix style
* refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments
* adjustment
* fix teamcity error?
* rename nil aggs to empty, change empty agg constructor signature, add comments
* use stringutils base64 stuff to be chill with master
* add aggregate combiner, comment
* add "bloom_filter_test" druid expression to support bloom filters in ExpressionVirtualColumn and ExpressionDimFilter and sql expressions
* more docs
* use java.util.Base64, doc fixes
* created seekablestream classes
* created seekablestreamsupervisor class
* first attempt to integrate kafa indexing service to use SeekableStream
* seekablestream bug fixes
* kafkarecordsupplier
* integrated kafka indexing service with seekablestream
* implemented resume/suspend and refactored some package names
* moved kinesis indexing service into core druid extensions
* merged some changes from kafka supervisor race condition
* integrated kinesis-indexing-service with seekablestream
* unite tests for kinesis-indexing-service
* various bug fixes for kinesis-indexing-service
* refactored kinesisindexingtask
* finished up more kinesis unit tests
* more bug fixes for kinesis-indexing-service
* finsihed refactoring kinesis unit tests
* removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions
* kinesis-indexing-service code cleanup and docs
* merge #6291
merge #6337
merge #6383
* added more docs and reordered methods
* fixd kinesis tests after merging master and added docs in seekablestream
* fix various things from pr comment
* improve recordsupplier and add unit tests
* migrated to aws-java-sdk-kinesis
* merge changes from master
* fix pom files and forbiddenapi checks
* checkpoint JavaType bug fix
* fix pom and stuff
* disable checkpointing in kinesis
* fix kinesis sequence number null in closed shard
* merge changes from master
* fixes for kinesis tasks
* capitalized <partitionType, sequenceType>
* removed abstract class loggers
* conform to guava api restrictions
* add docker for travis other modules test
* address comments
* improve RecordSupplier to supply records in batch
* fix strict compile issue
* add test scope for localstack dependency
* kinesis indexing task refactoring
* comments
* github comments
* minor fix
* removed unneeded readme
* fix deserialization bug
* fix various bugs
* KinesisRecordSupplier unable to catch up to earliest position in stream bug fix
* minor changes to kinesis
* implement deaggregate for kinesis
* Merge remote-tracking branch 'upstream/master' into seekablestream
* fix kinesis offset discrepancy with kafka
* kinesis record supplier disable getPosition
* pr comments
* mock for kinesis tests and remove docker dependency for unit tests
* PR comments
* avg lag in kafkasupervisor #6587
* refacotred SequenceMetadata in taskRunners
* small fix
* more small fix
* recordsupplier resource leak
* revert .travis.yml formatting
* fix style
* kinesis docs
* doc part2
* more docs
* comments
* comments*2
* revert string replace changes
* comments
* teamcity
* comments part 1
* comments part 2
* comments part 3
* merge #6754
* fix injection binding
* comments
* KinesisRegion refactor
* comments part idk lol
* can't think of a commit msg anymore
* remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner
* commmmmmmmmmments
* extra error handling in KinesisRecordSupplier getRecords
* comments
* quickfix
* typo
* oof
* move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql
* fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays
* remove leftover print
* convert micro timestamp to millis
* checkstyle
* add ignore for .parquet and .parq to rat exclude
* fix legit test failure from avro flattern behavior change
* fix rebase
* add exclusions to pom to cut down on redundant jars
* refactor tests, add support for unwrapping lists for parquet-avro, review comments
* more comment
* fix oops
* tweak parquet-avro list handling
* more docs
* fix style
* grr styles
* include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library
* Install mysql connector package
* use symlinks to avoid versioning issues
* add documentation for fetching the mysql connector
* 'suspend' and 'resume' support for kafka indexing service
changes:
* introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively.
* `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise.
* `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors
* Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever
* Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}`
* Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate'
* move overlord console status to own column in supervisor table so does not look like garbage
* spacing
* padding
* other kind of spacing
* fix rebase fail
* fix more better
* all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests
* fix log
* resolves#5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask
* address review comments
* changes due to review
* merge fail
* Rename io.druid to org.apache.druid.
* Fix META-INF files and remove some benchmark results.
* MonitorsConfig update for metrics package migration.
* Reorder some dimensions in inner queries for some reason.
* Fix protobuf tests.
* Add PostgreSQLConnectorConfig to expose SSL configuration options for the Postgres Metadata Storage module.
* Fix checkstyle violations and add license header
* Convert properties in the postgres docs to be the full property path and fix typo
* Fix grammar in sslFactory docs
* Native parallel indexing without shuffle
* fix build
* fix ci
* fix ingestion without intervals
* fix retry
* fix retry
* add it test
* use chat handler
* fix build
* add docs
* fix ITUnionQueryTest
* fix failures
* disable metrics reporting
* working
* Fix split of static-s3 firehose
* Add endpoints to supervisor task and a unit test for endpoints
* increase timeout in test
* Added doc
* Address comments
* Fix overlapping locks
* address comments
* Fix static s3 firehose
* Fix test
* fix build
* fix test
* fix typo in docs
* add missing maxBytesInMemory to doc
* address comments
* fix race in test
* fix test
* Rename to ParallelIndexSupervisorTask
* fix teamcity
* address comments
* Fix license
* addressing comments
* addressing comments
* indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator
* Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner
* Add more javadocs
* use StringUtils.nonStrictFormat for logging
* fix typo and remove unused class
* fix tests
* change package
* fix strict build
* tmp
* Fix overlord api according to the recent change in master
* Fix it test