* Doc update for new input source and input format.
- The input source and input format are promoted in all docs under docs/ingestion
- All input sources including core extension ones are located in docs/ingestion/native-batch.md
- All input formats and parsers including core extension ones are localted in docs/ingestion/data-formats.md
- New behavior of the parallel task with different partitionsSpecs are documented in docs/ingestion/native-batch.md
* parquet
* add warning for range partitioning with sequential mode
* hdfs + s3, gs
* add fs impl for gs
* address comments
* address comments
* gcs
* Tutorials use new ingestion spec where possible
There are 2 main changes
* Use task type index_parallel instead of index
* Remove the use of parser + firehose in favor of inputFormat + inputSource
index_parallel is the preferred method starting in 0.17. Setting the job to
index_parallel with the default maxNumConcurrentSubTasks(1) is the equivalent
of an index task
Instead of using a parserSpec, dimensionSpec and timestampSpec have been
promoted to the dataSchema. The format is described in the ioConfig as the
inputFormat.
There are a few cases where the new format is not supported
* Hadoop must use firehoses instead of the inputSource and inputFormat
* There is no equivalent of a combining firehose as an inputSource
* A Combining firehose does not support index_parallel
* fix typo
* add prefixes support to google input source, making it symmetrical-ish with s3
* docs
* more better, and tests
* unused
* formatting
* javadoc
* dependencies
* oops
* review comments
* better javadoc
* add s3 input source for native batch ingestion
* add docs
* fixes
* checkstyle
* lazy splits
* fixes and hella tests
* fix it
* re-use better iterator
* use key
* javadoc and checkstyle
* exception
* oops
* refactor to use S3Coords instead of URI
* remove unused code, add retrying stream to handle s3 stream
* remove unused parameter
* update to latest master
* use list of objects instead of object
* serde test
* refactor and such
* now with the ability to compile
* fix signature and javadocs
* fix conflicts yet again, fix S3 uri stuffs
* more tests, enforce uri for bucket
* javadoc
* oops
* abstract class instead of interface
* null or empty
* better error
* add parquet support to native batch
* cleanup
* implement toJson for sampler support
* better binaryAsString test
* docs
* i hate spellcheck
* refactor toMap conversion so can be shared through flattenerMaker, default impls should be good enough for orc+avro, fixup for merge with latest
* add comment, fix some stuff
* adjustments
* fix accident
* tweaks
If the JDBC drivers are missing from the lookup extensions, throw an
exception that directs the user how to resolve the issue. This change is
a follow up to #8825.
* Add reference to `druid.storage.type`
This should be in here. Without setting storage type to S3 globally it will obviously not be used, even if all other parameters are correct.
* Update s3.md
Add global storage parameter to knob table.
* Update s3.md
* Add option lateMessageRejectionStartDate
* Use option lateMessageRejectionStartDate
* Fix tests
* Add lateMessageRejectionStartDate to kafka indexing service
* Update tests kafka indexing service
* Fix tests for KafkaSupervisorTest
* Add lateMessageRejectionStartDate to KinesisSupervisorIOConfig
* Fix var name
* Update documentation
* Add check lateMessageRejectionStartDateTime and lateMessageRejectionPeriod, fails if both were specified.
Since it hasn't received updates or community interest in a while, it makes sense
to de-emphasize it in the distribution and most documentation (outside of simple
mentions of its existence).
* Support LDAP authentication/authorization
* fixed integration-tests
* fixed Travis CI build errors related to druid-security module
* fixed failing test
* fixed failing test header
* added comments, force build
* fixes for strict compilation spotbugs checks
* removed authenticator rolling credential update feature
* removed escalator rolling credential update feature
* fixed teamcity inspection deprecated API usage error
* fixed checkstyle execution error, removed unused import
* removed cached config as part of removing authenticator rolling credential update feature
* removed config bundle entity as part of removing authenticator rolling credential update feature
* refactored ldao configuration
* added support for SSLContext configuration and TLSCertificateChecker
* removed check to return authentication failure when user has no group assigned, will be checked and handled by the authorizer
* Separate out authorizer checks between metadata-backed store user and LDAP user/groups
* refactored BasicSecuritySSLSocketFactory usage to fix strict compilation spotbugs checks
* fixes build issue
* final review comments updates
* final review comments updates
* fixed LGTM and spellcheck alerts
* Fixed Avatica auth failure error message check
* Updated metadata credentials validator exception message string, replaced DB with metadata store
* move google ext docs from contrib to core
* fix links
* revert unintended change
* more links, add note to example ext doc that it was removed, unlink from sidebar