Commit Graph

183 Commits

Author SHA1 Message Date
Jihoon Son 153495068b Doc update for the new input source and the new input format (#9171)
* Doc update for new input source and input format.

- The input source and input format are promoted in all docs under docs/ingestion
- All input sources including core extension ones are located in docs/ingestion/native-batch.md
- All input formats and parsers including core extension ones are localted in docs/ingestion/data-formats.md
- New behavior of the parallel task with different partitionsSpecs are documented in docs/ingestion/native-batch.md

* parquet

* add warning for range partitioning with sequential mode

* hdfs + s3, gs

* add fs impl for gs

* address comments

* address comments

* gcs
2020-01-17 15:52:05 -08:00
singh 936b9bdfd0 add deets about the keyfile (#9209) 2020-01-17 11:24:49 -08:00
Suneet Saldanha 92ac22d060 Link javaOpts to middlemanager runtime.properties docs (#9101)
* Link javaOpts to middlemanager runtime.properties docs

* fix broken link

* reword config links
2020-01-15 21:22:49 -08:00
Suneet Saldanha 85a3d416b0 Tutorials use new ingestion spec where possible (#9155)
* Tutorials use new ingestion spec where possible

There are 2 main changes
  * Use task type index_parallel instead of index
  * Remove the use of parser + firehose in favor of inputFormat + inputSource

index_parallel is the preferred method starting in 0.17. Setting the job to
index_parallel with the default maxNumConcurrentSubTasks(1) is the equivalent
of an index task

Instead of using a parserSpec, dimensionSpec and timestampSpec have been
promoted to the dataSchema. The format is described in the ioConfig as the
inputFormat.

There are a few cases where the new format is not supported
 * Hadoop must use firehoses instead of the inputSource and inputFormat
 * There is no equivalent of a combining firehose as an inputSource
 * A Combining firehose does not support index_parallel

* fix typo
2020-01-15 14:08:29 -08:00
Jonathan Wei d1500c1328 Update Kinesis resharding information about task failures (#9104) 2020-01-07 15:44:48 -08:00
Jonathan Wei aa539177ec De-incubation cleanup in code, docs, packaging (#9108)
* De-incubation cleanup in code, docs, packaging

* remove unused docs script
2020-01-03 12:33:19 -05:00
Clint Wylie 5ecdf94d83
add 'prefixes' support to google input source (#8930)
* add prefixes support to google input source, making it symmetrical-ish with s3

* docs

* more better, and tests

* unused

* formatting

* javadoc

* dependencies

* oops

* review comments

* better javadoc
2019-12-04 21:01:10 -08:00
Jonathan Wei 00ce18a0ea
Additional Kinesis resharding fixes (#8870)
* Additional Kinesis resharding fixes

* Address PR comments

* Remove unused method

* Adjust SegmentTransactionalInsertAction null handling

* Check for unchanged metadata on empty publish

* Add logs for empty publish

* Fix javadoc

* Clear offset when invalid endOffsets are seen

* Fix LGTM alert

* Fix build

* Add resharding note to Kinesis docs

* Checkstyle

* Spelling

* Address PR comments

* Checkstyle
2019-11-28 12:59:01 -08:00
Clint Wylie 4458113375
S3 input source (#8903)
* add s3 input source for native batch ingestion

* add docs

* fixes

* checkstyle

* lazy splits

* fixes and hella tests

* fix it

* re-use better iterator

* use key

* javadoc and checkstyle

* exception

* oops

* refactor to use S3Coords instead of URI

* remove unused code, add retrying stream to handle s3 stream

* remove unused parameter

* update to latest master

* use list of objects instead of object

* serde test

* refactor and such

* now with the ability to compile

* fix signature and javadocs

* fix conflicts yet again, fix S3 uri stuffs

* more tests, enforce uri for bucket

* javadoc

* oops

* abstract class instead of interface

* null or empty

* better error
2019-11-25 22:31:19 -08:00
Clint Wylie 7250010388 add parquet support to native batch (#8883)
* add parquet support to native batch

* cleanup

* implement toJson for sampler support

* better binaryAsString test

* docs

* i hate spellcheck

* refactor toMap conversion so can be shared through flattenerMaker, default impls should be good enough for orc+avro, fixup for merge with latest

* add comment, fix some stuff

* adjustments

* fix accident

* tweaks
2019-11-22 10:49:16 -08:00
Clint Wylie d67c3c7aed document SQL compatible null handling mode (#8894)
* document SQL compatible null handling mode

* adjustments

* fix docs

* review changes
2019-11-20 06:52:20 -08:00
Clint Wylie 074a45219d add google cloud storage InputSource for native batch (#8907)
* add google cloud storage InputSource for native batch

* rename

* checkstyle

* fix

* fix spelling

* review comments
2019-11-19 19:49:43 -08:00
Chi Cao Minh d60978343a Improve missing JDBC driver error for lookups (#8872)
If the JDBC drivers are missing from the lookup extensions, throw an
exception that directs the user how to resolve the issue. This change is
a follow up to #8825.
2019-11-18 11:42:38 -08:00
fst0 80dbf44fca Add reference to druid.storage.type (#8857)
* Add reference to `druid.storage.type`

This should be in here. Without setting storage type to S3 globally it will obviously not be used, even if all other parameters are correct.

* Update s3.md

Add global storage parameter to knob table.

* Update s3.md
2019-11-13 10:03:41 -08:00
Lucas Capistrant a066cc5648 Fix groupMapping endpoint URIs in druid-basic-security doc (#8847) 2019-11-12 21:12:34 +05:30
Jad Naous ce3c0dae4d Add note on JDBC libs for lookups (#8825)
* Add note on JDBC libs for lookups

* Fix directory and additional "the"
2019-11-06 13:31:26 -08:00
Giuseppe Martino 9c171e2b1f Message rejection absolute date (#8656)
* Add option lateMessageRejectionStartDate

* Use option lateMessageRejectionStartDate

* Fix tests

* Add lateMessageRejectionStartDate to kafka indexing service

* Update tests kafka indexing service

* Fix tests for KafkaSupervisorTest

* Add lateMessageRejectionStartDate to KinesisSupervisorIOConfig

* Fix var name

* Update documentation

* Add check lateMessageRejectionStartDateTime and lateMessageRejectionPeriod, fails if both were specified.
2019-10-31 15:13:02 -07:00
Gian Merlino 7605c23354 Remove Tranquility configs and certain doc references. (#8793)
Since it hasn't received updates or community interest in a while, it makes sense
to de-emphasize it in the distribution and most documentation (outside of simple
mentions of its existence).
2019-10-30 16:30:16 -07:00
Gian Merlino aa81253cf4 Fix typos. (#8767) 2019-10-28 12:47:01 -07:00
Gian Merlino b65d2ac648 Add HDFS firehose (#8754)
* Add HDFS firehose.

* Tests, support for lists of paths.

* Fixups.

* Update list of firehoses.

* Wildcards is a word.
2019-10-28 08:07:38 -07:00
Eyal Yurman 14e33428f0 Moving Average extention: Add Sum averagers (#8511)
* Add sum averagers.

* avoid casting double to long.
2019-10-24 16:37:24 -07:00
Kamal Gurala 3ed5f9698a gcs prefix doc fix (#8699) 2019-10-21 08:29:54 -07:00
Jonathan Wei d88075237a
Add initial SQL support for non-expression sketch postaggs (#8487)
* Add initial SQL support for non-expression sketch postaggs

* Checkstyle, spotbugs

* checkstyle

* imports

* Update SQL docs

* Checkstyle

* Fix theta sketch operator docs

* PR comments

* Checkstyle fixes

* Add missing entries for HLL sketch module

* PR comments, add round param to HLL estimate operator, fix optional HLL param
2019-10-18 14:59:44 -07:00
Jonathan Wei 89ce6384f5
More Kinesis resharding adjustments (#8671)
* More Kinesis resharding adjustments

* Fix TC inspection

* Fix comment'

* Adjust comment, small refactor

* Make repartition transition time configurable

* Add spellcheck exclusion

* Spelling fix
2019-10-15 23:19:17 -07:00
Mitch Lloyd 1a78a0c98a Add credentials for ECS (#8651)
* Add credentials for ECS

* Fix import order

* Update S3 authentication methods table

* Update .spelling for new documentation
2019-10-12 09:12:14 -07:00
Mohammad J. Khan 18758f5228 Support LDAP authentication/authorization (#6972)
* Support LDAP authentication/authorization

* fixed integration-tests

* fixed Travis CI build errors related to druid-security module

* fixed failing test

* fixed failing test header

* added comments, force build

* fixes for strict compilation spotbugs checks

* removed authenticator rolling credential update feature

* removed escalator rolling credential update feature

* fixed teamcity inspection deprecated API usage error

* fixed checkstyle execution error, removed unused import

* removed cached config as part of removing authenticator rolling credential update feature

* removed config bundle entity as part of removing authenticator rolling credential update feature

* refactored ldao configuration

* added support for SSLContext configuration and TLSCertificateChecker

* removed check to return authentication failure when user has no group assigned, will be checked and handled by the authorizer

* Separate out authorizer checks between metadata-backed store user and LDAP user/groups

* refactored BasicSecuritySSLSocketFactory usage to fix strict compilation spotbugs checks

* fixes build issue

* final review comments updates

* final review comments updates

* fixed LGTM and spellcheck alerts

* Fixed Avatica auth failure error message check

* Updated metadata credentials validator exception message string, replaced DB with metadata store
2019-10-08 17:08:27 -07:00
Vadim Ogievetsky 94298f7809 Update Kafka loading docs to use the streaming data loader (#8544)
* fix redirects

* remove useless page

* fix Single server reference configurations formatting

* update batch data loading

* update Kafka docs

* fix typos and tests

* add more links

* fix spelling
2019-09-22 15:00:52 -07:00
Xavier Léauté e184d24a74
add support for dogstatsd events in statsd-emitter (#8546)
* add support for dogstatsd events in statsd-emitter
* add option to turn on alert events (off by default)
* updated docs
2019-09-19 08:12:30 -07:00
Chi Cao Minh 7dcbaca658 Spellcheck docs (#8548)
* Spellcheck docs

Fix spelling mistakes in docs and add CI job for running spellcheck on
docs.

* Add missing license header
2019-09-17 12:47:30 -07:00
Clint Wylie 75978e5b98 move google ext docs from contrib to core (#8512)
* move google ext docs from contrib to core

* fix links

* revert unintended change

* more links, add note to example ext doc that it was removed, unlink from sidebar
2019-09-12 09:40:39 -07:00
Clint Wylie fb078eea1e
fix web-console build in src distribution, fix kafka doc minimum version (#8502) 2019-09-10 21:01:07 -07:00
Clint Wylie fd58fbc8d3
fix statds dogstatsdServiceAsTag docs example to match behavior (#8477) 2019-09-05 19:05:25 -07:00
Gian Merlino d007477742
Docusaurus build framework + ingestion doc refresh. (#8311)
* Docusaurus build framework + ingestion doc refresh.

* stick to npm instead of yarn

* fix typos

* restore some _bin

* Adjustments.

* detect and fix redirect anchors

* update anchor lint

* Web-console: remove specific column filters (#8343)

* add clear filter

* update tool kit

* remove usless check

* auto run

* add %

* Fix resource leak (#8337)

* Fix resource leak

* Patch comments

* Enable Spotbugs NP_NONNULL_RETURN_VIOLATION (#8234)

* Fixes from PR review.

* Fix more anchors.

* Preamble nix.

* Fix more anchors, headers

* clean up placeholder page

* add to website lint to travis config

* better broken link checking

* travis fix

* Fixed more broken links

* better redirects

* unfancy catch

* fix LGTM error

* link fixes

* fix md issues

* Addl fixes
2019-08-20 21:48:59 -07:00