Commit Graph

249 Commits

Author SHA1 Message Date
Clint Wylie 2e54755a03
add docker tutorial, friendlier docker-compose.yml, experimental java 11 dockerfile (#9262)
* add docker tutorial, experimental java 11 dockerfile

* fix typo

* spelling

* doc adjustments
2020-02-13 21:24:45 -08:00
Atul Mohan 7968524b01
Add Pig-specific file handling to Avro parser (#9258)
* Add processing for data files from AvroStorage

* Add words to spellings file
2020-02-10 21:53:11 -08:00
Zhenxiao Luo 479c09751c Add MostAvailableSizeStorageLocationSelectorStrategy (#8879)
* Add MostAvailableSize LocationSelectorStrategy

* Add doc for mostAvailableSize strategy

* Fix docs for mostAvailableSize
2020-01-23 13:42:03 -08:00
Gian Merlino d21054f7c5
Remove the deprecated interval-chunking stuff. (#9216)
* Remove the deprecated interval-chunking stuff.

See https://github.com/apache/druid/pull/6591, https://github.com/apache/druid/pull/4004#issuecomment-284171911 for details.

* Remove unused import.

* Remove chunkInterval too.
2020-01-19 17:14:23 -08:00
Jihoon Son 153495068b Doc update for the new input source and the new input format (#9171)
* Doc update for new input source and input format.

- The input source and input format are promoted in all docs under docs/ingestion
- All input sources including core extension ones are located in docs/ingestion/native-batch.md
- All input formats and parsers including core extension ones are localted in docs/ingestion/data-formats.md
- New behavior of the parallel task with different partitionsSpecs are documented in docs/ingestion/native-batch.md

* parquet

* add warning for range partitioning with sequential mode

* hdfs + s3, gs

* add fs impl for gs

* address comments

* address comments

* gcs
2020-01-17 15:52:05 -08:00
Jonathan Wei 58d337186b
Graduation update for ASF release process guide and download links (#9126)
* Graduation update for ASF release process guide and download links

* Fix release vote thread typo

* Fix pom.xml
2020-01-06 15:00:33 -06:00
Jonathan Wei aa539177ec De-incubation cleanup in code, docs, packaging (#9108)
* De-incubation cleanup in code, docs, packaging

* remove unused docs script
2020-01-03 12:33:19 -05:00
Jonathan Wei 4e8368a5d9 Set version to 0.18.0-SNAPSHOT (#9109) 2020-01-02 17:55:10 -05:00
Himanshu 9236dd9467
optionally enable Jetty ForwardedRequestCustomizer (#9010)
* optionally enable Jetty ForwardedRequestCustomizer

* fix doc build
2019-12-12 17:00:08 -08:00
Jonathan Wei 8af41d7cd0 Update version to 0.18.0-incubating-SNAPSHOT (#9009) 2019-12-11 14:04:03 -08:00
Jonathan Wei c949a25210
Add DruidInputSource (replacement for IngestSegmentFirehose) (#8982)
* Add Druid input source and format

* Inherit dims/metrics from segment

* Add ingest segment firehose reindexing test

* Remove unnecessary module

* Fix unit tests, checkstyle

* Add doc entry

* Fix dimensionExclusions handling, add parallel index integration test

* Add spelling exclusion

* Address some PR comments

* Checkstyle

* wip

* Address rest of PR comments

* Address PR comments
2019-12-05 16:50:00 -08:00
Jonathan Wei 00ce18a0ea
Additional Kinesis resharding fixes (#8870)
* Additional Kinesis resharding fixes

* Address PR comments

* Remove unused method

* Adjust SegmentTransactionalInsertAction null handling

* Check for unchanged metadata on empty publish

* Add logs for empty publish

* Fix javadoc

* Clear offset when invalid endOffsets are seen

* Fix LGTM alert

* Fix build

* Add resharding note to Kinesis docs

* Checkstyle

* Spelling

* Address PR comments

* Checkstyle
2019-11-28 12:59:01 -08:00
Clint Wylie 7250010388 add parquet support to native batch (#8883)
* add parquet support to native batch

* cleanup

* implement toJson for sampler support

* better binaryAsString test

* docs

* i hate spellcheck

* refactor toMap conversion so can be shared through flattenerMaker, default impls should be good enough for orc+avro, fixup for merge with latest

* add comment, fix some stuff

* adjustments

* fix accident

* tweaks
2019-11-22 10:49:16 -08:00
Clint Wylie 074a45219d add google cloud storage InputSource for native batch (#8907)
* add google cloud storage InputSource for native batch

* rename

* checkstyle

* fix

* fix spelling

* review comments
2019-11-19 19:49:43 -08:00
Vadim Ogievetsky 98580ffe71 adding search to docs (#8906) 2019-11-19 07:04:40 -08:00
Vadim Ogievetsky e9e1625e96 Docs: Add docsearch version (#8850)
* Add docsearch version

* remove open snas stylesheet
2019-11-09 19:51:06 -08:00
Clint Wylie 7aafcf8bca parallel broker merges on fork join pool (#8578)
* sketch of broker parallel merges done in small batches on fork join pool

* fix non-terminating sequences, auto compute parallelism

* adjust benches

* adjust benchmarks

* now hella more faster, fixed dumb

* fix

* remove comments

* log.info for debug

* javadoc

* safer block for sequence to yielder conversion

* refactor LifecycleForkJoinPool into LifecycleForkJoinPoolProvider which wraps a ForkJoinPool

* smooth yield rate adjustment, more logs to help tune

* cleanup, less logs

* error handling, bug fixes, on by default, more parallel, more tests

* remove unused var

* comments

* timeboundary mergeFn

* simplify, more javadoc

* formatting

* pushdown config

* use nanos consistently, move logs back to debug level, bit more javadoc

* static terminal result batch

* javadoc for nullability of createMergeFn

* cleanup

* oops

* fix race, add docs

* spelling, remove todo, add unhandled exception log

* cleanup, revert unintended change

* another unintended change

* review stuff

* add ParallelMergeCombiningSequenceBenchmark, fixes

* hyper-threading is the enemy

* fix initial start delay, lol

* parallelism computer now balances partition sizes to partition counts using sqrt of sequence count instead of sequence count by 2

* fix those important style issues with the benchmarks code

* lazy sequence creation for benchmarks

* more benchmark comments

* stable sequence generation time

* update defaults to use 100ms target time, 4096 batch size, 16384 initial yield, also update user docs

* add jmh thread based benchmarks, cleanup some stuff

* oops

* style

* add spread to jmh thread benchmark start range, more comments to benchmarks parameters and purpose

* retool benchmark to allow modeling more typical heterogenous heavy workloads

* spelling

* fix

* refactor benchmarks

* formatting

* docs

* add maxThreadStartDelay parameter to threaded benchmark

* why does catch need to be on its own line but else doesnt
2019-11-07 11:58:46 -08:00
Himanshu 5adc8212b4
add documentation for druid docker and k8s operator (#8802)
* add documentation for druid docker and k8s operator

* address review comment and add Kubernetes to spelling file
2019-11-06 12:56:21 -08:00
Gian Merlino b65d2ac648 Add HDFS firehose (#8754)
* Add HDFS firehose.

* Tests, support for lists of paths.

* Fixups.

* Update list of firehoses.

* Wildcards is a word.
2019-10-28 08:07:38 -07:00
Vadim Ogievetsky 9eb68b1d53
Add version to canonical URL (#8747) 2019-10-25 12:46:39 -07:00
Clint Wylie 09f92818d4 update druid expression docs to indicate that array functions do not work at indexing time (#8734)
* update druid expression docs to indicate that array functions are not supported in transformSpec

* fix unrelated spelling check
2019-10-24 22:04:08 -07:00
Vadim Ogievetsky 774ce3ce6d add canonical to the docs (#8731) 2019-10-24 15:25:57 -07:00
Vadim Ogievetsky cc3650ee3b fix doc headers (#8729) 2019-10-24 11:17:39 -07:00
Surekha 98f59ddd7e Add `sys.supervisors` table to system tables (#8547)
* Add supervisors table to SystemSchema

* Add docs

* fix checkstyle

* fix test

* fix CI

* Add comments

* Fix javadoc teamcity error

* comments

* fix links in docs

* fix links

* rename fullStatus query param to system and remove it from docs
2019-10-18 15:16:42 -07:00
Jihoon Son 30c15900be
Auto compaction based on parallel indexing (#8570)
* Auto compaction based on parallel indexing

* javadoc and doc

* typo

* update spell

* addressing comments

* address comments

* fix log

* fix build

* fix test

* increase default max input segment bytes per task

* fix test
2019-10-18 13:24:14 -07:00
Mingming Qiu 2c758ef5ff Support assign tasks to run on different categories of MiddleManagers (#7066)
* Support assign tasks to run on different tiers of MiddleManagers

* address comments

* address comments

* rename tier to category and docs

* doc

* fix doc

* fix spelling errors

* docs
2019-10-17 12:57:19 -07:00
Jonathan Wei 89ce6384f5
More Kinesis resharding adjustments (#8671)
* More Kinesis resharding adjustments

* Fix TC inspection

* Fix comment'

* Adjust comment, small refactor

* Make repartition transition time configurable

* Add spellcheck exclusion

* Spelling fix
2019-10-15 23:19:17 -07:00
Mitch Lloyd 1a78a0c98a Add credentials for ECS (#8651)
* Add credentials for ECS

* Fix import order

* Update S3 authentication methods table

* Update .spelling for new documentation
2019-10-12 09:12:14 -07:00
Jihoon Son 96d8523ecb Use hash of Segment IDs instead of a list of explicit segments in auto compaction (#8571)
* IOConfig for compaction task

* add javadoc, doc, unit test

* fix webconsole test

* add spelling

* address comments

* fix build and test

* address comments
2019-10-09 11:12:00 -07:00
Clint Wylie 8bda3afea4 fix spelling errors triggered by another doc PR (#8653) 2019-10-08 23:43:58 -07:00
Mohammad J. Khan 18758f5228 Support LDAP authentication/authorization (#6972)
* Support LDAP authentication/authorization

* fixed integration-tests

* fixed Travis CI build errors related to druid-security module

* fixed failing test

* fixed failing test header

* added comments, force build

* fixes for strict compilation spotbugs checks

* removed authenticator rolling credential update feature

* removed escalator rolling credential update feature

* fixed teamcity inspection deprecated API usage error

* fixed checkstyle execution error, removed unused import

* removed cached config as part of removing authenticator rolling credential update feature

* removed config bundle entity as part of removing authenticator rolling credential update feature

* refactored ldao configuration

* added support for SSLContext configuration and TLSCertificateChecker

* removed check to return authentication failure when user has no group assigned, will be checked and handled by the authorizer

* Separate out authorizer checks between metadata-backed store user and LDAP user/groups

* refactored BasicSecuritySSLSocketFactory usage to fix strict compilation spotbugs checks

* fixes build issue

* final review comments updates

* final review comments updates

* fixed LGTM and spellcheck alerts

* Fixed Avatica auth failure error message check

* Updated metadata credentials validator exception message string, replaced DB with metadata store
2019-10-08 17:08:27 -07:00
Clint Wylie 2f20799868 merge recommendations into basic-cluster-tuning, add additional info (#8649)
* merge recommendations into basic-cluster-tuning, add additional info

* stupid sidebar
2019-10-08 16:33:54 -07:00
Nishant Bangarwa 8537fbeca7 Implementing dropwizard emitter for druid (#7363)
* Implementing dropwizard emitter for druid

making metric manager and alert emitters as optional

* Refactor and make things work

more improvements

improve docs

refactrings

* Fix teamcity inspections

* review comments

* more review comments

* add limit to max number of gauges

* update pom version

* fix pom

* review comments

* review comment

* review comments

* fix broken doc link

review comments

review comments

* review comments

* fix checkstyle

* more spell check fixes

* fix travis failures
2019-10-01 14:59:30 -07:00
Sashidhar Thallam 51a7235ebc Making optimal usage of multiple segment cache locations (#8038)
* #7641 - Changing segment distribution algorithm to distribute segments to multiple segment cache locations

* Fixing indentation

* WIP

* Adding interface for location strategy selection, least bytes used strategy impl, round-robin strategy impl, locationSelectorStrategy config with least bytes used strategy as the default strategy

* fixing code style

* Fixing test

* Adding a method visible only for testing, fixing tests

* 1. Changing the method contract to return an iterator of locations instead of a single best location. 2. Check style fixes

* fixing the conditional statement

* Added testSegmentDistributionUsingLeastBytesUsedStrategy, fixed testSegmentDistributionUsingRoundRobinStrategy

* to trigger CI build

* Add documentation for the selection strategy configuration

* to re trigger CI build

* updated docs as per review comments, made LeastBytesUsedStorageLocationSelectorStrategy.getLocations a synchronzied method, other minor fixes

* In checkLocationConfigForNull method, using getLocations() to check for null instead of directly referring to the locations variable so that tests overriding getLocations() method do not fail

* Implementing review comments. Added tests for StorageLocationSelectorStrategy

* Checkstyle fixes

* Adding java doc comments for StorageLocationSelectorStrategy interface

* checkstyle

* empty commit to retrigger build

* Empty commit

* Adding suppressions for words leastBytesUsed and roundRobin of ../docs/configuration/index.md file

* Impl review comments including updating docs as suggested

* Removing checkLocationConfigForNull(), @NotEmpty annotation serves the purpose

* Round robin iterator to keep track of the no. of iterations, impl review comments, added tests for round robin strategy

* Fixing the round robin iterator

* Removed numLocationsToTry, updated java docs

* changing property attribute value from tier to type

* Fixing assert messages
2019-09-28 00:17:44 -06:00
Jonathan Wei 5f83cd879e Fix download link in header links (#8600) 2019-09-26 18:06:12 -07:00
Vadim Ogievetsky f4605f45be Website: stricter replace (#8593)
* stricter replace

* better fix
2019-09-25 17:25:39 -07:00
Nishant Bangarwa a75ddaad9e Add TrustedDomain Authenticator (#8248)
* Add TrustedDomain Authenticator

update javadoc

Add nullable annotations

Add cautionary note

fix travis failure

* add IP to spell checker
2019-09-25 11:25:03 -07:00
Vadim Ogievetsky 563718c5b2 add druid version correctly (#8586) 2019-09-24 18:19:06 -07:00
Vadim Ogievetsky 52f3f2c229 fix docs version interpolation (#8568) 2019-09-22 17:38:55 -07:00
Vadim Ogievetsky 94298f7809 Update Kafka loading docs to use the streaming data loader (#8544)
* fix redirects

* remove useless page

* fix Single server reference configurations formatting

* update batch data loading

* update Kafka docs

* fix typos and tests

* add more links

* fix spelling
2019-09-22 15:00:52 -07:00
Chi Cao Minh 99b6eedab5 Rename partition spec fields (#8507)
* Rename partition spec fields

Rename partition spec fields to be consistent across the various types
(hashed, single_dim, dynamic). Specifically, use targetNumRowsPerSegment
and maxRowsPerSegment in favor of targetPartitionSize and
maxSegmentSize. Consistent and clearer names are easier for users to
understand and use.

Also fix various IntelliJ inspection warnings and doc spelling mistakes.

* Fix test

* Improve docs

* Add targetRowsPerSegment to HashedPartitionsSpec
2019-09-20 14:59:18 -06:00
Chi Cao Minh 7dcbaca658 Spellcheck docs (#8548)
* Spellcheck docs

Fix spelling mistakes in docs and add CI job for running spellcheck on
docs.

* Add missing license header
2019-09-17 12:47:30 -07:00
Chi Cao Minh 8b7b412d9d Add 'edit' and 'copy code' buttons to docs (#8483)
Having an 'edit' button will help encourage doc contributions from the
community.

Having 'copy code' buttons on the code blocks is especially convenient
for the quickstart tutorials in the docs.
2019-09-13 16:56:50 -07:00
Clint Wylie 75978e5b98 move google ext docs from contrib to core (#8512)
* move google ext docs from contrib to core

* fix links

* revert unintended change

* more links, add note to example ext doc that it was removed, unlink from sidebar
2019-09-12 09:40:39 -07:00
Clint Wylie 067c30c8b2
add option to control version of website build (#8482) 2019-09-06 19:18:00 -07:00
Clint Wylie c73a489335
bump master version to 0.17.0-incubating-SNAPSHOT (#8421) 2019-08-28 01:58:36 -07:00
Clint Wylie 7afe473fd3 How to asf release (#8370)
* add ASF release manager guide

* fix broken link

* fix bold

* fix order

* clean up

* oops

* pom

* more

* fix

* fixes

* fix

* fix
2019-08-27 18:36:13 -07:00
Benedict Jin e1f94a5e26 Reduce the size of images with lossless compression (#8358) 2019-08-21 13:29:30 -07:00
Gian Merlino d007477742
Docusaurus build framework + ingestion doc refresh. (#8311)
* Docusaurus build framework + ingestion doc refresh.

* stick to npm instead of yarn

* fix typos

* restore some _bin

* Adjustments.

* detect and fix redirect anchors

* update anchor lint

* Web-console: remove specific column filters (#8343)

* add clear filter

* update tool kit

* remove usless check

* auto run

* add %

* Fix resource leak (#8337)

* Fix resource leak

* Patch comments

* Enable Spotbugs NP_NONNULL_RETURN_VIOLATION (#8234)

* Fixes from PR review.

* Fix more anchors.

* Preamble nix.

* Fix more anchors, headers

* clean up placeholder page

* add to website lint to travis config

* better broken link checking

* travis fix

* Fixed more broken links

* better redirects

* unfancy catch

* fix LGTM error

* link fixes

* fix md issues

* Addl fixes
2019-08-20 21:48:59 -07:00