9894 Commits

Author SHA1 Message Date
Jihoon Son
96d8523ecb Use hash of Segment IDs instead of a list of explicit segments in auto compaction (#8571)
* IOConfig for compaction task

* add javadoc, doc, unit test

* fix webconsole test

* add spelling

* address comments

* fix build and test

* address comments
2019-10-09 11:12:00 -07:00
Jonathan Wei
526f04c47c Fix missing jackson jars for hadoop ingestion (#8652)
* Fix missing jackson jars for hadoop ingestion

* PR comments

* pom ordering

* New approach

* Remove all jackson-core/mapper-asl exclusions from hdfs storage
2019-10-08 23:54:55 -07:00
Clint Wylie
8bda3afea4 fix spelling errors triggered by another doc PR (#8653) 2019-10-08 23:43:58 -07:00
Chi Cao Minh
b6b5517c20 Speed up ParallelIndexSupervisorTask tests (#8633)
Previously, some tests for ParallelIndexSupervisorTask were being run
twice unnecessarily.
2019-10-08 19:56:12 -07:00
Nishant Bangarwa
0853273091 Add tier based usage metrics for historical nodes to help with autoscaling (#8636)
* Add tier based usage metrics for historical nodes to help with druid historical autoscaling

Add tier based usage metrics for historical nodes to help druid cluster orchestration systems understand the historical node usage and requirements. Following metrics would be helpful -

tier/required/capacity- total capacity in bytes required in each tier. Dimensions - tier
tier/total/capacity - total capacity in bytes available in a given tier. Dimension - tier
tier/historical/count - no. of historical nodes available in each tier. Dimension - tier
tier/replication/factor - configured maximum replication factor in given tier. Dimension - tier

* fix unit test failures
2019-10-08 19:55:32 -07:00
Mohammad J. Khan
18758f5228 Support LDAP authentication/authorization (#6972)
* Support LDAP authentication/authorization

* fixed integration-tests

* fixed Travis CI build errors related to druid-security module

* fixed failing test

* fixed failing test header

* added comments, force build

* fixes for strict compilation spotbugs checks

* removed authenticator rolling credential update feature

* removed escalator rolling credential update feature

* fixed teamcity inspection deprecated API usage error

* fixed checkstyle execution error, removed unused import

* removed cached config as part of removing authenticator rolling credential update feature

* removed config bundle entity as part of removing authenticator rolling credential update feature

* refactored ldao configuration

* added support for SSLContext configuration and TLSCertificateChecker

* removed check to return authentication failure when user has no group assigned, will be checked and handled by the authorizer

* Separate out authorizer checks between metadata-backed store user and LDAP user/groups

* refactored BasicSecuritySSLSocketFactory usage to fix strict compilation spotbugs checks

* fixes build issue

* final review comments updates

* final review comments updates

* fixed LGTM and spellcheck alerts

* Fixed Avatica auth failure error message check

* Updated metadata credentials validator exception message string, replaced DB with metadata store
2019-10-08 17:08:27 -07:00
Himanshu
46ddaf3aa1 fix sorting for resultRow object when numeric dimension not in limitSpec (#8645) 2019-10-08 16:37:15 -07:00
Clint Wylie
2f20799868 merge recommendations into basic-cluster-tuning, add additional info (#8649)
* merge recommendations into basic-cluster-tuning, add additional info

* stupid sidebar
2019-10-08 16:33:54 -07:00
Himanshu
c078ed40fd
groupBy query: optional limit push down to segment scan (#8426)
* groupBy query: optional limit push down to segment scan

* make segment level limit push down configurable

* fix teamcity errors

* fix segment limit pushdown flag handling on query level config override

* use equals for comparator check

* fix sql and null handling

* fix unused imports

* handle null offset in NullableValueGroupByColumnSelectorStrategy for buffer comparator similar to RowBasedGrouperHelper.NullableRowBasedKeySerdeHelper
2019-10-08 15:35:07 -07:00
Lucas Capistrant
d801ce2f29 Update rollup table to properly reflect 0.16.0 (#8638)
This table stated that `index_parallel` tasks were best-effort only. However, this changed with #8061 and this documentation update was simply missed.
2019-10-07 12:37:15 -07:00
Xavier Léauté
1d42551d95 Fix statsd types (#8628)
* fix segment underReplicated/unavailable counts to be gauges instead of counters

* fix jvm/gc/cpu to be a counter instead of timre

jvm/gc/cpu represents the total cpu time spent for multiple gc
invocations, not the time spent in each gc cycle.

the number needs to be divided by jvm/gc/count to get the average gc
time per cycle

* update docs

* fix spellcheck
2019-10-06 14:14:09 -07:00
Himanshu
d91d1c8699 make TaskMonitor continue to monitor in the face of transient errors (#8625) 2019-10-04 09:42:20 -07:00
Parag Jain
f0d74b240d password provider for basic authentication of HttpEmitterConfig (#8618) 2019-10-02 15:59:17 -07:00
Clint Wylie
0a20caf177 update how to release doc (#8590)
* update how to release docs for staged website, updated email templates with checksums and docker

* fix typo

* review stuff

* review stuff

* hmm

* nope
2019-10-02 08:51:25 -07:00
Himanshu
0f7e0ff030 CuratorDruidNodeDiscoveryProvider: do not ignore exception in listener execution and log it (#8616) 2019-10-02 08:50:28 -07:00
Nishant Bangarwa
8537fbeca7 Implementing dropwizard emitter for druid (#7363)
* Implementing dropwizard emitter for druid

making metric manager and alert emitters as optional

* Refactor and make things work

more improvements

improve docs

refactrings

* Fix teamcity inspections

* review comments

* more review comments

* add limit to max number of gauges

* update pom version

* fix pom

* review comments

* review comment

* review comments

* fix broken doc link

review comments

review comments

* review comments

* fix checkstyle

* more spell check fixes

* fix travis failures
2019-10-01 14:59:30 -07:00
Fokko Driesprong
82bfe86d0c Make more package EverythingIsNonnullByDefault by default (#8198)
* Make more package EverythingIsNonnullByDefault by default

* Fixed additional voilations after pulling in master

* Change iterator to list.addAll

* Fix annotations
2019-09-30 18:53:18 -06:00
pdeva
db65068c42 add reference to indexer nodes (#8607) 2019-09-30 16:45:33 -06:00
Vadim Ogievetsky
ce7e77e8bd readme gifs (#8608) 2019-09-29 19:32:39 -07:00
Sashidhar Thallam
51a7235ebc Making optimal usage of multiple segment cache locations (#8038)
* #7641 - Changing segment distribution algorithm to distribute segments to multiple segment cache locations

* Fixing indentation

* WIP

* Adding interface for location strategy selection, least bytes used strategy impl, round-robin strategy impl, locationSelectorStrategy config with least bytes used strategy as the default strategy

* fixing code style

* Fixing test

* Adding a method visible only for testing, fixing tests

* 1. Changing the method contract to return an iterator of locations instead of a single best location. 2. Check style fixes

* fixing the conditional statement

* Added testSegmentDistributionUsingLeastBytesUsedStrategy, fixed testSegmentDistributionUsingRoundRobinStrategy

* to trigger CI build

* Add documentation for the selection strategy configuration

* to re trigger CI build

* updated docs as per review comments, made LeastBytesUsedStorageLocationSelectorStrategy.getLocations a synchronzied method, other minor fixes

* In checkLocationConfigForNull method, using getLocations() to check for null instead of directly referring to the locations variable so that tests overriding getLocations() method do not fail

* Implementing review comments. Added tests for StorageLocationSelectorStrategy

* Checkstyle fixes

* Adding java doc comments for StorageLocationSelectorStrategy interface

* checkstyle

* empty commit to retrigger build

* Empty commit

* Adding suppressions for words leastBytesUsed and roundRobin of ../docs/configuration/index.md file

* Impl review comments including updating docs as suggested

* Removing checkLocationConfigForNull(), @NotEmpty annotation serves the purpose

* Round robin iterator to keep track of the no. of iterations, impl review comments, added tests for round robin strategy

* Fixing the round robin iterator

* Removed numLocationsToTry, updated java docs

* changing property attribute value from tier to type

* Fixing assert messages
2019-09-28 00:17:44 -06:00
Evan Ren
17d9d7daed Increase column size for taskID and createdTime, and decrease Type and Duration (#8594) 2019-09-27 14:19:04 -07:00
Faxian Zhao
e1b4a3ab71 bug fix for lookup leak when we remove the last lookup from lookup tier (#8598)
* bug fix for lookup leak when we remove the last lookup from lookup tier

* warnings about lookups that will never be loaded

* fix unit test
2019-09-27 03:55:02 -07:00
Fokko Driesprong
a2363b6b61 Remove commons-httpclient (#8407) 2019-09-27 02:14:58 -07:00
Fokko Driesprong
99c3e0bb3f Bump HttpClient to 4.5.10 (#8404)
* Bump HttpClient to 4.5.9

* Remove Licenses file

* Revert license

* Remove duplicate dependency

* Bump HttpClient to 4.5.10
2019-09-27 02:14:36 -07:00
Jonathan Wei
5f83cd879e Fix download link in header links (#8600) 2019-09-26 18:06:12 -07:00
Clint Wylie
7781820dea JsonParserIterator.init future timeout (#8550)
* add timeout support for JsonParserIterator init future

* add queryId

* should be less than 1

* fix

* fix npe

* fix lgtm

* adjust exception, nullable

* fix test

* refactor

* revert queryId change

* add log.warn to tie exception to json parser iterator
2019-09-27 09:13:37 +09:00
elloooooo
7f2b6577ef get active task by datasource when supervisor discover tasks (#8450)
* get active task by datasource when supervisor discover tasks

* fix ut

* fix ut

* fix ut

* remove unnecessary condition check

* fix ut

* remove stream in hot loop
2019-09-26 16:15:24 -07:00
Fangyuan Deng
a280c5dc03 fix queuedSize not decrease in HttpLoadQueuePeon when load failed (#8596) 2019-09-26 08:48:00 -07:00
Himanshu
9f1f5e115c
doubleMean aggregator to be used at query time (#8459)
* doubleMean aggregator for computing mean

* make docs

* build fixes

* address review comment: handle null args
2019-09-26 08:04:33 -07:00
Vadim Ogievetsky
f4605f45be Website: stricter replace (#8593)
* stricter replace

* better fix
2019-09-25 17:25:39 -07:00
Nishant Bangarwa
a75ddaad9e Add TrustedDomain Authenticator (#8248)
* Add TrustedDomain Authenticator

update javadoc

Add nullable annotations

Add cautionary note

fix travis failure

* add IP to spell checker
2019-09-25 11:25:03 -07:00
Vadim Ogievetsky
563718c5b2 add druid version correctly (#8586) 2019-09-24 18:19:06 -07:00
Evan Ren
0467cce7a0 Web console: Add frontend buttons to remove group by (#8537)
* Add frontend buttons to remove group by

* Change icon for remove group by

* Update web console to use latest toolkit

* Add test cases to verify that remove group by buttons are rendered

* Correct mistake of using incorrect components

* Update tests for two cases

* Put remove button after group by
2019-09-24 16:32:02 -07:00
Clint Wylie
eabddffd6e fix http firehose factory leaky connection in constructor (#8576)
* fix http firehose factory leaky connection in constructor

* stylin
2019-09-24 17:08:43 -06:00
Vadim Ogievetsky
7c14fa08f8 Web console: Expand filter UI (#8579)
* add controls to the filter UI

* fix double base
2019-09-24 12:27:46 -07:00
Himanshu
b6a16b5eb6 make it possible to not emit cache metrics and disable by default (#8561) 2019-09-24 22:12:09 +08:00
Rye
f2a444321b Added live reports for Kafka and Native batch task (#8557)
* Added live reports for Kafka and Native batch task

* Removed unused local variables

* Added the missing unit test

* Refine unit test logic, add implementation for HttpRemoteTaskRunner

* checksytle fixes

* Update doc descriptions for updated API

* remove unnecessary files

* Fix spellcheck complaints

* More details for api descriptions
2019-09-23 21:08:36 -07:00
Vadim Ogievetsky
52f3f2c229 fix docs version interpolation (#8568) 2019-09-22 17:38:55 -07:00
Vadim Ogievetsky
2104cee79b Web console: prevent extra trim in auto complete (#8543)
* prevent extra trim in auto complete

* add unit test
2019-09-22 15:18:51 -07:00
Vadim Ogievetsky
94298f7809 Update Kafka loading docs to use the streaming data loader (#8544)
* fix redirects

* remove useless page

* fix Single server reference configurations formatting

* update batch data loading

* update Kafka docs

* fix typos and tests

* add more links

* fix spelling
2019-09-22 15:00:52 -07:00
Roman Leventov
e7cc968749
Add new items to concurrency code review checklist (#8493) 2019-09-22 14:19:37 +03:00
SandishKumarHN
ade8d1922d #8156 : StructuralSearchInspection, Prohibit check on Thread.ge… (#8394)
* StructuralSearchInspection, Prohibit check on Thread.getState()

* review changes - 1

* review changes 2

* review changes 3

* test fix

* review changes-2

* review changes-3
2019-09-22 14:12:05 +03:00
Vadim Ogievetsky
868bb42301 clean up default values and add infos (#8567) 2019-09-21 21:00:21 -07:00
Chi Cao Minh
aeac0d4fd3 Adjust defaults for hashed partitioning (#8565)
* Adjust defaults for hashed partitioning

If neither the partition size nor the number of shards are specified,
default to partitions of 5,000,000 rows (similar to the behavior of
dynamic partitions). Previously, both could be null and cause incorrect
behavior.

Specifying both a partition size and a number of shards now results in
an error instead of ignoring the partition size in favor of using the
number of shards. This is a behavior change that makes it more apparent
to the user that only one of the two properties will be honored
(previously, a message was just logged when the specified partition size
was ignored).

* Fix test

* Handle -1 as null

* Add -1 as null tests for single dim partitioning

* Simplify logic to handle -1 as null

* Address review comments
2019-09-21 20:57:40 -07:00
Chi Cao Minh
99b6eedab5 Rename partition spec fields (#8507)
* Rename partition spec fields

Rename partition spec fields to be consistent across the various types
(hashed, single_dim, dynamic). Specifically, use targetNumRowsPerSegment
and maxRowsPerSegment in favor of targetPartitionSize and
maxSegmentSize. Consistent and clearer names are easier for users to
understand and use.

Also fix various IntelliJ inspection warnings and doc spelling mistakes.

* Fix test

* Improve docs

* Add targetRowsPerSegment to HashedPartitionsSpec
2019-09-20 14:59:18 -06:00
Chi Cao Minh
187b507b3d Fix CuratorModule flaky test (#8562)
For CuratorModuleTest. exitsJvmWhenMaxRetriesExceeded(), the expected
log message is intermittently not the first one in the list of captured
log messages. For example, it is the second one in
https://travis-ci.org/apache/incubator-druid/jobs/586792178#L754.
2019-09-20 13:36:22 -07:00
Himanshu
62afbca7b9
update HRTR to account for task known to be running on a worker when it shows up (#8427) 2019-09-19 10:19:17 -07:00
Xavier Léauté
e184d24a74
add support for dogstatsd events in statsd-emitter (#8546)
* add support for dogstatsd events in statsd-emitter
* add option to turn on alert events (off by default)
* updated docs
2019-09-19 08:12:30 -07:00
Vadim Ogievetsky
36a6365d9f Web console: polish the data loader (#8554)
* rearrange ioConfig prop ordering in Tune step

* make useEarliestOffset mandatory

* required intent

* make segmentGranularity requred for batch ingestion

* update tests
2019-09-18 20:53:18 -07:00
Evan Ren
8650ee9fd0 Web console: Druid status displayed in a table (#8484)
* Retrieved data from endpoint and displayed on table

* Added view raw button, removed json, and fixed formatting for table

* Remove error var

* Fixed snapshot for updated status dialog

* Made changes based on PR review

* Added version tag

* Updated snapshot to match changes

* Made more changes based on review

* Fix filter and formatting

* Fix filter and add unit test

* fix styling of dialog

* Fix footer height

* Fixed testing of filtering
2019-09-17 15:26:51 -06:00