Commit Graph

10775 Commits

Author SHA1 Message Date
Charles Smith 99494e3d16
suggest index parallel for native batch reindexing > 1GB (#10788) 2021-01-22 21:54:28 -08:00
Clint Wylie cd6af93274
add leftover tests from #10743 (#10766) 2021-01-22 09:20:48 -08:00
zhangyue19921010 bf1d1d583b
modify (#10778)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-01-22 09:20:13 -08:00
zhangyue19921010 8c6153d511
[Bug Fix] Broker will not wait for its SQL metadata view to fully initialize before starting up, even though set awaitInitializationOnStart true (#10779)
* enhance the logic of Start up DruidSchema immediately if there are no segments.

* add UT to test DruidSchema init

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-01-22 08:48:21 -08:00
Gian Merlino 8b808c4879
Retain order of AND, OR filter children. (#10758)
* Retain order of AND, OR filter children.

If we retain the order, it enables short-circuiting. People can put a
more selective filter earlier in the list and lower the chance that
later filters will need to be evaluated.

Short-circuiting was working before #9608, which switched to unordered
sets to solve a different problem. This patch tries to solve that
problem a different way.

This patch moves filter simplification logic from "optimize" to
"toFilter", because that allows the code to be shared with Filters.and
and Filters.or. The simplification has become more complicated and so
it's useful to share it.

This patch also removes code from CalciteCnfHelper that is no longer
necessary because Filters.and and Filters.or are now doing the work.

* Fixes for inspections.

* Fix tests.

* Back to a Set.
2021-01-20 08:59:20 -08:00
zhangyue19921010 2590ad4f67
Historical unloads damaged segments automatically when lazy on start. (#10688)
* ready to test

* tested on dev cluster

* tested

* code review

* add UTs

* add UTs

* ut passed

* ut passed

* opti imports

* done

* done

* fix checkstyle

* modify uts

* modify logs

* changing the package of SegmentLazyLoadFailCallback.java to org.apache.druid.segment

* merge from master

* modify import orders

* merge from master

* merge from master

* modify logs

* modify docs

* modify logs to rerun ci

* modify logs to rerun ci

* modify logs to rerun ci

* modify logs to rerun ci

* modify logs to rerun ci

* modify logs to rerun ci

* modify logs to rerun ci

* modify logs to rerun ci

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-01-16 19:53:30 -08:00
Gian Merlino 2b24dc3764
SegmentAnalyzer: Properly close column after retrieving it. (#10772) 2021-01-16 19:26:34 -08:00
Jihoon Son 95065bdf1a
Bump dev version to 0.22.0-SNAPSHOT (#10759) 2021-01-15 13:16:23 -08:00
Xavier Léauté c5ecbf6794
fix task metric types in statsd emitter (#10764)
except success and failure stats, task count metrics should all be
gauges, since they represent the current state and not some aggregate
counter over time.
2021-01-15 11:39:51 -08:00
Gian Merlino a82910e065
OrFilter: Properly handle child matchers that return the original mask. (#10754)
* OrFilter: Properly handle child matchers that return the original mask.

This happens when a child matcher is literally true (for example,
BooleanVectorValueMatcher). In this case, OrFilter would throw this
exception from its call to removeAll while processing the next filter:

  java.lang.IllegalStateException: 'other' must be a different instance from 'this'

Also update the javadocs for VectorValueMatcher to call out that the
returned object may be the same as the input mask.

* Fix style.
2021-01-14 23:28:13 -08:00
Gian Merlino 7354953b1b
VectorMatch: Disallow "copyFrom", "addAll" on self; improve tests. (#10755)
No existing code relies on being able to call these methods in this way.

The new tests exhaustively test all vectors up to size 7, and also test
behavior the run-on-self behavior that has been adjusted by this patch.
2021-01-14 18:29:13 -08:00
Gian Merlino 2bbf89db81
Remove FalseVectorMatcher, TrueVectorMatcher in favor of BooleanVectorValueMatcher. (#10757) 2021-01-14 18:28:25 -08:00
Vadim Ogievetsky e52db19823
treat null as not defined (#10751) 2021-01-14 18:22:59 -08:00
kaijianding 4437c6af60
use actual dataInterval in cache key (#10714)
* use actual dataInterval in cache key

* fix ut fail

* fix segmentMaxTime exclusive
2021-01-13 18:31:36 -08:00
Jihoon Son b3325c1601
Add a config for monitorScheduler type (#10732)
* Add a config for monitorScheduler type

* check interrupted

* null check

* do not schedule monitor if the previous one is still running

* checkstyle

* clean up names

* change default back to basic

* fix test
2021-01-13 17:20:43 -08:00
Jihoon Son 149306c9db
Tidy up HTTP status codes for query errors (#10746)
* Tidy up query error codes

* fix tests

* Restore query exception type in JsonParserIterator

* address review comments; add a comment explaining the ugly switch

* fix test
2021-01-13 17:20:00 -08:00
Clint Wylie 8c3c9b4060
fix limited queries with subtotals (#10743)
* i put my thing down, flip it and reverse it

* oops
2021-01-13 12:55:24 -08:00
Clint Wylie 9362dc7968
re-use expression vector evaluation results for the same offset in expression vector selectors (#10614)
* cache expression selector results by associating vector expression bindings to underlying vector offset

* better coverage, fix floats

* style

* stupid bot

* stupid me

* more test

* intellij threw me under the bus when it generated those junit methods

* narrow interface instead of passing around offset
2021-01-13 12:44:56 -08:00
Vadim Ogievetsky 2fc2938b01
Web console: fix bad results if there is not native bigint (#10741)
* fix bigint when it does not exist

* add test
2021-01-12 16:32:23 -08:00
Lucas Capistrant aecc9e5e7e
Remove legacy code from LogUsedSegments duty (#10287)
* allow the LogUsedSegments duty to be skippped

* Fixes for TravisCI coverage checks and documentation spell checking

* prameterize DruidCoordinatorTest in order to achieve coverage

* update config name to remove duty ref and improve documentation

* refine documentation for new config with reviewer advice

* add default column to docs for new config

* remove legacy code in LogUsedSegments and remove config to disbale duty

* fix makeHistoricalMangementDuties now that the returned list is always the same
2021-01-12 14:09:19 -08:00
Jihoon Son ca32652932
Fix potential deadlock in batch ingestion (#10736)
* Fix potential deadlock in batch ingestion

* fix checkstyle and comment

* this is better
2021-01-12 12:50:45 -08:00
Jihoon Son 3984457e5b
Add missing unit tests for segment loading in historicals (#10737)
* Add missing unit tests for segment loading in historicals

* unused import
2021-01-11 18:20:13 -06:00
Lucas Capistrant fe0511b16a
Coordinator Dynamic Config changes to ease upgrading with new config value (#10724)
* Coordinator Dynamic Config changes to ease upgrading with new config value

* change a log to debug level following review

* changes based on review feedback

* fix checkstyle
2021-01-10 20:05:39 -08:00
Xavier Léauté 118b50195e
Introduce KafkaRecordEntity to support Kafka headers in InputFormats (#10730)
Today Kafka message support in streaming indexing tasks is limited to
message values, and does not provide a way to expose Kafka headers,
timestamps, or keys, which may be of interest to more specialized
Druid input formats. For instance, Kafka headers may be used to indicate
payload format/encoding or additional metadata, and timestamps are often
omitted from values in Kafka streams applications, since they are
included in the record.

This change proposes to introduce KafkaRecordEntity as InputEntity,
which would give input formats full access to the underlying Kafka record,
including headers, key, timestamps. It would also open access to low-level
information such as topic, partition, offset if needed.

KafkaEntity is a subclass of ByteEntity for backwards compatibility with
existing input formats, and to avoid introducing unnecessary complexity
for Kinesis indexing tasks.
2021-01-08 16:04:37 -08:00
zhangyue19921010 2837a9b62f
[Minor Doc Fix] Correct the default value of `druid.server.http.gracefulShutdownTimeout` (#10661)
* done

* done

* done

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-01-08 15:23:08 -08:00
zhangyue19921010 d5192640cb
remove extra comma (#10670)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-01-08 15:15:08 -08:00
Yi Yuan 3624acbcf8
fix web-console show json bug (#10710)
* fix web-console show json bug

* replace all JSON.stringify

Co-authored-by: yuanyi <yuanyi@freewheel.tv>
2021-01-08 14:55:55 -08:00
秦臻 c62b7c19c3
javascript filter result convert to java boolean (#10721)
* javascript filter result convert to java boolean

* use type convert replace script convert, and add more unit test

Co-authored-by: qinzhen <qinzhen@kuaishou.com>
2021-01-08 14:30:09 -08:00
Abhishek Agarwal f66fdbfa5d
add offsetFetchPeriod to kinesis ingestion doc (#10734) 2021-01-08 14:19:26 -08:00
Gian Merlino 6eef0e4c9f
Fix collision between #10689 and #10593. (#10738) 2021-01-08 09:52:27 -08:00
Aleksey Plekhanov 26bcd47e51
Thread-safety for ResponseContext.REGISTERED_KEYS (#9667) 2021-01-08 00:37:49 -08:00
Liran Funaro 08ab82f55c
IncrementalIndex Tests and Benchmarks Parametrization (#10593)
* Remove redundant IncrementalIndex.Builder

* Parametrize incremental index tests and benchmarks

- Reveal and fix a bug in OffheapIncrementalIndex

* Fix forbiddenapis error: Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [Uses default locale]

* Fix Intellij errors: declared exception is never thrown

* Add documentation and validate before closing objects on tearDown.

* Add documentation to OffheapIncrementalIndexTestSpec

* Doc corrections and minor changes.

* Add logging for generated rows.

* Refactor new tests/benchmarks.

* Improve IncrementalIndexCreator documentation

* Add required tests for DataGenerator

* Revert "rollupOpportunity" to be a string
2021-01-07 22:18:47 -08:00
kaijianding 01e25f1e69
reuse DataSegment object when a segment found on another server (#10715) 2021-01-07 21:55:25 -08:00
Jonathan Wei c7f2d3fbb5
Update deps for CVE-2020-28168 and CVE-2020-28052 (#10733)
* Update deps for CVE-2020-28168 and CVE-2020-28052

* Make BC runtime scope
2021-01-07 20:31:44 -08:00
Makdon 1905b80ec3
Update badge for travis in README.md (#10717)
* Update README for updating travis badge

Update README cause
> This repository was migrated and is now building on travis-ci.com

* Update README.md
2021-01-07 18:39:58 -08:00
Himanshu c7b1212a43
AWS RDS token based password provider (#9518)
* refresh db pwd

* aws iam token password provider

* fix analyze-dependencies build

* fix doc build

* add  ut for BasicDataSourceExt

* more doc updates

* more  doc update

* moving aws  token password  provider to new extension

* remove duplicate changes

* make  all config inline

* extension docs

* refresh db  password  in SQL Firehose code path as well

* add ut

* fix build

* add new extension to distribution

* rds lib is not provided

* fix license build

* add version to license

* change parent version to 0.19.0-snapshot

* address review comments

* fix core/ code coverage

* Update server/src/main/java/org/apache/druid/metadata/BasicDataSourceExt.java

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

* address review comments

* fix spellchecker

* remove inadvertant website file change

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2021-01-06 21:15:29 -08:00
Gian Merlino 48e576a307
Scan query: More accurate error message when segment per time chunk limit is exceeded. (#10630)
* Scan query: More accurate error message when segment per time chunk limit is exceeded.

* Add guardrail test.
2021-01-06 14:11:28 -08:00
Makdon f9fc1892d1
Typo: missing comma in json (#10711) 2021-01-06 13:49:50 -08:00
Jonathan Wei 68bb038b31
Multiphase segment merge for IndexMergerV9 (#10689)
* Multiphase merge for IndexMergerV9

* JSON fix

* Cleanup temp files

* Docs

* Address logging and add IT

* Fix spelling and test unloader datasource name
2021-01-05 22:19:09 -08:00
Himanshu d2e6240cac
k8s-int-test-build: zk-less druid cluster and http based segment/task managment (#10686)
* zk-less druid cluster in k8s build

* attempt to fix build and use http based remote task management

* mm/router logs for debugging

* add default account k8s role and binding for pod, configMap access

* fix issue

* change router port to 8088 for common readinessProbe

* break build_run_k8s_cluster.sh into separate scripts

* revert changes to K8sDruidNodeAnnouncer.java

* k8s extension doc update

* add license to new file

* address review comments

* do not try to load lookups at startup to improve cluster startup time
2021-01-05 18:51:47 -08:00
Jihoon Son ea2d51d61f
Better error message for compaction task when it sees no segments for compaction (#10728) 2021-01-05 16:49:57 -08:00
Jonathan Wei 769c21cc87
Add sample method to IndexingServiceClient (#10729)
* Add sample method to IndexingServiceClient

* Add unit test

* Fix LGTM
2021-01-05 15:02:44 -08:00
Franklyn Dsouza 045b29fa95
Correctly handle null values in time column results (#10642)
* handle null case

* test this case

* test sql resource

* fix style
2021-01-04 22:22:46 -08:00
Lucas Capistrant 26b911a384
Make some additions to IT suite to make Hadoop related testing more understandable (#10667)
* Make some additions to IT suite to make Hadoop related testing more understandable

* add start.hadoop.docker to mvn arg tips in doc

* fix issues preventing ITIndexHadoopTest from running in local mode
2020-12-28 12:25:47 -06:00
Clint Wylie edfbdbfc97
fix NPE when calling TaskLocation.hashCode with null host (#10708) 2020-12-24 15:30:54 -08:00
Clint Wylie 74fbdd322d
refactor NodeRole so extensions can participate in disco and announcement (#10700)
* refactor NodeRole so extensions can participate in disco and announcement

* fixes, maybe

* retries

* javadoc

* fix

* spelling
2020-12-24 15:29:32 -08:00
Charles Smith 797371598d
update syntax for golbal cached uri lookups (#10629) 2020-12-24 09:49:01 -08:00
Xavier Léauté b7a16d08a6
Update Apache Kafka to 2.7.0 (#10701)
- align scala versions to match Kafka
2020-12-22 13:56:00 -08:00
Lucas Capistrant 58ce2e55d8
Add dynamic coordinator config that allows control over how many segments are considered when picking a segment to move. (#10284)
* dynamic coord config adding more balancing control

add new dynamic coordinator config, maxSegmentsToConsiderPerMove. This
config caps the number of segments that are iterated over when selecting
a segment to move. The default value combined with current balancing
strategies will still iterate over all provided segments. However,
setting this value to something > 0 will cap the number of segments
visited. This could make sense in cases where a cluster has a very large
number of segments and the admins prefer less iterations vs a thorough
consideration of all segments provided.

* fix checkstyle failure

* Make doc more detailed for admin to understand when/why to use new config

* refactor PR to use a % of segments instead of raw number

* update the docs

* remove bad doc line

* fix typo in name of new dynamic config

* update RservoirSegmentSampler to gracefully deal with values > 100%

* add handler for <= 0 in ReservoirSegmentSampler

* fixup CoordinatorDynamicConfigTest naming and argument ordering

* fix items in docs after spellcheck flags

* Fix lgtm flag on missing space in string literal

* improve documentation for new config

* Add default value to config docs and add advice in cluster tuning doc

* Add percentOfSegmentsToConsiderPerMove to web console coord config dialog

* update jest snapshot after console change

* fix spell checker errors

* Improve debug logging in getRandomSegmentBalancerHolder to cover all bad inputs for % of segments to consider

* add new config back to web console module after merge with master

* fix ReservoirSegmentSamplerTest

* fix line breaks in coordinator console dialog

* Add a test that helps ensure not regressions for percentOfSegmentsToConsiderPerMove

* Make improvements based off of feedback in review

* additional cleanup coming from review

* Add a warning log if limit on segments to consider for move can't be calcluated

* remove unused import

* fix tests for CoordinatorDynamicConfig

* remove precondition test that is redundant in CoordinatorDynamicConfig Builder class
2020-12-22 08:27:55 -08:00
Maytas Monsereenusorn 5bd7924296
Fix kinesis integration test (#10696)
* fix kinesis IT

* fix checkstyle
2020-12-21 12:57:40 -08:00