Commit Graph

10304 Commits

Author SHA1 Message Date
Aleksey Plekhanov 9341ea828a
Fixed flaky BlockingPoolTest.testConcurrentTakeBatch() (#9692) 2020-05-03 12:54:27 -07:00
BIGrey ee9a721acc
fix npe in IncrementalIndexReadBenchmark (#9754)
Co-authored-by: 黄辉 <huanghui.bigrey@bytedance.com>
2020-05-03 12:52:50 -07:00
Jihoon Son 9ab49b34db
Update notice; fix version of druid-query-toolkit (#9799) 2020-05-02 20:00:43 -07:00
Clint Wylie 9a293d554d
remove UnionMergeRule rules from SQL planner (#9797) 2020-05-01 12:50:11 -07:00
Jonathan Wei 61295bd002
More Hadoop integration tests (#9714)
* More Hadoop integration tests

* Add missing s3 instructions

* Address PR comments

* Address PR comments

* PR comments

* Fix typo
2020-04-30 14:33:01 -07:00
sthetland c61365c1e0
Druid Quickstart refactor and update (#9766)
* Update data-formats.md

Per Suneet, "Since you're editing this file can you also fix the json on line 177 please - it's missing a comma after the }"

* Light text cleanup

* Removing discussion of sample data, since it's repeated in the data loading tutorial, and not immediately relevant here.

* Update index.md

* original quickstart full first pass

* original quickstart full first pass

* first pass all the way through

* straggler

* image touchups and finished old tutorial

* a bit of finishing up

* Review comments

* fixing links

* spell checking gymnastics
2020-04-30 12:07:28 -07:00
Jihoon Son 39722bd064
Integration tests for stream ingestion with various data formats (#9783)
* Integration tests for stream ingestion with various data formats

* fix npe

* better logging; fix tsv

* fix tsv

* exclude kinesis from travis

* some readme
2020-04-29 13:18:01 -07:00
Suneet Saldanha 7510e6e722
Fix potential NPEs in joins (#9760)
* Fix potential NPEs in joins

intelliJ reported issues with potential NPEs. This was first hit in testing
with a filter being pushed down to the left hand table when joining against
an indexed table.

* More null check cleanup

* Optimize filter value rewrite for IndexedTable

* Add unit tests for LookupJoinable

* Add tests for IndexedTableJoinable

* Add non null assert for dimension selector

* Supress null warning in LookupJoinMatcher

* remove some null checks on hot path
2020-04-29 11:03:13 -07:00
Aleksei Chumagin 0642f778fa
changed Preview to Apply (#9757) 2020-04-29 09:53:25 -07:00
Maytas Monsereenusorn 6bc64b731f
Improve "waiting for tasks complete" logic in integration tests (#9759)
* improve waiting for tasks complete logic in integration tests

* improve waiting for tasks complete logic in integration tests

* fix forbidden check
2020-04-29 08:53:45 -07:00
Maytas Monsereenusorn a107ee3ed2
Fix problem when running single integration test using -Dit.test= (#9778)
* fix running single it

* fix checksyle
2020-04-29 08:53:25 -07:00
James Dalton b279e04a31
table fix (#9769) 2020-04-28 11:23:24 -07:00
Francesco Nidito e7e41e3a36
Adding support for autoscaling in GCE (#8987)
* Adding support for autoscaling in GCE

* adding extra google deps also in gce pom

* fix link in doc

* remove unused deps

* adding terms to spelling file

* version in pom 0.17.0-incubating-SNAPSHOT --> 0.18.0-SNAPSHOT

* GCEXyz -> GceXyz in naming for consistency

* add preconditions

* add VisibleForTesting annotation

* typos in comments

* use StringUtils.format instead of String.format

* use custom exception instead of exit

* factorize interval time between retries

* making literal value a constant

* iter all network interfaces

* use provided on google (non api) deps

* adding missing dep

* removing unneded this and use Objects methods instead o 3-way if in hash and comparison

* adding import

* adding retries around getRunningInstances and adding limit for operation end waiting

* refactor GceEnvironmentConfig.hashCode

* 0.18.0-SNAPSHOT -> 0.19.0-SNAPSHOT

* removing unused config

* adding tests to hash and equals

* adding nullable to waitForOperationEnd

* adding testTerminate

* adding unit tests for createComputeService

* increasing retries in unrelated integration-test to prevent sporadic failure (hopefully)

* reverting queryResponseTemplate change

* adding comment for Compute.Builder.build() returning null
2020-04-28 03:13:39 -07:00
Maytas Monsereenusorn 8b78eebdbd
Test reading from empty kafka/kinesis partitions (#9729)
* add test for stream sequence number returns null

* fix checkstyle

* add index test for when stream returns null

* retrigger test
2020-04-27 10:23:56 -07:00
Jonathan Wei fe000a9e4b
Adjust string comparators used for ingestion (#9742)
* Adjust string comparators used for ingestion

* Small tweak

* Fix inspection, more javadocs

* Address PR comment

* Add rollup comment

* Add ordering test

* Fix IncrementaIndexRowCompTest
2020-04-25 13:47:07 -07:00
Clint Wylie 7711f776a0
fix issue where CloseableIterator.flatMap does not close inner CloseableIterator (#9761)
* fix issue where CloseableIterator.flatMap does not close inner CloseableIterator

* more test

* style

* clarify test
2020-04-24 13:52:50 -07:00
Clint Wylie fc5383cd00
revert datasketches-java version to 1.1.0-incubating until new version is released (#9751)
* revert datasketches-java version to 1.1.0-incubating until fix is in place

* fix tests

* checkstyle
2020-04-24 12:52:12 -07:00
Jihoon Son 7fa72fbf15
Initialize SettableByteEntityReader only when inputFormat is not null (#9734)
* Lazy initialization of SettableByteEntityReader to avoid NPE

* toInputFormat for tsv

* address comments

* common code
2020-04-24 10:22:51 -07:00
BIGrey c5bfe36011
Optimize FileWriteOutBytes to avoid high system cpu usage (#9722)
* optimize FileWriteOutBytes to avoid high sys cpu

* optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException

* optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException in writeOutBytes.size

* Revert "optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException in writeOutBytes.size"

This reverts commit 965f7421

* Revert "optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException"

This reverts commit 149e08c0

* optimize FileWriteOutBytes to avoid high sys cpu -- avoid IOEception never thrown check

* Fix size counting to handle IOE in FileWriteOutBytes + tests

* remove unused throws IOException in WriteOutBytes.size()

* Remove redundant throws IOExcpetion clauses

* Parameterize IndexMergeBenchmark

Co-authored-by: huanghui.bigrey <huanghui.bigrey@bytedance.com>
Co-authored-by: Suneet Saldanha <suneet.saldanha@imply.io>
2020-04-23 20:18:42 -07:00
Gian Merlino 4087a015e8
Datasource doc structure adjustments. (#9716)
- Reorder both the datasource and query-execution page orderings to
table, lookup, union, inline, query, join. (Roughly increasing order
of conceptual "fanciness".)
- Add more crosslinks from datasource page to query-execution page:
one per datasource type.
2020-04-23 16:04:59 -07:00
Maytas Monsereenusorn 16f5ae4405
Add integration tests for kafka ingestion (#9724)
* add kafka admin and kafka writer

* refactor kinesis IT

* fix typo refactor

* parallel

* parallel

* parallel

* parallel works now

* add kafka it

* add doc to readme

* fix tests

* fix failing test

* test

* test

* test

* test

* address comments

* addressed comments
2020-04-22 10:43:34 -07:00
Gian Merlino 479c290fb9
Add QueryResource to log4j2 template. (#9735) 2020-04-22 09:18:45 -07:00
Maytas Monsereenusorn cff39892ba
Fixes intermittent failure in ITAutoCompactionTest (#9739)
* fix intermittent failure in ITAutoCompactionTest

* fix typo

* update javadoc
2020-04-21 20:56:17 -07:00
calvinhkf b146f8a2a7
Align library version (#9636)
* align JUnitParams version 1.1.1,1.0.4 to 1.1.1

* aligin junit version 4.8.1,4.12 to 4.12

* exclude explicitly specified version
2020-04-21 20:19:38 -07:00
Abhishek Radhakrishnan 8abcbf671d
Fix numbered list formatting in markdown. (#9664) 2020-04-21 20:18:12 -07:00
Clint Wylie 68cc0b2e1c
fixes for inline subqueries when multi-value dimension is present (#9698)
* fixes for inline subqueries when multi-value dimension is present

* fix test

* allow missing capabilities for vectorized group by queries to be treated as single dims since it means that column doesnt exist

* add comment
2020-04-21 18:44:26 -07:00
Clint Wylie 28f56978ab
web-console clean coverage report on build clean (#9718) 2020-04-21 17:02:05 -07:00
Jenson b9ad250c00
Fix misuse of Integer.SIZE in FileWriteOutBytes.writeInt (#9723)
* change Integer.SIZE to Integer.BYTES in FileWriteOutBytes#writeInt

* Add ASF header

Co-authored-by: jenson <junstan@paypal.com>
2020-04-19 18:16:53 +08:00
Clint Wylie e677c62484
document useFilterCNF query context parameter (#9647)
* document useFilterCNF query context parameter

* move context key to QueryContexts

* Update .spelling
2020-04-16 22:12:20 -07:00
mcbrewster b7fdb29423
add joins to column tree menu (#9705)
* add joins to column tree menu

* fix capitalization

* add keyword, keep columns if replaced

* actually fix capitalization

* add keywords
2020-04-16 17:51:59 -07:00
Clint Wylie b89ad49396
disable group by config applyLimitPushDownToSegment by default (#9711)
* disable group by config applyLimitPushDownToSegment by default

* document
2020-04-16 03:03:35 -07:00
Gian Merlino 42590ae64b
Refresh query docs. (#9704)
* Refresh query docs.

Larger changes:

- New doc: querying/datasource.md describes the various kinds of
datasources you can use, and has examples for both SQL and native.
- New doc: querying/query-execution.md describes how native queries
are executed at a high level. It doesn't go into the details of specific
query engines or how queries run at a per-segment level. But I think it
would be good to add or link that content here in the future.
- Refreshed doc: querying/sql.md updated to refer to joins, reformatted
a bit, added a new "Query translation" section that explains how
queries are translated from SQL to native, and removed configuration
details (moved to configuration/index.md).
- Refreshed doc: querying/joins.md updated to refer to join datasources.

Smaller changes:

- Add helpful banners to the top of query documentation pages telling
people whether a given page describes SQL, native, or both.
- Add SQL metrics to operations/metrics.md.
- Add some color and cross-links in various places.
- Add native query component docs to the sidebar, and renamed them so
they look nicer.
- Remove Select query from the sidebar.
- Fix Broker SQL configs in configuration/index.md. Remove them from
querying/sql.md.
- Combined querying/searchquery.md and querying/searchqueryspec.md.

* Updates.

* Fix numbering.

* Fix glitches.

* Add new words to spellcheck file.

* Assorted changes.

* Further adjustments.

* Add missing punctuation.
2020-04-15 16:12:20 -07:00
Himanshu b082262a2a
druid-pac4j:add custom SSL handling to com.nimbusds.oauth2.sdk.http.HTTPRequest objects (#9695) 2020-04-15 15:59:24 -07:00
Maytas Monsereenusorn 8328d91b30
Add missing integration tests for the compaction by the coordinator (#9644)
* Add API to trigger a compaction by the coordinator for integration tests

* Add missing integration tests for the compaction by the coordinator

* address comments
2020-04-15 14:27:33 -07:00
Jihoon Son b8f7128b2d
Revert "remove ServerDiscoverySelector from DruidLeaderClient (#9481)" (#9702)
* Revert "remove ServerDiscoverySelector from DruidLeaderClient (#9481)"

This reverts commit 072bbe210f.

* fix build
2020-04-14 20:42:56 -07:00
Will Salisbury cda9f41e69
s/S3/GCS/g (#9700)
fix typo [ at least I hope this was a typo… ]
2020-04-14 18:39:54 -07:00
Chi Cao Minh 2262e33316
Fix flaky web console E2E test (#9685)
web-console/e2e-tests/tutorial-batch.spec.ts would occasionally timeout
between the transition from the data loader "configure schema" and
"partition" steps due to missing waits when toggling the rollup setting.

Also, fix shellcheck warnings for script/druid.
2020-04-14 15:27:16 -07:00
Maytas Monsereenusorn d930f04e6a
Test file format extensions for inputSource (orc, parquet) (#9632)
* Test file format extensions for inputSource (orc, parquet)

* Test file format extensions for inputSource (orc, parquet)

* fix path

* resolve merge conflict

* fix typo
2020-04-13 13:03:56 -07:00
Jihoon Son 6a52bdc605
Skip license check for dependency reduced pom files (#9687) 2020-04-11 18:11:53 -07:00
Chi Cao Minh e6dd6a4119
Skip node dev dependency vulnerability scan (#9684)
Since they are not production dependencies, security vulnerabilities in
the dev dependencies can be ignored.
2020-04-11 14:24:25 -07:00
Abhishek Radhakrishnan cbbfd63bed
Add 0.18.0 to .backportrc.json to facilitate backport. (#9661) 2020-04-11 13:49:04 -07:00
Clint Wylie 0ff926b1a1
fix issue with group by limit pushdown for extractionFn, expressions, joins, etc (#9662)
* fix issue with group by limit pushdown for extractionFn, expressions, joins, etc

* remove unused

* fix test

* revert unintended change

* more tests

* consider capabilities for StringGroupByColumnSelectorStrategy

* fix test

* fix and more test

* revert because im scared
2020-04-11 01:18:11 -07:00
Jihoon Son 1b60148ec6
Missing license changes for sources in licenses.yaml (#9678) 2020-04-10 23:06:33 -07:00
Gian Merlino 5249155284
Fix off-by-one in IndexedTableJoinMatcher.getCardinality. (#9674)
* Fix off-by-one in IndexedTableJoinMatcher.getCardinality.

It would report a cardinality that is one lower than the actual cardinality.
The missing value is the phantom null that can be generated by outer joins.

* Fix tests.
2020-04-10 18:11:05 -07:00
Himanshu ca369e5768
druid-pac4j: add ability to use custom ssl trust store while talking to auth server (#9637)
* druid-pac4j: add ability for custom ssl trust store for talking to auth
server

* fix nimbusds DefaultResourceRetriever name in comment
2020-04-10 18:01:59 -07:00
Suneet Saldanha 332ca19621
Fix potential integer overflow issues (#9609)
ApproximateHistogram - seems unlikely
SegmentAnalyzer - unclear if this is an actual issue
GenericIndexedWriter - unclear if this is an actual issue
IncrementalIndexRow and OnheapIncrementalIndex are non-issues becaus it's very
unlikely for the number of dims to be large enough to hit the overflow
condition
2020-04-10 11:47:08 -07:00
Suneet Saldanha 22d3eed80c
Do not use external input in format strings (#9665)
https://lgtm.com/rules/7900080/
2020-04-10 10:46:04 -07:00
Suneet Saldanha bd1cff24a2
Remove no-op assert statement in ClientQuerySegmentWalker (#9607)
* Remove no-op assert statement

The assert statement in ClientQuerySegmentWalker will always be true because
of the preceeding while loop which has the same condition.
This change removes dead code to fix an error reported by LGTM

* Suppress lgtm

* cleanup whitespace
2020-04-10 10:41:29 -07:00
Suneet Saldanha 642fe83897
Indexing Service validates externally received taskId (#9666)
Addresses issues flagged by https://lgtm.com/rules/5970070/
2020-04-10 10:36:26 -07:00
Suneet Saldanha 1ced3b33fb
IntelliJ inspections cleanup (#9339)
* IntelliJ inspections cleanup

* Standard Charset object can be used
* Redundant Collection.addAll() call
* String literal concatenation missing whitespace
* Statement with empty body
* Redundant Collection operation
* StringBuilder can be replaced with String
* Type parameter hides visible type

* fix warnings in test code

* more test fixes

* remove string concatenation inspection error

* fix extra curly brace

* cleanup AzureTestUtils

* fix charsets for RangerAdminClient

* review comments
2020-04-10 10:04:40 -07:00