Commit Graph

10596 Commits

Author SHA1 Message Date
Jihoon Son a6790ff22a
More optimize CNF conversion of filters (#9634)
* More optimize CNF conversion of filters

* update license

* fix build

* checkstyle

* remove unnecessary code

* split helper

* license

* checkstyle

* add comments on cnf conversion
2020-04-08 21:31:17 -07:00
Abhishek Radhakrishnan 08851c0198
Preserve the null values for numeric type dimensions post-compaction. (#9622)
* Add selector null check to preserve null values as-is.

* Fix typo.

* add wrapping dimension selector test.

* Address review comments.

* nit: replace exception type.

* uh, float is indeed NOT a special case.
2020-04-08 18:56:06 -07:00
mcbrewster 6f3d403491
Use auto-form for add an edit lookups (#9587)
* use auto form

* jest -u

* fix unreachable statment

* complete the owl

* jest -u

* remove changes to query-view

* fix permissions

* add test, fix info

* add cool highlights

* fix formatting

* fix capitalization

* add optional placeholder

* add space
2020-04-08 16:34:59 -07:00
mcbrewster 2b2b9efcd7
add new text to lookup action dialog (#9643) 2020-04-08 11:30:47 -07:00
Clint Wylie 7bf2dfa3b1
fix flaky jetty test (#9633) 2020-04-08 10:07:06 -07:00
Maytas Monsereenusorn b95a1b9878
Fix NPE in RemoteTaskRunner event handler causes JVM shutdown (#9610)
* Fix NPE in RemoteTaskRunner event handler causes JVM shutdown

* address comments

* fix compile

* fix checkstyle

* fix lgtm

* fix merge

* fix test

* fix tests

* change scope

* address comments

* address comments
2020-04-07 14:53:51 -07:00
mcbrewster 6e50d29b4e
fix global filter input (#9567)
* fix global filter input

* remove clear

* close global filters after clicking apply

* add restFilter
2020-04-07 13:31:19 -07:00
Maytas Monsereenusorn 73a6baaeb6
change hadoop inputSource IT to use parallel batch ingestion (#9616) 2020-04-07 11:37:37 -07:00
Clint Wylie d267b1c414
check paths used for shuffle intermediary data manager get and delete (#9630)
* check paths used for shuffle intermediary data manager get and delete

* add test

* newline

* meh
2020-04-07 09:47:18 -07:00
Aleksei Chumagin 79522f3e25
Integration-tests: typo (#9624)
* QA-57: change $ to # as comment

* QA-57: fix haddop to hadoop
2020-04-06 17:40:05 -07:00
Jihoon Son 82ce60b5c1
Reuse transformer in stream indexing (#9625)
* Reuse transformer in stream indexing

* remove unused method

* memoize complied pattern
2020-04-06 16:36:08 -07:00
Suneet Saldanha 7bf1ebb0b8
Add tests for valid and invalid datasource names (#9614)
* Add tests for valid and invalid datasource names

* code review

* clean up dependencies
2020-04-06 16:02:50 -07:00
Himanshu fc2897da1d
pac4j: be noop if a previous authenticator in chain has successfully authenticated (#9620) 2020-04-06 11:55:55 -07:00
Jihoon Son 40e84a171b
Eliminate common subfilters when converting it to a CNF (#9608) 2020-04-05 22:29:41 -07:00
bolkedebruin 2d99966933
Add Apache Ranger Authorization (#9579) 2020-04-04 18:02:24 +02:00
Clint Wylie 4d277dbf99
Fix double count ssl connection metrics (#9594)
* fix double counted jetty/numOpenConnections metric for ssl connections

* tests

* more better

* style
2020-04-03 23:29:23 -07:00
Chi Cao Minh b5419962f0
Suppress CVEs for jackson-mapper-asl:1.9.13 (#9604)
The jackson-mapper-asl:1.9.13 CVEs via curator-x-discovery are all
suppressed for now as fixing them requires updating the curator version.
2020-04-03 10:33:52 -07:00
Maytas Monsereenusorn 1852bf33ea
Add Integration Test for functionality of kinesis ingestion (#9576)
* kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* Kinesis IT

* fix kinesis timeout

* Kinesis IT

* Kinesis IT

* fix checkstyle

* Kinesis IT

* address comments

* fix checkstyle
2020-04-03 09:45:22 -07:00
Suneet Saldanha af3337dac8
DruidInputSource can add new dimensions during re-ingestion (#9590)
* WIP integration tests

* Add integration test for ingestion with transformSpec

* WIP almost working tests

* Add ignored tests

* checkstyle stuff

* remove newPage from index task ingestion spec

* more test cleanup

* still not quite working

* Actually disable the tests

* working tests

* fix codestyle

* dont use junit in integration tests

* actually fix the bug

* fix checkstyle

* bring index tests closer to reindex tests
2020-04-02 17:32:31 -07:00
Jonathan Wei dbaabdd247
Fix for [CVE-2020-1958]: Apache Druid LDAP injection vulnerability (#9600) 2020-04-01 14:52:01 -07:00
zachjsh e855c7fe1b
Allow Cloud Deep Storage configs without segment bucket or path specified (#9588)
* Allow Cloud SegmentKillers to be instantiated without segment bucket or path

This change fixes a bug that was introduced that causes ingestion
to fail if data is ingested from one of the supported cloud storages
(Azure, Google, S3), and the user is using another type of storage
for deep storage. In this case the all segment killer implementations
are instantiated. A change recently made forced a dependency between
the supported cloud storage type SegmentKiller classes and the
deep storage configuration for that storage type being set, which
forced the deep storage bucket and prefix to be non-null. This caused
a NullPointerException to be thrown when instantiating the
SegmentKiller classes during ingestion.

To fix this issue, the respective deep storage segment configs for the
cloud storage types supported in druid are now allowed to have nullable
bucket and prefix configurations

* * Allow google deep storage bucket to be null
2020-04-01 11:57:32 -07:00
Jihoon Son 0da8ffc3ff
Bump up development version to 0.19.0-SNAPSHOT (#9586) 2020-03-30 16:24:04 -07:00
Himanshu 839379246a
remove commons-lang3 usage from DoubleMeanAggregatorFactoryTest (#9578) 2020-03-30 14:31:50 -07:00
Clint Wylie fa5da6693c
add lane enforcement for joinish queries (#9563)
* add lane enforcement for joinish queries

* oops

* style

* review stuffs
2020-03-30 11:58:16 -07:00
Chi Cao Minh c0195a19e4
Fix HDFS input source split (#9574)
Fixes an issue where splitting an HDFS input source for use in native
parallel batch ingestion would cause the subtasks to get a split with an
invalid HDFS path.
2020-03-28 15:45:57 -07:00
Stanislav Poryadnyi 9081b5f25c
fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder (#9568)
* fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder

* byte[] type handling in deserialize and finalizeComputation for DoubleMeanAggregatorFactory

* DoubleMeanAggregatorFactory tests: Max Intermediate Size, Deserialize, finalizeComputation

* moved byte[] check to first position

Co-authored-by: Stanislav <S.Poryadnyi@abcconsulting.ru>
2020-03-27 22:26:31 -07:00
Xavier Léauté b4ad3d0d88
fix nullhandling exceptions related to test ordering (#9570)
* fix nullhandling exceptions related to test ordering

Tests might get executed in different order depending on the maven
version and the test environment. This may lead to "NullHandling module
not initialized" errors for some tests where we do not initialize
null-handling explicitly.

* use InitializedNullHandlingTest
2020-03-27 09:46:31 -07:00
Clint Wylie 2c49f6d89a
error on value counter overflow instead of writing sad segments (#9559) 2020-03-26 16:54:48 -07:00
Suneet Saldanha e6e2836b0e
Instructions to run integration tests against quickstart (#9560)
* Instructions to run integration tests against quickstart

* Address review comments

* actually exclude the test group

* Revert "actually exclude the test group"

This reverts commit 66f366409e.

* update comment
2020-03-26 13:22:53 -07:00
Suneet Saldanha 55c08e0746
DruidSegmentReader should work if timestamp is specified as a dimension (#9530)
* DruidSegmentReader should work if timestamp is specified as a dimension

* Add integration tests

Tests for compaction and re-indexing a datasource with the timestamp column

* Instructions to run integration tests against quickstart

* address pr
2020-03-25 13:47:34 -07:00
Maytas Monsereenusorn 3f521943fc
S3 ingestion spec should not uses the default credentials provider chain when environment value password provider is misconfigured. (#9552)
* fix s3 optional cred

* S3 ingestion spec uses the default credentials provider chain when environment value password provider is misconfigured.

* fix failing test
2020-03-24 15:09:02 -07:00
mcbrewster e1b201c279
Add view values to lookup actions menu (#9549)
* add test, add query

* jest -u

* add limit, explicitly get columns, remoove map
2020-03-24 09:57:33 -07:00
Neil Volungis 0ac875a8b4
Update docker.md readme to note memory requirements (#9529)
* Update docker.md readme to note memory requirements

* Fix grammatical error

Co-Authored-By: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com>

Co-authored-by: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com>
2020-03-24 03:33:29 -07:00
Maytas Monsereenusorn e97695d9da
fix Hadoop ingestion fails due to error 'JavaScript is disabled' on certain config (#9553)
* fix Hadoop ingestion fails due to error 'JavaScript is disabled', if determine partition hadoop job is run

* add test

* fix checkstyle

* address comments

* address comments
2020-03-23 23:09:21 -07:00
JaeGeun 57018adf23
change backtick() and fix broken links (#9550) 2020-03-23 20:57:03 -07:00
Clint Wylie 2bc29543e5
modify QueryCapacityExceededException to provide better messaging (#9547)
* modify QueryCapacityExceededException to provide better messaging

* style
2020-03-23 20:05:11 -07:00
Clint Wylie bf85ea19b2
roaring bitmaps by default (#9548)
* it is finally time

* fix it

* more docs

* fix doc
2020-03-23 18:15:57 -07:00
Himanshu 5604ac7963
druid extension for OpenID Connect auth using pac4j lib (#8992)
* druid pac4j security extension for OpenID Connect OAuth 2.0 authentication

* update version in druid-pac4j pom

* introducing unauthorized resource filter

* authenticated but authorized /unified-webconsole.html

* use httpReq.getRequestURI() for matching callback path

* add documentation

* minor doc addition

* licesne file updates

* make dependency analyze succeed

* fix doc build

* hopefully fixes doc build

* hopefully fixes license check build

* yet another try on fixing license build

* revert unintentional changes to website folder

* update version to 0.18.0-SNAPSHOT

* check session and its expiry on each request

* add crypto service

* code for encrypting the cookie

* update doc with cookiePassphrase

* update license yaml

* make sessionstore in Pac4jFilter private non static

* make Pac4jFilter fields final

* okta: use sha256 for hmac

* remove incubating

* add UTs for crypto util and session store impl

* use standard charsets

* add license header

* remove unused file

* add org.objenesis.objenesis to license.yaml

* a bit of nit changes  in CryptoService  and embedding EncryptionResult for clarity

* rename alg  to cipherAlgName

* take cipher alg name, mode and padding as input

* add java doc  for CryptoService  and make it more understandable

* another  UT for CryptoService

* cache pac4j Config

* use generics clearly in Pac4jSessionStore

* update cookiePassphrase doc to mention PasswordProvider

* mark stuff Nullable where appropriate in Pac4jSessionStore

* update doc to mention jdbc

* add error log on reaching callback resource

* javadoc  for Pac4jCallbackResource

* introduce NOOP_HTTP_ACTION_ADAPTER

* add correct module name in license file

* correct extensions folder name in licenses.yaml

* replace druid-kubernetes-extensions to druid-pac4j

* cache SecureRandom instance

* rename UnauthorizedResourceFilter to AuthenticationOnlyResourceFilter
2020-03-23 18:15:45 -07:00
Vadim Ogievetsky cdf4a26904
clean up spec before reopening in data loader (#9536) 2020-03-23 16:57:51 -07:00
Clint Wylie d8833316c4
fix broken links (#9537)
* fix broken links

* missing /

* adjustment
2020-03-22 17:41:18 -07:00
Gian Merlino 54c9325256
SQL support for joins on subqueries. (#9545)
* SQL support for joins on subqueries.

Changes to SQL module:

- DruidJoinRule: Allow joins on subqueries (left/right are no longer
  required to be scans or mappings).
- DruidJoinRel: Add cost estimation code for joins on subqueries.
- DruidSemiJoinRule, DruidSemiJoinRel: Removed, since DruidJoinRule can
  handle this case now.
- DruidRel: Remove Nullable annotation from toDruidQuery, because
  it is no longer needed (it was used by DruidSemiJoinRel).
- Update Rules constants to reflect new rules available in our current
  version of Calcite. Some of these are useful for optimizing joins on
  subqueries.
- Rework cost estimation to be in terms of cost per row, and place all
  relevant constants in CostEstimates.

Other changes:

- RowBasedColumnSelectorFactory: Don't set hasMultipleValues. The lack
  of isComplete is enough to let callers know that columns might have
  multiple values, and explicitly setting it to true causes
  ExpressionSelectors to think it definitely has multiple values, and
  treat the inputs as arrays. This behavior interfered with some of the
  new tests that involved queries on lookups.
- QueryContexts: Add maxSubqueryRows parameter, and use it in druid-sql
  tests.

* Fixes for tests.

* Adjustments.
2020-03-22 16:43:55 -07:00
Maytas Monsereenusorn 5f127a1829
Add integration tests for HDFS (#9542)
* HDFS IT

* HDFS IT

* HDFS IT

* fix checkstyle
2020-03-20 15:46:08 -07:00
zachjsh 4870ad7b56
Azure deep storage does not work with datasource name containing non-ASCII chars (#9525)
* Azure deep storage does not work with datasource name containing non-ASCII chars

Fixed a bug where recording the segment file location fails when
using Azure Deep Storage, if the datasource has any special
characters

* * update jacoco thresholds

* * resolve merge conflicts
* address review comments
2020-03-19 12:32:35 -07:00
Jihoon Son 1e667362eb
Do not use UnmodifiableList in auto compaction (#9535) 2020-03-19 11:43:33 -07:00
Clint Wylie 68013fbc64
fix issue where total limit was being applied even when not configured (#9534)
* fix issue where total limit was being applied even when not configured

* fix inspection

* add reserved lane name check to manual laning strategy
2020-03-18 18:05:59 -07:00
zachjsh 838735411f
Ability to Delete task logs and segments from Google Storage (#9519)
* Ability to Delete task logs and segments from Google Storage

* implement ability to delete all tasks logs or all task logs
  written before a particular date when written to Google storage

* implement ability to delete all segments from Google deep storage

* * Address review comments
2020-03-18 18:00:43 -07:00
zachjsh b18dd2b7a9
Ability to Delete task logs and segments from Azure Storage (#9523)
* Ability to Delete task logs and segments from Azure Storage

* implement ability to delete all tasks logs or all task logs
  written before a particular date when written to Azure storage

* implement ability to delete all segments from Azure deep storage

* * Address review comments
2020-03-18 17:59:17 -07:00
Vadim Ogievetsky 3b536eea7f
Web console: expose props for S3 (#9432)
* expose props for S3

* added env inputs

* add scarry warning

* use .password

* put the warning front and center

* Update web-console/src/views/load-data-view/load-data-view.tsx

Co-Authored-By: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com>

* let prettier rewrap the text

Co-authored-by: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com>
2020-03-18 15:32:12 -07:00
Gian Merlino 1ef25a438f
Broker: Add ability to inline subqueries. (#9533)
* Broker: Add ability to inline subqueries.

The main changes:

- ClientQuerySegmentWalker: Add ability to inline queries.
- Query: Add "getSubQueryId" and "withSubQueryId" methods.
- QueryMetrics: Add "subQueryId" dimension.
- ServerConfig: Add new "maxSubqueryRows" parameter, which is used by
  ClientQuerySegmentWalker to limit how many rows can be inlined per
  query.
- IndexedTableJoinMatcher: Allow creating keys on top of unknown types,
  by assuming they are strings. This is useful because not all types are
  known for fields in query results.
- InlineDataSource: Store RowSignature rather than component parts. Add
  more zealous "equals" and "hashCode" methods to ease testing.
- Moved QuerySegmentWalker test code from CalciteTests and
  SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in
  druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest.

* Adjustments from CI.

* Fix integration test.
2020-03-18 15:06:45 -07:00
Maytas Monsereenusorn 4c620b8f1c
Adding s3, gcs, azure integration tests (#9501)
* exclude pulling s3 segments for tests that doesnt need it

* fix script

* fix script

* fix script

* add s3 test

* refactor sample data script

* add tests

* add tests

* add license header

* fix failing tests

* change bucket and path to config

* update integration test readme

* fix typo
2020-03-17 03:08:44 -07:00