Commit Graph

26 Commits

Author SHA1 Message Date
Clint Wylie fb26a1093d
discover nested columns when using nested column indexer for schemaless ingestion (#13672)
* discover nested columns when using nested column indexer for schemaless
* move useNestedColumnIndexerForSchemaDiscovery from AppendableIndexSpec to DimensionsSpec
2023-01-18 12:57:28 -08:00
Paul Rogers fa493f1ebc
Convert from DRUID_INTEGRATION_TEST_INDEXER to USE_INDEXER (#13684)
The old ITs use DRUID_INTEGRATION_TEST_INDEXER. The new ones use the
USE_INDEXER env var passed in from the build environment.
2023-01-18 08:51:42 -08:00
Paul Rogers 22630b0aab
Much improved table functions (#13627)
Much improved table functions

* Revises properties, definitions in the catalog
* Adds a "table function" abstraction to model such functions
* Specific functions for HTTP, inline, local and S3.
* Extended SQL types in the catalog
* Restructure external table definitions to use table functions
* EXTEND syntax for Druid's extern table function
* Support for array-valued table function parameters
* Support for array-valued SQL query parameters
* Much new documentation
2023-01-17 08:41:57 -08:00
Paul Rogers ed623d626f
Support both Indexer and MiddleManager in ITs (#13660)
Support both indexer and MM in ITs

Support for the DRUID_INTEGRATION_TEST_INDEXER variable
Conditional client cluster configuration
Cleanup of OVERRIDE_ENV file handling
Enforce setting of test-specific env vars
Cleanup of unused bits
2023-01-14 14:34:06 -08:00
abhagraw 5ef689fc3f
Cloud deep storage tests in new IT framework (S3, GCS, Azure) (#13535)
* MSQ s3 deep storage tests

* Fix license check

* Getting config values from env variables

* Added  s3TestUtils

* Merged AbstractITSQLBasedIngestionTest with AbstractITBatchIndexTest

* Fixing license issues

* Fixing checkstyle errors

* Fix spotbug errors

* Update s3util name in other files

* GCS and Azure deep storage tests

* Fix license and checkstyle errors

* Fix dependency error

* fix intellij check errors

* Copy credentials file in all containers

* Refactor and gcs file upload fix

* Fixing dependency check errors and codeQL warnings

* Fixing checkstyle errors

* Fixing intellij inspection errors

* Removing unrequired exceptions

* Addressing comments
2023-01-11 09:43:44 +05:30
Karan Kumar 56076d33fb
Worker retry for MSQ task (#13353)
* Initial commit.

* Fixing error message in retry exceeded exception

* Cleaning up some code

* Adding some test cases.

* Adding java docs.

* Finishing up state test cases.

* Adding some more java docs and fixing spot bugs, intellij inspections

* Fixing intellij inspections and added tests

* Documenting error codes

* Migrate current integration batch tests to equivalent MSQ tests (#13374)

* Migrate current integration batch tests to equivalent MSQ tests using new IT framework

* Fix build issues

* Trigger Build

* Adding more tests and addressing comments

* fixBuildIssues

* fix dependency issues

* Parameterized the test and addressed comments

* Addressing comments

* fixing checkstyle errors

* Adressing comments

* Adding ITTest which kills the worker abruptly

* Review comments phase one

* Adding doc changes

* Adjusting for single threaded execution.

* Adding Sequential Merge PR state handling

* Merge things

* Fixing checkstyle.

* Adding new context param for fault tolerance.
Adding stale task handling in sketchFetcher.
Adding UT's.

* Merge things

* Merge things

* Adding parameterized tests
Created separate module for faultToleranceTests

* Adding missed files

* Review comments and fixing tests.

* Documentation things.

* Fixing IT

* Controller impl fix.

* Fixing racy WorkerSketchFetcherTest.java exception handling.

Co-authored-by: abhagraw <99210446+abhagraw@users.noreply.github.com>
Co-authored-by: Karan Kumar <cryptoe@karans-mbp.lan>
2023-01-11 07:38:29 +05:30
imply-cheddar a8ecc48ffe
Validate response headers and fix exception logging (#13609)
* Validate response headers and fix exception logging

A class of QueryException were throwing away their
causes making it really hard to determine what's
going wrong when something goes wrong in the SQL
planner specifically.  Fix that and adjust tests
 to do more validation of response headers as well.

We allow 404s and 307s to be returned even without 
authorization validated, but others get converted to 403
2023-01-05 14:15:15 -08:00
abhagraw 365474ff1d
New IT Framework - InputSource and InputFormat Tests (#13597)
* New IT Framework - InputSource and InputFormat Tests

* Fixing checkstyle errors

* Updating InputSource setup

* Updating queries to use druid DB

* Making metadata setup queries to be idempotent

* Restore intellij files
2023-01-04 10:40:05 +05:30
abhagraw f6f625ee08
MSQ Reindex IT (#13433)
* MSQ Reindex IT

* Fixing checkstyle errors

* Addressing comments

* Addressing comments
2022-12-01 12:13:23 +05:30
Adarsh Sanjeev 2f3b97194f
Fix harcoded version in pom file (#13460) 2022-12-01 10:10:04 +05:30
Kashif Faraz 7cf761cee4
Prepare master branch for next release, 26.0.0 (#13401)
* Prepare master branch for next release, 26.0.0

* Use docker image for druid 24.0.1

* Fix version in druid-it-cases pom.xml
2022-11-22 15:31:01 +05:30
Adarsh Sanjeev 280a0f7158
Add sequential sketch merging to MSQ (#13205)
* Add sketch fetching framework

* Refactor code to support sequential merge

* Update worker sketch fetcher

* Refactor sketch fetcher

* Refactor sketch fetcher

* Add context parameter and threshold to trigger sequential merge

* Fix test

* Add integration test for non sequential merge

* Address review comments

* Address review comments

* Address review comments

* Resolve maxRetainedBytes

* Add new classes

* Renamed key statistics information class

* Rename fetchStatisticsSnapshotForTimeChunk function

* Address review comments

* Address review comments

* Update documentation and add comments

* Resolve build issues

* Resolve build issues

* Change worker APIs to async

* Address review comments

* Resolve build issues

* Add null time check

* Update integration tests

* Address review comments

* Add log messages and comments

* Resolve build issues

* Add unit tests

* Add unit tests

* Fix timing issue in tests
2022-11-22 09:56:32 +05:30
abhagraw 5172d76a67
Migrate current integration batch tests to equivalent MSQ tests (#13374)
* Migrate current integration batch tests to equivalent MSQ tests using new IT framework

* Fix build issues

* Trigger Build

* Adding more tests and addressing comments

* fixBuildIssues

* fix dependency issues

* Parameterized the test and addressed comments

* Addressing comments

* fixing checkstyle errors

* Adressing comments
2022-11-21 09:12:02 +05:30
Paul Rogers 81d005f267
Druid Catalog basics (#13165)
Druid catalog basics

Catalog object model for tables, columns
Druid metadata DB storage (as an extension)
REST API to update the catalog (as an extension)
Integration tests
Model only: no planner integration yet
2022-11-12 15:30:22 -08:00
Kashif Faraz fd7864ae33
Improve run time of coordinator duty MarkAsUnusedOvershadowedSegments (#13287)
In clusters with a large number of segments, the duty `MarkAsUnusedOvershadowedSegments`
can take a long very long time to finish. This is because of the costly invocation of 
`timeline.isOvershadowed` which is done for every used segment in every coordinator run.

Changes
- Use `DataSourceSnapshot.getOvershadowedSegments` to get all overshadowed segments
- Iterate over this set instead of all used segments to identify segments that can be marked as unused
- Mark segments as unused in the DB in batches rather than one at a time
- Refactor: Add class `SegmentTimeline` for ease of use and readability while using a
`VersionedIntervalTimeline` of segments.
2022-11-01 20:19:52 +05:30
Paul Rogers f4dcc52dac
Redesign QueryContext class (#13071)
We introduce two new configuration keys that refine the query context security model controlled by druid.auth.authorizeQueryContextParams. When that value is set to true then two other configuration options become available:

druid.auth.unsecuredContextKeys: The set of query context keys that do not require a security check. Use this for the "white-list" of key to allow. All other keys go through the existing context key security checks.
druid.auth.securedContextKeys: The set of query context keys that do require a security check. Use this when you want to allow all but a specific set of keys: only these keys go through the existing context key security checks.
Both are set using JSON list format:

druid.auth.securedContextKeys=["secretKey1", "secretKey2"]
You generally set one or the other values. If both are set, unsecuredContextKeys acts as exceptions to securedContextKeys.

In addition, Druid defines two query context keys which always bypass checks because Druid uses them internally:

sqlQueryId
sqlStringifyArrays
2022-10-15 11:02:11 +05:30
Laksh Singla 728745a1d3
Add IT for MSQ task engine using the new IT framework (#12992)
* first test, serde causing problems

* serde working

* insert and select check

* Add cluster annotations for MSQ test cases

* Add cluster config for MSQ

* Add MSQ config to the pom.xml

* cleanup unnecessary changes

* Remove model classes

* Comments, checkstyle, check queries from file

* fixup test case name

* build failure fix

* review changes

* build failure fix

* Trigger Build

* Log the mismatch in QueryResultsVerifier

* Trigger Build

* Change the signature of the results verifier

* review changes

* LGTM fix

* build, change pom

* Trigger Build

* Trigger Build

* trigger build with minimal pom changes

* guice fix in tests

* travis.yml
2022-09-22 16:09:47 +05:30
Vadim Ogievetsky b9edfe34a4
be consistent about referring to the web console by its name (#13118) 2022-09-19 15:02:17 -07:00
Adam Peck ee22663dd3
Add interpolation to JsonConfigurator (#13023)
* Add interpolation to JsonConfigurator

* Fix checkstyle

* Fix tests by removing common-text override

* Add back commons-text without version

* Remove unused hadoopDir configs

* Move some stuff to hopefully pass coverage
2022-09-07 12:48:01 +05:30
Clint Wylie a3a377e570
more consistent expression error messages (#12995)
* more consistent expression error messages

* review stuff

* add NamedFunction for Function, ApplyFunction, and ExprMacro to share common stuff

* fixes

* add expression transform name to transformer failure, better parse_json error messaging
2022-09-06 23:21:38 -07:00
Abhishek Agarwal 7d332c6f6a
Suppress false CVEs (#13026)
* Suppress CVEs

* Add more suppressions
2022-09-06 11:46:56 +05:30
abhagraw f3c47cf68c
Building druid-it-tools and running for travis in it.sh (#12957)
* Building druid-it-tools and running for travis in it.sh

* Addressing comments

* Updating druid-it-image pom to point to correct it-tools

* Updating all it-tools references to druid-it-tools

* Adding dist back to it.sh travis

* Trigger Build

* Disabling batchIndex tests and commenting out user specific code

* Fixing checkstyle and intellij inspection errors

* Replacing tabs with spaces in it.sh

* Enabling old batch index tests with indexer
2022-08-30 12:48:07 +05:30
Abhishek Agarwal 618757352b
Bump up the version to 25.0.0 (#12975)
* Bump up the version to 25.0.0

* Fix the version in console
2022-08-29 11:27:38 +05:30
Adam Peck 21b73bde20
Update Curator to 5.3.0 (#12939)
* Update Curator to 5.3.0

* Update licenses.yaml

* Fix inspections + add tests.

* Fix checkstyle

* Another intellij inspection fix

* Update curator exclusions

* Cleanup new exhibitor references

* Remove unused dep and checkstyle fix
2022-08-26 18:23:40 -07:00
Paul Rogers cfed036091
Add the new integration test framework (#12368)
This commit is a first draft of the revised integration test framework which provides:
- A new directory, integration-tests-ex that holds the new integration test structure. (For now, the existing integration-tests is left unchanged.)
- Maven module druid-it-tools to hold code placed into the Docker image.
- Maven module druid-it-image to build the Druid-only test image from the tarball produced in distribution. (Dependencies live in their "official" image.)
- Maven module druid-it-cases that holds the revised tests and the framework itself. The framework includes file-based test configuration, test-specific clients, test initialization and updated versions of some of the common test support classes.

The integration test setup is primarily a huge mass of details. This approach refactors many of those details: from how the image is built and configured to how the Docker Compose scripts are structured to test configuration. An extensive set of "readme" files explains those details. Rather than repeat that material here, please consult those files for explanations.
2022-08-24 17:03:23 +05:30
Paul Rogers 4706a4c572
Docker build for the revised ITs (#12707)
* Docker build for the revised ITs

* Fix POM versions

* Update comments from review suggestions
2022-08-10 14:17:33 +05:30