druid

Commit Graph

Author	SHA1	Message	Date
Vishesh Garg	e43bb74c3a	Add MSQ Durable Storage Connector for Google Cloud Storage and change current Google Cloud Storage client library (#15398 ) The PR addresses 2 things: Add MSQ durable storage connector for GCS Change GCS client library from the old Google API Client Library to the recommended Google Cloud Client Library. Ref: https://cloud.google.com/apis/docs/client-libraries-explained	2023-12-14 07:34:49 +05:30
Ankit Kothari	8735d023a1	Add experimental support for first/last for double/float/long #10702 (#14462 ) Add experimental support for doubleLast, doubleFirst, FloatLast, FloatFirst, longLast and longFirst.	2023-12-12 11:36:51 +05:30
Laksh Singla	5f86072456	Prepare master for Druid 29 (#15121 ) Prepare master for Druid 29	2023-10-11 10:33:45 +05:30
Tejaswini Bandlamudi	dec6a0aa14	Update google client apis to latest version (#14414 ) Currently Druid is using google apis client 1.26.0 version and google-oauth-client-1.26.0.jar in particular is bringing following CVEs CVE-2020-7692, CVE-2021-22573. Despite the CVEs being false positives, they're causing red security scans on Druid distribution. Hence updating the version to latest version with these CVE fixes.	2023-09-11 12:27:23 +05:30
Clint Wylie	5d1412949e	enable sql compatible null handling mode by default (#14792 ) * enable sql compatible null handling mode by default * fix bug with string first/last aggs when druid.generic.useDefaultValueForNull=false	2023-08-21 20:07:13 -07:00
Clint Wylie	6b14dde50e	deprecate config-magic in favor of json configuration stuff (#14695 ) * json config based processing and broker merge configs to deprecate config-magic	2023-08-16 18:23:57 -07:00
dependabot[bot]	e55fe67535	Bump apache.curator.version from 5.4.0 to 5.5.0 (#14843 ) * Bump apache.curator.version from 5.4.0 to 5.5.0 Bumps `apache.curator.version` from 5.4.0 to 5.5.0. Updates `org.apache.curator:curator-client` from 5.4.0 to 5.5.0 - [Commits](https://github.com/apache/curator/compare/apache-curator-5.4.0...apache-curator-5.5.0) Updates `org.apache.curator:curator-framework` from 5.4.0 to 5.5.0 - [Commits](https://github.com/apache/curator/compare/apache-curator-5.4.0...apache-curator-5.5.0) Updates `org.apache.curator:curator-recipes` from 5.4.0 to 5.5.0 - [Commits](https://github.com/apache/curator/compare/apache-curator-5.4.0...apache-curator-5.5.0) Updates `org.apache.curator:curator-x-discovery` from 5.4.0 to 5.5.0 - [Commits](https://github.com/apache/curator/compare/apache-curator-5.4.0...apache-curator-5.5.0) Updates `org.apache.curator:curator-test` from 5.4.0 to 5.5.0 - [Commits](https://github.com/apache/curator/compare/apache-curator-5.4.0...apache-curator-5.5.0) --- updated-dependencies: - dependency-name: org.apache.curator:curator-client dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.curator:curator-framework dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.curator:curator-recipes dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.curator:curator-x-discovery dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.curator:curator-test dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * update licenses.yaml --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xavier Léauté <xvrl@apache.org>	2023-08-16 07:36:58 -07:00
AmatyaAvadhanula	0412f40d36	Prepare master branch for next release, 28.0.0 (#14595 ) * Prepare master branch for next release, 28.0.0	2023-07-18 09:22:30 +05:30
Karan Kumar	89aee6caaa	Fixing an issue in sequential merge (#14574 ) * Fixing an issue in sequential merge where workers without any partial key statistics would get stuck because controller did not change the worker state. * Removing empty check * Adding IT for MSQ sequential bug fix.	2023-07-12 22:05:30 +05:30
Gian Merlino	3ff51487b7	Add ZooKeeper connection state alerts and metrics. (#14333 ) * Add ZooKeeper connection state alerts and metrics. - New metric "zk/connected" is an indicator showing 1 when connected, 0 when disconnected. - New metric "zk/disconnected/time" measures time spent disconnected. - New alert when Curator connection state enters LOST or SUSPENDED. * Use right GuardedBy. * Test fixes, coverage. * Adjustment. * Fix tests. * Fix ITs. * Improved injection. * Adjust metric name, add tests.	2023-07-12 09:34:28 -07:00
Jan Werner	95115d722a	CVE fixes - update of multiple dependencies. (#14519 ) Apache Druid brings multiple direct and transitive dependencies that are affected by plethora of CVEs. This PR attempts to update all the dependencies that did not require code refactoring. This PR modifies pom files, license file and OWASP Dependency Check suppression file.	2023-07-07 20:27:30 +05:30
Abhishek Agarwal	f8f2fe8b7b	Skip tests based on files changed in the PR (#14445 ) Our CI system has a lot of tests. And much of this testing is really unnecessary for most of the PRs. This PR adds some checks so we can skip these expensive tests when we know they are not necessary.	2023-06-22 12:27:23 +05:30
Kashif Faraz	50461c3bd5	Enable smartSegmentLoading on the Coordinator (#13197 ) This commit does a complete revamp of the coordinator to address problem areas: - Stability: Fix several bugs, add capabilities to prioritize and cancel load queue items - Visibility: Add new metrics, improve logs, revamp `CoordinatorRunStats` - Configuration: Add dynamic config `smartSegmentLoading` to automatically set optimal values for all segment loading configs such as `maxSegmentsToMove`, `replicationThrottleLimit` and `maxSegmentsInNodeLoadingQueue`. Changed classes: - Add `StrategicSegmentAssigner` to make assignment decisions for load, replicate and move - Add `SegmentAction` to distinguish between load, replicate, drop and move operations - Add `SegmentReplicationStatus` to capture current state of replication of all used segments - Add `SegmentLoadingConfig` to contain recomputed dynamic config values - Simplify classes `LoadRule`, `BroadcastRule` - Simplify the `BalancerStrategy` and `CostBalancerStrategy` - Add several new methods to `ServerHolder` to track loaded and queued segments - Refactor `DruidCoordinator` Impact: - Enable `smartSegmentLoading` by default. With this enabled, none of the following dynamic configs need to be set: `maxSegmentsToMove`, `replicationThrottleLimit`, `maxSegmentsInNodeLoadingQueue`, `useRoundRobinSegmentAssignment`, `emitBalancingStats` and `replicantLifetime`. - Coordinator reports richer metrics and produces cleaner and more informative logs - Coordinator uses an unlimited load queue for all serves, and makes better assignment decisions	2023-06-19 14:27:35 +05:30
Tejaswini Bandlamudi	8e4f003f02	Fix flaky Revised ITs failures on GHA runners (#14348 ) * Fix read timed out failures and remove containers before test * remove containers before loading images * add labels to IT docker containers, download stable minio docker image release instead of latest	2023-06-05 18:58:54 +05:30
Paul Rogers	3c0983c8e9	Extend the IT framework to allow tests in extensions (#13877 ) The "new" IT framework provides a convenient way to package and run integration tests (ITs), but only for core modules. We have a use case to run an IT for a contrib extension: the proposed gRPC query extension. This PR provides the IT framework functionality to allow non-core ITs.	2023-05-15 20:29:51 +05:30
Gian Merlino	eeed5ed7e2	MSQ: Use the same result coercion routines as the regular SQL endpoint. (#14046 ) * MSQ: Use the same result coercion routines as the regular SQL endpoint. The main changes are to move NativeQueryMaker.coerce to SqlResults, and to formally make the list of sqlTypeNames from the MSQ results reports use SqlTypeNames. - Change the default to MSQ-compatible rather than MSQ-incompatible. The explicit marker function is now "notMsqCompatible()".	2023-04-15 06:56:23 +05:30
Clint Wylie	1aef72aa7e	Bump up the version in pom to 27.0.0 in preparation of release (#14051 )	2023-04-10 14:56:59 +05:30
Paul Rogers	030ed911d4	Temporarily revert extended table functions for Druid 26 (#14019 )	2023-04-05 21:09:33 -07:00
abhagraw	eb31207402	Using MinIO to run S3DeepStorage ITs (#13997 ) * Using MinIO to S3DeepStorage ITs * Adding S3DeepStorageTest to github actions revised ITs	2023-03-30 12:15:53 -07:00
abhagraw	c7d864d3bc	Update container creation in AzureTestUtil.java (#13911 ) * 1. Handling deletion/creation of container created during the previously run test in AzureTestUtil.java. 2. Adding/updating log messages and comments in Azure and GCS deep storage tests.	2023-03-16 11:04:43 +05:30
Tejaswini Bandlamudi	7103cb4b9d	Removes FiniteFirehoseFactory and its implementations (#12852 ) The FiniteFirehoseFactory and InputRowParser classes were deprecated in 0.17.0 (#8823) in favor of InputSource & InputFormat. This PR removes the FiniteFirehoseFactory and all its implementations along with classes solely used by them like Fetcher (Used by PrefetchableTextFilesFirehoseFactory). Refactors classes including tests using FiniteFirehoseFactory to use InputSource instead. Removing InputRowParser may not be as trivial as many classes that aren't deprecated depends on it (with no alternatives), like EventReceiverFirehoseFactory. Hence FirehoseFactory, EventReceiverFirehoseFactory, and Firehose are marked deprecated.	2023-03-02 18:07:17 +05:30
Clint Wylie	1d8fff4096	sampler + type detection = bff (#13711 ) * sampler + type detection = bff * split logical and physical dimensions, tidy up	2023-02-28 04:14:30 -08:00
Tejaswini Bandlamudi	e2461c21c4	fix flaky BatchIndex IT failures. (#13855 )	2023-02-27 17:23:14 -08:00
Adarsh Sanjeev	aceeac91d4	Fix MSQ IT test (#13808 )	2023-02-22 08:14:46 -08:00
Paul Rogers	5dadbdf4d0	Generate the IT docker-compose.yaml files (#13669 ) Generate IT docker-compose.sh files Generates test-specific docker-compose.sh files using a simple Python template script.	2023-02-21 15:03:02 -08:00
Clint Wylie	08b5951cc5	merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything (#13698 ) * merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything * fix poms and license stuff * mockito is evil * allow reset of JvmUtils RuntimeInfo if tests used static injection to override	2023-02-17 14:27:41 -08:00
Paul Rogers	333196d207	Code cleanup & message improvements (#13778 ) * Misc cleanup edits Correct spacing Add type parameters Add toString() methods to formats so tests compare correctly IT doc revisions Error message edits Display UT query results when tests fail * Edit * Build fix * Build fixes	2023-02-15 15:22:54 +05:30
Tejaswini Bandlamudi	9ffaba9c7f	Fix MySQL drivers setup for Revised ITs (#13800 ) * download both mysql drivers and use org.mariadb.jdbc.Driver for now * use com.mysql.jdbc.Driver	2023-02-15 11:03:25 +05:30
Paul Rogers	842ee554de	Refinements to input-source specific table functions (#13780 ) Refinements to table functions Fixes various bugs Improves the structure of the table function classes Adds unit and integration tests	2023-02-13 16:21:27 -08:00
Paul Rogers	f28c06515b	Auto-detect docker-compose (#13754 )	2023-02-06 21:29:45 +05:30
Clint Wylie	fb26a1093d	discover nested columns when using nested column indexer for schemaless ingestion (#13672 ) * discover nested columns when using nested column indexer for schemaless * move useNestedColumnIndexerForSchemaDiscovery from AppendableIndexSpec to DimensionsSpec	2023-01-18 12:57:28 -08:00
Paul Rogers	fa493f1ebc	Convert from DRUID_INTEGRATION_TEST_INDEXER to USE_INDEXER (#13684 ) The old ITs use DRUID_INTEGRATION_TEST_INDEXER. The new ones use the USE_INDEXER env var passed in from the build environment.	2023-01-18 08:51:42 -08:00
Paul Rogers	22630b0aab	Much improved table functions (#13627 ) Much improved table functions * Revises properties, definitions in the catalog * Adds a "table function" abstraction to model such functions * Specific functions for HTTP, inline, local and S3. * Extended SQL types in the catalog * Restructure external table definitions to use table functions * EXTEND syntax for Druid's extern table function * Support for array-valued table function parameters * Support for array-valued SQL query parameters * Much new documentation	2023-01-17 08:41:57 -08:00
Paul Rogers	ed623d626f	Support both Indexer and MiddleManager in ITs (#13660 ) Support both indexer and MM in ITs Support for the DRUID_INTEGRATION_TEST_INDEXER variable Conditional client cluster configuration Cleanup of OVERRIDE_ENV file handling Enforce setting of test-specific env vars Cleanup of unused bits	2023-01-14 14:34:06 -08:00
abhagraw	5ef689fc3f	Cloud deep storage tests in new IT framework (S3, GCS, Azure) (#13535 ) * MSQ s3 deep storage tests * Fix license check * Getting config values from env variables * Added s3TestUtils * Merged AbstractITSQLBasedIngestionTest with AbstractITBatchIndexTest * Fixing license issues * Fixing checkstyle errors * Fix spotbug errors * Update s3util name in other files * GCS and Azure deep storage tests * Fix license and checkstyle errors * Fix dependency error * fix intellij check errors * Copy credentials file in all containers * Refactor and gcs file upload fix * Fixing dependency check errors and codeQL warnings * Fixing checkstyle errors * Fixing intellij inspection errors * Removing unrequired exceptions * Addressing comments	2023-01-11 09:43:44 +05:30
Karan Kumar	56076d33fb	Worker retry for MSQ task (#13353 ) * Initial commit. * Fixing error message in retry exceeded exception * Cleaning up some code * Adding some test cases. * Adding java docs. * Finishing up state test cases. * Adding some more java docs and fixing spot bugs, intellij inspections * Fixing intellij inspections and added tests * Documenting error codes * Migrate current integration batch tests to equivalent MSQ tests (#13374) * Migrate current integration batch tests to equivalent MSQ tests using new IT framework * Fix build issues * Trigger Build * Adding more tests and addressing comments * fixBuildIssues * fix dependency issues * Parameterized the test and addressed comments * Addressing comments * fixing checkstyle errors * Adressing comments * Adding ITTest which kills the worker abruptly * Review comments phase one * Adding doc changes * Adjusting for single threaded execution. * Adding Sequential Merge PR state handling * Merge things * Fixing checkstyle. * Adding new context param for fault tolerance. Adding stale task handling in sketchFetcher. Adding UT's. * Merge things * Merge things * Adding parameterized tests Created separate module for faultToleranceTests * Adding missed files * Review comments and fixing tests. * Documentation things. * Fixing IT * Controller impl fix. * Fixing racy WorkerSketchFetcherTest.java exception handling. Co-authored-by: abhagraw <99210446+abhagraw@users.noreply.github.com> Co-authored-by: Karan Kumar <cryptoe@karans-mbp.lan>	2023-01-11 07:38:29 +05:30
imply-cheddar	a8ecc48ffe	Validate response headers and fix exception logging (#13609 ) * Validate response headers and fix exception logging A class of QueryException were throwing away their causes making it really hard to determine what's going wrong when something goes wrong in the SQL planner specifically. Fix that and adjust tests to do more validation of response headers as well. We allow 404s and 307s to be returned even without authorization validated, but others get converted to 403	2023-01-05 14:15:15 -08:00
abhagraw	365474ff1d	New IT Framework - InputSource and InputFormat Tests (#13597 ) * New IT Framework - InputSource and InputFormat Tests * Fixing checkstyle errors * Updating InputSource setup * Updating queries to use druid DB * Making metadata setup queries to be idempotent * Restore intellij files	2023-01-04 10:40:05 +05:30
abhagraw	f6f625ee08	MSQ Reindex IT (#13433 ) * MSQ Reindex IT * Fixing checkstyle errors * Addressing comments * Addressing comments	2022-12-01 12:13:23 +05:30
Adarsh Sanjeev	2f3b97194f	Fix harcoded version in pom file (#13460 )	2022-12-01 10:10:04 +05:30
Kashif Faraz	7cf761cee4	Prepare master branch for next release, 26.0.0 (#13401 ) * Prepare master branch for next release, 26.0.0 * Use docker image for druid 24.0.1 * Fix version in druid-it-cases pom.xml	2022-11-22 15:31:01 +05:30
Adarsh Sanjeev	280a0f7158	Add sequential sketch merging to MSQ (#13205 ) * Add sketch fetching framework * Refactor code to support sequential merge * Update worker sketch fetcher * Refactor sketch fetcher * Refactor sketch fetcher * Add context parameter and threshold to trigger sequential merge * Fix test * Add integration test for non sequential merge * Address review comments * Address review comments * Address review comments * Resolve maxRetainedBytes * Add new classes * Renamed key statistics information class * Rename fetchStatisticsSnapshotForTimeChunk function * Address review comments * Address review comments * Update documentation and add comments * Resolve build issues * Resolve build issues * Change worker APIs to async * Address review comments * Resolve build issues * Add null time check * Update integration tests * Address review comments * Add log messages and comments * Resolve build issues * Add unit tests * Add unit tests * Fix timing issue in tests	2022-11-22 09:56:32 +05:30
abhagraw	5172d76a67	Migrate current integration batch tests to equivalent MSQ tests (#13374 ) * Migrate current integration batch tests to equivalent MSQ tests using new IT framework * Fix build issues * Trigger Build * Adding more tests and addressing comments * fixBuildIssues * fix dependency issues * Parameterized the test and addressed comments * Addressing comments * fixing checkstyle errors * Adressing comments	2022-11-21 09:12:02 +05:30
Paul Rogers	81d005f267	Druid Catalog basics (#13165 ) Druid catalog basics Catalog object model for tables, columns Druid metadata DB storage (as an extension) REST API to update the catalog (as an extension) Integration tests Model only: no planner integration yet	2022-11-12 15:30:22 -08:00
Kashif Faraz	fd7864ae33	Improve run time of coordinator duty MarkAsUnusedOvershadowedSegments (#13287 ) In clusters with a large number of segments, the duty `MarkAsUnusedOvershadowedSegments` can take a long very long time to finish. This is because of the costly invocation of `timeline.isOvershadowed` which is done for every used segment in every coordinator run. Changes - Use `DataSourceSnapshot.getOvershadowedSegments` to get all overshadowed segments - Iterate over this set instead of all used segments to identify segments that can be marked as unused - Mark segments as unused in the DB in batches rather than one at a time - Refactor: Add class `SegmentTimeline` for ease of use and readability while using a `VersionedIntervalTimeline` of segments.	2022-11-01 20:19:52 +05:30
Laksh Singla	728745a1d3	Add IT for MSQ task engine using the new IT framework (#12992 ) * first test, serde causing problems * serde working * insert and select check * Add cluster annotations for MSQ test cases * Add cluster config for MSQ * Add MSQ config to the pom.xml * cleanup unnecessary changes * Remove model classes * Comments, checkstyle, check queries from file * fixup test case name * build failure fix * review changes * build failure fix * Trigger Build * Log the mismatch in QueryResultsVerifier * Trigger Build * Change the signature of the results verifier * review changes * LGTM fix * build, change pom * Trigger Build * Trigger Build * trigger build with minimal pom changes * guice fix in tests * travis.yml	2022-09-22 16:09:47 +05:30
Adam Peck	ee22663dd3	Add interpolation to JsonConfigurator (#13023 ) * Add interpolation to JsonConfigurator * Fix checkstyle * Fix tests by removing common-text override * Add back commons-text without version * Remove unused hadoopDir configs * Move some stuff to hopefully pass coverage	2022-09-07 12:48:01 +05:30
Abhishek Agarwal	7d332c6f6a	Suppress false CVEs (#13026 ) * Suppress CVEs * Add more suppressions	2022-09-06 11:46:56 +05:30
abhagraw	f3c47cf68c	Building druid-it-tools and running for travis in it.sh (#12957 ) * Building druid-it-tools and running for travis in it.sh * Addressing comments * Updating druid-it-image pom to point to correct it-tools * Updating all it-tools references to druid-it-tools * Adding dist back to it.sh travis * Trigger Build * Disabling batchIndex tests and commenting out user specific code * Fixing checkstyle and intellij inspection errors * Replacing tabs with spaces in it.sh * Enabling old batch index tests with indexer	2022-08-30 12:48:07 +05:30
Abhishek Agarwal	618757352b	Bump up the version to 25.0.0 (#12975 ) * Bump up the version to 25.0.0 * Fix the version in console	2022-08-29 11:27:38 +05:30

1 2

52 Commits