Commit Graph

1605 Commits

Author SHA1 Message Date
Tejaswini Bandlamudi a45b25fa1d
Removes support for Hadoop 2 (#14763)
Removing Hadoop 2 support as discussed in https://lists.apache.org/list?dev@druid.apache.org:lte=1M:hadoop
2023-08-09 17:47:52 +05:30
Tejaswini Bandlamudi 550a66d71e
Upgrade jackson-databind to 2.12.7 (#14770)
The current version of jackson-databind is flagged for vulnerabilities CVE-2020-28491 (Although cbor format is not used in druid), CVE-2020-36518 (Seems genuine as deeply nested json in can cause resource exhaustion). Updating the dependency to the latest version 2.12.7 to fix these vulnerabilities.
2023-08-09 12:22:16 +05:30
Tejaswini Bandlamudi d0403f00fd
upgrade org.mozilla:rhino (#14765) 2023-08-08 12:17:59 +05:30
Xavier Léauté c1c2435aee
upgrade core Apache Kafka dependencies to 3.5.1 (#14721)
Release notes: https://downloads.apache.org/kafka/3.5.1/RELEASE_NOTES.html
Announcement: https://lists.apache.org/thread/p7jyv3ys7b6jowcb6lys7821qcbcpb07

Release notes: https://downloads.apache.org/kafka/3.5.0/RELEASE_NOTES.html
Announcement: https://lists.apache.org/thread/s6x3zvkrv32v5y8yb6hh31h57spdbylk
2023-08-02 01:08:40 -07:00
Pranav 8a10b46dd8
Adding the PropertyNamingStrategies from jackson for fixing hadoop ingestion (#14671) 2023-08-01 20:02:43 +05:30
dependabot[bot] e99bab2fd3
Bump org.xerial.snappy:snappy-java from 1.1.10.1 to 1.1.10.3 (#14641)
Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) from 1.1.10.1 to 1.1.10.3.
- [Release notes](https://github.com/xerial/snappy-java/releases)
- [Commits](https://github.com/xerial/snappy-java/compare/v1.1.10.1...v1.1.10.3)

---
updated-dependencies:
- dependency-name: org.xerial.snappy:snappy-java
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-26 06:27:55 -07:00
AmatyaAvadhanula 0412f40d36
Prepare master branch for next release, 28.0.0 (#14595)
* Prepare master branch for next release, 28.0.0
2023-07-18 09:22:30 +05:30
Atul Mohan 03d6d395a0
Extension to read and ingest iceberg data files (#14329)
This adds a new contrib extension: druid-iceberg-extensions which can be used to ingest data stored in Apache Iceberg format. It adds a new input source of type iceberg that connects to a catalog and retrieves the data files associated with an iceberg table and provides these data file paths to either an S3 or HDFS input source depending on the warehouse location.

Two important dependencies associated with Apache Iceberg tables are:

Catalog : This extension supports reading from either a Hive Metastore catalog or a Local file-based catalog. Support for AWS Glue is not available yet.
Warehouse : This extension supports reading data files from either HDFS or S3. Adapters for other cloud object locations should be easy to add by extending the AbstractInputSourceAdapter.
2023-07-18 08:59:57 +05:30
Sam Rash 0dcb19f7e3
Add Continuous Profiling to Unit Tests (#14506)
Uses a custom continusou jfr profiler.

Modifies the github actions for tests to do profiling only in the case
of jdk17, as the profiler requires jdk17+ to use the JFR streaming API
plus a few other language features in the code.

Continuous Profiling service is provided to the Apache Druid project
free of charge by Imply and any committer can request free access to
the UI.
2023-07-12 17:50:38 -07:00
Gian Merlino 63ee69b4e8
Claim full support for Java 17. (#14384)
* Claim full support for Java 17.

No production code has changed, except the startup scripts.

Changes:

1) Allow Java 17 without DRUID_SKIP_JAVA_CHECK.

2) Include the full list of opens and exports on both Java 11 and 17.

3) Document that Java 17 is both supported and preferred.

4) Switch some tests from Java 11 to 17 to get better coverage on the
   preferred version.

* Doc update.

* Update errorprone.

* Update docker_build_containers.sh.

* Update errorprone in licenses.yaml.

* Add some more run-javas.

* Additional run-javas.

* Update errorprone.

* Suppress new errorprone error.

* Add exports and opens in ForkingTaskRunner for Java 11+.

Test, doc changes.

* Additional errorprone updates.

* Update for errorprone.

* Restore old fomatting in LdapCredentialsValidator.

* Copy bin/ too.

* Fix Java 15, 17 build line in docker_build_containers.sh.

* Update busybox image.

* One more java command.

* Fix interpolation.

* IT commandline refinements.

* Switch to busybox 1.34.1-glibc.

* POM adjustments, build and test one IT on 17.

* Additional debugging.

* Fix silly thing.

* Adjust command line.

* Add exports and opens one more place.

* Additional harmonization of strong encapsulation parameters.
2023-07-07 12:52:35 -07:00
Jan Werner 95115d722a
CVE fixes - update of multiple dependencies. (#14519)
Apache Druid brings multiple direct and transitive dependencies that are affected by plethora of CVEs.
This PR attempts to update all the dependencies that did not require code refactoring.
This PR modifies pom files, license file and OWASP Dependency Check suppression file.
2023-07-07 20:27:30 +05:30
Clint Wylie 277aaa5c57
remove druid.processing.columnCache.sizeBytes and CachingIndexed, combine string column implementations (#14500)
* combine string column implementations
changes:
* generic indexed, front-coded, and auto string columns now all share the same column and index supplier implementations
* remove CachingIndexed implementation, which I think is largely no longer needed by the switch of many things to directly using ByteBuffer, avoiding the cost of creating Strings
* remove ColumnConfig.columnCacheSizeBytes since CachingIndexed was the only user
2023-07-02 19:37:15 -07:00
Tejaswini Bandlamudi baa64e6d8a
update hadoop version to 3.3.6 (#14489) 2023-06-28 15:03:10 +05:30
Tejaswini Bandlamudi 72cf91fbc0
Upgrade Avro to latest version (#14440)
Upgraded Avro to 1.11.1
2023-06-24 14:51:30 +05:30
Clint Wylie 9b1779734b
fix website mvn build (#14458)
changes:
* fix website mvn build
* remove the i18n/en.json file add to gitignore
* add spellcheck to mvn test phase
2023-06-22 12:14:23 -07:00
Hardik Bajaj 1ea9158a50
Added new SysMonitorOshi v0 using Oshi library (#14359)
Added a new monitor SysMonitorOshi to replace SysMonitor. The new monitor has a wider support for different machine architectures including ARM instances. Please switch to SysMonitorOshi as SysMonitor is now deprecated and will be removed in future releases.
2023-06-20 20:57:58 +05:30
Alexander Saydakov f6169d437b
use the latest datasketches-java-4.1.0 (#14430)
Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
2023-06-14 16:03:56 -07:00
Alexander Saydakov 4131c0df13
use the latest datasketches-java-4.0.0 (#14334)
* use the latest datasketches-java-4.0.0

* updated versions of datasketches

* adjusted expectation

* fixed the expectations

---------

Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
2023-05-27 22:19:18 -07:00
Clint Wylie e833a4700d
suppress hadoop3 cve that seem not applicable to us (#14252) 2023-05-10 23:08:05 -07:00
Tejaswini Bandlamudi 774073b2e7
Update Hadoop3 as default build version (#14005)
Hadoop 2 often causes red security scans on Druid distribution because of the dependencies it brings. We want to move away from Hadoop 2 and provide Hadoop 3 distribution available. Switch druid to building with Hadoop 3 by default. Druid will still be compatible with Hadoop 2 and users can build hadoop-2 compatible distribution using hadoop2 profile.
2023-04-26 12:52:51 +05:30
Tejaswini Bandlamudi cb302e1bd1
Use apache-jar-resource-bundle:1.5 instead of 1.5-SNAPSHOT (#14054) 2023-04-10 18:55:39 +05:30
Clint Wylie 1aef72aa7e
Bump up the version in pom to 27.0.0 in preparation of release (#14051) 2023-04-10 14:56:59 +05:30
Sandeep ccdf30e399
Bump Joda-Time version for current DateTimeZone data (#13999) 2023-03-29 20:15:49 +05:30
frankgrimes97 2f98675285
Tuple sketch SQL support (#13887)
This PR is a follow-up to #13819 so that the Tuple sketch functionality can be used in SQL for both ingestion using Multi-Stage Queries (MSQ) and also for analytic queries against Tuple sketch columns.
2023-03-28 18:47:12 +05:30
Abhishek Agarwal 139a058ba7
Use sonatype maven central for plugin repositories (#13961)
* Change search order of maven repositories

* Update pom.xml
2023-03-23 15:35:47 +05:30
abhagraw c52d15d65d
Fixing security vulnerability check errors (#13956)
* Fixing security vulnerability check errors

* Updating javax.el to jakarta.el

* Adding cron job trigger on changes to suppressions file
2023-03-23 11:10:06 +05:30
Benedict Jin cee2dfd768
Upgrade ZK from 3.5.9 to 3.5.10 to avoid data inconsistency risk (#13715) 2023-03-15 19:21:09 +05:30
Paul Rogers 4493275d88
Use Maven central repo rather than Apache (#13921)
* Use Maven central repo rather than Apache

* Disable snapshots
2023-03-13 10:49:32 -07:00
Elliott Freis d93fdb2632
Bump CycloneDX module to address POM errors (#13878)
* Bump CycloneDX module to address POM errors

* Including web-console in the PR

---------

Co-authored-by: Elliott Freis <elliottfreis@Elliott-Freis.earth.dynamic.blacklight.net>
2023-03-03 15:39:15 +05:30
Clint Wylie 08b5951cc5
merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything (#13698)
* merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything
* fix poms and license stuff
* mockito is evil
* allow reset of JvmUtils RuntimeInfo if tests used static injection to override
2023-02-17 14:27:41 -08:00
Adarsh Sanjeev e8330e95f5
Update Apache Kafka dependencies to 3.4.0 (#13802)
Release notes:
- https://downloads.apache.org/kafka/3.4.0/RELEASE_NOTES.html
2023-02-15 15:15:13 +05:30
Xavier Léauté 698670c88e
update core Apache Kafka dependencies to 3.3.2 (#13717)
Release notes:
- https://downloads.apache.org/kafka/3.3.2/RELEASE_NOTES.html
2023-01-27 21:00:01 -08:00
Dongjoon Hyun 2503095296
Publish SBOM artifacts (#13648) 2023-01-10 16:08:10 +05:30
abhagraw 74a76c74b1
Updating dependency check version (#13649) 2023-01-10 14:43:19 +05:30
Kashif Faraz 78ae0b7533
Upgrade to netty 4.1.86.Final to address CVEs (#13604)
This commit addresses the following CVEs:
- CVE-2021-43797
- CVE-2022-41881
2022-12-23 01:44:01 +05:30
Peter Stöckli df55768535
Add CodeQL workflow (#13477)
* workflower: Add CodeQL workflow

* add modified CodeQL build config
2022-12-21 09:24:39 +05:30
Jason Koch 6c44dd8175
perf: core/TextReader for faster json ingestion (#13545)
* perf: provide a custom utf8 specific buffered line iterator (benchmark)

Benchmark                         Mode  Cnt     Score     Error  Units
JsonLineReaderBenchmark.baseline  avgt   15  3459.871 ± 106.175  us/op

* perf: provide a custom utf8 specific buffered line iterator

Benchmark                         Mode  Cnt     Score    Error  Units
JsonLineReaderBenchmark.baseline  avgt   15  3022.053 ± 51.286  us/op

* perf: provide a custom utf8 specific buffered line iterator (more tests)

* perf: provide a custom utf8 specific buffered line iterator (pr feedback)

Ensure field visibility is as limited as possible

Null check for buffer in constructor

* perf: provide a custom utf8 specific buffered line iterator (pr feedback)

Remove additional 'finished' variable.

* perf: provide a custom utf8 specific buffered line iterator (more tests and bugfix)
2022-12-19 23:12:37 -08:00
Kashif Faraz 7cf761cee4
Prepare master branch for next release, 26.0.0 (#13401)
* Prepare master branch for next release, 26.0.0

* Use docker image for druid 24.0.1

* Fix version in druid-it-cases pom.xml
2022-11-22 15:31:01 +05:30
Paul Rogers 81d005f267
Druid Catalog basics (#13165)
Druid catalog basics

Catalog object model for tables, columns
Druid metadata DB storage (as an extension)
REST API to update the catalog (as an extension)
Integration tests
Model only: no planner integration yet
2022-11-12 15:30:22 -08:00
Didip Kerabat c875f4bd04
Upgrade curator to 5.4.0 (#13302) 2022-11-03 11:26:19 -07:00
Dr. Sizzles e5ad24ff9f
Support for middle manager less druid, tasks launch as k8s jobs (#13156)
* Support for middle manager less druid, tasks launch as k8s jobs

* Fixing forking task runner test

* Test cleanup, dependency cleanup, intellij inspections cleanup

* Changes per PR review

Add configuration option to disable http/https proxy for the k8s client
Update the docs to provide more detail about sidecar support

* Removing un-needed log lines

* Small changes per PR review

* Upon task completion we callback to the overlord to update the status / locaiton, for slower k8s clusters, this reduces locking time significantly

* Merge conflict fix

* Fixing tests and docs

* update tiny-cluster.yaml 

changed `enableTaskLevelLogPush` to `encapsulatedTask`

* Apply suggestions from code review

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>

* Minor changes per PR request

* Cleanup, adding test to AbstractTask

* Add comment in peon.sh

* Bumping code coverage

* More tests to make code coverage happy

* Doh a duplicate dependnecy

* Integration test setup is weird for k8s, will do this in a different PR

* Reverting back all integration test changes, will do in anotbher PR

* use StringUtils.base64 instead of Base64

* Jdk is nasty, if i compress in jdk 11 in jdk 17 the decompressed result is different

Co-authored-by: Rahul Gidwani <r_gidwani@apple.com>
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
2022-11-02 19:44:47 -07:00
chi-chi weng 72c16097ac
Fix Apache Commons Text CVE-2022-42889 (#13226)
* Fix Apache Commons Text  CVE-2022-42889

Fix Apache Commons Text  CVE-2022-42889

https://nvd.nist.gov/vuln/detail/CVE-2022-42889

* Update license

Co-authored-by: Frank Chen <frank.chen021@outlook.com>
2022-10-26 10:04:32 +08:00
Frank Chen d30cf8c308
Dependency cleanup (#13194)
* Clean up dependency in extensions

* Bump protobuf/aws.sdk

* Bump aws-sdk to 1.12.317

* Fix CI

* Fix CI

* Update license

* Update license
2022-10-10 20:34:38 +08:00
Xavier Léauté eff7edb603
update core Apache Kafka dependencies to 3.3.1 (#13176)
Announcement:
- https://blogs.apache.org/kafka/entry/what-rsquo-s-new-in

Release notes:
- https://archive.apache.org/dist/kafka/3.3.0/RELEASE_NOTES.html
- https://downloads.apache.org/kafka/3.3.1/RELEASE_NOTES.html
2022-10-04 12:52:16 -07:00
AmatyaAvadhanula acafd0d1e0
Upgrade kafka version to 3.2.3 to fix CVE (#13142)
Upgrade to 3.2.3 to fix CVE: https://nvd.nist.gov/vuln/detail/CVE-2022-34917
2022-09-28 10:47:09 +05:30
Gian Merlino 5733360dfd
Update Snappy to 1.1.8.4. (#13081)
* Update Snappy to 1.1.8.4.

Prior to this, because snappy-java wasn't included in dependencyManagement,
we actually shipped multiple different versions for different extensions,
ranging from 1.1.7.1 to 1.1.8.4. Now, we standardize on 1.1.8.4.

Among other things, this enables the tests to pass on M1 Macs.

* Update snappy-java versions in licenses.yaml.
2022-09-14 15:13:47 -07:00
Adam Peck ee22663dd3
Add interpolation to JsonConfigurator (#13023)
* Add interpolation to JsonConfigurator

* Fix checkstyle

* Fix tests by removing common-text override

* Add back commons-text without version

* Remove unused hadoopDir configs

* Move some stuff to hopefully pass coverage
2022-09-07 12:48:01 +05:30
senthilkv 3d9aef225d
compressed big decimal - module (#10705)
Compressed Big Decimal is an extension which provides support for 
Mutable big decimal value that can be used to accumulate values 
without losing precision or reallocating memory. This type helps in 
absolute precision arithmetic on large numbers in applications, 
where greater level of accuracy is required, such as financial 
applications, currency based transactions. This helps avoid rounding 
issues where in potentially large amount of money can be lost.

Accumulation requires that the two numbers have the same scale, 
but does not require that they are of the same size. If the value 
being accumulated has a larger underlying array than this value 
(the result), then the higher order bits are dropped, similar to what 
happens when adding a long to an int and storing the result in an 
int. A compressed big decimal that holds its data with an embedded 
array.

Compressed big decimal is an absolute number based complex type 
based on big decimal in Java. This supports all the functionalities 
supported by Java Big Decimal. Java Big Decimal is not mutable in 
order to avoid big garbage collection issues. Compressed big decimal 
is needed to mutate the value in the accumulator.
2022-09-06 00:06:57 -07:00
Gian Merlino 48ceab2153
Add Java 17 information to documentation. (#12990)
The docs say Java 17 support is experimental, and give tips on running
successfully with Java 17.

This patch also removes java.base/jdk.internal.perf and
jdk.management/com.sun.management.internal from the list of required
exports and opens, because they were formerly needed for JvmMonitor,
which was rewritten in #12481 to use MXBeans instead.
2022-08-30 12:32:49 -07:00
Gian Merlino 9eb20e5e7c
Remove dependency on jvm-attach. (#12989)
This dependency was no longer needed after #12481, but remained because
it was used for a (now useless) test. This patch removes the test and
the dependency.
2022-08-29 14:18:33 -07:00