Hadoop 2 often causes red security scans on Druid distribution because of the dependencies it brings. We want to move away from Hadoop 2 and provide Hadoop 3 distribution available. Switch druid to building with Hadoop 3 by default. Druid will still be compatible with Hadoop 2 and users can build hadoop-2 compatible distribution using hadoop2 profile.
* use new sampler features
* supprot kafka format
* update DQT, fix tests
* prefer non numeric formats
* fix input format step
* boost SQL data loader
* delete dimension in auto discover mode
* inline example specs
* feedback updates
* yeet the format into valueFormat when switching to kafka
* kafka format is now a toggle
* even better form layout
* rename
* merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything
* fix poms and license stuff
* mockito is evil
* allow reset of JvmUtils RuntimeInfo if tests used static injection to override
* Support for middle manager less druid, tasks launch as k8s jobs
* Fixing forking task runner test
* Test cleanup, dependency cleanup, intellij inspections cleanup
* Changes per PR review
Add configuration option to disable http/https proxy for the k8s client
Update the docs to provide more detail about sidecar support
* Removing un-needed log lines
* Small changes per PR review
* Upon task completion we callback to the overlord to update the status / locaiton, for slower k8s clusters, this reduces locking time significantly
* Merge conflict fix
* Fixing tests and docs
* update tiny-cluster.yaml
changed `enableTaskLevelLogPush` to `encapsulatedTask`
* Apply suggestions from code review
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
* Minor changes per PR request
* Cleanup, adding test to AbstractTask
* Add comment in peon.sh
* Bumping code coverage
* More tests to make code coverage happy
* Doh a duplicate dependnecy
* Integration test setup is weird for k8s, will do this in a different PR
* Reverting back all integration test changes, will do in anotbher PR
* use StringUtils.base64 instead of Base64
* Jdk is nasty, if i compress in jdk 11 in jdk 17 the decompressed result is different
Co-authored-by: Rahul Gidwani <r_gidwani@apple.com>
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
* remove old query view
* update tests
* add filter
* fix test
* bump d3 things to latest versions
* rent too far into the future with d3
* make config dialogs load
* goodies
* update snapshots
* only compute duration when running or pending
* Update Snappy to 1.1.8.4.
Prior to this, because snappy-java wasn't included in dependencyManagement,
we actually shipped multiple different versions for different extensions,
ranging from 1.1.7.1 to 1.1.8.4. Now, we standardize on 1.1.8.4.
Among other things, this enables the tests to pass on M1 Macs.
* Update snappy-java versions in licenses.yaml.
* Add interpolation to JsonConfigurator
* Fix checkstyle
* Fix tests by removing common-text override
* Add back commons-text without version
* Remove unused hadoopDir configs
* Move some stuff to hopefully pass coverage
This dependency was no longer needed after #12481, but remained because
it was used for a (now useless) test. This patch removes the test and
the dependency.
* MSQ web console
* fix typo in comments
* remove useless conditional
* wrap SQL_DATA_TYPES
* fixes sus regex
* rewrite regex
* remove problematic regex
* fix UTs
* convert PARTITIONED / CLUSTERED BY to ORDER BY for preview
* fix log
* updated to use shuffle
* Web console: Use Ace.Completion directly (#1405)
* Use Ace.Completion directly
* Another Ace.Completion
* better comment
* fix column ordering in e2e test
* add nested data example also
Co-authored-by: John Gozde <john.gozde@imply.io>
* Improved Java 17 support and Java runtime docs.
1) Add a "Java runtime" doc page with information about supported
Java versions, garbage collection, and strong encapsulation..
2) Update asm and equalsverifier to versions that support Java 17.
3) Add additional "--add-opens" lines to surefire configuration, so
tests can pass successfully under Java 17.
4) Switch openjdk15 tests to openjdk17.
5) Update FrameFile to specifically mention Java runtime incompatibility
as the cause of not being able to use Memory.map.
6) Update SegmentLoadDropHandler to log an error for Errors too, not
just Exceptions. This is important because an IllegalAccessError is
encountered when the correct "--add-opens" line is not provided,
which would otherwise be silently ignored.
7) Update example configs to use druid.indexer.runner.javaOptsArray
instead of druid.indexer.runner.javaOpts. (The latter is deprecated.)
* Adjustments.
* Use run-java in more places.
* Add run-java.
* Update .gitignore.
* Exclude hadoop-client-api.
Brought in when building on Java 17.
* Swap one more usage of java.
* Fix the run-java script.
* Fix flag.
* Include link to Temurin.
* Spelling.
* Update examples/bin/run-java
Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
* Adding zstandard compression library
* 1. Took @clintropolis's advice to have ZStandard decompressor use the byte array when the buffers are not direct.
2. Cleaned up checkstyle issues.
* Fixing zstandard version to latest stable version in pom's and updating license files
* Removing zstd from benchmarks and adding to processing (poms)
* fix the intellij inspection issue
* Removing the prefix v for the version in the license check for ztsd
* Fixing license checks
Co-authored-by: Rahul Gidwani <r_gidwani@apple.com>
* Add ipaddress library as dependency.
* IPv4 functions to use the inet.ipaddr package.
* Remove unused imports.
* Add new function.
* Minor rename.
* Add more unit tests.
* IPv4 address expr utils unit tests and address options.
* Adjust the IPv4Util functions.
* Move the UTs a bit around.
* Javadoc comments.
* Add license info for IPAddress.
* Fix groupId, artifact and version in license.yaml.
* Remove redundant subnet in messages - fixes UT.
* Remove unused commons-net dependency for /processing project.
* Make class and methods public so it can be accessed.
* Add initial version of benchmark
* Add subnetutils package for benchmarks.
* Auto generate ip addresses.
* Add more v4 address representations in setup to avoid bias.
* Use ThreadLocalRandom to avoid forbidden API usage.
* Adjust IPv4AddressBenchmark to adhere to codestyle rules.
* Update ipaddress library to latest 5.3.4
* Add ipaddress package dependency to benchmarks project.
* Update blueprint dependencies & LICENSES
* Switch to bp4 namespace; use bp-ns variable in overrides
* Add webpack alias for colors.scss
* Snapshots
* Update selectors in e2e tests
amazon-kinesis-client was not covered undered the apache license and required separate insertion in the kinesis extension.
This can now be avoided since it is covered, and including it within druid helps prevent incompatibilities.
Allows enabling of deaggregation out of the box by packaging amazon-kinesis-client (1.14.4) with druid for kinesis ingestion.