* Add avro dependency to parquet extension
If the parquet extension is loaded and an ingestionSpec uses the older format
specifying a 'parser' instead of using an 'inputFormat' the job fails
with the following error
java.lang.TypeNotPresentException: Type org.apache.avro.generic.GenericRecord not present
This change removes the exclusion of the avro package so that the missing
class can be found.
* Address review comments and add dependency version
* Support orc format for native batch ingestion
* fix pom and remove wrong comment
* fix unnecessary condition check
* use flatMap back to handle exception properly
* move exceptionThrowingIterator to intermediateRowParsingReader
* runtime
* add parquet support to native batch
* cleanup
* implement toJson for sampler support
* better binaryAsString test
* docs
* i hate spellcheck
* refactor toMap conversion so can be shared through flattenerMaker, default impls should be good enough for orc+avro, fixup for merge with latest
* add comment, fix some stuff
* adjustments
* fix accident
* tweaks
* Fix dependency analyze warnings
Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports and
updated druid-forbidden-apis to prevent regressions.
* Address review comments
* Adjust scope for org.glassfish.jaxb:jaxb-runtime
* Fix dependencies for hdfs-storage
* Consolidate netty4 versions
* Fix dependency analyze warnings
Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports.
* Fix licenses and dependencies
* Fix licenses and dependencies again
* Fix integration test dependency
* Address review comments
* Fix unit test dependencies
* Fix integration test dependency
* Fix integration test dependency again
* Fix integration test dependency third time
* Fix integration test dependency fourth time
* Fix compile error
* Fix assert package
Make static imports forbidden in tests and remove all occurrences to be
consistent with the non-test code.
Also, various changes to files affected by above:
- Reformat to adhere to druid style guide
- Fix various IntelliJ warnings
- Fix various SonarLint warnings (e.g., the expected/actual args to
Assert.assertEquals() were flipped)
* Bump Apache Avro to 1.9.0
Apache Avro 1.9.0 brings a lot of new features:
* Deprecate Joda-Time in favor of Java8 JSR310 and setting it as default
* Remove support for Hadoop 1.x
* Move from Jackson 1.x to 2.9
* Add ZStandard Codec
* Lots of updates on the dependencies to fix CVE's
* Remove Jackson classes from public API
* Apache Avro is built by default with Java 8
* Apache Avro is compiled and tested with Java 11 to guarantee compatibility
* Apache Avro MapReduce is compiled and tested with Hadoop 3
* Apache Avro is now leaner, multiple dependencies were removed: guava, paranamer, commons-codec, and commons-logging
* Introduce JMH Performance Testing Framework
* Add Snappy support for C++ DataFile
* and many, many more!
* Add exclusions for Jackson
* Remove Apache Pig from the tests
* Remove the Pig specific part
* Fix the Checkstyle issues
* Cleanup a bit
* Add an additional test
* Revert the abstract class
* Add checkstyle rules about imports and empty lines between members
* Add suppressions
* Update Eclipse import order
* Add empty line
* Fix StatsDEmitter
* move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql
* fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays
* remove leftover print
* convert micro timestamp to millis
* checkstyle
* add ignore for .parquet and .parq to rat exclude
* fix legit test failure from avro flattern behavior change
* fix rebase
* add exclusions to pom to cut down on redundant jars
* refactor tests, add support for unwrapping lists for parquet-avro, review comments
* more comment
* fix oops
* tweak parquet-avro list handling
* more docs
* fix style
* grr styles