Commit Graph

100 Commits

Author SHA1 Message Date
Abhishek Agarwal 2fe053c5cb
Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
Ivan Vankovich 6a93872586
OpenTelemetry emitter extension (#12015)
* Add OpenTelemetry emitter extension

* Fix build

* Fix checkstyle

* Add used undeclared dependencies

* Ignore unused declared dependencies
2022-01-15 12:18:04 +08:00
Jihoon Son 4a74c5adcc
Use Druid's extension loading for integration test instead of maven (#12095)
* Use Druid's extension loading for integration test instead of maven

* fix maven command

* override config path

* load input format extensions and kafka by default; add prepopulated-data group

* all docker-composes are overridable

* fix s3 configs

* override config for all

* fix docker_compose_args

* fix security tests

* turn off debug logs for overlord api calls

* clean up stuff

* revert docker-compose.yml

* fix override config for query error test; fix circular dependency in docker compose

* add back some dependencies in docker compose

* new maven profile for integration test

* example file filter
2022-01-05 23:33:04 -08:00
Karan Kumar 90640bb316
Support for hadoop 3 via maven profiles (#11794)
Add support for hadoop 3 profiles . Most of the details are captured in #11791 .
We use a combination of maven profiles and resource filtering to achieve this. Hadoop2 is supported by default and a new maven profile with the name hadoop3 is created. This will allow the user to choose the profile which is best suited for the use case.
2021-10-30 22:46:24 +05:30
Đặng Minh Dũng 4baebb231b
add `prometheus-emitter` to distribution (#11812)
* add `prometheus-emitter` to distribution

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

* add `druid-momentsketch` to distribution

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
2021-10-25 21:16:17 -07:00
Clint Wylie fe1d8c206a
bump version to 0.23.0-SNAPSHOT (#11670) 2021-09-08 15:56:04 -07:00
dependabot[bot] c82ede9afc
Bump maven-antrun-plugin from 1.8 to 3.0.0 (#11336)
Bumps [maven-antrun-plugin](https://github.com/apache/maven-antrun-plugin) from 1.8 to 3.0.0.
- [Release notes](https://github.com/apache/maven-antrun-plugin/releases)
- [Commits](https://github.com/apache/maven-antrun-plugin/compare/maven-antrun-plugin-1.8...maven-antrun-plugin-3.0.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-antrun-plugin
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-08 08:28:11 -07:00
Jihoon Son 95065bdf1a
Bump dev version to 0.22.0-SNAPSHOT (#10759) 2021-01-15 13:16:23 -08:00
Himanshu c7b1212a43
AWS RDS token based password provider (#9518)
* refresh db pwd

* aws iam token password provider

* fix analyze-dependencies build

* fix doc build

* add  ut for BasicDataSourceExt

* more doc updates

* more  doc update

* moving aws  token password  provider to new extension

* remove duplicate changes

* make  all config inline

* extension docs

* refresh db  password  in SQL Firehose code path as well

* add ut

* fix build

* add new extension to distribution

* rds lib is not provided

* fix license build

* add version to license

* change parent version to 0.19.0-snapshot

* address review comments

* fix core/ code coverage

* Update server/src/main/java/org/apache/druid/metadata/BasicDataSourceExt.java

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

* address review comments

* fix spellchecker

* remove inadvertant website file change

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2021-01-06 21:15:29 -08:00
Himanshu ac1882bf74
kubernetes based discovery druid extension to run Druid on K8S without Zookeeper (#10544)
* honor zk enablement config in more places in druid code

* kubernetes based discovery module

* fix spotbugs check

* fix intellij checks error

* fix doc link to kubernetes.md from extension

* make spellchecker happy

* update license.yaml

* fix dependency check errors

* update extension coverage

* UTs for BaseNodeRoleWatcher

* fix forbidden-api check

* update k8s module coverage ignores

* add Bouncy Castle License being same as MIT License for license checking purposes

* further update licenses.yaml

* label/annotation pre-existence assumption

* address review comment
2020-12-14 21:10:31 -08:00
Jonathan Wei 65c0d64676
Update version to 0.21.0-SNAPSHOT (#10450)
* [maven-release-plugin] prepare release druid-0.21.0

* [maven-release-plugin] prepare for next development iteration

* Update web-console versions
2020-10-03 16:08:34 -07:00
Clint Wylie c86e7ce30b
bump version to 0.20.0-SNAPSHOT (#10124) 2020-07-06 15:08:32 -07:00
frank chen 60c6bd5b4c
support Aliyun OSS service as deep storage (#9898)
* init commit, all tests passed

* fix format

Signed-off-by: frank chen <frank.chen021@outlook.com>

* data stored successfully

* modify config path

* add doc

* add aliyun-oss extension to project

* remove descriptor deletion code to avoid warning message output by aliyun client

* fix warnings reported by lgtm-com

* fix ci warnings

Signed-off-by: frank chen <frank.chen021@outlook.com>

* fix errors reported by intellj inspection check

Signed-off-by: frank chen <frank.chen021@outlook.com>

* fix doc spelling check

Signed-off-by: frank chen <frank.chen021@outlook.com>

* fix dependency warnings reported by ci

Signed-off-by: frank chen <frank.chen021@outlook.com>

* fix warnings reported by CI

Signed-off-by: frank chen <frank.chen021@outlook.com>

* add package configuration to support showing extension info

Signed-off-by: frank chen <frank.chen021@outlook.com>

* add IT test cases and fix bugs

Signed-off-by: frank chen <frank.chen021@outlook.com>

* 1. code review comments adopted
2. change schema from 'aliyun-oss' to 'oss'

Signed-off-by: frank chen <frank.chen021@outlook.com>

* add license info

Signed-off-by: frank chen <frank.chen021@outlook.com>

* fix doc

Signed-off-by: frank chen <frank.chen021@outlook.com>

* exclude execution of IT testcases of OSS extension from CI

Signed-off-by: frank chen <frank.chen021@outlook.com>

* put the extensions under contrib group and add to distribution

* fix names in test cases

* add unit test to cover OssInputSource

* fix names in test cases

* fix dependency problem reported by CI

Signed-off-by: frank chen <frank.chen021@outlook.com>
2020-07-01 22:20:53 -07:00
Francesco Nidito e7e41e3a36
Adding support for autoscaling in GCE (#8987)
* Adding support for autoscaling in GCE

* adding extra google deps also in gce pom

* fix link in doc

* remove unused deps

* adding terms to spelling file

* version in pom 0.17.0-incubating-SNAPSHOT --> 0.18.0-SNAPSHOT

* GCEXyz -> GceXyz in naming for consistency

* add preconditions

* add VisibleForTesting annotation

* typos in comments

* use StringUtils.format instead of String.format

* use custom exception instead of exit

* factorize interval time between retries

* making literal value a constant

* iter all network interfaces

* use provided on google (non api) deps

* adding missing dep

* removing unneded this and use Objects methods instead o 3-way if in hash and comparison

* adding import

* adding retries around getRunningInstances and adding limit for operation end waiting

* refactor GceEnvironmentConfig.hashCode

* 0.18.0-SNAPSHOT -> 0.19.0-SNAPSHOT

* removing unused config

* adding tests to hash and equals

* adding nullable to waitForOperationEnd

* adding testTerminate

* adding unit tests for createComputeService

* increasing retries in unrelated integration-test to prevent sporadic failure (hopefully)

* reverting queryResponseTemplate change

* adding comment for Compute.Builder.build() returning null
2020-04-28 03:13:39 -07:00
bolkedebruin 2d99966933
Add Apache Ranger Authorization (#9579) 2020-04-04 18:02:24 +02:00
Jihoon Son 0da8ffc3ff
Bump up development version to 0.19.0-SNAPSHOT (#9586) 2020-03-30 16:24:04 -07:00
Himanshu 5604ac7963
druid extension for OpenID Connect auth using pac4j lib (#8992)
* druid pac4j security extension for OpenID Connect OAuth 2.0 authentication

* update version in druid-pac4j pom

* introducing unauthorized resource filter

* authenticated but authorized /unified-webconsole.html

* use httpReq.getRequestURI() for matching callback path

* add documentation

* minor doc addition

* licesne file updates

* make dependency analyze succeed

* fix doc build

* hopefully fixes doc build

* hopefully fixes license check build

* yet another try on fixing license build

* revert unintentional changes to website folder

* update version to 0.18.0-SNAPSHOT

* check session and its expiry on each request

* add crypto service

* code for encrypting the cookie

* update doc with cookiePassphrase

* update license yaml

* make sessionstore in Pac4jFilter private non static

* make Pac4jFilter fields final

* okta: use sha256 for hmac

* remove incubating

* add UTs for crypto util and session store impl

* use standard charsets

* add license header

* remove unused file

* add org.objenesis.objenesis to license.yaml

* a bit of nit changes  in CryptoService  and embedding EncryptionResult for clarity

* rename alg  to cipherAlgName

* take cipher alg name, mode and padding as input

* add java doc  for CryptoService  and make it more understandable

* another  UT for CryptoService

* cache pac4j Config

* use generics clearly in Pac4jSessionStore

* update cookiePassphrase doc to mention PasswordProvider

* mark stuff Nullable where appropriate in Pac4jSessionStore

* update doc to mention jdbc

* add error log on reaching callback resource

* javadoc  for Pac4jCallbackResource

* introduce NOOP_HTTP_ACTION_ADAPTER

* add correct module name in license file

* correct extensions folder name in licenses.yaml

* replace druid-kubernetes-extensions to druid-pac4j

* cache SecureRandom instance

* rename UnauthorizedResourceFilter to AuthenticationOnlyResourceFilter
2020-03-23 18:15:45 -07:00
zachjsh d771b42ed1
Move Azure extension into Core (#9394)
* Move Azure extension into Core

Moving the azure extension into Core.

* * Fix build failure

* * Add The MIT License (MIT) to list of compatible licenses

* * Address review comments

* * change reference to contrib azure to core azure

* * Fix spelling mistakes.
2020-02-25 17:49:16 -08:00
Jonathan Wei 4e8368a5d9 Set version to 0.18.0-SNAPSHOT (#9109) 2020-01-02 17:55:10 -05:00
Jonathan Wei 8af41d7cd0 Update version to 0.18.0-incubating-SNAPSHOT (#9009) 2019-12-11 14:04:03 -08:00
Chi Cao Minh af74acaa85 Address security vulnerabilities CVSS >= 7 (#8980)
* Address security vulnerabilities CVSS >= 7

Update dependencies to address security vulnerabilities with CVSS scores
of 7 or higher. A new Travis CI job is added to prevent new
high/critical security vulnerabilities from being added.

Updated dependencies:
- api-util 1.0.0 -> 1.0.3
- jackson 2.9.10 -> 2.10.1
- kafka 2.1.0 -> 2.1.1
- libthrift 0.10.0 -> 0.13.0
- protobuf 3.2.0 -> 3.11.0

The following high/critical security vulnerabilities are currently
suppressed (so that the new Travis CI job can be added now) and are left
as future work to fix:
- hibernate-validator:5.2.5
- jackson-mapper-asl:1.9.13
- libthrift:0.6.1
- netty:3.10.6
- nimbus-jose-jwt:4.41.1

* Rename EDL1 license file

* Fix inspection errors
2019-12-05 14:34:35 -08:00
jon-wei dfbc066163 Revert "[maven-release-plugin] prepare release druid-0.16.1-incubating-rc1"
This reverts commit a0f21d9b07.
2019-11-27 23:22:43 -08:00
jon-wei 0402ff85b8 Revert "[maven-release-plugin] prepare for next development iteration"
This reverts commit 8ffa71e7e6.
2019-11-27 23:22:32 -08:00
jon-wei 8ffa71e7e6 [maven-release-plugin] prepare for next development iteration 2019-11-27 23:18:48 -08:00
jon-wei a0f21d9b07 [maven-release-plugin] prepare release druid-0.16.1-incubating-rc1 2019-11-27 23:18:37 -08:00
Nishant Bangarwa 8537fbeca7 Implementing dropwizard emitter for druid (#7363)
* Implementing dropwizard emitter for druid

making metric manager and alert emitters as optional

* Refactor and make things work

more improvements

improve docs

refactrings

* Fix teamcity inspections

* review comments

* more review comments

* add limit to max number of gauges

* update pom version

* fix pom

* review comments

* review comment

* review comments

* fix broken doc link

review comments

review comments

* review comments

* fix checkstyle

* more spell check fixes

* fix travis failures
2019-10-01 14:59:30 -07:00
Clint Wylie 984958122b
packaging script adjustments (#8436)
* set encoding for license and notice scripts, split generate-license.py into generate-binary-license.py and check-licenses.py, check-licenses when -Papache-release is used

* missing docs

* doc fix

* more doc fix

* remove comments

* good catch travis +1

* fix lgtm alerts
2019-08-29 23:27:43 -07:00
Clint Wylie c73a489335
bump master version to 0.17.0-incubating-SNAPSHOT (#8421) 2019-08-28 01:58:36 -07:00
Clint Wylie 010f70b371
autogenerate NOTICE.BINARY from NOTICE and licenses.yaml (#8306)
* migrate binary notice entries to live in licenses.yaml, use licenses.yaml and NOTICE to generate NOTICE.BINARY at distribution time

* +x

* move release scripts to distribution/bin, fixup notice script, trim dependencies for avro and kerberos in licenses.yaml

* add missing hdfs-storage dependencies

* revert to old syntax, fixes

* formatting

* update notices for recently updated dependencies
2019-08-21 12:46:27 -07:00
Jihoon Son 0a3538b569 Fix license check in travis and make it optional (#8049)
* Fix license check in travis and make it optional

* debug

* fix build

* too loud maven

* move MAVEN_OPTS to top and add comments

* adjust script

* remove mvn option from python script
2019-07-09 19:35:29 -07:00
Jihoon Son 12f12676e3
Binary license management system (#7998)
* Binary license management system

* add missing file

* add comment

* Address comments

* print missing licenses

* print druid module name

* Add missing licenses and update versions

* fix library versions and add missing ones. also fix pom.xml

* testing multi thread

* Parallel report generation

* fix build error

* install pyyaml and use old api

* install python3

* fix travis script

* python3.6

* pip

* setuptools

* python3-setuptools

* address comment

* error on not found reports or registered licenses

* removed licenses

* debug

* travis debug

* add missing licenses

* travis debug

* debug

* remove debug code

* test build script

* travis debug

* still debug

* add missing python lib

* debug

* debug

* fix travis

* fix travis

* debug travis

* flush print

* print something more to keep travis alive

* adjust print

* single threaded

* single threaded

* debug

* debug

* remove debug

* remove deprecated-2017Q4 from travis conf

* remove comments and duplicate sudo
2019-07-08 12:24:51 -07:00
Clint Wylie 42a7b8849a remove FirehoseV2 and realtime node extensions (#8020)
* remove firehosev2 and realtime node extensions

* revert intellij stuff

* rat exclusion
2019-07-04 15:40:22 -07:00
Khwunchai Jaengsawang fb56f8d53c Rename io.druid to org.druid in tdigestsketch extension (#7996)
* Rename io.druid to org.druid in tdigestsketch extension

* Fix typo

Signed-off-by: Khwunchai Jaengsawang <khwunchai.j@ku.th>

* Add packaging check for community extensions

Signed-off-by: Khwunchai Jaengsawang <khwunchai.j@ku.th>
2019-07-01 12:46:35 -07:00
Jihoon Son 7abfbb066a Bump up snapshot version to 0.16.0 (#7802) 2019-05-30 17:17:33 -07:00
awelsh93 6964ac23a2 Adding influxdb emitter as a contrib extension (#7717)
* Adding influxdb emitter as a contrib extension

* addressing code review comments
2019-05-23 11:11:48 -07:00
Clint Wylie 6a6c6d573d
Add plain text README.txt, use relative link from README.md to build.md (#7611)
* use relative link to build instructions from top level readme

* add textfile to readme

* formatting

* make README.BINARY plaintext, move LABELS.md to LABELS, README.txt to README

* exclude README.BINARY still

* remove jdk links/recommmendations

* add script to use DRUIDVERSION in textfile README instead of latest, add links to recommended jdk to build.md

* license

* better readme template, links to latest if does not detect an apache release version

* fix
2019-05-09 21:29:26 -07:00
Samarth Jain b542bb9f34 TDigest backed sketch aggregators (#7331)
* First set of changes for tDigest histogram

* Add license

* Address code review comments

* Add a doc page for new T-Digest sketch aggregators. Minor code cleanup and comments.

* Remove synchronization from BufferAggregators. Address code review comments

* Fix typo
2019-05-09 17:22:55 -07:00
Eyal Yurman f02251ab2d Contributing Moving-Average Query to open source. (#6430)
* Contributing Moving-Average Query to open source.

* Fix failing code inspections.

* See if explicit types will invoke the correct comparison function.

* Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter.

* Update styling and headers for complience.

* Refresh code with latest master changes:

* Remove NullDimensionSelector.
* Apply changes of RequestLogger.
* Apply changes of TimelineServerView.

* Small checkstyle fix.

* Checkstyle fixes.

* Fixing rat errors; Teamcity errors.

* Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved.

* Implements some of the review fixes.

* Contributing Moving-Average Query to open source.

* Fix failing code inspections.

* See if explicit types will invoke the correct comparison function.

* Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter.

* Update styling and headers for complience.

* Refresh code with latest master changes:

* Remove NullDimensionSelector.
* Apply changes of RequestLogger.
* Apply changes of TimelineServerView.

* Small checkstyle fix.

* Checkstyle fixes.

* Fixing rat errors; Teamcity errors.

* Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved.

* Implements some of the review fixes.

* More fixes for review.

* More fixes from review.

* MapBasedRow is Unmodifiable. Create new rows instead of modifying existing ones.

* Remove more changes related to datasketches support.

* Refactor BaseAverager startFrom field and add a comment.

* fakeEvents field: Refactor initialization and add comment.

* Rename parameters (tiny change).

* Fix variable name typo in test (JAN_4).

* Fix styling of non camelCase fields.

* Fix Preconditions.checkArgument for cycleSize.

* Add more documentation to RowBucketIterable and other classes.

* key/value comment on in MovingAverageIterable.

* Fix anonymous makeColumnValueSelector returning null.

* Replace IdentityYieldingAccumolator with Yielders.each().

* * internalNext() should return null instead of throwing exception.
* Remove unused variables/prarameters.

* Harden MovingAverageIterableTest (Switch anyOf to exact match).

* Change internalNext() from recursion to iteration; Simplify next() and hasNext().

* Remove unused imports.

* Address review comments.

* Rename fakeEvents to emptyEvents.

* Remove redundant parameter key from computeMovingAverage.

* Check yielder as well in RowBucketIterable#hasNext()

* Fix javadoc.
2019-04-26 17:07:48 -07:00
Clint Wylie 89bb43f382 'core' ORC extension (#7138)
* orc extension reworked to use apache orc map-reduce lib, moved to core extensions, support for flattenSpec, tests, docs

* change binary handling to be compatible with avro and parquet, Rows.objectToStrings now converts byte[] to base64, change date handling

* better docs and tests

* fix it

* formatting

* doc fix

* fix it

* exclude redundant dependencies

* use latest orc-mapreduce, add hadoop jobProperties recommendations to docs

* doc fix

* review stuff and fix binaryAsString

* cache for root level fields

* more better
2019-04-09 09:03:26 -07:00
Charles Allen eeb3dbe79d Move GCP to a core extension (#6953)
* Move GCP to a core extension

* Don't provide druid-core >.<

* Keep AWS and GCP modules separate

* Move AWSModule to its own module

* Add aws ec2 extension and more modules in more places

* Fix bad imports

* Fix test jackson module

* Include AWS and GCP core in server

* Add simple empty method comment

* Update version to 15

* One more 0.13.0-->0.15.0 change

* Fix multi-binding problem

* Grep for s3-extensions and update docs

* Update extensions.md
2019-03-27 09:00:43 -07:00
Jonathan Wei 5486c2abf8
Update LICENSE and NOTICE files (#7026)
* Update LICENSE and NOTICE files

* Update react-table version
2019-03-04 18:45:22 -08:00
Jonathan Wei fafbc4a80e
Set version to 0.15.0-incubating-SNAPSHOT (#7014) 2019-02-07 14:02:52 -08:00
Jonathan Wei 8bc5eaa908
Set version to 0.14.0-incubating-SNAPSHOT (#7003) 2019-02-04 19:36:20 -08:00
Joshua Sun 7c7997e8a1 Add Kinesis Indexing Service to core Druid (#6431)
* created seekablestream classes

* created seekablestreamsupervisor class

* first attempt to integrate kafa indexing service to use SeekableStream

* seekablestream bug fixes

* kafkarecordsupplier

* integrated kafka indexing service with seekablestream

* implemented resume/suspend and refactored some package names

* moved kinesis indexing service into core druid extensions

* merged some changes from kafka supervisor race condition

* integrated kinesis-indexing-service with seekablestream

* unite tests for kinesis-indexing-service

* various bug fixes for kinesis-indexing-service

* refactored kinesisindexingtask

* finished up more kinesis unit tests

* more bug fixes for kinesis-indexing-service

* finsihed refactoring kinesis unit tests

* removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions

* kinesis-indexing-service code cleanup and docs

* merge #6291

merge #6337

merge #6383

* added more docs and reordered methods

* fixd kinesis tests after merging master and added docs in seekablestream

* fix various things from pr comment

* improve recordsupplier and add unit tests

* migrated to aws-java-sdk-kinesis

* merge changes from master

* fix pom files and forbiddenapi checks

* checkpoint JavaType bug fix

* fix pom and stuff

* disable checkpointing in kinesis

* fix kinesis sequence number null in closed shard

* merge changes from master

* fixes for kinesis tasks

* capitalized <partitionType, sequenceType>

* removed abstract class loggers

* conform to guava api restrictions

* add docker for travis other modules test

* address comments

* improve RecordSupplier to supply records in batch

* fix strict compile issue

* add test scope for localstack dependency

* kinesis indexing task refactoring

* comments

* github comments

* minor fix

* removed unneeded readme

* fix deserialization bug

* fix various bugs

* KinesisRecordSupplier unable to catch up to earliest position in stream bug fix

* minor changes to kinesis

* implement deaggregate for kinesis

* Merge remote-tracking branch 'upstream/master' into seekablestream

* fix kinesis offset discrepancy with kafka

* kinesis record supplier disable getPosition

* pr comments

* mock for kinesis tests and remove docker dependency for unit tests

* PR comments

* avg lag in kafkasupervisor #6587

* refacotred SequenceMetadata in taskRunners

* small fix

* more small fix

* recordsupplier resource leak

* revert .travis.yml formatting

* fix style

* kinesis docs

* doc part2

* more docs

* comments

* comments*2

* revert string replace changes

* comments

* teamcity

* comments part 1

* comments part 2

* comments part 3

* merge #6754

* fix injection binding

* comments

* KinesisRegion refactor

* comments part idk lol

* can't think of a commit msg anymore

* remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner

* commmmmmmmmmments

* extra error handling in KinesisRecordSupplier getRecords

* comments

* quickfix

* typo

* oof
2018-12-21 12:49:24 -07:00
David Lim 3443f9a008 add missing contrib extensions to distribution packaging, fix influx package name, fix checkstyle plugin config to support Maven < 3.3.9 (#6717) 2018-12-10 17:56:32 -08:00
David Lim afb239b17a add missing license headers, in particular to MD files; clean up RAT … (#6563)
* add missing license headers, in particular to MD files; clean up RAT exclusions

* revert inadvertent doc changes

* docs

* cr changes

* fix modified druid-production.svg
2018-11-13 09:38:37 -08:00
Clint Wylie 1224d8b746 overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360)
* move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql

* fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays

* remove leftover print

* convert micro timestamp to millis

* checkstyle

* add ignore for .parquet and .parq to rat exclude

* fix legit test failure from avro flattern behavior change

* fix rebase

* add exclusions to pom to cut down on redundant jars

* refactor tests, add support for unwrapping lists for parquet-avro, review comments

* more comment

* fix oops

* tweak parquet-avro list handling

* more docs

* fix style

* grr styles
2018-11-05 21:33:42 -08:00
David Lim 822e564f54 include mysql-metadata-storage extension in distribution, but without… (#6497)
* include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library

* Install mysql connector package

* use symlinks to avoid versioning issues

* add documentation for fetching the mysql connector
2018-10-20 18:18:58 -07:00
David Lim e1a53fd17a fix distribution to not include contrib extensions by default, don't pull the entire AWS SDK bundle (#6494) 2018-10-19 13:50:05 -07:00
David Lim 73780536a6 Apache-ize POM (#6482)
* Apache-ize POM

* put revision information into MANIFEST.MF for binary release

* remove nightly profile

* fix flaky travis by overriding maven-remote-resources-plugin execution from the parent POM
2018-10-17 18:37:01 -07:00