Commit Graph

9182 Commits

Author SHA1 Message Date
Clint Wylie 1224d8b746 overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360)
* move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql

* fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays

* remove leftover print

* convert micro timestamp to millis

* checkstyle

* add ignore for .parquet and .parq to rat exclude

* fix legit test failure from avro flattern behavior change

* fix rebase

* add exclusions to pom to cut down on redundant jars

* refactor tests, add support for unwrapping lists for parquet-avro, review comments

* more comment

* fix oops

* tweak parquet-avro list handling

* more docs

* fix style

* grr styles
2018-11-05 21:33:42 -08:00
Roman Leventov a2a1a1c2c9 Hide NullDimensionSelector from public (#6480) 2018-11-02 04:38:21 -07:00
David Lim 23ad3d214c fixup docs to download from Apache mirror, fixup tarball name and path, change references from quickstart/* to quickstart/tutorial/* (#6570) 2018-11-01 21:47:29 -07:00
Caroline1000 26d992840c correct default tier name (#6568) 2018-11-01 17:51:13 -07:00
QiuMM ddd15a6907 correct default value for maxTotalRows (#6566) 2018-11-01 16:53:15 -07:00
Jihoon Son a92c2a197b
Move supervisor APIs to api-reference (#6555)
* Move supervisor APIs to api-reference

* fix kafka-specific docs

* add ingestion stats report
2018-11-01 13:10:05 -07:00
QiuMM 7b34662462 Period load/drop/broadcast rules should include the future by default (#6414)
* Period load/drop/broadcast rules should include the future by default

* address comments

* adjust coordinator console and tweak docs

* address comments

* fix travis-ci
2018-11-01 09:43:34 -07:00
Jihoon Son d2a533c7c7 Add doc for missing balancerComputeThreads configuration (#6561)
* Add doc for missing balancerComputeThreads configuration

* remove duplicate
2018-10-31 18:43:12 -07:00
Roman Leventov 2cdce2e2a6
Add RequestLogEventBuilderFactory (#6477)
This PR allows to control the fields in `RequestLogEvent`, emitted in `EmittingRequestLogger`. In our case, we want to get rid of the `intervals` fields of the query objects that are a part of `DefaultRequestLogEvent`. They are enormous (thousands of segments) and not useful.

Related to #5522, FYI @a2l007.
2018-10-31 22:24:37 +01:00
Gian Merlino d5e9e5686e Set new keys for integration-tests. (#6554) 2018-10-31 09:01:42 -07:00
taiii b1159174b7 Update mysql.md (#6545) 2018-10-30 14:01:32 -07:00
QiuMM 676f5e6d7f Prohibit some guava collection APIs and use JDK collection APIs directly (#6511)
* Prohibit some guava collection APIs and use JDK APIs directly

* reset files that changed by accident

* sort codestyle/druid-forbidden-apis.txt alphabetically
2018-10-29 13:02:43 +01:00
Samarth Jain 0a90b3d51a Remove unused code (#6504)
* Remove unused code

* Remove usage of list in setDimensions and setAggregatorSpecs

* Fix formatting to adhere to 120 character guideline
2018-10-26 11:31:10 -07:00
Jonathan Wei 8382764900 Remove unused bin/init script, conf-quickstart reference (#6520) 2018-10-26 11:30:01 -07:00
Joshua Sun f7753ef1e2 fix KafkaSupervisor stats report error (#6508)
* fix kafkasupervisor stats 500

* added unit test

* throw error if group already exists
2018-10-25 15:45:54 -07:00
Clint Wylie ee1fc93f97 fix exception in Supervisor.start causing overlord unable to become leader (#6516)
* fix exception thrown by Supervisor.start causing overlord unable to become leader

* fix style
2018-10-25 15:44:04 -07:00
Michael Trelinski aef1b39762 Update init (#6514)
Fix bin/init to source from proper directory.
2018-10-25 13:40:23 -07:00
Clint Wylie e1057ad47a Fix NPE in TaskLockbox that prevents overlord leadership (#6512)
* fix NPE that prevents overlord from assuming leadership if extension that provides indexing task type is not loaded

* heh
2018-10-25 13:06:11 -07:00
Jonathan Wei b2d9b6f23d Allow custom TLS cert checks (#6432)
* Allow custom TLS cert checks

* PR comment

* Checkstyle, PR comment
2018-10-24 16:31:52 -07:00
QiuMM 601183b4c7 Add period drop before rule (#6415)
* Add period drop before rule

* add license header

* support period drop before rule in coordinator console

* address comments
2018-10-24 12:44:30 -07:00
Alexander Saydakov ec9d1827a0 updated to use the latest sketches-core-0.12.0 (#6381) 2018-10-23 11:20:19 -07:00
Roman Leventov 84ac18dc1b
Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461)
* Catch some incorrect method parameter or call argument formatting patterns with checkstyle

* Fix DiscoveryModule

* Inline parameters_and_arguments.txt

* Fix a bug in PolyBind

* Fix formatting
2018-10-23 07:17:38 -03:00
Faxian Zhao c5bf4e7503 update insert pending segments logic to synchronous (#6336)
* 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict.
2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB.
3. 'select and insert' is not need within transaction.

* Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments.

* fix typo for NullPointerException
2018-10-22 19:48:20 -07:00
Samarth Jain 359576a80b Implement force push down for nested group by query (#5471)
* Force nested query push down

* Code review changes
2018-10-22 13:43:47 -07:00
elloooooo 1e82b6291e Remove consumer.listTopics() method in case when too many topics in kafka causes the FullGC in Overlord (#6455)
* remove consumer.listTopics() method

* add consumerLock and exception handling for consumer.partitionFor() and remove some useless checks

* add check in case consumer.partitionsFor() returns null

* fix CI failure

* fix failed UT

* Revert "fix CI failure"

This reverts commit f839d09e1e.

* revert unless commit and re-commit the useful part to fix failed UT
2018-10-22 10:46:31 -07:00
Clint Wylie e83cc22996 use a sha512 hash of bloom filter for cache key instead of filter bytes (#6488)
* use a sha512 hash of bloom filter for cache key instead of filter bytes

* make serde private, BloomDimFilter.toString and BloomDimFilter.equals use hash instead of bloomKFilter which has no tostring or equals of its own

* keep and use HashCode object instead of converting to bytes up front

* uneeded imports oops

* tweaks from review

* refactor dupe code

* refactor
2018-10-22 07:57:21 -07:00
David Lim 822e564f54 include mysql-metadata-storage extension in distribution, but without… (#6497)
* include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library

* Install mysql connector package

* use symlinks to avoid versioning issues

* add documentation for fetching the mysql connector
2018-10-20 18:18:58 -07:00
Jonathan Wei 9851921f43 Fix integration test service logging (#6479) 2018-10-20 14:55:14 -07:00
Joshua Sun bf90b2b183 fix shallow clone git-commit-id plugin unable to find commits until tag issue (#6495) 2018-10-20 14:45:09 -07:00
David Lim e1a53fd17a fix distribution to not include contrib extensions by default, don't pull the entire AWS SDK bundle (#6494) 2018-10-19 13:50:05 -07:00
QiuMM f5f4171a45 QueryCountStatsMonitor: emit query/count (#6473)
Let `QueryCountStatsMonitor` emit `query/count`, then I can monitor QPS of my services, or I have to count it by myself.
2018-10-19 10:15:02 -03:00
Joshua Sun b662fe84c5 fix TaskRunnerUtils String formatting issue (#6492)
* fix TaskRunnerUtils String formatting issue

* additional fixes
2018-10-18 19:16:46 -06:00
David Lim 73780536a6 Apache-ize POM (#6482)
* Apache-ize POM

* put revision information into MANIFEST.MF for binary release

* remove nightly profile

* fix flaky travis by overriding maven-remote-resources-plugin execution from the parent POM
2018-10-17 18:37:01 -07:00
patelh f0b977ea7f Upgrade lz4-java to 1.5.0 (#6478) 2018-10-17 18:34:20 -07:00
Clint Wylie c034ed8c43 fix a couple of missed changes that should been in #6443 (#6487) 2018-10-17 16:22:17 -07:00
David Lim f8710c8e20 add DISCLAIMER file and modify NOTICE file for ASF (#6475)
* add DISCLAIMER file and modify NOTICE file for ASF

* add line breaks
2018-10-17 08:36:13 -07:00
Jonathan Wei b9d01f5770 More portable expired cert gen for IT (#6481) 2018-10-17 08:24:15 -07:00
patelh c780aacc03 Add ability to specify dbcp properties file (#6419)
* Add ability to specify dbcp properties file

* Address PR comments, use mock config, remove setter

* Add documentation

* APRC, updated docs with example file contents

* APRC, add @Nullable, @VisibileForTesting, update doc

* APRC, remove error log, use props directly as jackson binding

* Remove unused files
2018-10-16 12:27:19 -07:00
Roman Leventov 789c9a1dc7 Prohibit using Object|Long|Float|DoubleColumnSelector in instanceof statements (#6470)
* Prohibit using Object|Long|Float|DoubleColumnSelector in instanceof statements

* Doc fixes
2018-10-15 15:41:43 -07:00
QiuMM 85a89e2703 make druid node bind address configurable (#6464)
* make druid node bind address configurable

* fix tests

* fix travis-ci
2018-10-15 14:19:40 -07:00
Gian Merlino f537c0069a SQL: Support for selecting multi-value dimensions. (#6462)
* SQL: Support for selecting multi-value dimensions.

Fixes #4637. Doesn't completely address everything mentioned in #4638,
but at least fixes one issue on the way there.

* Fix null cases in tests.
2018-10-15 14:01:21 -07:00
Roman Leventov b57d5df074
Emit emitter/buffers/allocated/delta and emitter/buffers/failed/delta (#6425)
@gianm gauge `emitter/buffers/failed` and `emitter/buffers/totalAllocated` metrics proved to be very inconvenient in practice: they are not additive across a fleet of nodes, are randomly reset when a JVM restarts (that may be completely independent from metrics emitting issues). Regarding [problem identification](https://github.com/apache/incubator-druid/pull/4973#issuecomment-338119125), `emitter/buffers/dropped` serves this well.

Currently in this PR I added `/delta` vs. `/total` metrics, but if there are no objections I would remove `/total` and so align `emitter/buffers/allocated` and `/failed` with `/dropped` and `/events/emitted` which are already delta-only.
2018-10-15 17:30:29 -03:00
QiuMM 6c71ee5ed5 fix type mismatch caused by #6377 (#6466) 2018-10-15 17:34:18 +09:00
Roman Leventov aa121da25f Use NodeType enum instead of Strings (#6377)
* Use NodeType enum instead of Strings

* Make NodeType constants uppercase

* Fix CommonCacheNotifier and NodeType/ServerType comments

* Reconsidering comment

* Fix import

* Add a comment to CommonCacheNotifier.NODE_TYPES
2018-10-14 20:49:38 -07:00
robertervin 95ab1ea737 Fix Empty InDimFilter Failure (#6330)
* fix empty InDimFilter failure (#6101)

* Add test case for empty values input

* Add documentation for empty values in InDimFilter
2018-10-14 20:43:16 -07:00
Clint Wylie 84598fba3b combine druid-api, druid-common, java-util into druid-core (#6443)
* combine druid-api, druid-common, java-util

* spacing
2018-10-14 20:37:37 -07:00
Diego Costa d5ccd582b5 Reducing the usage of Jmh Invocation level on setup (#6459)
* Reducing the usage of invocation level setup

* Removing LoadStatusBenchmark as it has no use in the codebase
2018-10-14 20:37:11 -07:00
Roman Leventov e3397ba00f Enforce Druid's exception class use (#6456) 2018-10-13 16:35:14 -07:00
Surekha e908fd6db7 Add check for nullable numRows (#6460)
* Add check for nullable numRows

* Make numRows long instead of Long type

* Add check for numRows in unit test
* small refactoring

* Modify test

PR comment from https://github.com/apache/incubator-druid/pull/6094#pullrequestreview-163937783

* Add a test for serverSegments table

* update tests
2018-10-13 15:08:42 -07:00
dongyifeng b06ac54a5e add PrefixFilteredDimensionSpec for multi-value dimensions (#6307)
* add PrefixFilteredDimensionSpec for multi-value dimensions

* add docs for PrefixFilteredDimensionSpec

* remove unnecessary null handling

* add null check to the result of NullHandling
2018-10-12 17:51:09 -07:00