Commit Graph

8947 Commits

Author SHA1 Message Date
Jihoon Son 0395d554e1 Properly reset total size of segmentsToCompact in NewestSegmentFirstIterator (#6622)
* Properly reset total size of segmentsToCompact in NewestSegmentFirstIterator

* add test
2018-11-15 01:00:51 -08:00
Roman Leventov 0b70c36eb0 Fix bugs in ExprEval (#6617) 2018-11-14 15:20:52 -08:00
Niketh Sabbineni 2ebdce20b1 Fix smile query documentation (#6620) 2018-11-14 08:51:02 +08:00
Jihoon Son cdae2fe7b5 Deprecate IntervalChunkingQueryRunner (#6591)
* Deprecate IntervalChunkingQueryRunner

* add doc

* deprecate metric

* fix doc
2018-11-14 06:33:27 +08:00
Gian Merlino 80173b5d29 SQL: Set INFORMATION_SCHEMA catalog name to "druid". (#6595)
* SQL: Set INFORMATION_SCHEMA catalog name to "druid".

Some third party tools ignore catalogs with empty names. So using
the name "druid" for the catalog makes integration easier.

* Update tests.
2018-11-14 06:32:40 +08:00
Gian Merlino ab518781bb SQL: Support AVG on system tables. (#6601) 2018-11-14 06:31:33 +08:00
Gian Merlino 154b6fbcef SQL: Add "POSITION" function. (#6596)
Also add a "fromIndex" argument to the strpos expression function. There
are some -1 and +1 adjustment terms due to the fact that the strpos
expression behaves like Java indexOf (0-indexed), but the POSITION SQL
function is 1-indexed.
2018-11-13 13:39:00 -08:00
QiuMM f2b73f9df1 fix cannot resolve param at OverlordResource#getTasks (#6593) 2018-11-13 09:47:11 -08:00
Jihoon Son 7b262b7123 Remove unnecessary path param from auto compaction api (#6594)
* Remove unnecessary path param from auto compaction api

* fix ci
2018-11-13 09:46:13 -08:00
David Lim afb239b17a add missing license headers, in particular to MD files; clean up RAT … (#6563)
* add missing license headers, in particular to MD files; clean up RAT exclusions

* revert inadvertent doc changes

* docs

* cr changes

* fix modified druid-production.svg
2018-11-13 09:38:37 -08:00
Gian Merlino 52f6bdc1eb
Optimization for expressions that hit a single long column. (#6599)
* Optimization for expressions that hit a single long column.

There was previously a single-long-input optimization that applied only
to the time column. These have been combined together. Also adds
type-specific value caching to ExprEval, which allowed simplifying
the SingleLongInputCachingExpressionColumnValueSelector code.

* Add more benchmarks.

* Don't use LRU cache for __time.

* Simplify a bit.

* Let the cache grow.
2018-11-13 09:36:32 -08:00
Clint Wylie e326086604 fix kafka indexing task not processing through end offsets on publish, fixes #6602 (#6603) 2018-11-12 14:27:32 -08:00
Clint Wylie c2f020eacc fix druid-bloom-filter thread-safety (#6584)
* use BloomFilter instead of BloomKFilter since the latters test method is not threadsafe

* fix formatting

* style and forbidden api

* remove redundant hive notice entry

* add todo with note to delete copied implementation and link to related hive jira

* better fix for masks than ThreadLocal
2018-11-09 10:55:17 -08:00
Roman Leventov 54351a5c75 Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490)
* Fix various bugs; Enable more IntelliJ inspections and update error-prone

* Fix NPE

* Fix inspections

* Remove unused imports
2018-11-06 14:38:08 -08:00
Surekha bcb754d066 Use current coordinator leader instead of cached one (#6551) (#6552)
* Use current coordinator leader instead of cached one (#6551)

Check the response status and throw exception if not OK

* Modify tests

* PR comment

* Add the correct check for status of BytesAccumulatingResponseHandler

* Move the status check into JsonParserIterator so sql query outputs meaningful message on failure

* Fix tests
2018-11-06 13:09:51 -08:00
Clint Wylie 1224d8b746 overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360)
* move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql

* fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays

* remove leftover print

* convert micro timestamp to millis

* checkstyle

* add ignore for .parquet and .parq to rat exclude

* fix legit test failure from avro flattern behavior change

* fix rebase

* add exclusions to pom to cut down on redundant jars

* refactor tests, add support for unwrapping lists for parquet-avro, review comments

* more comment

* fix oops

* tweak parquet-avro list handling

* more docs

* fix style

* grr styles
2018-11-05 21:33:42 -08:00
Roman Leventov a2a1a1c2c9 Hide NullDimensionSelector from public (#6480) 2018-11-02 04:38:21 -07:00
David Lim 23ad3d214c fixup docs to download from Apache mirror, fixup tarball name and path, change references from quickstart/* to quickstart/tutorial/* (#6570) 2018-11-01 21:47:29 -07:00
Caroline1000 26d992840c correct default tier name (#6568) 2018-11-01 17:51:13 -07:00
QiuMM ddd15a6907 correct default value for maxTotalRows (#6566) 2018-11-01 16:53:15 -07:00
Jihoon Son a92c2a197b
Move supervisor APIs to api-reference (#6555)
* Move supervisor APIs to api-reference

* fix kafka-specific docs

* add ingestion stats report
2018-11-01 13:10:05 -07:00
QiuMM 7b34662462 Period load/drop/broadcast rules should include the future by default (#6414)
* Period load/drop/broadcast rules should include the future by default

* address comments

* adjust coordinator console and tweak docs

* address comments

* fix travis-ci
2018-11-01 09:43:34 -07:00
Jihoon Son d2a533c7c7 Add doc for missing balancerComputeThreads configuration (#6561)
* Add doc for missing balancerComputeThreads configuration

* remove duplicate
2018-10-31 18:43:12 -07:00
Roman Leventov 2cdce2e2a6
Add RequestLogEventBuilderFactory (#6477)
This PR allows to control the fields in `RequestLogEvent`, emitted in `EmittingRequestLogger`. In our case, we want to get rid of the `intervals` fields of the query objects that are a part of `DefaultRequestLogEvent`. They are enormous (thousands of segments) and not useful.

Related to #5522, FYI @a2l007.
2018-10-31 22:24:37 +01:00
Gian Merlino d5e9e5686e Set new keys for integration-tests. (#6554) 2018-10-31 09:01:42 -07:00
taiii b1159174b7 Update mysql.md (#6545) 2018-10-30 14:01:32 -07:00
QiuMM 676f5e6d7f Prohibit some guava collection APIs and use JDK collection APIs directly (#6511)
* Prohibit some guava collection APIs and use JDK APIs directly

* reset files that changed by accident

* sort codestyle/druid-forbidden-apis.txt alphabetically
2018-10-29 13:02:43 +01:00
Samarth Jain 0a90b3d51a Remove unused code (#6504)
* Remove unused code

* Remove usage of list in setDimensions and setAggregatorSpecs

* Fix formatting to adhere to 120 character guideline
2018-10-26 11:31:10 -07:00
Jonathan Wei 8382764900 Remove unused bin/init script, conf-quickstart reference (#6520) 2018-10-26 11:30:01 -07:00
Joshua Sun f7753ef1e2 fix KafkaSupervisor stats report error (#6508)
* fix kafkasupervisor stats 500

* added unit test

* throw error if group already exists
2018-10-25 15:45:54 -07:00
Clint Wylie ee1fc93f97 fix exception in Supervisor.start causing overlord unable to become leader (#6516)
* fix exception thrown by Supervisor.start causing overlord unable to become leader

* fix style
2018-10-25 15:44:04 -07:00
Michael Trelinski aef1b39762 Update init (#6514)
Fix bin/init to source from proper directory.
2018-10-25 13:40:23 -07:00
Clint Wylie e1057ad47a Fix NPE in TaskLockbox that prevents overlord leadership (#6512)
* fix NPE that prevents overlord from assuming leadership if extension that provides indexing task type is not loaded

* heh
2018-10-25 13:06:11 -07:00
Jonathan Wei b2d9b6f23d Allow custom TLS cert checks (#6432)
* Allow custom TLS cert checks

* PR comment

* Checkstyle, PR comment
2018-10-24 16:31:52 -07:00
QiuMM 601183b4c7 Add period drop before rule (#6415)
* Add period drop before rule

* add license header

* support period drop before rule in coordinator console

* address comments
2018-10-24 12:44:30 -07:00
Alexander Saydakov ec9d1827a0 updated to use the latest sketches-core-0.12.0 (#6381) 2018-10-23 11:20:19 -07:00
Roman Leventov 84ac18dc1b
Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461)
* Catch some incorrect method parameter or call argument formatting patterns with checkstyle

* Fix DiscoveryModule

* Inline parameters_and_arguments.txt

* Fix a bug in PolyBind

* Fix formatting
2018-10-23 07:17:38 -03:00
Faxian Zhao c5bf4e7503 update insert pending segments logic to synchronous (#6336)
* 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict.
2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB.
3. 'select and insert' is not need within transaction.

* Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments.

* fix typo for NullPointerException
2018-10-22 19:48:20 -07:00
Samarth Jain 359576a80b Implement force push down for nested group by query (#5471)
* Force nested query push down

* Code review changes
2018-10-22 13:43:47 -07:00
elloooooo 1e82b6291e Remove consumer.listTopics() method in case when too many topics in kafka causes the FullGC in Overlord (#6455)
* remove consumer.listTopics() method

* add consumerLock and exception handling for consumer.partitionFor() and remove some useless checks

* add check in case consumer.partitionsFor() returns null

* fix CI failure

* fix failed UT

* Revert "fix CI failure"

This reverts commit f839d09e1e.

* revert unless commit and re-commit the useful part to fix failed UT
2018-10-22 10:46:31 -07:00
Clint Wylie e83cc22996 use a sha512 hash of bloom filter for cache key instead of filter bytes (#6488)
* use a sha512 hash of bloom filter for cache key instead of filter bytes

* make serde private, BloomDimFilter.toString and BloomDimFilter.equals use hash instead of bloomKFilter which has no tostring or equals of its own

* keep and use HashCode object instead of converting to bytes up front

* uneeded imports oops

* tweaks from review

* refactor dupe code

* refactor
2018-10-22 07:57:21 -07:00
David Lim 822e564f54 include mysql-metadata-storage extension in distribution, but without… (#6497)
* include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library

* Install mysql connector package

* use symlinks to avoid versioning issues

* add documentation for fetching the mysql connector
2018-10-20 18:18:58 -07:00
Jonathan Wei 9851921f43 Fix integration test service logging (#6479) 2018-10-20 14:55:14 -07:00
Joshua Sun bf90b2b183 fix shallow clone git-commit-id plugin unable to find commits until tag issue (#6495) 2018-10-20 14:45:09 -07:00
David Lim e1a53fd17a fix distribution to not include contrib extensions by default, don't pull the entire AWS SDK bundle (#6494) 2018-10-19 13:50:05 -07:00
QiuMM f5f4171a45 QueryCountStatsMonitor: emit query/count (#6473)
Let `QueryCountStatsMonitor` emit `query/count`, then I can monitor QPS of my services, or I have to count it by myself.
2018-10-19 10:15:02 -03:00
Joshua Sun b662fe84c5 fix TaskRunnerUtils String formatting issue (#6492)
* fix TaskRunnerUtils String formatting issue

* additional fixes
2018-10-18 19:16:46 -06:00
David Lim 73780536a6 Apache-ize POM (#6482)
* Apache-ize POM

* put revision information into MANIFEST.MF for binary release

* remove nightly profile

* fix flaky travis by overriding maven-remote-resources-plugin execution from the parent POM
2018-10-17 18:37:01 -07:00
patelh f0b977ea7f Upgrade lz4-java to 1.5.0 (#6478) 2018-10-17 18:34:20 -07:00
Clint Wylie c034ed8c43 fix a couple of missed changes that should been in #6443 (#6487) 2018-10-17 16:22:17 -07:00