Commit Graph

334 Commits

Author SHA1 Message Date
Chi Cao Minh 6b02991464
Match GREATEST/LEAST function behavior to other DBs (#9488)
* Match GREATEST/LEAST function behavior

Change the behavior of the GREATEST / LEAST functions to be similar to
how it is implemented in other databases (as functions instead of
aggregators). The GREATEST/LEAST functions are not in the SQL standard,
but users will expect behavior similar to what other databases provide.

* Match postgres behavior & handle more SQL types

* Fix imports
2020-03-12 15:10:11 -07:00
Himanshu 75a5591448
remove old unused zookeeper dependent lookups code (#9480)
* remove old unused zookeeper dependent lookups code

* make  intellij inspector happy
2020-03-10 12:12:48 -07:00
Clint Wylie 8b9fe6f584
query laning and load shedding (#9407)
* prototype

* merge QueryScheduler and QueryManager

* everything in its right place

* adjustments

* docs

* fixes

* doc fixes

* use resilience4j instead of semaphore

* more tests

* simplify

* checkstyle

* spelling

* oops heh

* remove unused

* simplify

* concurrency tests

* add SqlResource tests, refactor error response

* add json config tests

* use LongAdder instead of AtomicLong

* remove test only stuffs from scheduler

* javadocs, etc

* style

* partial review stuffs

* adjust

* review stuffs

* more javadoc

* error response documentation

* spelling

* preserve user specified lane for NoSchedulingStrategy

* more test, why not

* doc adjustment

* style

* missed review for make a thing a constant

* fixes and tests

* fix test

* Update docs/configuration/index.md

Co-Authored-By: sthetland <steve.hetland@imply.io>

* doc update

Co-authored-by: sthetland <steve.hetland@imply.io>
2020-03-10 02:57:16 -07:00
Jihoon Son 75e2051195
Convert array_contains() and array_overlaps() into native filters if possible (#9487)
* Convert array_contains() and array_overlaps() into native filters if
possible

* make spotbugs happy and fix null results when null compatible
2020-03-09 22:50:38 -07:00
Gian Merlino c9faf3e148
Add SQL GROUPING SETS support. (#9122)
* Add SQL GROUPING SETS support.

Built on top of the subtotalsSpec feature in the groupBy query. This also involves
two changes to subtotalsSpec:

- Alter behavior so limitSpec is applied after subtotalsSpec, rather than applied to
  each grouping set. This is more in line with SQL standard behavior. I think it is okay
  to make this change, since the old behavior was not documented, so users should
  hopefully not be depending on it.
- Fix a bug where virtual columns were included in the subtotal queries, but they
  should not have been.

Also fixes two bugs in query equality checking:

- BaseQuery: Use getDuration() instead of "duration" in equals and hashCode, since the
  latter is lazily initialized and might be null in one query but not the other.
- GroupByQuery: Include subtotalsSpec in equals and hashCode.

* Fix bugs.

* Fix tests.

* PR updates.

* Grouping class hygiene.
2020-02-26 08:52:39 -08:00
Clint Wylie b408a6d774
sql support for dynamic parameters (#6974)
* sql support for dynamic parameters

* fixup

* javadocs

* fixup from merge

* formatting

* fixes

* fix it

* doc fix

* remove druid fallback self-join parameterized test

* unused imports

* ignore test for now

* fix imports

* fixup

* fix merge

* merge fixup

* fix test that cannot vectorize

* fixup and more better

* dependency thingo

* fix docs

* tweaks

* fix docs

* spelling

* unused imports after merge

* review stuffs

* add comment

* add ignore text

* review stuffs
2020-02-19 13:09:20 -08:00
Maytas Monsereenusorn c30579e47b
ANY Aggregator should not skip null values implementation (#9317)
* ANY Aggregator should not skip null values implementation

* add tests

* add more tests

* Update documentation

* add more tests

* address review comments

* optimize StringAnyBufferAggregator

* fix failing tests

* address pr comments
2020-02-12 14:01:41 -08:00
Aditya 868fdeb384
GREATEST/LEAST post-aggregators in SQL (#8719)
* implement shell for greatest sql aggregator with hardcoded long values

* implement functional long greatest aggregator for direct access columns

* implement greatest & least sql aggregators for long & double types using abstract base class

* add javadocs, unit tests & handling for floats for greatest/least postaggregations

* minor checkstyle fix

* improve naming for the test cases

* make inner class static

* remove blank lines to retest travis build

* change trivial text to rerun travis build

* implement suggested updates for greatest/least sql aggs & fix checkstyle issues

* fix stale comments in greatest/least sql aggs abstract base

* Update sql.md

* improve sql function definitions for greatest/least sql aggs

* add more tests for greatest/least sql aggs

* add tests to cover invalid greatest/least sql expressions

* rename & reorder greatest least sql tests
2020-02-04 17:08:53 -08:00
Clint Wylie 8011211a0c first/last aggregators and nulls (#9161)
* null handling for numeric first/last aggregators, refactor to not extend nullable numeric agg since they are complex typed aggs

* initially null or not based on config

* review stuff, make string first/last consistent with null handling of numeric columns, more tests

* docs

* handle nil selectors, revert to primitive first/last types so groupby v1 works...
2020-01-20 11:51:54 -08:00
Suneet Saldanha 180c622e0f Minor doc updates (#9217)
* update string first last aggs

* update kafka ingestion specs in docs

* remove unnecessary parser spec
2020-01-20 11:34:37 -08:00
Gian Merlino d21054f7c5
Remove the deprecated interval-chunking stuff. (#9216)
* Remove the deprecated interval-chunking stuff.

See https://github.com/apache/druid/pull/6591, https://github.com/apache/druid/pull/4004#issuecomment-284171911 for details.

* Remove unused import.

* Remove chunkInterval too.
2020-01-19 17:14:23 -08:00
Maytas Monsereenusorn 42359c93dd Implement ANY aggregator (#9187)
* Implement ANY aggregator

* Add copyright headers

* Add unit tests

* fix BufferAggregator

* Fix bug in BufferAggregator

* hook up the SQL command

* add check for buffer aggregator

* Address comment

* address comments

* add docs

* Address comments

* add more tests for numeric columns that have null values when run in sql compatible null mode

* fix checkstyle errors

* fix failing tests

* fix failing tests
2020-01-16 14:40:32 -08:00
Jonathan Wei aa539177ec De-incubation cleanup in code, docs, packaging (#9108)
* De-incubation cleanup in code, docs, packaging

* remove unused docs script
2020-01-03 12:33:19 -05:00
Clint Wylie 6881535b48
docs - clarify cache parameters (#9020) 2019-12-13 16:53:45 -08:00
Clint Wylie d67c3c7aed document SQL compatible null handling mode (#8894)
* document SQL compatible null handling mode

* adjustments

* fix docs

* review changes
2019-11-20 06:52:20 -08:00
Jonathan Wei 75ea0d592a Add more datasketches doubles sketch SQL functions (#8843)
* Add more datasketches doubles sketch SQL postaggs

* style and lgtm
2019-11-08 18:05:06 -08:00
Gian Merlino 0e8c3f74d0 SQL: EARLIEST, LATEST aggregators. (#8815)
* SQL: EARLIEST, LATEST aggregators.

I chose these names instead of FIRST, LAST because those are already
reserved functions in Calcite that mean something different. I think
these are also better names anyway.

* Finalify.

* SQL updates.

* Adjust aggregator calls.

* Validations, test updates.

* Review docs.
2019-11-08 16:29:25 -08:00
Clint Wylie 7aafcf8bca parallel broker merges on fork join pool (#8578)
* sketch of broker parallel merges done in small batches on fork join pool

* fix non-terminating sequences, auto compute parallelism

* adjust benches

* adjust benchmarks

* now hella more faster, fixed dumb

* fix

* remove comments

* log.info for debug

* javadoc

* safer block for sequence to yielder conversion

* refactor LifecycleForkJoinPool into LifecycleForkJoinPoolProvider which wraps a ForkJoinPool

* smooth yield rate adjustment, more logs to help tune

* cleanup, less logs

* error handling, bug fixes, on by default, more parallel, more tests

* remove unused var

* comments

* timeboundary mergeFn

* simplify, more javadoc

* formatting

* pushdown config

* use nanos consistently, move logs back to debug level, bit more javadoc

* static terminal result batch

* javadoc for nullability of createMergeFn

* cleanup

* oops

* fix race, add docs

* spelling, remove todo, add unhandled exception log

* cleanup, revert unintended change

* another unintended change

* review stuff

* add ParallelMergeCombiningSequenceBenchmark, fixes

* hyper-threading is the enemy

* fix initial start delay, lol

* parallelism computer now balances partition sizes to partition counts using sqrt of sequence count instead of sequence count by 2

* fix those important style issues with the benchmarks code

* lazy sequence creation for benchmarks

* more benchmark comments

* stable sequence generation time

* update defaults to use 100ms target time, 4096 batch size, 16384 initial yield, also update user docs

* add jmh thread based benchmarks, cleanup some stuff

* oops

* style

* add spread to jmh thread benchmark start range, more comments to benchmarks parameters and purpose

* retool benchmark to allow modeling more typical heterogenous heavy workloads

* spelling

* fix

* refactor benchmarks

* formatting

* docs

* add maxThreadStartDelay parameter to threaded benchmark

* why does catch need to be on its own line but else doesnt
2019-11-07 11:58:46 -08:00
Clint Wylie 3ff5e02237 remove select query (#8739)
* remove select query

* thanks teamcity

* oops

* oops

* add back a SelectQuery class that throws RuntimeExceptions linking to docs

* adjust text

* update docs per review

* deprecated
2019-10-30 19:29:56 -07:00
Gian Merlino 7605c23354 Remove Tranquility configs and certain doc references. (#8793)
Since it hasn't received updates or community interest in a while, it makes sense
to de-emphasize it in the distribution and most documentation (outside of simple
mentions of its existence).
2019-10-30 16:30:16 -07:00
Vadim Ogievetsky cc3650ee3b fix doc headers (#8729) 2019-10-24 11:17:39 -07:00
Jihoon Son f5b9bf5525 Cluster-wide configuration for query vectorization (#8657)
* Cluster-wide configuration for query vectorization

* add doc

* fix build

* fix doc

* rename to QueryConfig and add javadoc

* fix checkstyle

* fix variable names
2019-10-23 21:44:28 +08:00
Abhishek Radhakrishnan 42cfe679f1 Update query result timestamp to match query intervals. (#8717) 2019-10-22 14:39:47 -07:00
Surekha 98f59ddd7e Add `sys.supervisors` table to system tables (#8547)
* Add supervisors table to SystemSchema

* Add docs

* fix checkstyle

* fix test

* fix CI

* Add comments

* Fix javadoc teamcity error

* comments

* fix links in docs

* fix links

* rename fullStatus query param to system and remove it from docs
2019-10-18 15:16:42 -07:00
Jonathan Wei d88075237a
Add initial SQL support for non-expression sketch postaggs (#8487)
* Add initial SQL support for non-expression sketch postaggs

* Checkstyle, spotbugs

* checkstyle

* imports

* Update SQL docs

* Checkstyle

* Fix theta sketch operator docs

* PR comments

* Checkstyle fixes

* Add missing entries for HLL sketch module

* PR comments, add round param to HLL estimate operator, fix optional HLL param
2019-10-18 14:59:44 -07:00
Clint Wylie 8bda3afea4 fix spelling errors triggered by another doc PR (#8653) 2019-10-08 23:43:58 -07:00
Himanshu c078ed40fd
groupBy query: optional limit push down to segment scan (#8426)
* groupBy query: optional limit push down to segment scan

* make segment level limit push down configurable

* fix teamcity errors

* fix segment limit pushdown flag handling on query level config override

* use equals for comparator check

* fix sql and null handling

* fix unused imports

* handle null offset in NullableValueGroupByColumnSelectorStrategy for buffer comparator similar to RowBasedGrouperHelper.NullableRowBasedKeySerdeHelper
2019-10-08 15:35:07 -07:00
Himanshu 9f1f5e115c
doubleMean aggregator to be used at query time (#8459)
* doubleMean aggregator for computing mean

* make docs

* build fixes

* address review comment: handle null args
2019-09-26 08:04:33 -07:00
Vadim Ogievetsky 52f3f2c229 fix docs version interpolation (#8568) 2019-09-22 17:38:55 -07:00
Chi Cao Minh 7dcbaca658 Spellcheck docs (#8548)
* Spellcheck docs

Fix spelling mistakes in docs and add CI job for running spellcheck on
docs.

* Add missing license header
2019-09-17 12:47:30 -07:00
Clint Wylie 3baf31e9a8 add documentation for group by array based result format (#8416) 2019-08-28 08:30:31 -07:00
Jihoon Son 95fa609615 Fix wrong partitionsSpec type names in the document (#8297)
* Fix wrong type names for partitionsSpec

* add unit tests; add json properties for backward compatibility

* beautify conf names

* remove maxRowsPerSegment from hashed partitionsSpec

* fix doc build
2019-08-23 13:44:58 -07:00
Surekha cf2a2dd917
Add group_id to the sys.tasks table (#8304)
* Add group_id to overlord tasks API and sys.tasks table

* adjust test

* modify docs

* Make groupId nullable

* fix integration test

* fix toString

* Remove groupId from TaskInfo

* Modify docs and tests

* modify TaskMonitorTest
2019-08-22 15:28:23 -07:00
Gian Merlino d007477742
Docusaurus build framework + ingestion doc refresh. (#8311)
* Docusaurus build framework + ingestion doc refresh.

* stick to npm instead of yarn

* fix typos

* restore some _bin

* Adjustments.

* detect and fix redirect anchors

* update anchor lint

* Web-console: remove specific column filters (#8343)

* add clear filter

* update tool kit

* remove usless check

* auto run

* add %

* Fix resource leak (#8337)

* Fix resource leak

* Patch comments

* Enable Spotbugs NP_NONNULL_RETURN_VIOLATION (#8234)

* Fixes from PR review.

* Fix more anchors.

* Preamble nix.

* Fix more anchors, headers

* clean up placeholder page

* add to website lint to travis config

* better broken link checking

* travis fix

* Fixed more broken links

* better redirects

* unfancy catch

* fix LGTM error

* link fixes

* fix md issues

* Addl fixes
2019-08-20 21:48:59 -07:00