Commit Graph

113 Commits

Author SHA1 Message Date
Yi Yuan f1e52ab356
add doc (#11531)
Co-authored-by: yuanyi <yuanyi@freewheel.tv>
2021-08-03 12:20:29 +08:00
Victoria Lim 949484728f
docs fix for doubleMean description (#11513)
* fix for doubleMean description

* include quantile aggregator description from Suneet

* update hyperlink to quantiles aggregator
2021-07-30 12:39:44 -07:00
Kashif Faraz 8a4e27f51d
Select broker based on query context parameter `brokerService` (#11495)
This change allows the selection of a specific broker service (or broker tier) by the Router.

The newly added ManualTieredBrokerSelectorStrategy works as follows:

Check for the parameter brokerService in the query context. If this is a valid broker service, use it.
Check if the field defaultManualBrokerService has been set in the strategy. If this is a valid broker service, use it.
Move on to the next strategy
2021-07-27 20:56:05 +05:30
Peter Marshall 973e5bf7d0
Docs - HLL lgK tip and slight layout change (#11482)
* HLL lgK and a tip

Knowledge transfer from https://the-asf.slack.com/archives/CJ8D1JTB8/p1600699967024200.  Attempted to make a connection between the SQL HLL function and the HLL underneath without getting too complicated.  Also added a note about using K over 16 being pretty much pointless.

* Corrected spelling

* Create datasketches-hll.md

Put roll-up back to rollup

* Update docs/development/extensions-core/datasketches-hll.md

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
2021-07-26 12:28:53 -07:00
sthetland a366753ba5
Consolidate multi-value dimension doc and highlight configurability (#11428)
* Clarify options for multi-value dims
* Add first example
2021-07-15 10:19:10 -07:00
Clint Wylie 17efa6f556
add single input string expression dimension vector selector and better expression planning (#11213)
* add single input string expression dimension vector selector and better expression planning

* better

* fixes

* oops

* rework how vector processor factories choose string processors, fix to be less aggressive about vectorizing

* oops

* javadocs, renaming

* more javadocs

* benchmarks

* use string expression vector processor with vector size 1 instead of expr.eval

* better logging

* javadocs, surprising number of the the

* more

* simplify
2021-07-06 11:20:49 -07:00
frank chen 906a704c55
Eliminate ambiguities of KB/MB/GB in the doc (#11333)
* GB ---> GiB

* suppress spelling check

* MB --> MiB, KB --> KiB

* Use IEC binary prefix

* Add reference link

* Fix doc style
2021-06-30 13:42:45 -07:00
Clint Wylie df9b57aa1a
bitwise aggregators, better null handling options for expression agg (#11280)
* bitwise aggregators, better nulls for expression agg

* correct behavior

* rework deserialize, better names

* fix json, share mask
2021-06-25 16:51:16 -07:00
Clint Wylie bfbd7ec432
fix a bugs related to SQL type inference return type nullability (#11327)
* fix a bunch of type inference nullability bugs

* fixes

* style

* fix test

* fix concat
2021-06-15 12:26:59 -07:00
Charles Smith a1ed3a407d
clarify bySegment is native only (#11331) 2021-06-11 13:48:17 -07:00
Clint Wylie 3649c608d2
array handling improvements (#11233)
* fix jdbc array handling, split handling for some array and multi value operator, split and add more tests

* formatting
2021-05-13 18:50:32 -07:00
Clint Wylie 691d7a1d54
SQL timeseries no longer skip empty buckets with all granularity (#11188)
* SQL timeseries no longer skip empty buckets with all granularity

* add comment, fix tests

* the ol switcheroo

* revert unintended change

* docs and more tests

* style

* make checkstyle happy

* docs fixes and more tests

* add docs, tests for array_agg

* fixes

* oops

* doc stuffs

* fix compile, match doc style
2021-05-10 10:13:37 -07:00
benkrug 49c8307b72
Update datasource.md (#10864)
* Update datasource.md

Change "table" to "datasource" in join discussion: This means that all datasources
other than the leftmost "base" table must fit in memory.

According to docs on datasources, "datasource" is the more general term, and a table is a kind of datasource.  In the context here, then, "datasource" is applicable.

* left-hand table -> left-hand datasource

Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>

Co-authored-by: sthetland <steve.hetland@imply.io>
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
2021-05-07 01:14:45 -07:00
Lasse Krogh Mammen 9be2a5cdc2
Add documentation re alphabetical sorted of MV dimensions (#10695) 2021-05-07 01:12:32 -07:00
Clint Wylie 554f1ffeee
ARRAY_AGG sql aggregator function (#11157)
* ARRAY_AGG sql aggregator function

* add javadoc

* spelling

* review stuff, return null instead of empty when nil input

* review stuff

* Update sql.md

* use type inference for finalize, refactor some things
2021-05-03 22:17:10 -07:00
imply-jbalik 6f7701e742
fixed array syntax (#11191) 2021-05-03 21:38:16 -07:00
Gian Merlino cb7c6ac314
Doc updates for union datasources. (#11103)
The main one is updating datasources.md to talk about SQL. (It still said
that table unions are not supported in SQL.) Also, this doc update adds
some clarifying details on limitations.
2021-04-14 18:18:14 -07:00
sthetland dd4c5f2a17
Update using-caching.md (#11069) 2021-04-08 16:48:26 -05:00
sthetland fb6751fa45
Fix old broken link (#11048)
* link check fixes

* updated link target

* Update aggregations.md

* spelling error
2021-04-07 20:40:50 -07:00
Cameron Teasdale 786207995e
add minimal documentation for expression filters (#11045)
* add minimal documentation for expression filters

* Update docs/querying/filters.md

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

* Update docs/querying/filters.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/filters.md

Co-authored-by: Alejandro Lujan <andanthor@gmail.com>

* Update docs/querying/filters.md

Co-authored-by: Alejandro Lujan <andanthor@gmail.com>

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Co-authored-by: sthetland <steve.hetland@imply.io>
Co-authored-by: Alejandro Lujan <andanthor@gmail.com>
2021-04-07 16:58:28 -07:00
Abhishek Agarwal 0df0bff44b
Enable multiple distinct aggregators in same query (#11014)
* Enable multiple distinct count

* Add more tests

* fix sql test

* docs fix

* Address nits
2021-04-07 00:52:19 -07:00
Lasse Krogh Mammen 782a1d4e6c
Add Calcite Avatica protobuf handler (#10543) 2021-03-31 12:46:25 -07:00
benkrug 7f96ca8f5e
Update topnquery.md (#10944)
minor edits of the English, no meanings changed (imo)
2021-03-09 15:19:02 -08:00
Abhishek Agarwal c66951a59e
Add flag in SQL to disable left base filter optimization for joins (#10947)
* Add flag to disable left base filter

* code coverage

* Draft

* Review comments

* code coverage

* add docs

* Add old tests
2021-03-09 13:07:34 -08:00
Charles Smith 0f81ce32a0
refactor query caching docs (#10848)
* refactor query caching

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* Update docs/querying/using-caching.md

Co-authored-by: sthetland <steve.hetland@imply.io>

* add description for context link

* accept suggestions

* reword, rework some awkward language

* incorporate feedback, fix errors

* add back perf considerations

* Apply suggestions from code review

applying @suneet-s 's changes

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update caching.md

fix link

Co-authored-by: sthetland <steve.hetland@imply.io>
Co-authored-by: Suneet Saldanha <suneet@apache.org>
2021-03-08 22:25:48 -08:00
Atul Mohan be2ac8d6ce
Document type inference issues with dynamic params in SQL (#10801)
* Clarify docs

* Apply suggestions from code review

Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>

Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
2021-03-04 03:48:11 -08:00
Clint Wylie cbbef80c7f
add SQL operators for bitwise expressions (#10823)
* add SQL operators for bitwise expressions

* more test

* fix spelling

* more tests
2021-02-18 20:56:33 -08:00
Jihoon Son ac41e41232
Update doc for query errors and add unit tests for JsonParserIterator (#10833)
* Update doc for query errors and add unit tests for JsonParserIterator

* static constructor for convenience

* rename method
2021-02-05 02:55:32 -08:00
Makdon f9fc1892d1
Typo: missing comma in json (#10711) 2021-01-06 13:49:50 -08:00
Clint Wylie da0eabaa01
integration test for coordinator and overlord leadership client (#10680)
* integration test for coordinator and overlord leadership, added sys.servers is_leader column

* docs

* remove not needed

* fix comments

* fix compile heh

* oof

* revert unintended

* fix tests, split out docker-compose file selection from starting cluster, use docker-compose down to stop cluster

* fixes

* style

* dang

* heh

* scripts are hard

* fix spelling

* fix thing that must not matter since was already wrong ip, log when test fails

* needs more heap

* fix merge

* less aggro
2020-12-17 22:50:12 -08:00
sthetland 6ae8059c09
cleaning up and fixing links (#10528)
* cleaning up and fixing links

* reverting local link

* Update indexer.md

* link checking

* Fixing one more stale link for PostgreSQL
2020-12-17 13:37:43 -08:00
Abhishek Agarwal 4ea1ab8531
Fix links in the grouping function doc (#10654) 2020-12-09 14:56:32 +08:00
Abhishek Agarwal 26d74b3580
Add grouping_id function (#10518)
* First draft of grouping_id function

* Add more tests and documentation

* Add calcite tests

* Fix travis failures

* bit of a change

* Add documentation

* Fix typos

* typo fix
2020-12-07 11:46:29 -08:00
frank chen 24f1e35b5d
fix desc of 'required' for granularity property (#10616) 2020-12-01 18:29:51 -08:00
sthetland ba915b7f56
Security overview documentation (#10339)
* initial file

* initial file

* security overview added

* ldap added

* spacing adjustments

* nits

* security graphics and doc review

* Update docs/operations/security-overview.md

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Update docs/operations/security-user-auth.md

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Update docs/operations/security-overview.md

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Update docs/operations/security-overview.md

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* updates frm review

* review comments

* finish up review and light edits

* broken links

* spell check

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>
2020-11-19 15:24:58 -08:00
michaelschiff 2f4d6da33f
Updates segment metadata query documentation (#10589)
* updates segment metadata query documentation to be clearer about cardinality estimation

* typo in documentation
2020-11-20 00:08:27 +05:30
Atul Mohan 21e3c4b39c
Add missing docs for timeout exceptions (#10554)
* Add missing docs for timeout exceptions

* Add info on auth failures
2020-11-13 08:45:40 -06:00
Gian Merlino 3436297354
Clarify how ORDER BY works with UNION ALL (#10561)
Hopefully a bit clearer.
2020-11-05 20:12:03 -08:00
Abhishek Agarwal 04546b65ec
Additional documentation for query caching (#10503)
* Add documentation for when caching is unsupported

* Minor changes

* Minor doc fix

* Review comments

* Add more details

* Fix spelling check

* Fix doc for union query

* Trailing dot
2020-10-20 13:49:13 -07:00
Maytas Monsereenusorn 3538abd5d0
Make sure all fields in sys.segments are JSON-serialized (#10481)
* fix JSON format

* Change all columns in sys segments to be JSON

* Change all columns in sys segments to be JSON

* add tests

* fix failing tests

* fix failing tests
2020-10-14 13:49:46 -07:00
Clint Wylie 1d6cb624f4
add vectorizeVirtualColumns query context parameter (#10432)
* add vectorizeVirtualColumns query context parameter

* oops

* spelling

* default to false, more docs

* fix test

* fix spelling
2020-09-28 18:48:34 -07:00
Jihoon Son 0cc9eb4903
Store hash partition function in dataSegment and allow segment pruning only when hash partition function is provided (#10288)
* Store hash partition function in dataSegment and allow segment pruning only when hash partition function is provided

* query context

* fix tests; add more test

* javadoc

* docs and more tests

* remove default and hadoop tests

* consistent name and fix javadoc

* spelling and field name

* default function for partitionsSpec

* other comments

* address comments

* fix tests and spelling

* test

* doc
2020-09-24 16:32:56 -07:00
Clint Wylie dad69481f0
add light weight version of /druid/coordinator/v1/lookups/nodeStatus (#10422)
* add light weight version /druid/coordinator/v1/lookups/nodeStatus

* review stuffs
2020-09-24 14:36:53 +08:00
Maytas Monsereenusorn 72f1b55f56
Add last_compaction_state to sys.segments table (#10413)
* Add is_compacted to sys.segments table

* change is_compacted to last_compaction_state

* fix tests

* fix tests

* address comments
2020-09-23 15:29:36 -07:00
sthetland ae247b6e63
Document change in results of groupBy queries with subtotalsSpec (#10405)
* subtotalsSpec results with null values

Document the format change in results of a groupBy query with a subtotalsSpec. This update applies to 0.18 and later.

* Review catches
2020-09-19 10:51:23 -07:00
Suneet Saldanha f71ba6f2c2
Vectorized ANY aggregators (#10338)
* WIP vectorized ANY aggregators

* tests

* fix aggs

* cleanup

* code review + tests

* docs

* use NilVectorSelector when needed

* fix spellcheck

* dont instantiate vectors

* cleanup
2020-09-14 19:44:58 -07:00
Abhishek Agarwal f5e2645bbb
Support SearchQueryDimFilter in sql via new methods (#10350)
* Support SearchQueryDimFilter in sql via new methods

* Contains is a reserved word

* revert unnecessary change

* Fix toDruidExpression method

* rename methods

* java docs

* Add native functions

* revert change in dockerfile

* remove changes from dockerfile

* More tests

* travis fix

* Handle null values better
2020-09-14 09:57:54 -07:00
Abhishek Agarwal a5c46dc84b
Add vectorization for druid-histogram extension (#10304)
* First draft

* Remove redundant code from FixedBucketsHistogramAggregator classes

* Add test cases for new classes

* Fix tests in sql compatible mode

* Typo fix

* Fix comment

* Add spelling

* Vectorize only for supported types

* Rename internal aggregator files

* Fix tests
2020-09-09 13:56:33 -07:00
Gian Merlino 5cd7610fb6
SQL support for union datasources. (#10324)
* SQL support for union datasources.

Exposed via the "UNION ALL" operator. This means that there are now two
different implementations of UNION ALL: one at the top level of a query
that works by concatenating subquery results, and one at the table level
that works by creating a UnionDataSource.

The SQL documentation is updated to discuss these two use cases and how
they behave.

Future work could unify these by building support for a native datasource
that represents the union of multiple subqueries. (Today, UnionDataSource
can only represent the union of tables, not subqueries.)

* Fixes.

* Error message for sanity check.

* Additional test fixes.

* Add some error messages.
2020-08-28 07:57:06 -07:00
Gian Merlino 21703d81ac
Fix handling of 'join' on top of 'union' datasources. (#10318)
* Fix handling of 'join' on top of 'union' datasources.

The problem is that unions are typically rewritten into a series of
individual queries on the underlying tables, but this isn't done when
the union is wrapped in a join.

The main changes are in UnionQueryRunner:

1) Replace an instanceof UnionQueryRunner check with DataSourceAnalysis.
2) Replace a "query.withDataSource" call with a new function, "Queries.withBaseDataSource".

Together, these enable UnionQueryRunner to "see through" a join.

* Tests.

* Adjust heap sizes for integration tests.

* Different approach, more tests.

* Tweak.

* Styling.
2020-08-26 14:23:54 -07:00