Commit Graph

233 Commits

Author SHA1 Message Date
Andrés Gómez e270362767 Add stringLast and stringFirst aggregators extension (#5789)
* Add lastString and firstString aggregators extension

* Remove duplicated class

* Move first-last-string doc page to extensions-contrib

* Fix ObjectStrategy compare method

* Fix doc bad aggregatos type name

* Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery

* Add getMaxStringBytes() method to support JSON serialization

* Fix null pointer exception at segment creation phase when the string value is null

* Control the valueSelector object class on BufferAggregators

* Perform all improvements

* Add java doc on SerializablePairLongStringSerde

* Refactor ObjectStraty compare method

* Remove unused ;

* Add aggregateCombiner unit tests. Rename BufferAggregators unit tests

* Remove unused imports

* Add license header

* Add class name to java doc class serde

* Throw exception if value is unsupported class type

* Move first-last-string extension into druid core

* Update druid core docs

* Fix null pointer exception when pair->string is null

* Add null control unit tests

* Remove unused imports

* Add first/last string folding aggregator on AggregatorsModule to support segment metadata query

* Change SerializablePairLongString to extend SerializablePair

* Change vars from public to private

* Convert vars to primitive type

* Clarify compare comment

* Change IllegalStateException to ISE

* Remove TODO comments

* Control possible null pointer exception

* Add @Nullable annotation

* Remove empty line

* Remove unused parameter type

* Improve AggregatorCombiner javadocs

* Add filterNullValues option at StringLast and StringFirst aggregators

* Add filterNullValues option at agg documentation

* Fix checkstyle

* Update header license

* Fix StringFirstAggregatorFactory.VALUE_COMPARATOR

* Fix StringFirstAggregatorCombiner

* Fix if condition at StringFirstAggregateCombiner

* Remove filterNullValues from string first/last aggregators

* Add isReset flag in FirstAggregatorCombiner

* Change Arrays.asList to Collections.singletonList
2018-08-01 10:52:54 -07:00
Jonathan Wei efab3b0160 Add concat and textcat SQL functions (#6005) 2018-07-20 11:21:04 -07:00
Gian Merlino cd8ea3da8d
SQL: Add server-wide default time zone config. (#5993)
* SQL: Add server-wide default time zone config.

* Switch API.
2018-07-18 13:12:40 -07:00
scrawfor bf2a31a5bc Add new 'true' filter which always returns true. (#5711)
* Add new 'true' filter which always returns true.

* Add support for bitmap index.

* Adds documentation.

* Removes No-op Filter
2018-06-28 11:52:45 -07:00
Caroline1000 c73e3ea4f5 Provide examples to havingSpec filters (#5774)
* expand examples

* expand examples for filtered havingSpecs

* expand other having examples

* remove blank code block

* add better AND/OR/NOT examples

* fix indentation
2018-05-14 13:43:42 -07:00
Jakub Kukul e2431ae161 Update defaultHadoopCoordinates in documentation. (#5720)
* Update defaultHadoopCoordinates in documentation.

To match changes applied in #5382.

* Remove a parameter with defaults from example configuration file.

If it has reasonable defaults, then why would it be in an example config file?

Also, it is yet another place that has been forgotten to be updated and will be forgotten in the future.

Also, if someone is running different hadoop version, then there's much more work to be done than just changing this property, so why give users false hopes?

* Fix typo in documentation.
2018-04-30 20:49:14 -07:00
Jihoon Son 86746f82d8 Use mergeBuffer instead of processingBuffer in parallelCombiner (#5634)
* Use mergeBuffer instead of processingBuffer in parallelCombiner

* Fix test

* address comments

* fix test

* Fix test

* Update comment

* address comments

* fix build

* Fix test failure
2018-04-27 18:14:37 -07:00
Gian Merlino f81855d607
Add unauthorized errorCode to query docs. (#5691) 2018-04-26 13:06:25 -07:00
scrawfor 15f4ab2b31 Expose noop filter to users (#5597) 2018-04-18 07:57:07 -07:00
Gian Merlino fbf3fc178e Timeseries: Add "grandTotal" option. (#5640)
* Timeseries: Add "grandTotal" option.

* Modify whitespace.

* Checkstyle workaround.
2018-04-16 18:22:19 -07:00
Caroline1000 afa75e04b7 change header in overlord console; minor querydoc change (#5625)
* change header in overlord console; minor querydoc change

* remove change to overlord console

* address Gian comments
2018-04-11 12:57:22 -07:00
Arup Malakar 0c4598c1fe Fix typo in avatica java client code documenation (#5553) 2018-03-29 16:36:40 -05:00
Dyana Rose db508cf3ca [docs] fix invalid example json (#5547)
https://github.com/druid-io/druid/issues/5546
2018-03-28 13:53:38 -07:00
Clint Wylie 50e0e7f97d Correct lookup documentation (#5537)
fixes #5536
2018-03-26 17:01:02 -07:00
Atul Mohan ec17a44e09 Add result level caching to Brokers (#5028)
* Add result level caching to Brokers

* Minor doc changes

* Simplify sequences

*  Move etag execution

* Modify cacheLimit criteria

* Fix incorrect etag computation

* Fix docs

* Add separate query runner for result level caching

* Update docs

* Add post aggregated results to result level cache

* Fix indents

* Check byte size for exceeding cache limit

* Fix indents

* Fix indents

* Add flag for result caching

* Remove logs

* Make cache object generation synchronous

* Avoid saving intermediate cache results to list

* Fix changes that handle etag based response

* Release bytestream after use

*  Address PR comments

*  Discard resultcache stream after use

* Fix docs

* Address comments

* Add comment about fluent workflow issue
2018-03-23 19:11:52 -07:00
Gian Merlino 0851f2206c
Expanded documentation for DataSketches aggregators. (#5513)
Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448.
I also added redirects and updated links to point to the new
datasketches-extension.html landing page for the extension, rather than to
the old page about theta sketches.
2018-03-21 18:19:27 -07:00
Gian Merlino 7416d1d02d Add "joda" option to timeFormat extractionFn. (#5448) 2018-03-02 19:59:26 -08:00
Gian Merlino f3796bc81b SQL: Lower default JDBC frame size. (#5409)
The previous default of 100,000 was a bit excessive and could easily
lead to OOM errors on "select *" style queries.
2018-02-21 10:00:48 -08:00
Stephanie Rivera 77bb2f9c9f Update post-aggregations.md (#5237)
* Update post-aggregations.md

I think this is more clear. I am not sure how multiplying by 100 is involved in averaging...

* Update post-aggregations.md

adding additional aggregator

* Update post-aggregations.md
2018-02-06 16:26:39 -08:00
Gian Merlino ed47a1e1a9
Lookups: Inherit "injective" from registered lookups, improve docs. (#5316)
Code changes:
- In the lookup-based extractionFns, inherit injective property from
  the lookup itself if not specified.

Doc changes:
- Add a "Query execution" section to the lookups doc explaining how
  injective lookups and their optimizations work.
- Remove scary warnings against using registeredLookup extractionFns.
  They are necessary and important since they work with filters and
  function cascades -- two things that the dimension specs do not do.
  They deserve to be first class citizens.
- Move the "registeredLookup" fn above the "lookup" fn. It's probably
  more commonly used, so the docs read better this way.
2018-02-01 18:30:19 -08:00
Gian Merlino 53e3c7d1b2 SQL: Add additional unsupported features to the docs. (#5290) 2018-01-24 11:27:47 -08:00
Jihoon Son 972b4d189a Fix topN doc (#5240) 2018-01-09 20:10:13 -08:00
Chuanlei Ni 368d03146b assign granularity.all to SelectQuery by default (#5091) 2017-11-21 17:10:19 -08:00
Daniel 22c49b0d33 docs: fix broken link to broker configuration (#5105) 2017-11-21 13:32:00 +09:00
Gian Merlino 486159ba8c SQL: Add TIMESTAMPADD. (#5079) 2017-11-16 12:00:34 -08:00
Gian Merlino 4fd4444b42 SQL: Add "array" result format, and document result formats. (#5032)
* SQL: Add "array" result format, and document result formats.

* Code style.
2017-11-13 20:24:06 -08:00
Himanshu bbb678efd7 fix lookups endpoint collisions (#5058)
* fix lookups endpoint collissions

* fix errors
2017-11-09 17:39:53 -08:00
Roman Leventov a8dc056c09
Add retries for coordinator fetch and lookup start in LookupReferencesManager (#5029)
* Add retries for coordinator fetch and lookup start in LookupReferencesManager

* Fix LookupConfigTest

* Address comments

* Address more comments

* And address more comments

* Address comms

* Recognize 'not found' lookups in LookupReferencesManager.tryGetLookupListFromCoordinator(), by @egor-ryashin
2017-11-09 02:30:36 -03:00
Jonathan Wei 6840eabd87
Add Router connection balancers for Avatica queries (#4983)
* Add Router connection balancers for Avatica queries

* PR comments

* Adjust test bounds

* PR comments

* Add doc comments

* PR comments

* PR comment

* Checkstyle fix
2017-11-01 14:01:13 -07:00
Gian Merlino d5e83f9d50 Fix docs for MOD. (#4971) 2017-10-18 16:43:28 -07:00
Jihoon Son 52d7f74226 Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704)
* Add steaming grouper

* Fix doc

* Use a single dictionary while combining

* Revert GroupByBenchmark

* Removed unused code

* More cleanup

* Remove unused config

* Fix some typos and bugs

* Refactor Groupers.mergeIterators()

* Add comments for combining tree

* Refactor buildCombineTree

* Refactor iterator

* Add ParallelCombiner

* Add ParallelCombinerTest

* Handle InterruptedException

* use AbstractPrioritizedCallable

* Address comments

* [maven-release-plugin] prepare release druid-0.11.0-sg

* [maven-release-plugin] prepare for next development iteration

* Address comments

* Revert "[maven-release-plugin] prepare for next development iteration"

This reverts commit 5c6b31e488.

* Revert "[maven-release-plugin] prepare release druid-0.11.0-sg"

This reverts commit 0f5c3a8b82.

* Fix build failure

* Change list to array

* rename sortableIds

* Address comments

* change to foreach loop

* Fix comment

* Revert keyEquals()

* Remove loop

* Address comments

* Fix build fail

* Address comments

* Remove unused imports

* Fix method name

* Split intermediate and leaf combine degrees

* Add comments to StreamingMergeSortedGrouper

* Add more comments and fix overflow

* Address comments

* ConcurrentGrouperTest cleanup

* add thread number configuration for parallel combining

* improve doc

* address comments

* fix build
2017-10-17 23:24:08 -07:00
Gian Merlino f51f346e36 SQL: Fix POWER doc, add test. (#4953) 2017-10-13 14:38:15 -07:00
Gian Merlino 5cfc7f9ef7 Fix formatting of SQL TRIM docs. (#4951) 2017-10-13 14:38:06 -07:00
Atul Mohan c07678b143 Synchronization of lookups during startup of druid processes (#4758)
* Changes for lookup synchronization

* Refactor of Lookup classes

* Minor refactors and doc update

* Change coordinator instance to be retrieved by DruidLeaderClient

* Wait before thread shutdown

* Make disablelookups flag true by default

* Update docs

* Rename flag

* Move executorservice shutdown to finally block

* Update LookupConfig

* Refactoring and doc changes

* Remove lookup config constructor

* Revert Lookupconfig constructor changes

* Add tests to LookupConfig

* Make executorservice local

* Update LRM

* Move ListeningScheduledExecutorService to ExecutorCompletionService

* Move exception to outer block

* Remove check to see future is done

* Remove unnecessary assignment

* Add logging
2017-10-12 21:22:24 -05:00
Gian Merlino b20e3038b6 SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889)
* SQL: Upgrade to Calcite 1.14.0, some refactoring of internals.

This brings benefits:
- Ability to do GROUP BY and ORDER BY with ordinals.
- Ability to support IN filters beyond 19 elements (fixes #4203).

Some refactoring of druid-sql internals:
- Builtin aggregators and operators are implemented as SqlAggregators
  and SqlOperatorConversions rather being special cases. This simplifies
  the Expressions and GroupByRules code, which were becoming complex.
- SqlAggregator implementations are no longer responsible for filtering.

Added new functions:
- Expressions: strpos.
- SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR,
  and DATE_TRUNC.

* Add missing @Override annotation.

* Adjustments for forbidden APIs.

* Adjustments for forbidden APIs.

* Disable GROUP BY alias.

* Doc reword.
2017-10-10 12:44:05 -07:00
Gian Merlino 4e1d0f49d8 Docs: Fix link to broker configuration. (#4934) 2017-10-10 11:18:46 -07:00
Gian Merlino 2ce8123bdb Move scan-query from a contrib extension into core. (#4751)
* Move scan-query from a contrib extension into core.

Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion

This patch also adds support for virtual columns to the Scan query,
and updates Druid SQL to use Scan instead of Select.

This patch also makes some behavioral changes to handling of the __time
column. In particular, it is now is returned as "__time" rather than
"timestamp"; it is no longer included if you do not specifically ask for
it in your "columns"; and it is returned as a long rather than a string.

Users can revert time handling to the legacy extension behavior by
setting "legacy" : true in their queries, or setting the property
druid.query.scan.legacy = true. This is meant to provide a migration
path for users that were formerly using the contrib extension.

* Adjustments from review.

* Add back Select query.

* Adjust SQL docs.

* Restore SelectQuery link.
2017-09-13 09:51:24 -07:00
Gian Merlino 4909c48b0c SQL: Full TRIM support. (#4750)
* SQL: Full TRIM support.

- Support trimming arbitrary characters
- Support BOTH, LEADING, and TRAILING

* Remove unused import.

* Fix tests, add RTRIM / LTRIM.

* Remove unused imports.

* BTRIM and docs.

* Replace for with foreach.
2017-09-12 11:49:08 -07:00
Gian Merlino 9078925cab Docs for finalizingFieldAccess post-aggregator. (#4737) 2017-08-31 11:45:49 -07:00
Gian Merlino daf3c5f927 Add "round" option to cardinality and hyperUnique aggregators. (#4720)
* Add "round" option to cardinality and hyperUnique aggregators.

Also turn it on by default in SQL, to make math on distinct counts
work more as expected.

* Fix some compile errors.

* Fix test.

* Formatting.
2017-08-28 14:52:11 -07:00
Jihoon Son f3f2cd35e1 Array-based aggregation for groupBy query (#4576)
* Array-based aggregation

* Fix handling missing grouping key

* Handle invalid offset

* Fix compilation

* Add cardinality check

* Fix cardinality check

* Address comments

* Address comments

* Address comments

* Address comments

* Cleanup GroupByQueryEngineV2.process

* Change to Byte.SIZE

* Add flatMap
2017-08-03 20:04:54 +03:00
Gian Merlino d4ef0f6d94 Improved SQL support for floats and doubles. (#4598)
* Improved SQL support for floats and doubles.

- Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL,
  and DECIMAL.
- Use float* aggregators when appropriate.
- Add tests involving both float and double columns.
- Adjust documentation accordingly.

* CR comments.

* Fix braces.
2017-07-25 13:54:44 -07:00
Slim 71e7a4c054 Adding double colums supports (#4491)
* add double columns support

* Fix numbers and expected results in UTs

* adding float aggregators

* fix IT expected test results

* fix comments

* more fixes

* fix comp

* fix test

* refactor double and float aggregator factories

* fix

* fix UTs

* fix comments

* clean unused code

* fix more comments

* undo unnecessary changes

* fix null issue

* refactor TopNColumnSelectorStrategyFactory

* fix docs

* refactor NumericTopNColumnSelectorStrategy

* fix return

* fix comments

* handle the null case in DimesionIndexer

* more null fixing

* cosmetic changes
2017-07-20 10:14:14 +03:00
Gian Merlino 16817e408d SQL + Expressions = Best friends forever. (#4360)
* SQL + Expressions = Best friends forever.

- Use expressions as a projection layer for anything that can't be
  expressed using traditional Druid extractionFns. Sometimes they're
  embedded directly (like "expression" filters, builtin aggregators,
  or "expression" post-aggregators). Sometimes they're referenced
  through virtual columns (like dimensionSpecs, which can't innately
  reference functions of more than one column without the virtual
  column layer).
- Add many new functions and operators, taking advantage of the
  expression capability (see the querying/sql.md doc).
- Improve consistency of constant reduction and of casting by
  using Druid expressions for this instead of Calcite's RexExecutor.

* Fix casting bug, and other code review comments.

* Fix docs.
2017-07-07 08:48:26 -07:00
jeffhartley 3e7f7720a1 update aggregations.md re: rollup (#4455)
noted that rollup could be on or off
2017-06-23 14:28:59 -07:00
Jonathan Wei 3b70995bb3 Configurable row limit for JDBC frames (#4417) 2017-06-16 17:07:40 -07:00
Amar Ramachandran fc80df339e Fix incorrect name (#4386) 2017-06-09 13:32:17 -04:00
kaijianding 551a89bd67 serialize DateTime As Long to improve json serde performance (#4038) 2017-06-06 10:08:51 -07:00
Jonathan Wei b90c28e861 Support limit push down for GroupBy (#3873)
* Support limit push down for GroupBy V2

* Use orderBy spec ordering when applying limit push down

* PR Comments

* Remove unused var

* Checkstyle fixes

* Fix test

* Add comment on non-final variables, fix checkstyle

* Address PR comments

* PR comments

* Remove unnecessary buffer reset

* Fix missing @JsonProperty annotation
2017-06-02 15:39:04 -07:00
fanjieqi 2e933e1413 fix a bug in select-query.md which the property_form lack of the『granularity』 (#4327)
There result would be {"error"=>"Unknown exception",
"errorMessage"=>nil, "errorClass"=>"java.lang.NullPointerException",
"host"=>nil} when the json lack of 『granularity』.
2017-05-30 17:04:39 -07:00