* Add lastString and firstString aggregators extension
* Remove duplicated class
* Move first-last-string doc page to extensions-contrib
* Fix ObjectStrategy compare method
* Fix doc bad aggregatos type name
* Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery
* Add getMaxStringBytes() method to support JSON serialization
* Fix null pointer exception at segment creation phase when the string value is null
* Control the valueSelector object class on BufferAggregators
* Perform all improvements
* Add java doc on SerializablePairLongStringSerde
* Refactor ObjectStraty compare method
* Remove unused ;
* Add aggregateCombiner unit tests. Rename BufferAggregators unit tests
* Remove unused imports
* Add license header
* Add class name to java doc class serde
* Throw exception if value is unsupported class type
* Move first-last-string extension into druid core
* Update druid core docs
* Fix null pointer exception when pair->string is null
* Add null control unit tests
* Remove unused imports
* Add first/last string folding aggregator on AggregatorsModule to support segment metadata query
* Change SerializablePairLongString to extend SerializablePair
* Change vars from public to private
* Convert vars to primitive type
* Clarify compare comment
* Change IllegalStateException to ISE
* Remove TODO comments
* Control possible null pointer exception
* Add @Nullable annotation
* Remove empty line
* Remove unused parameter type
* Improve AggregatorCombiner javadocs
* Add filterNullValues option at StringLast and StringFirst aggregators
* Add filterNullValues option at agg documentation
* Fix checkstyle
* Update header license
* Fix StringFirstAggregatorFactory.VALUE_COMPARATOR
* Fix StringFirstAggregatorCombiner
* Fix if condition at StringFirstAggregateCombiner
* Remove filterNullValues from string first/last aggregators
* Add isReset flag in FirstAggregatorCombiner
* Change Arrays.asList to Collections.singletonList
* Update defaultHadoopCoordinates in documentation.
To match changes applied in #5382.
* Remove a parameter with defaults from example configuration file.
If it has reasonable defaults, then why would it be in an example config file?
Also, it is yet another place that has been forgotten to be updated and will be forgotten in the future.
Also, if someone is running different hadoop version, then there's much more work to be done than just changing this property, so why give users false hopes?
* Fix typo in documentation.
* Use mergeBuffer instead of processingBuffer in parallelCombiner
* Fix test
* address comments
* fix test
* Fix test
* Update comment
* address comments
* fix build
* Fix test failure
Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448.
I also added redirects and updated links to point to the new
datasketches-extension.html landing page for the extension, rather than to
the old page about theta sketches.
* Update post-aggregations.md
I think this is more clear. I am not sure how multiplying by 100 is involved in averaging...
* Update post-aggregations.md
adding additional aggregator
* Update post-aggregations.md
Code changes:
- In the lookup-based extractionFns, inherit injective property from
the lookup itself if not specified.
Doc changes:
- Add a "Query execution" section to the lookups doc explaining how
injective lookups and their optimizations work.
- Remove scary warnings against using registeredLookup extractionFns.
They are necessary and important since they work with filters and
function cascades -- two things that the dimension specs do not do.
They deserve to be first class citizens.
- Move the "registeredLookup" fn above the "lookup" fn. It's probably
more commonly used, so the docs read better this way.
* Add retries for coordinator fetch and lookup start in LookupReferencesManager
* Fix LookupConfigTest
* Address comments
* Address more comments
* And address more comments
* Address comms
* Recognize 'not found' lookups in LookupReferencesManager.tryGetLookupListFromCoordinator(), by @egor-ryashin
* Changes for lookup synchronization
* Refactor of Lookup classes
* Minor refactors and doc update
* Change coordinator instance to be retrieved by DruidLeaderClient
* Wait before thread shutdown
* Make disablelookups flag true by default
* Update docs
* Rename flag
* Move executorservice shutdown to finally block
* Update LookupConfig
* Refactoring and doc changes
* Remove lookup config constructor
* Revert Lookupconfig constructor changes
* Add tests to LookupConfig
* Make executorservice local
* Update LRM
* Move ListeningScheduledExecutorService to ExecutorCompletionService
* Move exception to outer block
* Remove check to see future is done
* Remove unnecessary assignment
* Add logging
* SQL: Upgrade to Calcite 1.14.0, some refactoring of internals.
This brings benefits:
- Ability to do GROUP BY and ORDER BY with ordinals.
- Ability to support IN filters beyond 19 elements (fixes#4203).
Some refactoring of druid-sql internals:
- Builtin aggregators and operators are implemented as SqlAggregators
and SqlOperatorConversions rather being special cases. This simplifies
the Expressions and GroupByRules code, which were becoming complex.
- SqlAggregator implementations are no longer responsible for filtering.
Added new functions:
- Expressions: strpos.
- SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR,
and DATE_TRUNC.
* Add missing @Override annotation.
* Adjustments for forbidden APIs.
* Adjustments for forbidden APIs.
* Disable GROUP BY alias.
* Doc reword.
* Move scan-query from a contrib extension into core.
Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion
This patch also adds support for virtual columns to the Scan query,
and updates Druid SQL to use Scan instead of Select.
This patch also makes some behavioral changes to handling of the __time
column. In particular, it is now is returned as "__time" rather than
"timestamp"; it is no longer included if you do not specifically ask for
it in your "columns"; and it is returned as a long rather than a string.
Users can revert time handling to the legacy extension behavior by
setting "legacy" : true in their queries, or setting the property
druid.query.scan.legacy = true. This is meant to provide a migration
path for users that were formerly using the contrib extension.
* Adjustments from review.
* Add back Select query.
* Adjust SQL docs.
* Restore SelectQuery link.
* SQL: Full TRIM support.
- Support trimming arbitrary characters
- Support BOTH, LEADING, and TRAILING
* Remove unused import.
* Fix tests, add RTRIM / LTRIM.
* Remove unused imports.
* BTRIM and docs.
* Replace for with foreach.
* Add "round" option to cardinality and hyperUnique aggregators.
Also turn it on by default in SQL, to make math on distinct counts
work more as expected.
* Fix some compile errors.
* Fix test.
* Formatting.
* Improved SQL support for floats and doubles.
- Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL,
and DECIMAL.
- Use float* aggregators when appropriate.
- Add tests involving both float and double columns.
- Adjust documentation accordingly.
* CR comments.
* Fix braces.
* SQL + Expressions = Best friends forever.
- Use expressions as a projection layer for anything that can't be
expressed using traditional Druid extractionFns. Sometimes they're
embedded directly (like "expression" filters, builtin aggregators,
or "expression" post-aggregators). Sometimes they're referenced
through virtual columns (like dimensionSpecs, which can't innately
reference functions of more than one column without the virtual
column layer).
- Add many new functions and operators, taking advantage of the
expression capability (see the querying/sql.md doc).
- Improve consistency of constant reduction and of casting by
using Druid expressions for this instead of Calcite's RexExecutor.
* Fix casting bug, and other code review comments.
* Fix docs.
There result would be {"error"=>"Unknown exception",
"errorMessage"=>nil, "errorClass"=>"java.lang.NullPointerException",
"host"=>nil} when the json lack of 『granularity』.