This is in preparation for eventually retiring the flag `useMaxMemoryEstimates`,
after which the footprint of a value in the dimension dictionary will always be
estimated using the `estimateSizeOfValue()` method.
* fix json_value sql planning with decimal type, fix vectorized expression math null value handling in default mode
changes:
* json_value 'returning' decimal will now plan to native double typed query instead of ending up with default string typing, allowing decimal vector math expressions to work with this type
* vector math expressions now zero out 'null' values even in 'default' mode (druid.generic.useDefaultValueForNull=false) to prevent downstream things that do not check the null vector from producing incorrect results
* more better
* test and why not vectorize
* more test, more fix
This adds min/max functions for CompressedBigDecimal. It exposes these
functions via sql (BIG_MAX, BIG_MIN--see the SqlAggFunction
implementations).
It also includes various bug fixes and cleanup to the original
CompressedBigDecimal code include the AggregatorFactories. Various null
handling was improved.
Additional test cases were added for both new and existing code
including a base test case for AggregationFactories. Other tests common
across sum,min,max may be refactored also to share the varoius cases in
the future.
1) Better support for Java 9+ in RuntimeInfo. This means that in many cases,
an actual validation can be done.
2) Clearer log message in cases where an actual validation cannot be done.
* process: update PR template to include release notes
* Update .github/pull_request_template.md [ci skip]
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update .github/pull_request_template.md
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
* incorporate feedback from paul
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
* remove old query view
* update tests
* add filter
* fix test
* bump d3 things to latest versions
* rent too far into the future with d3
* make config dialogs load
* goodies
* update snapshots
* only compute duration when running or pending
Druid currently uses Zookeeper dependent options as the default.
This commit updates the following to use HTTP as the default instead.
- task runner. `druid.indexer.runner.type=remote -> httpRemote`
- load queue peon. `druid.coordinator.loadqueuepeon.type=curator -> http`
- server inventory view. `druid.serverview.type=curator -> http`
* fix doc search
* upgrade website node to 16
* change website travis script
* move spellcheck notification
* explicit path to npm bin
* cd to the correct place
* add a note to the documentation about pre-built HLLSketches
Druid actually supports ingesting a pre-generated sketch column by using
the HLLSketchMerge aggregator. However, this functionality was
previously not made clear in the documentation.
* copyedit from the King's English to American English
* add suggested style changes
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This adds a sql function, "BIG_SUM", that uses
CompressedBigDecimal to do a sum. Other misc changes:
1. handle NumberFormatExceptions when parsing a string (default to set
to 0, configurable in agg factory to be strict and throw on error)
2. format pom file (whitespace) + add dependency
3. scaleUp -> scale and always require scale as a parameter
Optimizes the compareTo() function in
CompressedBigDecimal. It directly compares the int[] rather than
creating BigDecimal objects and using its compareTo.
It handles unequal sized CBDs, but does require
the scales to match.
* update log4j example
* fix some style issues
* Update docs/configuration/logging.md
Co-authored-by: Frank Chen <frankchen@apache.org>
Co-authored-by: Frank Chen <frankchen@apache.org>
Fixes#12822
The framework added here make it easy to write tests that verify the behaviour and interactions
of the following entities under various conditions:
- `DruidCoordinator`
- `HttpLoadQueuePeon`, `LoadQueueTaskMaster`
- coordinator duties: `BalanceSegments`, `RunRules`, `UnloadUnusedSegments`, etc.
- datasource retention rules: `LoadRule`, `DropRule`
Changes:
Add the following main classes:
- `CoordinatorSimulation` and related interfaces to dictate behaviour of simulation
- `CoordinatorSimulationBuilder` to build a simulation.
- `BlockingExecutorService` to keep submitted tasks in queue and execute them
only when explicitly invoked.
Add tests:
- `CoordinatorSimulationBaseTest`, `SegmentLoadingTest`, `SegmentBalancingTest`
- `SegmentLoadingNegativeTest` to contain tests which assert the existing erroneous behaviour
of segment loading. Once the behaviour is fixed, these tests will be moved to the regular
`SegmentLoadingTest`.
Please refer to the README.md in `org.apache.druid.server.coordinator.simulate` for more details
* Clarified the behaviour of COUNT(DISTINCT column) on multi-value columns
* Update docs/querying/sql-aggregations.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Vadim Ogievetsky <vadimon@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>