Commit Graph

12032 Commits

Author SHA1 Message Date
sr 54a2eb7dcc
Compressed Big Decimal Cleanup and Extension (#13048)
1. remove unnecessary generic type from CompressedBigDecimal
2. support Number input types
3. support aggregator reading supported input types directly (uningested
   data)
4. fix scaling bug in buffer aggregator
2022-09-13 19:14:31 -07:00
Frank Chen fd6c05eee8
Avoid ClassCastException when getting values from `QueryContext` (#13022)
* Use safe conversion methods

* Rename method

* Add getContextAsBoolean

* Update test case

* Remove generic from getContextValue

* Update catch-handler

* Add test

* Resolve comments

* Replace 'getContextXXX' to 'getQueryContext().getAsXXXX'
2022-09-13 18:00:09 +08:00
Vadim Ogievetsky 08d6aca528
Web console: better detection for arrays containing objects (#13077)
* better detection for arrays containing objects

* include boolean also
2022-09-12 18:50:29 -07:00
Gian Merlino 77925cdcdd
Expressions: fixes for round-trips of floating point literals, Long.MIN_VALUE literals, Shuffle.visitAll. (#13037)
* SQL: Fix round-trips of floating point literals.

When writing RexLiterals into Druid expressions, we now write non-integer
numeric literals in such a way that ensures they are parsed as doubles
on the other end.

* Updates from code review, and some additional stuff inspired by the
investigation.

- Remove unnecessary formatting code from DruidExpression.doubleLiteral:
  it handles things just fine with its default behavior.

- Fix a problem where expression literals could not represent Long.MIN_VALUE.
  Now, integer literals start life off as BigIntegerExpr instead of LongExpr,
  and are converted to LongExpr during flattening. This is necessary because,
  in order to avoid ambiguity between unary minus and negative literals, our
  grammar does not actually have true negative literals. Negative numbers must
  be represented as unary minus next to a positive literal.

- Fix a bug  introduced in #12230 where shuttle.visitAll(args) delegated
  to shuttle.visit(arg) instead of arg.visit(shuttle). The latter does
  a recursive visitation, which is the intended behavior.

* Style fixes.

* Move regexp to the right place.
2022-09-12 17:06:20 -07:00
Gian Merlino c00ad28ecc
Cleaner JSON for various input sources and formats. (#13064)
* Cleaner JSON for various input sources and formats.

Add JsonInclude to various properties, to avoid population of default
values in serialized JSON.

Also fixes a bug in OrcInputFormat: it was not writing binaryAsString,
so the property would be lost on serde.

* Additonal test cases.
2022-09-12 10:29:31 -07:00
Paul Rogers 80b97ac24d
Create a copy of the shared JDBC context (#13049) 2022-09-12 10:27:56 -07:00
Frank Chen eff7c64228
export com.sun.management.internal (#13068) 2022-09-12 09:03:22 -07:00
imply-cheddar 5ba0075c0c
Expose HTTP Response headers from SqlResource (#13052)
* Expose HTTP Response headers from SqlResource

This change makes the SqlResource expose HTTP response
headers in the same way that the QueryResource exposes them.

Fundamentally, the change is to pipe the QueryResponse
object all the way through to the Resource so that it can
populate response headers.  There is also some code
cleanup around DI, as there was a superfluous FactoryFactory
class muddying things up.
2022-09-12 01:40:06 -07:00
Frank Chen f60ec8e7ca
Enable msq for docker by default (#13069) 2022-09-11 21:00:32 +05:30
Benedict Jin 4bde50e683
Bump the version of Druid docker image from 0.16.0-incubating to latest (#13058) 2022-09-10 14:06:00 +05:30
Vadim Ogievetsky 4fc43670e5
adjust docs and images (#13067) 2022-09-10 14:05:19 +05:30
Vadim Ogievetsky d978afc5b7
fix number of expected functions (#13050) 2022-09-09 13:42:01 -07:00
Gian Merlino e29e7a8434
Add ARRAY_QUANTILE function. (#13061)
* Add ARRAY_QUANTILE function.

Expected usage is like: ARRAY_QUANTILE(ARRAY_AGG(x), 0.9).

* Fix test.
2022-09-09 11:29:20 -07:00
Vadim Ogievetsky 5cc5f7b60c
quote columns, datasources in auto complete if needed (#13060) 2022-09-09 11:22:40 -07:00
DENNIS dced61645f
prometheus-emitter supports sending metrics to pushgateway regularly … (#13034)
* prometheus-emitter supports sending metrics to pushgateway regularly and continuously

* spell check fix

* Optimization variable name and related documents

* Update docs/development/extensions-contrib/prometheus.md

OK, it looks more conspicuous

Co-authored-by: Frank Chen <frankchen@apache.org>

* Update doc

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* When PrometheusEmitter is closed, close the scheduler

* Ensure that registeredMetrics is thread safe.

* Local variable name optimization

* Remove unnecessary white space characters

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-09 20:46:14 +08:00
sachidananda007 48c99054d0
Update tutorial-kafka.md (#13056)
* Update tutorial-kafka.md

Added missing command to the doc for zookeeper before starting kafka

* Update docs/tutorials/tutorial-kafka.md

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-09 10:06:19 +08:00
Clint Wylie 6438f4198d
improve nested column serializer (#13051)
changes:
* long and double value columns are now written directly, at the same time as writing out the 'intermediary' dictionaryid column with unsorted ids
* remove reverse value lookup from GlobalDictionaryIdLookup since it is no longer needed
2022-09-08 18:32:53 -07:00
Frank Chen d57557d51d
Improve doc and configuration of prometheus emitter (#13028)
* Improve doc and validation

* Add configuration for peon tasks

* Update doc

* Update test case

* Fix typo

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2022-09-09 02:20:34 +08:00
Lucas Capistrant 99fd22c79b
fix bug in /status/properties filtering (#13045)
* fix bug in /status/properties filtering

* Refactor tests to use jackson for parsing druid.server.hiddenProperties instead of hacky string modifications

* make javadoc more descriptive using example

* add in a sanity assertion that raw properties keyset size is greater than filtered properties keyset size
2022-09-07 17:45:28 -07:00
Kashif Faraz 7e20d70242
Fix web-console message in MSQ data loader (#12996)
* Fix typo in web-console message

* Prettify the changes
2022-09-07 13:34:10 -07:00
Vadim Ogievetsky 92789cfc4a
default to no compare (#13041) 2022-09-07 08:28:28 -07:00
Gian Merlino f00f1f754d
MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor. (#13036)
* MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor.

Frame processors are meant to write only one output frame per cycle.
The ScanQueryFrameProcessor would write two when reading from a channel
if the input frame cursor cycled and then the output frame filled up
while reading from the next frame.

This patch fixes the bug, and adds a test. It also makes some adjustments
to the processor code in order to make it easier to test.

* Add license header.
2022-09-07 19:32:21 +05:30
Rohan Garg 2f156b3610
Disallow timeseries queries with ETERNITY interval and non-ALL granularity (#12944) 2022-09-07 16:45:08 +05:30
Rohan Garg 7aa8d7f987
Add query/time metric for SQL queries from router (#12867)
* Add query/time metric for SQL queries from router

* Fix query cancel bug when user has overriden native query-id in a SQL query
2022-09-07 13:54:46 +05:30
Adam Peck ee22663dd3
Add interpolation to JsonConfigurator (#13023)
* Add interpolation to JsonConfigurator

* Fix checkstyle

* Fix tests by removing common-text override

* Add back commons-text without version

* Remove unused hadoopDir configs

* Move some stuff to hopefully pass coverage
2022-09-07 12:48:01 +05:30
Clint Wylie a3a377e570
more consistent expression error messages (#12995)
* more consistent expression error messages

* review stuff

* add NamedFunction for Function, ApplyFunction, and ExprMacro to share common stuff

* fixes

* add expression transform name to transformer failure, better parse_json error messaging
2022-09-06 23:21:38 -07:00
sr ed26e2d634
Improve String Last/First Storage Efficiency (#12879)
-Add classes for writing cell values in LZ4 block compressed format.
Payloads are indexed by element number for efficient random lookup
-update SerializablePairLongStringComplexMetricSerde to use block
compression
-SerializablePairLongStringComplexMetricSerde also uses delta encoding
of the Long by doing 2-pass encoding: buffers first to find min/max
numbers and delta-encodes as integers if possible

Entry points for doing block-compressed storage of byte[] payloads
are the CellWriter and CellReader class. See
SerializablePairLongStringComplexMetricSerde for how these are used
along with how to do full column-based storage (delta encoding here)
which includes 2-pass encoding to compute a column header
2022-09-06 20:00:54 -07:00
Jill Osborne 1f69140623
Nested columns documentation (#12946)
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: brian.le <brian.le@imply.io>
2022-09-06 14:42:18 -07:00
Vadim Ogievetsky 897689c03b
remove mentions of DruidQueryRel from docs (#13033)
* remove mentions of DruidQueryRel

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2022-09-06 13:37:27 -07:00
Vadim Ogievetsky 2a039e7e6a
Add CTA and fix typo (#13009)
* Add CTA and fix typo

* resolve hostname better
2022-09-06 11:16:50 -07:00
Vadim Ogievetsky 2cf449386f
Web console: upgrade the console to use node 16 (#13017)
* upgrade the console to use node 16

* run npm audit fix
2022-09-06 11:15:23 -07:00
317brian d4233ef2a1
msq: add multi-stage-query docs (#12983)
* msq: add multi-stage-query docs

* add screenshots

add back theta sketches tutoria

change filename

fix filename

fix link

fix headings

* fixes

* fixes

* fix spelling issues and update spell file

* address feedback from karan

* add missing guardrail to known issues

* update blurb

* fix typo

* remove durable storage info

* update titles

* Restore en.json

* Update query view

* address comments from vad

* Update docs/multi-stage-query/msq-known-issues.md

finish sentence

* add apache license to docs

* add apache license to docs

Co-authored-by: Katya Macedo <katya.macedo@imply.io>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-09-06 23:06:09 +05:30
Didip Kerabat 66545a0f3d
Fix compiler error: The project was not built since its build path is incomplete. Cannot find the class file for org.slf4j.Logger. Fix the build path then try building this project (#13029)
Co-authored-by: Didip Kerabat <didip@apple.com>
2022-09-06 20:49:41 +05:30
senthilkv 3d9aef225d
compressed big decimal - module (#10705)
Compressed Big Decimal is an extension which provides support for 
Mutable big decimal value that can be used to accumulate values 
without losing precision or reallocating memory. This type helps in 
absolute precision arithmetic on large numbers in applications, 
where greater level of accuracy is required, such as financial 
applications, currency based transactions. This helps avoid rounding 
issues where in potentially large amount of money can be lost.

Accumulation requires that the two numbers have the same scale, 
but does not require that they are of the same size. If the value 
being accumulated has a larger underlying array than this value 
(the result), then the higher order bits are dropped, similar to what 
happens when adding a long to an int and storing the result in an 
int. A compressed big decimal that holds its data with an embedded 
array.

Compressed big decimal is an absolute number based complex type 
based on big decimal in Java. This supports all the functionalities 
supported by Java Big Decimal. Java Big Decimal is not mutable in 
order to avoid big garbage collection issues. Compressed big decimal 
is needed to mutate the value in the accumulator.
2022-09-06 00:06:57 -07:00
Abhishek Agarwal 7d332c6f6a
Suppress false CVEs (#13026)
* Suppress CVEs

* Add more suppressions
2022-09-06 11:46:56 +05:30
zemin 6805a7f9c2
Ease of hidding sensitive properties from /status/proper… (#12950)
* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

using one property for hiding properties, updated the index.md to document hiddenProperties

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

Added java docs

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

Add "password", "key", "token", "pwd" as default druid.server.hiddenProperties

fixed typo and removed redundant space

Co-authored-by: zemin <zemin.piao@adyen.com>
2022-09-02 08:51:25 -05:00
Vadim Ogievetsky 0ae515bd3c
Web console: don't crash if cookies are totally disabled (#13013)
* fix local storage detection

* fix numeric input dialog
2022-09-01 16:10:23 -07:00
Gian Merlino 85d2a6d879
Improve range partitioning docs. (#13016)
Two improvements:

- Use a realistic targetRowsPerSegment, so if people copy and paste
  the example from the docs, it will generate reasonable segments.
- Spell "countryName" correctly.
2022-09-01 15:21:30 -07:00
Benedict Jin d73a011f70
Bump the version of Maven in the Dockerfile (#11994) 2022-08-31 22:54:24 +08:00
Vadim Ogievetsky 5e850c6ea3
Make console e2e tests run in band so as to not hog task slots (#13004)
* increase e2e timeline

* get rid of pull deps

* increase post index task timeoout

* boost msq e2e timeout

* run in band
2022-08-30 21:55:53 -07:00
Vadim Ogievetsky 054688528f
don't show transform actions on * queries (#13005) 2022-08-30 21:54:18 -07:00
Gian Merlino 48ceab2153
Add Java 17 information to documentation. (#12990)
The docs say Java 17 support is experimental, and give tips on running
successfully with Java 17.

This patch also removes java.base/jdk.internal.perf and
jdk.management/com.sun.management.internal from the list of required
exports and opens, because they were formerly needed for JvmMonitor,
which was rewritten in #12481 to use MXBeans instead.
2022-08-30 12:32:49 -07:00
Gian Merlino 2450b96ac8
FrameFile: Java 17 compatibility. (#12987)
* FrameFile: Java 17 compatibility.

DataSketches Memory.map is not Java 17 compatible, and from discussions
with the team, is challenging to make compatible with 17 while also
retaining compatibility with 8 and 11. So, in this patch, we switch away
from Memory.map and instead use the builtin JDK mmap functionality. Since
it only supports maps up to Integer.MAX_VALUE, we also implement windowing
in FrameFile, such that we can still handle large files.

Other changes:

1) Add two new "map" functions to FileUtils, which we use in this patch.
2) Add a footer checksum to the FrameFile format. Individual frames
   already have checksums, but the footer was missing one.

* Changes for static analysis.

* wip

* Fixes.
2022-08-30 11:13:47 -07:00
abhagraw f3c47cf68c
Building druid-it-tools and running for travis in it.sh (#12957)
* Building druid-it-tools and running for travis in it.sh

* Addressing comments

* Updating druid-it-image pom to point to correct it-tools

* Updating all it-tools references to druid-it-tools

* Adding dist back to it.sh travis

* Trigger Build

* Disabling batchIndex tests and commenting out user specific code

* Fixing checkstyle and intellij inspection errors

* Replacing tabs with spaces in it.sh

* Enabling old batch index tests with indexer
2022-08-30 12:48:07 +05:30
Gian Merlino 414176fb97
Fix accounting of bytesAdded in ReadableByteChunksFrameChannel. (#12988)
* Fix accounting of bytesAdded in ReadableByteChunksFrameChannel.

Could cause WorkerInputChannelFactory to get into an infinite loop when
reading the footer of a frame file.

* Additional tests.
2022-08-29 18:25:28 -07:00
Gian Merlino 9eb20e5e7c
Remove dependency on jvm-attach. (#12989)
This dependency was no longer needed after #12481, but remained because
it was used for a (now useless) test. This patch removes the test and
the dependency.
2022-08-29 14:18:33 -07:00
Gian Merlino 0460d8a502
Adjust SQL "cannot plan" error message. (#12903)
Two changes:

1) Restore the text of the SQL query. It was removed in #12897, but
   then it was later pointed out that the text is helpful for end
   users querying Druid through tools that do not show the SQL queries
   that they are making.

2) Adjust wording slightly, from "Cannot build plan for query" to
   "Query not supported". This will be clearer to most users. Generally
   the reason we get these errors is due to unsupported SQL constructs.
2022-08-29 18:33:00 +05:30
wiquan 9fd8ab3c75
[Issue 10331] Fresh Docker install results in "Uh... I have no servers. Not assigning anything... (#12963)
* add env var DRUID_SINGLE_NODE_CONF

* no message

* typo fix

Co-authored-by: Wil Quan <wiquan@appdynamics.com>
2022-08-29 17:43:53 +05:30
Abhishek Agarwal 618757352b
Bump up the version to 25.0.0 (#12975)
* Bump up the version to 25.0.0

* Fix the version in console
2022-08-29 11:27:38 +05:30
Kashif Faraz 9843355ddd
Throw parse exception for multi-valued numeric dims (#12953)
During ingestion, if a row containing multiple values for a numeric dimension is encountered,
the whole ingestion task fails. Ideally, this should just be registered as a parse exception.

Changes:
- Remove `instanceof List` check from `LongDimensionIndexer`, `FloatDimensionIndexer` and `DoubleDimensionIndexer`.

Any invalid type, including list, throws a parse exception in `DimensionHandlerUtils.convertObjectToXXX`
methods. `ParseException` is already handled in `OnHeapIncrementalIndex` and does not fail the entire task.
2022-08-29 10:33:48 +05:30