Commit Graph

12249 Commits

Author SHA1 Message Date
Ellen Shen da30c8070a
kafka consumer: custom serializer can't be configured after it's instantiation (#12960) (#13097)
* allow kakfa custom serializer to be configured

  * add unit tests

Co-authored-by: ellen shen <ellenshen@apple.com>
2022-09-17 20:42:21 +08:00
Gian Merlino d4967c38f8
Various documentation updates. (#13107)
* Various documentation updates.

1) Split out "data management" from "ingestion". Break it into thematic pages.

2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so
   all conceptual content is in concepts.md and all syntax content is in reference.md.
   Shorten the known issues page to the most interesting ones.

3) Add SQL-based ingestion to the ingestion method comparison page. Remove the
   index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1.

4) Rename various mentions of "Druid console" to "web console".

5) Add additional information to ingestion/partitioning.md.

6) Remove a mention of Tranquility.

7) Remove a note about upgrading to Druid 0.10.1.

8) Remove no-longer-relevant task types from ingestion/tasks.md.

9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated.

10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some
    places, but it isn't very useful compared to index_parallel, so it shouldn't take up space
    in the sidebar.

11) Make all br tags self-closing.

12) Certain other cosmetic changes.

13) Update to node-sass 7.

* make travis use node12 for docs

Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
2022-09-16 21:58:11 -07:00
Vadim Ogievetsky c62a822121
support kafka lookups (#13098) 2022-09-16 15:25:25 -07:00
AmatyaAvadhanula 9b53b0184f
Allocate numCorePartitions using only used segments (#13070)
* Allocate numCorePartitions using only used segments

* Add corePartition checks in existing test

* Separate committedMaxId and overallMaxId

* Fix bug: replace overall with committed
2022-09-16 19:16:36 +05:30
Vadim Ogievetsky 2493eb17bf
Doc fixes around msq (#13090)
* remove things that do not apply

* fix more things

* pin node to a working version

* fix

* fixes

* known issues tidy up

* revert auto formatting changes

* remove management-uis page which is 100% lies

* don't mention the Coordinator console (that no longer exits)

* goodies

* fix typo
2022-09-16 02:15:26 -07:00
Clint Wylie 5ece870634
split up NestedDataColumnSerializer into separate files (#13096)
* split up NestedDataColumnSerializer into separate files

* fix it
2022-09-16 01:28:47 -07:00
Katya Macedo 2218c8d23c
Documentation: Update spatial indexing example (#12555)
* fix spatial indexing example

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update text and example

* Format JSON example

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Accept review suggestions

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-16 10:32:19 +08:00
Peter Marshall 68262e43f8
Docs – README.md update around documentation contributions (#12850)
* Update README.md

Expansion on the process and where everything is.

* Update README.md

Switcheroo and a typo fix.

* Update README.md

Header link update to take to the H2.

* Update README.md

Reverted docs link after feedback

* Update README.md

Amended language.

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update README.md

PR term update

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-09-16 10:31:06 +08:00
Frank Chen b8dd822f32
Some improvements about Docker (#13059) 2022-09-16 09:25:52 +08:00
Vadim Ogievetsky 078b50ebe1
link to error docs (#13094) 2022-09-15 15:06:08 -07:00
Peter Marshall c32bf0df65
Docs - README.md community channels removal + link (#12843)
* README.md community channels

Removed explicit links to project channels in favour of a link direct to the Community page on druid.apache.org.

Updated nav to match remaining headings in the README.

* Update README.md

Reintroduced the old section and amended the nav bar to point to back to the community section.

* Incorporated suggested wording from @paul-rogers with some stylistic blahness
* Updated Slack phraseology to be closer to the Google User Groups header wording and called out specific channels
* Added new wording re: events and articles with link to the repo to contribute them
2022-09-15 20:52:46 +08:00
Atul Mohan c153c2a712
Initialize NullValueHandlingConfig for failed tests (#13078)
* Initialize null handling

* Refactor nullhandlingconfig init
2022-09-15 20:47:10 +08:00
AmatyaAvadhanula 1311e85f65
Faster fix for dangling tasks upon supervisor termination (#13072)
This commit fixes issues with delayed supervisor termination during certain transient states.
Tasks can be created during supervisor termination and left behind since the cleanup may
not consider these newly added tasks.

#12178 added a lock for the entire process of task creation to prevent such dangling tasks.
But it also introduced a deadlock scenario as follows:
- An invocation of `runInternal` is in progress.
- A `stop` request comes, acquires `stateChangeLock` and submit a `ShutdownNotice`
- `runInternal` keeps waiting to acquire the `stateChangeLock`
- `ShutdownNotice` remains stuck in the notice queue because `runInternal` is still running
- After some timeout, the supervisor goes through a forced termination

Fix:
 * `SeekableStreamSupervisor.runInternal` - do not try to acquire lock if supervisor is already stopping
 * `SupervisorStateManager.maybeSetState` - do not allow transitions from STOPPING state
2022-09-15 15:31:14 +05:30
Frank Chen aa9b0900d4
Move web-console dependency declaration from druid-server to druid-distribution (#12501)
* Move web-console dependency from druid-server to distribution

* Add a test to check if the web-console is correctly integrated

* exclude web-console from 'other integration tests'

* Revert "exclude web-console from 'other integration tests'"

This reverts commit 8d72225544.

* Revert "Add a test to check if the web-console is correctly integrated"

This reverts commit d6ac8f3087.
2022-09-15 10:39:30 +08:00
Gian Merlino 5733360dfd
Update Snappy to 1.1.8.4. (#13081)
* Update Snappy to 1.1.8.4.

Prior to this, because snappy-java wasn't included in dependencyManagement,
we actually shipped multiple different versions for different extensions,
ranging from 1.1.7.1 to 1.1.8.4. Now, we standardize on 1.1.8.4.

Among other things, this enables the tests to pass on M1 Macs.

* Update snappy-java versions in licenses.yaml.
2022-09-14 15:13:47 -07:00
Clint Wylie f4ec50bf7a
fix JsonParserIteratorTest (#13083) 2022-09-13 20:49:57 -07:00
Atul Mohan a8fd3a9077
Provide service specific log4j overrides in containerized deployments (#13020)
* Provide service specific log4j overrides

* Clarify comments

* Add docs
2022-09-14 11:47:11 +08:00
sr 54a2eb7dcc
Compressed Big Decimal Cleanup and Extension (#13048)
1. remove unnecessary generic type from CompressedBigDecimal
2. support Number input types
3. support aggregator reading supported input types directly (uningested
   data)
4. fix scaling bug in buffer aggregator
2022-09-13 19:14:31 -07:00
Frank Chen fd6c05eee8
Avoid ClassCastException when getting values from `QueryContext` (#13022)
* Use safe conversion methods

* Rename method

* Add getContextAsBoolean

* Update test case

* Remove generic from getContextValue

* Update catch-handler

* Add test

* Resolve comments

* Replace 'getContextXXX' to 'getQueryContext().getAsXXXX'
2022-09-13 18:00:09 +08:00
Vadim Ogievetsky 08d6aca528
Web console: better detection for arrays containing objects (#13077)
* better detection for arrays containing objects

* include boolean also
2022-09-12 18:50:29 -07:00
Gian Merlino 77925cdcdd
Expressions: fixes for round-trips of floating point literals, Long.MIN_VALUE literals, Shuffle.visitAll. (#13037)
* SQL: Fix round-trips of floating point literals.

When writing RexLiterals into Druid expressions, we now write non-integer
numeric literals in such a way that ensures they are parsed as doubles
on the other end.

* Updates from code review, and some additional stuff inspired by the
investigation.

- Remove unnecessary formatting code from DruidExpression.doubleLiteral:
  it handles things just fine with its default behavior.

- Fix a problem where expression literals could not represent Long.MIN_VALUE.
  Now, integer literals start life off as BigIntegerExpr instead of LongExpr,
  and are converted to LongExpr during flattening. This is necessary because,
  in order to avoid ambiguity between unary minus and negative literals, our
  grammar does not actually have true negative literals. Negative numbers must
  be represented as unary minus next to a positive literal.

- Fix a bug  introduced in #12230 where shuttle.visitAll(args) delegated
  to shuttle.visit(arg) instead of arg.visit(shuttle). The latter does
  a recursive visitation, which is the intended behavior.

* Style fixes.

* Move regexp to the right place.
2022-09-12 17:06:20 -07:00
Gian Merlino c00ad28ecc
Cleaner JSON for various input sources and formats. (#13064)
* Cleaner JSON for various input sources and formats.

Add JsonInclude to various properties, to avoid population of default
values in serialized JSON.

Also fixes a bug in OrcInputFormat: it was not writing binaryAsString,
so the property would be lost on serde.

* Additonal test cases.
2022-09-12 10:29:31 -07:00
Paul Rogers 80b97ac24d
Create a copy of the shared JDBC context (#13049) 2022-09-12 10:27:56 -07:00
Frank Chen eff7c64228
export com.sun.management.internal (#13068) 2022-09-12 09:03:22 -07:00
imply-cheddar 5ba0075c0c
Expose HTTP Response headers from SqlResource (#13052)
* Expose HTTP Response headers from SqlResource

This change makes the SqlResource expose HTTP response
headers in the same way that the QueryResource exposes them.

Fundamentally, the change is to pipe the QueryResponse
object all the way through to the Resource so that it can
populate response headers.  There is also some code
cleanup around DI, as there was a superfluous FactoryFactory
class muddying things up.
2022-09-12 01:40:06 -07:00
Frank Chen f60ec8e7ca
Enable msq for docker by default (#13069) 2022-09-11 21:00:32 +05:30
Benedict Jin 4bde50e683
Bump the version of Druid docker image from 0.16.0-incubating to latest (#13058) 2022-09-10 14:06:00 +05:30
Vadim Ogievetsky 4fc43670e5
adjust docs and images (#13067) 2022-09-10 14:05:19 +05:30
Vadim Ogievetsky d978afc5b7
fix number of expected functions (#13050) 2022-09-09 13:42:01 -07:00
Gian Merlino e29e7a8434
Add ARRAY_QUANTILE function. (#13061)
* Add ARRAY_QUANTILE function.

Expected usage is like: ARRAY_QUANTILE(ARRAY_AGG(x), 0.9).

* Fix test.
2022-09-09 11:29:20 -07:00
Vadim Ogievetsky 5cc5f7b60c
quote columns, datasources in auto complete if needed (#13060) 2022-09-09 11:22:40 -07:00
DENNIS dced61645f
prometheus-emitter supports sending metrics to pushgateway regularly … (#13034)
* prometheus-emitter supports sending metrics to pushgateway regularly and continuously

* spell check fix

* Optimization variable name and related documents

* Update docs/development/extensions-contrib/prometheus.md

OK, it looks more conspicuous

Co-authored-by: Frank Chen <frankchen@apache.org>

* Update doc

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* When PrometheusEmitter is closed, close the scheduler

* Ensure that registeredMetrics is thread safe.

* Local variable name optimization

* Remove unnecessary white space characters

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-09 20:46:14 +08:00
sachidananda007 48c99054d0
Update tutorial-kafka.md (#13056)
* Update tutorial-kafka.md

Added missing command to the doc for zookeeper before starting kafka

* Update docs/tutorials/tutorial-kafka.md

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-09 10:06:19 +08:00
Clint Wylie 6438f4198d
improve nested column serializer (#13051)
changes:
* long and double value columns are now written directly, at the same time as writing out the 'intermediary' dictionaryid column with unsorted ids
* remove reverse value lookup from GlobalDictionaryIdLookup since it is no longer needed
2022-09-08 18:32:53 -07:00
Frank Chen d57557d51d
Improve doc and configuration of prometheus emitter (#13028)
* Improve doc and validation

* Add configuration for peon tasks

* Update doc

* Update test case

* Fix typo

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2022-09-09 02:20:34 +08:00
Lucas Capistrant 99fd22c79b
fix bug in /status/properties filtering (#13045)
* fix bug in /status/properties filtering

* Refactor tests to use jackson for parsing druid.server.hiddenProperties instead of hacky string modifications

* make javadoc more descriptive using example

* add in a sanity assertion that raw properties keyset size is greater than filtered properties keyset size
2022-09-07 17:45:28 -07:00
Kashif Faraz 7e20d70242
Fix web-console message in MSQ data loader (#12996)
* Fix typo in web-console message

* Prettify the changes
2022-09-07 13:34:10 -07:00
Vadim Ogievetsky 92789cfc4a
default to no compare (#13041) 2022-09-07 08:28:28 -07:00
Gian Merlino f00f1f754d
MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor. (#13036)
* MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor.

Frame processors are meant to write only one output frame per cycle.
The ScanQueryFrameProcessor would write two when reading from a channel
if the input frame cursor cycled and then the output frame filled up
while reading from the next frame.

This patch fixes the bug, and adds a test. It also makes some adjustments
to the processor code in order to make it easier to test.

* Add license header.
2022-09-07 19:32:21 +05:30
Rohan Garg 2f156b3610
Disallow timeseries queries with ETERNITY interval and non-ALL granularity (#12944) 2022-09-07 16:45:08 +05:30
Rohan Garg 7aa8d7f987
Add query/time metric for SQL queries from router (#12867)
* Add query/time metric for SQL queries from router

* Fix query cancel bug when user has overriden native query-id in a SQL query
2022-09-07 13:54:46 +05:30
Adam Peck ee22663dd3
Add interpolation to JsonConfigurator (#13023)
* Add interpolation to JsonConfigurator

* Fix checkstyle

* Fix tests by removing common-text override

* Add back commons-text without version

* Remove unused hadoopDir configs

* Move some stuff to hopefully pass coverage
2022-09-07 12:48:01 +05:30
Clint Wylie a3a377e570
more consistent expression error messages (#12995)
* more consistent expression error messages

* review stuff

* add NamedFunction for Function, ApplyFunction, and ExprMacro to share common stuff

* fixes

* add expression transform name to transformer failure, better parse_json error messaging
2022-09-06 23:21:38 -07:00
sr ed26e2d634
Improve String Last/First Storage Efficiency (#12879)
-Add classes for writing cell values in LZ4 block compressed format.
Payloads are indexed by element number for efficient random lookup
-update SerializablePairLongStringComplexMetricSerde to use block
compression
-SerializablePairLongStringComplexMetricSerde also uses delta encoding
of the Long by doing 2-pass encoding: buffers first to find min/max
numbers and delta-encodes as integers if possible

Entry points for doing block-compressed storage of byte[] payloads
are the CellWriter and CellReader class. See
SerializablePairLongStringComplexMetricSerde for how these are used
along with how to do full column-based storage (delta encoding here)
which includes 2-pass encoding to compute a column header
2022-09-06 20:00:54 -07:00
Jill Osborne 1f69140623
Nested columns documentation (#12946)
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: brian.le <brian.le@imply.io>
2022-09-06 14:42:18 -07:00
Vadim Ogievetsky 897689c03b
remove mentions of DruidQueryRel from docs (#13033)
* remove mentions of DruidQueryRel

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2022-09-06 13:37:27 -07:00
Vadim Ogievetsky 2a039e7e6a
Add CTA and fix typo (#13009)
* Add CTA and fix typo

* resolve hostname better
2022-09-06 11:16:50 -07:00
Vadim Ogievetsky 2cf449386f
Web console: upgrade the console to use node 16 (#13017)
* upgrade the console to use node 16

* run npm audit fix
2022-09-06 11:15:23 -07:00
317brian d4233ef2a1
msq: add multi-stage-query docs (#12983)
* msq: add multi-stage-query docs

* add screenshots

add back theta sketches tutoria

change filename

fix filename

fix link

fix headings

* fixes

* fixes

* fix spelling issues and update spell file

* address feedback from karan

* add missing guardrail to known issues

* update blurb

* fix typo

* remove durable storage info

* update titles

* Restore en.json

* Update query view

* address comments from vad

* Update docs/multi-stage-query/msq-known-issues.md

finish sentence

* add apache license to docs

* add apache license to docs

Co-authored-by: Katya Macedo <katya.macedo@imply.io>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-09-06 23:06:09 +05:30
Didip Kerabat 66545a0f3d
Fix compiler error: The project was not built since its build path is incomplete. Cannot find the class file for org.slf4j.Logger. Fix the build path then try building this project (#13029)
Co-authored-by: Didip Kerabat <didip@apple.com>
2022-09-06 20:49:41 +05:30