Commit Graph

9008 Commits

Author SHA1 Message Date
Justin Borromeo 01b25ed112 Added time ordering to the scan benchmark 2019-02-04 14:36:18 -08:00
Justin Borromeo 12e51a2721 Added TimestampComparator tests 2019-02-04 12:02:13 -08:00
Justin Borromeo ad731a362b Change benchmark 2019-02-04 10:55:56 -08:00
Justin Borromeo 989bd2d50e Merge branch '6088-Create-Scan-Benchmark' into 6088-Time-Ordering-On-Scans-V2 2019-02-04 10:46:38 -08:00
Justin Borromeo 7b58471394 Licensing stuff 2019-02-02 03:48:18 -08:00
Justin Borromeo 79e8319383 Move ScanResultValue timestamp comparator to a separate class for testing 2019-02-01 18:22:58 -08:00
Justin Borromeo 7a6080f636 Stuff for time-ordered scan query 2019-02-01 18:00:58 -08:00
Justin Borromeo 26930f8d20 It runs. 2019-02-01 16:38:49 -08:00
Justin Borromeo dd4ec1ac9c Need to form queries 2019-02-01 15:12:17 -08:00
Justin Borromeo 6430ef8e1b lol (#6985) 2019-02-01 14:21:13 -08:00
Justin Borromeo dba6e492a0 Merge branch 'master' into 6088-Create-Scan-Benchmark 2019-02-01 14:13:39 -08:00
Justin Borromeo 10e57d5f9e Moved Scan Builder to Druids class and started on Scan Benchmark setup 2019-02-01 14:04:13 -08:00
Jihoon Son 7d4cc28730 Fix node path for building the unified console (#6981) 2019-02-01 13:57:17 -08:00
Clint Wylie 7a5827e12e bloom filter sql aggregator (#6950)
* adds sql aggregator for bloom filter, adds complex value serde for sql results

* fix tests

* checkstyle

* fix copy-paste
2019-02-01 13:54:46 -08:00
lxqfy e45f9ea5e9 Update metrics.md (#6976) 2019-02-01 13:40:44 -08:00
jorbay-au 852fe86ea2 Remove repeated word in indexing-service.md (#6983) 2019-02-01 13:38:22 -08:00
Clint Wylie 5c0fbbda1b use System.err and System.out to print exit messages on CliPeon (#6975)
* use System.err and System.out to print exit messages on CliPeon

* more

* not necessarily a stopping error...
2019-02-01 18:54:14 +08:00
Roman Leventov f7df5fedcc Add several missing inspectRuntimeShape() calls (#6893)
* Add several missing inspectRuntimeShape() calls

* Add lgK to runtime shapes
2019-01-31 20:04:26 -08:00
Gian Merlino 4e426327bb Some adjustments to config examples. (#6973)
* Some adjustments to config examples.

- Add ExitOnOutOfMemoryError to jvm.config examples. It was added a
pretty long time ago (8u92) and is helpful since it prevents zombie
processes from hanging around. (OOMEs tend to bork things)
- Disable Broker caching and enable it on Historicals in example
configs. This config tends to scale better since it enables the
Historicals to merge results rather than sending everything by-segment
to the Broker. Also switch to "caffeine" cache from "local".
- Increase concurrency a bit for Broker example config.
- Enable SQL in the example config, a baby step towards making SQL
more of a thing. (It's still off by default in the code.)
- Reduce memory use a bit for the quickstart configs.
- Add example Router configs, in case someone wants to use that. One
reason might be to get the fancy new console (#6923).

* Add example Router configs.

* Fix up router example properties.

* Add router to quickstart supervise conf.
2019-01-31 17:59:39 -08:00
Vadim Ogievetsky 7f1b19bfb1 Adding a Unified web console. (#6923)
* Adding new web console.

* fixed css

* fix form height

* fix typo

* do import custom react-table css

* added repo field so npm does not complain

* ask travis for node 10

* move indexing-service/src/main/resources/indexer_static into web-console

* fix resource names and paths

* add licenses

* fix exclude file

* add licenses to misc files and tidy up

* remove rebase marker

* fix link

* updated env variable name

* tidy up licenses and surface errors

* cleanup

* remove unused code, fix missing await

* TeamCity does not like the name aux

* add more links to tasks view

* rm pages

* update gitignore

* update readme to be accurate

* make clean script

* removed old console dependancy

* update Jetty routes

* add a comment for welcome files for coordinator

* do not show inital notifaction for now

* renamed overlord console back to console.html

* fix coordinator console

* rename coordinator-console.html to index.html
2019-01-31 17:26:41 -08:00
Furkan KAMACI 185a7d4fc5 Updated definition and added link for Zookeeper connection string. (#6961)
* Updated definition and added link for Zookeeper connection string.

* Conflicts are merged.
2019-01-31 10:14:42 -08:00
Gian Merlino 54735a5ad1 Kafka indexing: Remove experimental notice. (#6970) 2019-01-31 09:54:22 -08:00
Surekha 4c211ab2b4 update sys table docs (#6955)
* update sys table docs

* Capitalize SQL
2019-01-31 08:51:39 -08:00
Jihoon Son e56c598cc1 Fall back to the old coordinator API for checking segment handoff if new one is not supported (#6966) 2019-01-31 08:50:46 -08:00
David Glasser 9eaf8f5304 google-storage: retry GoogleTaskLogs inserts (#6918)
This is an extension of PR #5750 by @drcrallen which added retry to a variety of
GCS operations, but not to GoogleTaskLogs, which we have found to
occasionally fail in our cluster.

Also fixes a typo in a variable name and removes an unused private method
parameter.

Fixes #6912.
2019-01-31 01:21:35 -08:00
Furkan KAMACI 30ec608038 Fix mixed up segment ids at SelectBinaryFnTest.java (#6946) 2019-01-30 20:04:16 -08:00
Jonathan Wei 82137874ea Add master/data/query server concepts to docs/packaging (#6916)
* Add master/data/query server concepts to docs/packaging

* PR comments

* TOC and markdown fix

* Update image legend

* PR comment

* More PR comments
2019-01-30 19:41:07 -08:00
Jihoon Son d4fbbb8deb Support protocol configuration for S3 (#6954)
* Support protocol configuration for S3

* Add doc
2019-01-30 19:32:00 -08:00
Gian Merlino edee576a7a Add doc for druid.storage.useS3aSchema. (#6964) 2019-01-30 10:26:37 -08:00
Clint Wylie de810286cd fix bug with expression virtual column selectors backed by a single long column (#6957)
* fix issue with SingleLongInputCachingExpressionColumnValueSelector when sql compatible null handling enabled

* add test with doubles to show same behavior for floats/doubles that lack the optimization of longs

* simplify

* fix import
2019-01-30 10:13:07 -05:00
Jihoon Son c23c5ef4ef Print files with unapproved licenses in travis (#6947) 2019-01-29 11:35:22 -08:00
Egor Riashin 2803fda8b7 Added an allocation rate metric #6604 (#6710)
Addressing #6604
2019-01-29 20:16:35 +07:00
Clint Wylie a6d81c0d16 Adds bloom filter aggregator to 'druid-bloom-filters' extension (#6397)
* blooming aggs

* partially address review

* fix docs

* minor test refactor after rebase

* use copied bloomkfilter

* add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests

* add methods to BloomKFilter to get number of set bits, use in comparator, fixes

* more docs

* fix

* fix style

* simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets

* oof, more fixes

* more sane docs example

* fix it

* do the right thing in the right place

* formatting

* fix

* avoid conflict

* typo fixes, faster comparator, docs for comparator behavior

* unused imports

* use buffer comparator instead of deserializing

* striped readwrite lock for buffer agg, null handling comparator, other review changes

* style fixes

* style

* remove sync for now

* oops

* consistency

* inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception

* CardinalityBufferAggregator inspect selectors instead of selectorPluses

* fix style

* refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments

* adjustment

* fix teamcity error?

* rename nil aggs to empty, change empty agg constructor signature, add comments

* use stringutils base64 stuff to be chill with master

* add aggregate combiner, comment
2019-01-29 20:05:17 +07:00
Gian Merlino ac4c7e21a2 Enhancements to dsql. (#6929)
- CLI history, basic autocomplete through deadline.
- Include timeout in query context.
- Group CLI options into... groups.
2019-01-28 17:02:43 -08:00
Justin Borromeo 8d70ba69cf Fix broken link on select query doc page (#6933)
* Fixed broken link

* Typo fix
2019-01-28 17:02:21 -08:00
Gian Merlino ba33bdc497 Add exclusions to limit doubling up on jars. (#6927) 2019-01-28 11:06:30 -08:00
Clint Wylie af3cbc3687 add bloom filter druid expression (#6904)
* add "bloom_filter_test" druid expression to support bloom filters in ExpressionVirtualColumn and ExpressionDimFilter and sql expressions

* more docs

* use java.util.Base64, doc fixes
2019-01-28 08:41:45 -05:00
Benedict Jin 2b73644340 * Use `@SuppressWarnings("GuardedBy")` instead of `noinspection FieldAccessNotGuarded` comment (#6903)
* Remove `@GuardedBy("connectionLock")` from `connectionLock` itself

* Add FieldAccessNotGuarded into inspection profile and set the level to ERROR
2019-01-27 12:42:45 -08:00
Navin Kumar ae4dba7785 Fix Configuration options (#6884)
Change `druid.metadata.postgres.*` to `druid.metadata.postgres.ssl.*`
2019-01-27 12:35:27 -08:00
Gian Merlino 7c5a06bb85
More docs on data modeling. (#6899)
* More docs on data modeling.

* Try to fix formatting.

* Fix indentation.

* More details and adjustments after feedback.
2019-01-27 11:33:21 -08:00
Janek Lasocki-Biczysko 89f2475369 Move ingest/kafka/* metrics into a separate section on the metrics docs (#6895)
The `ingest/kafka/*` metrics were grouped together with metrics relevant
to RealtimeMetricsMonitor, whereas they should be in their own section.
2019-01-28 00:11:53 +08:00
Benedict Jin 72a571fbf7 For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913)
* * Add few methods about base64 into StringUtils

* Use `java.util.Base64` instead of others

* Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis

* Rename encodeBase64String & decodeBase64String

* Update druid-forbidden-apis
2019-01-25 17:32:29 -08:00
Ankit Kothari 8492d94f59 Kill Hadoop MR task on kill of Hadoop ingestion task (#6828)
* KillTask from overlord UI now makes sure that it terminates the underlying MR job, thus saving unnecessary compute

Run in jobby is now split into 2
 1. submitAndGetHadoopJobId followed by 2. run
  submitAndGetHadoopJobId is responsible for submitting the job and returning the jobId as a string, run monitors this job for completion

JobHelper writes this jobId in the path provided by HadoopIndexTask which in turn is provided by the ForkingTaskRunner

HadoopIndexTask reads this path when kill task is clicked to get hte jobId and fire the kill command via the yarn api. This is taken care in the stopGracefully method which is called in SingleTaskBackgroundRunner. Have enabled `canRestore` method to return `true` for HadoopIndexTask in order for the stopGracefully method to be called

Hadoop*Job files have been changed to incorporate the changes to jobby

* Addressing PR comments

* Addressing PR comments - Fix taskDir

* Addressing PR comments - For changing the contract of Task.stopGracefully()
`SingleTaskBackgroundRunner` calls stopGracefully in stop() and then checks for canRestore condition to return the status of the task

* Addressing PR comments
 1. Formatting
 2. Removing `submitAndGetHadoopJobId` from `Jobby` and calling writeJobIdToFile in the job itself

* Addressing PR comments
 1. POM change. Moving hadoop dependency to indexing-hadoop

* Addressing PR comments
 1. stopGracefully now accepts TaskConfig as a param
     Handling isRestoreOnRestart in stopGracefully for `AppenderatorDriverRealtimeIndexTask, RealtimeIndexTask, SeekableStreamIndexTask`
     Changing tests to make TaskConfig param isRestoreOnRestart to true
2019-01-25 15:43:06 -08:00
Himanshu Pandey e1033bb412 Issue#6892- Replaced Math.random() with ThreadLocalRandom.current().nextDouble() (#6914)
*  Replacing Math.random() with ThreadLocalRandom.current().nextDouble()

* Added java.lang.Math#random() in forbidden-apis.txt

* Minor change in the message - druid-forbidden-apis.txt
2019-01-25 19:49:20 +08:00
Clint Wylie 66f64cd8bd fix long/float/double dimension filtering for columns with nulls (#6906)
* fix long,float, double dimension filtering when sql compatible null handling is enabled and the column has null values

* revert unintended change

* fix tests
2019-01-23 22:36:52 -08:00
Jihoon Son 3b020fd81b Improve doc for auto compaction (#6782)
* Improve doc for auto compaction

* address comments

* address comments

* address comments
2019-01-23 16:21:45 -08:00
Clint Wylie ffded61f5e
fix build (#6897) 2019-01-21 17:18:14 -08:00
Roman Leventov 8eae26fd4e Introduce SegmentId class (#6370)
* Introduce SegmentId class

* tmp

* Fix SelectQueryRunnerTest

* Fix indentation

* Fixes

* Remove Comparators.inverse() tests

* Refinements

* Fix tests

* Fix more tests

* Remove duplicate DataSegmentTest, fixes #6064

* SegmentDescriptor doc

* Fix SQLMetadataStorageUpdaterJobHandler

* Fix DataSegment deserialization for ignoring id

* Add comments

* More comments

* Address more comments

* Fix compilation

* Restore segment2 in SystemSchemaTest according to a comment

* Fix style

* fix testServerSegmentsTable

* Fix compilation

* Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes

* Fix SystemSchemaTest

* Fix style

* Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment

* Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164

* Fix compilation
2019-01-21 11:11:10 -08:00
Clint Wylie 8ba33b2505 add 'init' lifecycle stage for finer control over startup and shutdown (#6864)
* add Lifecycle.Stage.INIT, put log shutter downer in init stage, tests, rad startup banner

* log cleanup

* log changes

* add task-master lifecycle to module lifecycle to gracefully stop task-master stuff

* fix it the right way

* remove announce spam

* unused import

* one more log

* updated comments

* wrap leadership lifecycle stop to prevent exceptions from wrecking rest of task master stop

* add precondition check
2019-01-21 09:01:36 -08:00
Justin Borromeo 86e171a234 Doc change and commands tested command on v5 and v8 (#6886) 2019-01-18 15:13:11 -08:00