Commit Graph

9182 Commits

Author SHA1 Message Date
Justin Borromeo 8a6bb1127c Fix docs and flipped boolean in ScanQueryLimitRowIterator 2019-03-25 17:17:41 -07:00
Justin Borromeo 35692680fc Fix bug messing up count of rows 2019-03-25 16:15:49 -07:00
Justin Borromeo 219af478c8 Fix bug in numRowsScanned 2019-03-25 15:57:55 -07:00
Justin Borromeo da4fc66403 Check type of segment spec before using for time ordering 2019-03-25 15:19:45 -07:00
Justin Borromeo b822fc73df Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge"
This reverts commit 57033f36df, reversing
changes made to 8f01d8dd16.
2019-03-25 13:19:02 -07:00
Justin Borromeo 57033f36df Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge 2019-03-25 13:13:52 -07:00
Justin Borromeo 8f01d8dd16 Revert "Fixed failing tests -> allow usage of all types of segment spec"
This reverts commit ec470288c7.
2019-03-25 13:13:32 -07:00
Justin Borromeo ec470288c7 Fixed failing tests -> allow usage of all types of segment spec 2019-03-25 13:12:58 -07:00
Justin Borromeo 86d9730fc9 Fixed failing tests -> allow usage of all types of segment spec 2019-03-25 11:01:35 -07:00
Justin Borromeo 8b3b6b51ed Nit comment 2019-03-22 16:01:56 -07:00
Justin Borromeo a87d02127c Fix checkstyle and test 2019-03-22 15:54:42 -07:00
Justin Borromeo 62dcedacde More comments 2019-03-22 15:30:41 -07:00
Justin Borromeo 1b46b58aec Added a bit of docs 2019-03-22 15:19:52 -07:00
Justin Borromeo 49472162b7 Rename segment limit -> segment partitions limit 2019-03-22 10:27:41 -07:00
Justin Borromeo 43d490cc3a Optimized n-way merge strategy 2019-03-21 13:16:58 -07:00
Justin Borromeo 42f5246b8d Smarter limiting for pQueue method 2019-03-20 18:25:31 -07:00
Justin Borromeo 4823dab895 Finish rename 2019-03-20 16:05:53 -07:00
Justin Borromeo 2528a56142 Renaming 2019-03-18 14:00:50 -07:00
Justin Borromeo 7bfa77d3c1 Merge branch 'Update-Query-Interrupted-Exception' into 6088-Time-Ordering-On-Scans-N-Way-Merge 2019-03-12 16:57:45 -07:00
Justin Borromeo 7e49d47391 Added error message for UOE 2019-03-12 16:51:25 -07:00
Justin Borromeo a032c46ee0 Updated error message 2019-03-12 16:47:17 -07:00
Justin Borromeo 57b5682654 Fixed tests 2019-03-12 12:44:02 -07:00
Justin Borromeo 45e95bb1f4 Optimization 2019-03-12 11:09:08 -07:00
Venkatraman P 3118160387 Adding a tutorial in doc for using Kerberized Hadoop as deep storage. (#6863)
* Adding a tutorial in doc for using Kerberized Hadoop as deep storage.

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md

Fixed - to ~ in Apache License section.

* Update tutorial-kerberos-hadoop.md

* Update tutorial-kerberos-hadoop.md
2019-03-11 11:39:15 -07:00
Clint Wylie d7ba19d477 sql, filters, and virtual columns (#6902)
* refactor sql planning to re-use expression virtual columns when possible when constructing a DruidQuery, allowing virtual columns to be defined in filter expressions, and making resulting native druid queries more concise. also minor refactor of built-in sql aggregators to maximize code re-use

* fix it

* fix it in the right place

* fixup for base64 stuff

* fixup tests

* fix merge conflict on import order

* fixup

* fix imports

* fix tests

* review comments

* refactor

* re-arrange

* better javadoc

* fixup merge

* fixup tests

* fix accidental changes
2019-03-11 11:37:58 -07:00
Jonathan Wei e1d8c17746 Add commit ID milestone helper script (#7100)
* Add commit ID milestone helper script

* Filter on merged/closed in API call
2019-03-11 11:36:07 -07:00
Gian Merlino 4290e5ae7a Cache selectors in QueryableIndexColumnSelectorFactory. (#7216)
For selectors with internal caches (like SingleScanTimeDimensionSelector,
SingleLongInputCachingExpressionColumnValueSelector, etc) we can get a perf
boost and memory usage decrease by sharing selectors.
2019-03-11 11:33:01 -07:00
Samarth Jain 8804bd0dc1 Remove unnecessary check for contains() in LoadRule (#7073)
See https://github.com/apache/incubator-druid/issues/7072
2019-03-11 13:52:46 -03:00
Jonathan Wei 94463b5778 Add missing redirects and fix broken links (#7213)
* Add missing redirects

* Fix zookeeper redirect

* Fix broken links
2019-03-09 15:16:23 -08:00
Clint Wylie 5cc171419c move jetty module to Lifecycle.Stage.LAST to allow graceful shutdown to work with lookups and stuff, put http-clint on lifecycle modules lifecycle (#7215) 2019-03-09 15:14:09 -08:00
Jihoon Son 9bebf113ba
Fix race in historical when loading segments in parallel (#7203)
* Fix race in historical when loading segments in parallel

* revert unnecessary change

* remove synchronized

* add reference counting locking

* fix build

* fix comment
2019-03-08 17:54:05 -08:00
jorbay-au 62f0de9b89 Remove outdated instruction for rule updates (#7205) 2019-03-08 16:42:08 -08:00
Surekha 6991735f73 Fix and add sys IT tests to travis script (#7208)
* Add sys IT tests to travis script

* minor fixes

* Modify the test queries

* modify query
2019-03-08 16:40:59 -08:00
Clint Wylie a44df6522c rename maintenance mode to decommission (#7154)
* rename maintenance mode to decommission

* review changes

* missed one

* fix straggler, add doc about decommissioning stalling if no active servers

* fix missed typo, docs

* refine docs

* doc changes, replace generals

* add explicit comment to mention suppressed stats for balanceTier

* rename decommissioningVelocity to decommissioningMaxSegmentsToMovePercent and update docs

* fix precondition check

* decommissioningMaxPercentOfMaxSegmentsToMove

* fix test

* fix test

* fixes
2019-03-08 16:33:51 -08:00
David Glasser de55905a5f integration-tests: make ITParallelIndexTest still work in parallel (#7211)
* integration-tests: make ITParallelIndexTest still work in parallel

Follow-up to #7181, which made the default behavior for index_parallel tasks
non-parallel.

* Validate that parallel index subtasks were run
2019-03-08 16:17:52 -08:00
Justin Borromeo cce917ab84 Checkstyle fix 2019-03-08 14:11:07 -08:00
Justin Borromeo 73f4038068 Applied Jon's recommended changes 2019-03-07 18:40:00 -08:00
Justin Borromeo fb966def83 Sorry, checkstyle 2019-03-07 11:03:01 -08:00
Charles Allen 3ed250787d Densify swapped hll buffer (#6865)
* Densify swapped hll buffer

* Make test loop limit pre-increment

* Reformat

* Fix test comments
2019-03-06 14:50:04 -08:00
Justin Borromeo 6dc53b311c Improved test and appeased TeamCity 2019-03-06 10:34:13 -08:00
Jihoon Son e48a9c138e Reduce default max # of subTasks to 1 for native parallel task (#7181)
* Reduce # of max subTasks to 2

* fix typo and add more doc

* add more doc and link

* change default and add warning

* fix doc

* add test

* fix it test
2019-03-05 22:06:36 -08:00
Jonathan Wei 9183e32876 Add more approximate algorithm docs (#7195) 2019-03-05 16:44:02 -08:00
Roman Leventov 37cbad79b1 Adjust issue templates (#7188)
* Adjust issue templates

* typo

* bug -> problem
2019-03-05 16:06:40 -08:00
Xue Yu 65118277a3 support sin cos etc trigonometric function in sql (#7182)
* support triangle function in sql

* feedback address
2019-03-04 19:18:22 -08:00
Jonathan Wei 5486c2abf8
Update LICENSE and NOTICE files (#7026)
* Update LICENSE and NOTICE files

* Update react-table version
2019-03-04 18:45:22 -08:00
Justin Borromeo 35c96d3557 Checkstyle fix 2019-03-04 16:00:44 -08:00
Justin Borromeo 2d1978d571 Merge branch 'master' into 6088-Time-Ordering-On-Scans-N-Way-Merge 2019-03-04 15:24:49 -08:00
Clint Wylie 3398d3982f
fix intellij UnusedInspectionsScope.xml (#7158) 2019-03-04 14:56:41 -08:00
Roman Leventov 10c9f6d708
Fix and document concurrency of EventReceiverFirehose and TimedShutoffFirehose; Refine concurrency specification of Firehose (#7038)
#### `EventReceiverFirehoseFactory`
Fixed several concurrency bugs in `EventReceiverFirehoseFactory`:
 - Race condition over putting an entry into `producerSequences` in `checkProducerSequence()`.
 - `Stopwatch` used to measure time across threads, but it's a non-thread-safe class.
 - Use `System.nanoTime()` instead of `System.currentTimeMillis()` because the latter are [not suitable](https://stackoverflow.com/a/351571/648955)  for measuring time intervals.
 - `close()` was not synchronized by could be called from multiple threads concurrently.

Removed unnecessary `readLock` (protecting `hasMore()` and `nextRow()` which are always called from a single thread). Removed unnecessary `volatile` modifiers.

Documented threading model and concurrent control flow of `EventReceiverFirehose` instances.

**Important:** please read the updated Javadoc for `EventReceiverFirehose.addAll()`. It allows events from different requests (batches) to be interleaved in the buffer. Is this OK?

#### `TimedShutoffFirehoseFactory`
- Fixed a race condition that was possible because `close()` that was not properly synchronized.

Documented threading model and concurrent control flow of `TimedShutoffFirehose` instances.

#### `Firehose`

Refined concurrency contract of `Firehose` based on `EventReceiverFirehose` implementation. Importantly, now it states that `close()` doesn't affect `hasMore()` and `nextRow()` and could be called concurrently with them. In other words, specified that `close()` is for "row supply" side rather than "row consume" side. However, I didn't check that other `Firehose` implementatations adhere to this contract.

<hr>

This issue is the result of reviewing `EventReceiverFirehose` and `TimedShutoffFirehose` using [this checklist](https://medium.com/@leventov/code-review-checklist-java-concurrency-49398c326154).
2019-03-04 18:50:03 -03:00
David Glasser 7bf1ee4dc0 ITIndexerTest: validate new data source after reindex (#7171)
Previously, the test validated that the data source that we ingested from still
had the same query responses that it did before the second ingestion. This is
less useful than validating queries against the newly created data source.

The new queries file differs from the old one in that its maxTime is earlier due
to the interval selected by the reindex, and in that it does not query for the
dropped metric "count".
2019-03-04 11:05:40 -08:00