Commit Graph

8174 Commits

Author SHA1 Message Date
Jonathan Wei 9d91ffd039 Use ubuntu image with pre-installed java 8 for IT docker (#4970)
* Use OpenJDK8 instead of Oracle for IT docker

* install wget

* Use different docker image

* Revert "Use different docker image"

This reverts commit 5786c03cb4.

* Revert "install wget"

This reverts commit 8d1d5ec681.

* Revert "Use OpenJDK8 instead of Oracle for IT docker"

This reverts commit 55ea163bb5.

* Use prebuilt java8 image

* Add comment on docker image
2017-10-18 09:33:27 -05:00
Jihoon Son 52d7f74226 Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704)
* Add steaming grouper

* Fix doc

* Use a single dictionary while combining

* Revert GroupByBenchmark

* Removed unused code

* More cleanup

* Remove unused config

* Fix some typos and bugs

* Refactor Groupers.mergeIterators()

* Add comments for combining tree

* Refactor buildCombineTree

* Refactor iterator

* Add ParallelCombiner

* Add ParallelCombinerTest

* Handle InterruptedException

* use AbstractPrioritizedCallable

* Address comments

* [maven-release-plugin] prepare release druid-0.11.0-sg

* [maven-release-plugin] prepare for next development iteration

* Address comments

* Revert "[maven-release-plugin] prepare for next development iteration"

This reverts commit 5c6b31e488.

* Revert "[maven-release-plugin] prepare release druid-0.11.0-sg"

This reverts commit 0f5c3a8b82.

* Fix build failure

* Change list to array

* rename sortableIds

* Address comments

* change to foreach loop

* Fix comment

* Revert keyEquals()

* Remove loop

* Address comments

* Fix build fail

* Address comments

* Remove unused imports

* Fix method name

* Split intermediate and leaf combine degrees

* Add comments to StreamingMergeSortedGrouper

* Add more comments and fix overflow

* Address comments

* ConcurrentGrouperTest cleanup

* add thread number configuration for parallel combining

* improve doc

* address comments

* fix build
2017-10-17 23:24:08 -07:00
Gian Merlino 43051829f2 Regression test for #4208. (#4968) 2017-10-17 15:54:00 -05:00
Slim af2bc5f814 Make float default representation for DoubleSum/Min/Max aggregators (#4944)
* Introduce System wide property to select how to store double.
Set the default to store as float

Change-Id: Id85cca04ed0e7ecbce78624168c586dcc2adafaa

* fix tests

Change-Id: Ib42db724b8a8f032d204b58c366caaeabdd0d939

* Change the property name

Change-Id: I3ed69f79fc56e3735bc8f3a097f52a9f932b4734

* add tests and make default distribution store doubles as 64bits

Change-Id: I237b07829117ac61e247a6124423b03992f550f2

* adding mvn argument to parallel-test profile

Change-Id: Iae5d1328f901c4876b133894fa37e0d9a4162b05

* move property name and helper function to io.druid.segment.column.Column

Change-Id: I62ea903d332515de2b7ca45c02587a1b015cb065

* fix docs and clean style

Change-Id: I726abb8f52d25dc9dc62ad98814c5feda5e4d065

* fix docs

Change-Id: If10f4cf1e51a58285a301af4107ea17fe5e09b6d
2017-10-16 17:17:22 -07:00
Himanshu a7e802c9d4 greater-than/less-than/equal-to havingSpec to call AggregatorFactory.finalizeComputation(..) (#4883)
* greater-than/less-than/equal-to havingSpec to call AggregatorFactory.finalizeComputation(..)

* fix the unit test and expect having to work on hyperUnique agg

* test fix

* fix style errors
2017-10-16 12:02:30 -07:00
Roman Leventov dc7cb117a1 Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism (#4886)
* Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism

* Fix MapVirtualColumn.makeColumnValueSelector()

* Minor fixes

* Fix IndexGeneratorCombinerTest

* DimensionSelector to return zeros when treated as numeric ColumnValueSelector

* Fix IncrementalIndexTest

* Fix IncrementalIndex.makeColumnSelectorFactory()

* Optimize MapBasedRow.getMetric()

* Fix VarianceAggregatorTest

* Simplify IncrementalIndex.makeColumnSelectorFactory()

* Address comments

* More comments

* Test
2017-10-13 21:44:17 -05:00
Jihoon Son 8d9902831e Refactoring PrefetchableTextFilesFirehoseFactory (#4836)
* Refactoring prefetchable firehose

* Fix to read cache when prefetch is disabled

* More tests

* Cleanup codes

* Add Fetcher

* Fix test failure

* Count file size

* Fix test

* rename generic parameter

* address comments

* address comments

* reuse buffer

* move Execs to java-util

* use execs

* Fix build
2017-10-13 21:39:28 -05:00
Gian Merlino f51f346e36 SQL: Fix POWER doc, add test. (#4953) 2017-10-13 14:38:15 -07:00
Gian Merlino 5cfc7f9ef7 Fix formatting of SQL TRIM docs. (#4951) 2017-10-13 14:38:06 -07:00
Jihoon Son 675c6c00dd Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958)
* add checkstyle and intellij rule

* fix tc fail
2017-10-13 07:56:19 -07:00
Atul Mohan c07678b143 Synchronization of lookups during startup of druid processes (#4758)
* Changes for lookup synchronization

* Refactor of Lookup classes

* Minor refactors and doc update

* Change coordinator instance to be retrieved by DruidLeaderClient

* Wait before thread shutdown

* Make disablelookups flag true by default

* Update docs

* Rename flag

* Move executorservice shutdown to finally block

* Update LookupConfig

* Refactoring and doc changes

* Remove lookup config constructor

* Revert Lookupconfig constructor changes

* Add tests to LookupConfig

* Make executorservice local

* Update LRM

* Move ListeningScheduledExecutorService to ExecutorCompletionService

* Move exception to outer block

* Remove check to see future is done

* Remove unnecessary assignment

* Add logging
2017-10-12 21:22:24 -05:00
Gian Merlino 928b083a7a JSON: Fix incorrect translation of null to "null". (#4939) 2017-10-12 15:53:40 -07:00
Gian Merlino 57a4038379 SQL: Fix CASE-filtered aggregations with GROUP BY. (#4943) 2017-10-12 15:40:43 -07:00
Gian Merlino 32f36beaae QueryableIndexStorageAdapter: Lift column cache to Cursor sequence. (#4950)
* QueryableIndexStorageAdapter: Lift column cache to Cursor sequence.

This is where it was before #4710, when its was moved to the individual
Cursors, leading to higher than expected memory usage. It could be
extreme for finer query granularities like "second".

* Comment.
2017-10-12 16:44:33 -05:00
Jihoon Son d95915f8d2 Implement get methods for PrefetchableFirehose (#4948) 2017-10-12 16:14:45 +09:00
Jihoon Son dfa9cdc982 Prioritized locking (#4550)
* Implementation of prioritized locking

* Fix build failure

* Fix tc fail

* Fix typos

* Fix IndexTaskTest

* Addressed comments

* Fix test

* Fix spacing

* Fix build error

* Fix build error

* Add lock status

* Cleanup suspicious method

* Add nullables

*  add doInCriticalSection to TaskLockBox and revert return type of task actions

* fix build

* refactor CriticalAction

* make replaceLock transactional

* fix formatting

* fix javadoc

* fix build
2017-10-11 23:16:31 -07:00
Roman Leventov 7a9940d624 Add /readiness to HistoricalResource (#4916)
* Add /loadStatusCode to HistoricalResource

* Address comments

* Fixes
2017-10-11 20:35:52 -07:00
Jihoon Son 56fb11ce0b Lazy initialization for JavaScript functions (#4871)
* Lazy initialization of JavaScript functions

* Fix test failure

* Fix thread-safety and postpone js conf check

* Fix test fail

* Fix test

* Fix KafkaIndexTaskTest

* Move config check
2017-10-10 21:52:42 -07:00
hzy001 31c80024b6 Add missed links of commands (#4873)
Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>
2017-10-11 10:13:44 +09:00
Jonathan Wei 18635a19b3 Remove unused limitFn in GroupByQuery (#4935)
* Remove unused limitFn in GroupByQuery

* Remove unused limitFn creation logic
2017-10-10 15:56:30 -07:00
Roman Leventov e725ff4146 1-based counts in ZkCoordinator (#4917) 2017-10-10 13:00:51 -07:00
Gian Merlino b20e3038b6 SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889)
* SQL: Upgrade to Calcite 1.14.0, some refactoring of internals.

This brings benefits:
- Ability to do GROUP BY and ORDER BY with ordinals.
- Ability to support IN filters beyond 19 elements (fixes #4203).

Some refactoring of druid-sql internals:
- Builtin aggregators and operators are implemented as SqlAggregators
  and SqlOperatorConversions rather being special cases. This simplifies
  the Expressions and GroupByRules code, which were becoming complex.
- SqlAggregator implementations are no longer responsible for filtering.

Added new functions:
- Expressions: strpos.
- SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR,
  and DATE_TRUNC.

* Add missing @Override annotation.

* Adjustments for forbidden APIs.

* Adjustments for forbidden APIs.

* Disable GROUP BY alias.

* Doc reword.
2017-10-10 12:44:05 -07:00
Gian Merlino 4e1d0f49d8 Docs: Fix link to broker configuration. (#4934) 2017-10-10 11:18:46 -07:00
Kevin Conaway 1bc4b71a34 Reduce Chance of Duplicates in EventReceiverFireHose (#4903)
* Add ability to optionally specify a sequence identifier to reduce the possibility of duplicate events entering the event receiver firehose

* Add ability to optionally specify a sequence identifier to reduce the possibility of duplicate events entering the event receiver firehose

* Add a hard coded limit to the maximum number of possible producer IDs to prevent a malicious (or uninformed) client from overflowing the heap
2017-10-10 11:17:17 -07:00
chunghochen 0614b92df1 adding new post aggregators for test statistics to druid-stats extension (#4532)
* adding new post aggregators of test stats to druid-stats extension

* changes to address code review comments

* fix checkstyle violations using druid_intellij_formatting.xml after merge upstream/master

* add @Override annotation per CI log

* make changes per review comments/discussions

* remove some blocks per review comments
2017-10-09 23:43:27 -07:00
Parag Jain 7cc18226cd add more tls configs to enable/disable specific cipher suites and protocols (#4902)
* add more tls configs to enable/disable specific cipher suites and protocols

* fix doc, allow empty list
2017-10-09 13:53:12 -07:00
Gian Merlino 797b54d283 DruidLeaderClient: Throw IOException on retryable errors. (#4913)
* DruidLeaderClient: Throw IOException on retryable errors.

Fixes #4911.

* Adjustments.
2017-10-06 15:12:09 -05:00
Parag Jain 535c034c06 assume scheme to be http if not present (#4912) 2017-10-06 14:50:48 -05:00
Himanshu 0e856ee806 add configs to enable fast request failure on broker and historical (#4540)
* add configs to enable fast request failure on broker

* address review comments

* fix styling error

* fix style error

* have enableRequestLimit config instead of having user specify max limit

* add comment

* fix style error

* add UT fo LimitRequestsFilter

* address review comments

* fix test

* make LimitRequestsFilterTest more robust

* fix JettyQosTest
2017-10-06 14:45:13 -05:00
Parag Jain ef67915d9c prevent unnecessary exception (#4905) 2017-10-06 13:35:45 -05:00
Guillaume Balaine 35944d24ae Fix JdbcCacheGenerator, null values shouldn't be allowed (#4881)
* Fix JdbcCacheGenerator, null values shouldn't be allowed

* Add a test case for null values
2017-10-06 09:31:48 -07:00
praveev 4ff12e4394 Hadoop indexing: Fix NPE when intervals not provided (#4686)
* Fix #4647

* NPE protect bucketInterval as well

* Add test to verify timezone as well

* Also handle case when intervals are already present

* Fix checkstyle error

* Use factory method instead for Datetime

* Use Intervals factory method
2017-10-05 22:46:07 -07:00
Akash Dwivedi 716a5ec1a8 Add identity to DefaultSearchQueryMetrics and DefaultSelectQueryMetrics. (#4906) 2017-10-04 20:28:23 -05:00
Akash Dwivedi 2ee32399ff granularity method in QueryMetrics. (#4570)
* granularity method in QueryMetrics.

PR to emit granularity dimension for timeseries, search, groupBy,
select and topN queries.

* QueryMetricsFactory classes for search and select queries.

* Empty implementation  for  Granularity() method.

* Review comment changes.

* Remove unused import.

* empty query() method.

* checkstyle fix.

* Import fix.
2017-10-04 09:42:52 -07:00
Jonathan Wei 07aa405a6f Fix PreResponseAuthorizationCheckFilter HTTP error masking (#4900)
* Fix PreResponseAuthorizationCheckFilter HTTP error masking

* Add remote addr and host to missing auth check log message
2017-10-03 16:58:57 -05:00
Alexander Saydakov bba96f59f8 added missing synchronized keyword (#4894)
* added missing synchronized keyword

* added missing synchronized keyword
2017-10-03 12:16:54 -05:00
Gian Merlino c19cd23e94 RTR: Demote chatty log message. (#4895)
"No worker selection strategy set." would get logged any time tryAssignTask runs
in the default configuration, which is often. It also doesn't provide much value.
2017-10-03 08:16:32 -07:00
Jihoon Son e6eabac385 Implement repalceInput and add tpch dataset (#4848) 2017-10-03 08:00:59 -07:00
Roman Leventov 3f1009aaa1 Make Overlord auto-scaling and provisioning extensible (#4730)
* Make AutoScaler, ProvisioningStrategy and BaseWorkerBehaviorConfig extension points; More logging in PendingTaskBasedWorkerProvisioningStrategy

* Address comments and fix a bug

* Extract method

* debug logging

* Rename BaseWorkerBehaviorConfig to WorkerBehaviorConfig and WorkerBehaviorConfig to DefaultWorkerBehaviorConfig

* Fixes
2017-10-02 20:12:23 -05:00
QiuMM 6f91d9ca1e change WorkerSelectStrategy's defaultImpl from FillCapacityWorkerSelectStrategy to EqualDistributionWorkerSelectStrategy (#4777) 2017-10-02 16:52:41 -07:00
Jonathan Wei 5e60ccade1 Add context map to AuthenticationResult (#4870) 2017-10-02 17:08:14 -05:00
Jonathan Wei 5fbec5b435 Fix limit push down comparator bug (#4868) 2017-10-02 11:44:23 -07:00
Jonathan Wei 9deab26d8b Fix auth check in InventoryViewUtils (#4869) 2017-10-02 11:38:45 -07:00
Niketh Sabbineni 3e9391433d Coord resource throws NPE when segments are requested (#4759) 2017-10-02 10:13:27 -07:00
Jihoon Son ee7eaccbab Better logging for SegmentAllocateAction (#4884)
* Better logging for SegmentAllocateAction

* Split methods
2017-10-02 09:29:21 -07:00
Gian Merlino 1f2074c247 Bump versions in master to 0.11.1-SNAPSHOT. (#4878)
* Bump versions in master to 0.11.1-SNAPSHOT.

* Missed a few.
2017-09-28 17:09:51 -05:00
Goh Wei Xiang 26fd2b3a8e Priority on loading for primary replica (#4757)
* Priority on loading for primary replica

* Simplicity fixes

* Fix on skipping drop for quick return.

* change to debug logging for no replicants.

* Fix on filter logic

* swapping if-else

* Fix on wrong "hasTier" logic

* Refactoring of LoadRule

* Rename createPredicate to createLoadQueueSizeLimitingPredicate

* Rename getHolderList to getFilteredHolders

* remove varargs

* extract out currentReplicantsInTier

* rename holders to holdersInTier

* don't do temporary removal of tier.

* rename primaryTier to tierToSkip

* change LinkedList to ArrayList

* Change MinMaxPriorityQueue in DruidCluster to TreeSet.

* Adding some comments.

* Modify log messages in light of predicates.

* Add in-method comments

* Don't create new Object2IntOpenHashMap for each run() call.

* Cache result from strategy call in the primary assignment to be reused during the same run.

* Spelling mistake

* Cleaning up javadoc.

* refactor out loading in progress check.

* Removed redundant comment.

* Removed forbidden API

* Correct non-forbidden API.

* Precision in variable type for NavigableSet.

* Obsolete comment.

* Clarity in method call and moving retrieval of ServerHolder into method call.

* Comment on mutability of CoordinatoorStats.

* Added auxiliary fixture for dropping.
2017-09-28 13:02:05 -07:00
Gian Merlino a19f22b5bb Add identity to query metrics, logs. (#4862)
* Add identity to query metrics, logs.

Also fix a bug where unauthorized requests would not emit any logs or metrics,
and instead would log a "Tried to emit logs and metrics twice" warning.

Also rename QueryResource's "getServer" to "cancelQuery", because that's what
it does.

* Do not emit identity by default.
2017-09-28 11:45:23 -07:00
Gian Merlino fbd4cd633b SQL: Delay query translation until the end of planning. (#4846)
* SQL: Delay query translation until the end of planning.

This fixes a bug in which input rels to nested queries could get swapped
out by the optimizer, leading to incorrect nested query planning.

This also, I hope, makes the query translation code easier to understand. At
least for me, the PartialDruidQuery -> DruidQuery -> Query chain is easier
to understand than the previous-existing rule spaghetti.

* Make test more consistent.

* Fix test.
2017-09-28 11:43:20 -07:00
Himanshu f69c9280c4 remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858)
* remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form

* sanitize output of /druid/coordinator/v1/cluster endpoint
2017-09-28 10:40:59 -05:00