Commit Graph

8829 Commits

Author SHA1 Message Date
Clint Wylie ac194cc082 Coordinator fix exception caused by additional logging (#5988)
* fix explosion in curator load queue peon caused by additional logging, as well as annoying chatty log

* remove log message
2018-07-11 16:13:32 -07:00
Caroline1000 153eb26262 fix link to query-context in broker config doc (#5995) 2018-07-11 15:57:08 -07:00
Gian Merlino 04ea3c9f8c
Update license headers. (#5976)
* Update license headers.

For compliance with http://www.apache.org/legal/src-headers.html.

* More license adjustments.

* Fix mistakenly edited package line.
2018-07-11 09:55:18 -07:00
Gian Merlino 948e73da77 Extend various test timeouts. (#5978)
False failures on Travis due to spurious timeout (in turn due to noisy
neighbors) is a bigger problem than legitimate failures taking too long
to time out. So it makes sense to extend timeouts.
2018-07-10 13:02:14 -07:00
Benedict Jin b3021ec802 Fix bug in SegmentAnalyzer.analyzeComplexColumn() #5939 (#5954) 2018-07-09 15:36:16 -07:00
Surekha 441c9819d9 Support limit for timeseries query (#5894) (#5931)
* Support limit for timeseries query (#5894)

* Fix tests

* Address PR comments

* Try to fix teamcity inspection checks

* Remove unused method from VirtualColumns

* Remove unused import statement
2018-07-09 08:58:42 -07:00
Jihoon Son d1d9358274 Increase timeout for BlockingPoolTest (#5959) 2018-07-06 16:34:53 -07:00
Caroline1000 b3976050ad add definition of balancerComputeThreads (#5865) 2018-07-05 09:54:36 -07:00
Caroline1000 ee4a5aafb0 add config values for GCS deep storage (#5875)
* add config values for GCS deep storage

* fix config values for GCS deep storage
2018-07-05 09:53:41 -07:00
Dylan Wylie 10642ef9ca Fix filtered request logging docs (#5924)
- Setting druid.request.logging.delegate has no effect. 
- The provider is injected based on a type parameter & this looks to be scoped to delegate for filtered loggers
2018-07-05 09:51:10 -07:00
Gian Merlino 24c20b4734 Forbid slashes in datasource names. (#5937)
They are bad because datasources are used as paths on filesystems,
and slashes invariably make things get stored improperly.
2018-07-05 09:49:16 -07:00
Clint Wylie aa4987b871 change default compaction task target size from 800MB to 400MB to fall within range of what docs recommend for segment sizing (#5930) 2018-07-05 00:12:31 -07:00
Surekha 9bece8ce1e Prevent KafkaSupervisor NPE in generateSequenceName (#5900) (#5902)
* Prevent KafkaSupervisor NPE in checkPendingCompletionTasks (#5900)

* throw IAE in generateSequenceName if groupId not found in taskGroups
* add null check in checkPendingCompletionTasks

* Add warn log in checkPendingCompletionTasks

* Address PR comments

Replace warn with error log

* Address PR comments

* change signature of generateSequenceName to take a TaskGroup object instead of int

* Address comments

* Remove unnecessary method from KafkaSupervisorTest
2018-07-04 23:45:42 -07:00
Jihoon Son 4cd14e8158 Proper handling of the exceptions from auto persisting in AppenderatorImpl.add() (#5932) 2018-07-04 23:42:41 -07:00
Clint Wylie 39371b0ff8 More coordinator logging to help give context to load queue peon log messages (#5929)
* more coordinator logging to help give context to load queue peon log messages

* fix style

* more chill load queue peon log messages
2018-07-04 23:40:25 -07:00
Clint Wylie 0a472d3fa0 coordinator slight optimze load rule to skip drop if numToDrop is 0 (#5928) 2018-07-03 17:56:11 -07:00
Clint Wylie d5a3871864 Coordinator fix balance to try to move max segments instead of up to max segments (#5927)
* fix move to try to move max segments instead of "up to" max segments

* fix

* fix oops
2018-07-03 17:06:38 -07:00
Jihoon Son 1ccabab98e Fix the broken Appenderator contract in KafkaIndexTask (#5905)
* Fix broken Appenderator contract in KafkaIndexTask

* fix build

* add publishFuture

* reuse sequenceToUse if possible
2018-07-03 13:31:29 -07:00
mhshimul 867f6a9e2b Fix SQL Server select query in createInactiveStatusesSinceQuery() method. (#5901)
* Fix SQL Server select query in createInactiveStatusesSinceQuery() method.

SQL server does not support LIMIT N in select queries. Instead it has TOP N to limiting number of query results.
And TOP N is already added in the select statement as per maxNumStatuses value.

* Add parentheses for TOP in SELECT statement as SQL Servers no longer support TOP without parentheses.
2018-07-03 23:16:47 +05:30
Jihoon Son b6c957b0d2 Allow reordered segment allocation in kafka indexing service (#5805)
* Allow reordered segment allocation in kafka indexing service

* address comments

* fix a bug
2018-07-02 15:09:12 -07:00
Jihoon Son b76a056c14 Fix ConcurrentModificationException in IncrementalPublishingKafkaIndexTaskRunner (#5907)
* Fix ConcurrentModificationException in IncrementalPublishingKafkaIndexTaskRunner

* fix lock and add comments
2018-06-30 17:20:41 -07:00
Surekha 933b25416c Handle task deserialization failure in the tasks api (#5911)
If task payload fails to deserialize json to Java, make the task null and handle null task in OverlordResource
2018-06-29 11:57:48 -07:00
Jihoon Son 10a01d6846 [SQL] Fix missing postAggregations for Timeseries and TopN (#5912)
* [SQL] Fix missing postAggregations for Timeseries and TopN

* fix build

* fix test
2018-06-29 10:36:55 -07:00
Jonathan Wei f3e1520360
Fix merge for TrueDimFilter (#5916)
* Fix merge for TrueDimFilter

* remove unused cache ID
2018-06-28 14:46:47 -07:00
scrawfor bf2a31a5bc Add new 'true' filter which always returns true. (#5711)
* Add new 'true' filter which always returns true.

* Add support for bitmap index.

* Adds documentation.

* Removes No-op Filter
2018-06-28 11:52:45 -07:00
zhangxinyu d857345b7d add method getRequiredColumns for DimFilter (#5872)
* add method getRequiredColumns for DimFilter

* deal with the NullPointerException when DimFilter is null
2018-06-27 15:45:46 -07:00
Surekha 0f429298cf Fix Kafka Indexing task pause forever if no events in taskDuration (#5656) (#5899)
* Fix Kafka Indexing task pause forever (#5656)

* Fix Nullpointer Exception in overlord if taskGroups does not contain the groupId
* If the endOffset is same as startOffset, still let the task resume instead of returning
   endOffsets early which causes the tasks to pause forever and ultimately fail on timeout

* Address PR comment

*Remove the null check and do not return null from generateSequenceName
2018-06-25 19:29:36 -07:00
陈春斌 7649742943 Use ReentrantReadWriteLock in DimensionDictionary (#5883) 2018-06-25 12:35:26 -07:00
Gian Merlino a28314349c
Fix spelling of "propagate" in various places. (#5896)
One of these is a configuration parameter (introduced in #5429),
but it's never been in a release, so I think it's ok to rename it.
2018-06-25 09:18:08 -07:00
George Paraskevas 4b111929ec Fix typo lage->large , improve warning message (#5890) 2018-06-22 17:33:02 -07:00
Jihoon Son 8c5ded0fad
Splitting KafkaIndexTask for better code maintenance (#5854)
* Refactoring KafkaIndexTask for better code maintenance

* fix bug

* fix test

* add annotation

* fix checkstyle

* remove SetEndOffsetsResult
2018-06-22 13:00:03 -07:00
Clint Wylie 1a7adabf57 Coordinator segment balancer max load queue fix (#5888)
* Coordinator segment balancer will now respect "maxSegmentsInNodeLoadingQueue" config

* allow moves from full load queues

* better variable names
2018-06-20 23:04:41 -07:00
Niketh Sabbineni 0982472c90 Use historical node instead of realtime for querying (#4764)
* Use historical node instead of realtime for querying

* Incorporated code review comments

* Incorporate code review comments

* Remove artifact comment

* Consider non-historical nodes as realtime
2018-06-20 22:53:56 -07:00
Jonathan Wei 0eae89170e
Make DruidPlanner constructor public again (#5891) 2018-06-20 11:10:50 -07:00
Surekha 8619adb5b9 Improve task retrieval APIs on Overlord (#5801)
* Add the new tasks api in overlordResource

It takes 4 optional query params
* state(pending/running/waiting/compelte)
* dataSource
* interval (applies to completed tasks)
* maxCompletedTasks (applies to completed tasks)

If all params are null, the api returns all the tasks

* Add the state to each task returned by tasks endpoint

* divide active tasks into waiting, pending or running
* Add more unit tests

* Add UNKNOWN state to TaskState

* Fix the authorization calls

* WIP: PR comments

Added new class to capture task info for caching
Other refactoring

* Refactoring : move TaskStatus class to druid-api

so it can be accessed within server
And other related classes like TaskState and TaskStatusPlus are in api

* Remove unused class and apis accessing it

* Add a separate cache for recently completed tasks

This is to mainly capture the task type from payload

* Ignore a test

* Add a RuntimeTaskState to encompass all states a task can be in

* Revert "Add a RuntimeTaskState to encompass all states a task can be in"

This reverts commit 2a527a0731.

* Fix wrong api call

* Fix and unignore tests

* Remove waiting,pending state from TaskState

* Add RunnerTaskState

* Missed the annotation runnerStatusCode

* Fix the creationTime

* Fix the createdTime and queueInsertionTime for running/active tasks
* Clean up tests

* Add javadocs

* Potentially fix the teamcity build

* Address PR comments

*Get rid of TaskInfoBuilder
*Make TaskInfoMapper static nested class
*Other changes

* fix import in MaterializedViewSupervisor after merge

* Address PR comments on

* Replace global cache with local map
* combine multiple queries into one
* Removed unused code

* Fix unit tests

Fix a bug in securedTaskStatusPlus

* Remove getRecentlyFinishedTaskStatuses method

Change TaskInfoMapper signature to add generic type

* Address PR comments

* Passed datasource as argument to be used in sql query
* Other minor fixes

* Address PR comments

*Some minor changes, rename method, spacing changes

* Add early auth check if datasource is not null

* Fix test case

* Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage
* Add TaskLocation to Anytask object

* Address PR comments

* Fix a bug in test case causing ClassCastException
2018-06-19 11:34:59 -07:00
Gian Merlino 6d0dd2fd0f CalciteQueryTest: Add more subquery tests. (#5880)
None of them actually work right now, but this is useful to help document,
via tests, what works and what doesn't.
2018-06-18 11:54:29 -07:00
Charles Allen 8dc4aca25f
Add cgroup memory monitor (#5866)
* Add cgroup memory monitor

* Port of https://github.com/metamx/java-util/pull/67

* Fix copyright

* Don't use `String.format`
2018-06-18 10:03:44 -07:00
varaga b4b1b2a020 Provisioning support for ZooKeeper Authorization (#5701)
Review comments implemented
2018-06-15 14:02:01 -07:00
Dylan Wylie 8c6651022d Update jsonpath dependency (#5794)
* Update JSONPath Library

Re: #5792

- Add a unit test containing a JSONPath conditional
- Update the JSONPath library and no longer exclude the json-smart dependency.
- I believe the original reason for excluding this has been fixed: https://github.com/json-path/JsonPath/pull/315

* Add test

* Fix test
2018-06-15 13:50:48 -07:00
Dylan Wylie 1f700bb880 Suppress JsonPath exceptions in AvroFlattener (#5793)
Re: #5791

- Make the AvroFlattenerMake consistent with the JSONFlattenerMaker
2018-06-14 17:38:15 -07:00
Jonathan Wei dc67b77ec2 Immediately send 401 on basic HTTP authentication failure (#5856)
* Immediately send 401 on basic HTTP authentication failure

* Add unit tests
2018-06-14 10:23:10 -07:00
Joseph Glanville 1032387d78 Snappy decompression support (#5864)
Support decompression of files using Google's Snappy algorithm.

This only supports files compressed with the Snappy framing format
described here: https://github.com/google/snappy/blob/master/framing_format.txt
2018-06-14 10:55:42 +01:00
Gian Merlino e0eb7048f6 Remove evil.zip test file. (#5879)
Removes an evil.zip file added by #5850, since it's not necessary.
The tests in that patch create their own evil files.
2018-06-13 16:02:18 -07:00
Nishant Bangarwa 1c031784cb Align long Aggregator implementation with Double and Float (#5861)
Add LongMin/Max aggregator combiners
Extract common code from LongSum/Min/MaxAggregatorFactories in
SimpleLongAggregatorFactory
2018-06-14 01:56:41 +04:00
Jonathan Wei 24efbb054c
Fix inefficient available segment cache population in SQLMetadataSegmentManager (#5878) 2018-06-12 18:53:30 -07:00
Jonathan Wei bc9da54e12
Fix Zip Slip vulnerability (#5850)
* Fix evil zip exploit

* PR comment, checkstyle

* PR comments

* Add link to vulnerability report

* Fix test
2018-06-12 10:03:08 -07:00
Jihoon Son 2feec44a55 Fix mismatch in revoked task locks between memory and metastore after sync from storage (#5858)
* Fix mismatched revoked task locks after sync from storage

* fix build

* fix log

* fix lock release
2018-06-12 10:25:34 -04:00
Gian Merlino 0ae4aba4e2 HdfsDataSegmentPusher: Close tmpIndexFile before copying it. (#5873)
It seems that copy-before-close works OK on HDFS, but it doesn't work
on all filesystems. In particular, we observed this not working properly
with Google Cloud Storage. And anyway, it's better hygiene to close files
before attempting to copy them somewhere else.
2018-06-12 08:58:48 +01:00
Jihoon Son fe4d678aac Support projection after sorting in SQL (#5788)
* Add sort project

* add more test

* address comments
2018-06-11 11:33:47 -04:00
zhangxinyu e43e5ebbcd Materialized view implementation (#5556)
* implement materialized view

* modify code according to jihoonson's comments

* modify code according to jihoonson's comments - 2

* add documentation about materialized view

* use new HadoopTuningConfig in pr 5583

* add minDataLag and fix optimizer bug

* correct value of DEFAULT_MIN_DATA_LAG_MS

* modify code according to jihoonson's comments - 3

* use the boolean expression instead of if-else
2018-06-09 12:24:54 -07:00