Commit Graph

9182 Commits

Author SHA1 Message Date
Clint Wylie ea4f8544fb revert lambda conversion to fix occasional jvm error (#5591) 2018-04-06 14:18:55 -07:00
Gian Merlino 5ab17668c0 CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586)
Also switch various firehoses to the new method.

Fixes #5585.
2018-04-06 08:06:45 -07:00
Charles Allen b86ed99d9a
Deprecate spark2 profile in pom.xml (#5581)
Deprecated due to https://github.com/druid-io/druid/pull/5382
2018-04-06 05:37:16 -07:00
Alexander T c228eed500 Update sql.md (#5519)
Example code is wrong. 'Statement' has to be created from the Connection Object
2018-04-06 00:21:18 -07:00
Jihoon Son 298ed1755d Fix indexTask to respect forceExtendableShardSpecs (#5509)
* Fix indexTask to respect forceExtendableShardSpecs

* add comments
2018-04-05 23:54:59 -07:00
Jihoon Son 723857699c Add doc for automatic pendingSegments (#5565)
* Add missing doc for automatic pendingSegments

* address comments
2018-04-05 23:53:43 -07:00
Dylan Wylie ddd23a11e6 Fix taskDuration docs for KafkaIndexingService (#5572)
* With incremental handoff the changed line is no longer true.
2018-04-05 23:52:58 -07:00
Senthil Kumar L S 371c672828 Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551)
* Adding feature thetaSketchConstant to do some set operation in PostAggregator

* Updated review comments for PR #5551 - Adding thetaSketchConstant

* Fixed CI build issue

* Updated review comments 2 for PR #5551 - Adding thetaSketchConstant
2018-04-05 22:56:59 -07:00
Niketh Sabbineni 270fd1ea15 Allow getDomain to return disjointed intervals (#5570)
* Allow getDomain to return disjointed intervals

* Indentation issues
2018-04-05 22:12:30 -07:00
Jonathan Wei 969342cd28
More error reporting and stats for ingestion tasks (#5418)
* Add more indexing task status and error reporting

* PR comments, add support in AppenderatorDriverRealtimeIndexTask

* Use TaskReport instead of metrics/context

* Fix tests

* Use TaskReport uploads

* Refactor fire department metrics retrieval

* Refactor input row serde in hadoop task

* Refactor hadoop task loader names

* Truncate error message in TaskStatus, add errorMsg to task report

* PR comments
2018-04-05 21:38:57 -07:00
Jonathan Wei 818091ec60
Add overlord unsecured paths to coordinator when using combined service (#5579)
* Add overlord unsecured paths to coordinator when using combined service

* PR comment
2018-04-05 14:16:06 -07:00
Jihoon Son 7239f56131 Fix NPE in RemoteTaskRunner when some tasks in ZooKeeper but not in Overlord (#5511)
* Fix NPE in RemoteTaskRunner when some tasks in ZooKeeper but not in Overlord

* revert unnecessary change
2018-04-03 21:15:58 -07:00
Niketh Sabbineni f0a94f5035 Remove unused config (#5564)
* Remove unused config

* Fix failing tests
2018-04-03 13:23:46 -07:00
Clint Wylie f31dba6c5b Coordinator drop segment selection through cost balancer (#5529)
* drop selection through cost balancer

* use collections.emptyIterator

* add test to ensure does not drop from server with larger loading queue with cost balancer

* javadocs and comments to clear things up

* random drop for completeness
2018-04-03 11:22:51 -07:00
Clint Wylie a81ae99021 add 'stopped' check and handling to HttpLoadQueuePeon load and drop segment methods (#5555)
* add stopped check and handling to HttpLoadQueuePeon load and drop segment methods

* fix unrelated timeout :(

* revert unintended change

* PR feedback: change logging

* fix dumb
2018-04-03 11:21:52 -07:00
Jonathan Wei 723f7ac550
Add support for task reports, upload reports to deep storage (#5524)
* Add support for task reports, upload reports to deep storage

* PR comments

* Better name for method

* Fix report file upload

* Use TaskReportFileWriter

* Checkstyle

* More PR comments
2018-04-02 12:10:56 -07:00
Clint Wylie 6feac204e3 Coordinator primary segment assignment fix (#5532)
* fix issue where assign primary assigns segments to all historical servers in cluster

* fix test

* add test to ensure primary assignment will not assign to another server while loading is in progress
2018-04-02 09:40:20 -07:00
Jihoon Son 05547e29b2
Fix SQLMetadataSegmentManager to allow succesive start and stop (#5554)
* Fix SQLMetadataSegmentManager to allow succesive start and stop

* address comment

* add synchronization
2018-03-30 12:43:19 -07:00
Arup Malakar 0c4598c1fe Fix typo in avatica java client code documenation (#5553) 2018-03-29 16:36:40 -05:00
Clint Wylie 30fc4d3ba0 Coordinator balancer move then drop fix (#5528)
* #5521 part 1

* formatting

* oops

* less magic tests
2018-03-29 10:30:12 -07:00
Kirill Kozlov 8878a7ff94 Replace guava Charsets with native java StandardCharsets (#5545) 2018-03-28 21:00:08 -07:00
Clint Wylie 81be1b3966 this will fix it (#5549) 2018-03-28 18:58:39 -07:00
Niketh Sabbineni 912adcc284 ArrayAggregation: Use long to avoid overflow (#5544)
* ArrayAggregation: Use long to avoid overflow

* Add Tests
2018-03-28 16:37:53 -07:00
Jihoon Son 024e0a9cca Respect forceHashAggregation in queryContext (#5533)
* Respect forceHashAggregation in queryContext

* address comment
2018-03-28 14:15:38 -07:00
Dyana Rose db508cf3ca [docs] fix invalid example json (#5547)
https://github.com/druid-io/druid/issues/5546
2018-03-28 13:53:38 -07:00
Clint Wylie 50e0e7f97d Correct lookup documentation (#5537)
fixes #5536
2018-03-26 17:01:02 -07:00
Nathan Hartwell ea30c05355 Adding ParserSpec for Influx Line Protocol (#5440)
* Adding ParserSpec for Influx Line Protocol

* Addressing PR feedback

- Remove extraneous TODO
- Better handling of parse errors (e.g. invalid timestamp)
- Handle sub-millisecond timestamps

* Adding documentation for Influx parser

* Fixing docs
2018-03-26 14:28:46 -07:00
Atul Mohan ec17a44e09 Add result level caching to Brokers (#5028)
* Add result level caching to Brokers

* Minor doc changes

* Simplify sequences

*  Move etag execution

* Modify cacheLimit criteria

* Fix incorrect etag computation

* Fix docs

* Add separate query runner for result level caching

* Update docs

* Add post aggregated results to result level cache

* Fix indents

* Check byte size for exceeding cache limit

* Fix indents

* Fix indents

* Add flag for result caching

* Remove logs

* Make cache object generation synchronous

* Avoid saving intermediate cache results to list

* Fix changes that handle etag based response

* Release bytestream after use

*  Address PR comments

*  Discard resultcache stream after use

* Fix docs

* Address comments

* Add comment about fluent workflow issue
2018-03-23 19:11:52 -07:00
Charles Allen ef21ce5a64
Add graceful shutdown timeout for Jetty (#5429)
* Add graceful shutdown timeout

* Handle interruptedException

* Incorporate code review comments

* Address code review comments

* Poll for activeConnections to be zero

* Use statistics handler to get active requests

* Use native jetty shutdown gracefully

* Move log line back to where it was

* Add unannounce wait time

* Make the default retain prior behavior

* Update docs with new config defaults

* Make duration handling on jetty shutdown more consistent

* StatisticsHandler is a wrapper

* Move jetty lifecycle error logging to error
2018-03-23 09:38:17 -07:00
Gian Merlino 0851f2206c
Expanded documentation for DataSketches aggregators. (#5513)
Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448.
I also added redirects and updated links to point to the new
datasketches-extension.html landing page for the extension, rather than to
the old page about theta sketches.
2018-03-21 18:19:27 -07:00
Jihoon Son 1ad898bde2
Use the official aws-sdk instead of jet3t (#5382)
* Use the official aws-sdk instead of jet3t

* fix compile and serde tests

* address comments and fix test

* add http version string

* remove redundant dependencies, fix potential NPE, and fix test

* resolve TODOs

* fix build

* downgrade jackson version to 2.6.7

* fix test

* resolve the last TODO

* support proxy and endpoint configurations

* fix build

* remove debugging log

* downgrade hadoop version to 2.8.3

* fix tests

* remove unused log

* fix it test

* revert KerberosAuthenticator change

* change hadoop-aws scope to provided in hdfs-storage

* address comments

* address comments
2018-03-21 15:36:54 -07:00
Clint Wylie 885b975c95 fix LongsColumnWithNulls and FloatsColumnWithNulls to override isNull in order to actually use nullValueBitmap (#5510) 2018-03-20 16:04:08 -07:00
Charles Allen 58f110f7f8 Future-proof some Guava usage (#5414)
* Future-proof some Guava usage

* Use a java-util EmptyIterator instead of Guava's
* Change some of the guava future handling to do manual async
transforms. Guava changes transform into transformAsync by deprecating
transform in ONLY Guava 19. Then its gone in 20

* Use `Collections.emptyIterator()`

* Pretty formatting

* Make listenable future transforms a thing in default druid

* Format fix

* Add forbidden guava apis

* Make the ListenableFutrues.transformAsync have comments

* Undo intellij bad pattern matching in comments

* Futrues --> Futures

* Add empty iterators forbidding

* Fix extra `A`

* Correct method signature

* Address review comments

* Finish Gian review comments

* Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax
2018-03-20 08:59:33 -07:00
Slim 17c71a2a60
Make Doubles aggregators use 64bits by default (#5478)
* use 64-bit float representation for double based aggregator

Change-Id: Ia4f442037052add178f6ac68138c9d52f96c6e09

* review comments

Change-Id: I5a588f7364f236bf22f2b138e9d743bfb27c67fe
2018-03-19 19:13:04 -07:00
Jonathan Wei b22455b924
Fix supervisor tombstone auth handling (#5504) 2018-03-19 12:55:47 -07:00
Roman Leventov 693e3575f9
Remove unused code and exception declarations (#5461)
* Remove unused code and exception declarations

* Address comments

* Remove redundant Exception declarations

* Make FirehoseFactoryV2.connect() to throw IOException again
2018-03-16 22:11:12 +01:00
Samarth Jain afa25202a3 Segment filtering should be done by looking at the inner most query o… (#5496)
* Segment filtering should be done by looking at the inner most query of a nested query

* Fixing checkstyle errors

* Addressing code review comments
2018-03-16 14:05:14 -07:00
Jonathan Wei 30e6bdedf3 Authorize supervisor history instead of current active supervisors for supervisor history API (#5501) 2018-03-16 12:29:17 -07:00
Gian Merlino a08efe4683
Fix round robining in router. (#5500)
* Fix round robining in router.

Say that ten times fast.

For query endpoints, AsyncQueryForwardingServlet called hostFinder.getDefaultServer()
to set a default server, followed by hostFinder.getServer(inputQuery) to override it
with query-specific routing. Since hostFinder is round-robin, this skips a server.
When there are only two servers, one server is _always_ skipped and the router sends
all queries to the same broker.

* Adjust spacing.
2018-03-15 18:45:59 -07:00
Gian Merlino 16b81fcd53
SegmentMetadataQuery: Fix default interval handling. (#5489)
* SegmentMetadataQuery: Fix default interval handling.

PR #4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
2018-03-15 10:05:46 -07:00
bolkedebruin 7d1163b0d9 Optimize chunkedCopy for sequential writes (#5477)
NativeIO.chunkedCopy fsyncs its writebuffer directly and
requires an O_DIRECT RandomAccessFile. By allowing the
kernel to start writing while filling the buffer the writes
will be more constant. In addition the O_DIRECT flag is not
required anymore and this will work faster in case fadvise
is not supported on some system.

This is based on Linus' post here:
http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01845.html
2018-03-14 15:15:41 -07:00
Gian Merlino e096a8d6c5 Emitter: Clarify contract of "emit". (#5486)
* Emitter: Clarify contract of "emit".

* New wording
2018-03-14 22:07:08 +01:00
Gian Merlino fdd55538e1 SQL: Remove unused escalator, authConfig from various classes. (#5483)
DruidPlanner.plan is responsible for checking authorization, so these objects
weren't needed in as many places as they were injected.
2018-03-14 13:28:51 -07:00
Niketh Sabbineni 40cc2c8740 Query should not fail because emitter fails or throws Exception (#5484) 2018-03-13 19:57:05 -07:00
Jihoon Son 9b2a25bd84
Refactor supervisorReport to be type-safe (#5479)
* refactor supervisorReport

* use primitives
2018-03-13 09:28:44 -07:00
Christoph Hösler 34f655599d Let MySQLConnector accept all UTF charsets and recommend utf8mb4 (#5411)
* Let MySQLConnector accept all UTF charsets and recommend utf8mb4

* Fix regex and remove newline in log statement
2018-03-13 01:16:10 -07:00
Niraja Mishra 96cebfc222 As part of this feature, implemented a new endpoint to get running tasks by datasources (#5260)
and added datasource information as part of existing endpoint /druid/indexer/v1/runningTasks.

Added junit test cases for the newly implemented API and fixed existing junit test cases.

Fixed review comments - added new method getCreatedDateTimeAndDataSource into TaskStorageQueryAdapter class
and formatted changed files
2018-03-12 23:48:11 -07:00
Himanshu e968811583 HttpServerInventoryView: fixed startup wait time and more informative logging (#5336) 2018-03-12 22:13:51 -07:00
Roman Leventov 6b158abe3f Enforce optimal IndexedInts iteration (#5456)
* Enforce optimal IndexedInts iteration

* Fix remaining suboptimal usages
2018-03-09 09:42:40 -08:00
Clint Wylie d159a4fa01 better error messaging when parseSpec is missing timestampSpec or dimensionSpec (#5439) 2018-03-08 07:57:13 -08:00