* Adding feature thetaSketchConstant to do some set operation in PostAggregator
* Updated review comments for PR #5551 - Adding thetaSketchConstant
* Fixed CI build issue
* Updated review comments 2 for PR #5551 - Adding thetaSketchConstant
* Add more indexing task status and error reporting
* PR comments, add support in AppenderatorDriverRealtimeIndexTask
* Use TaskReport instead of metrics/context
* Fix tests
* Use TaskReport uploads
* Refactor fire department metrics retrieval
* Refactor input row serde in hadoop task
* Refactor hadoop task loader names
* Truncate error message in TaskStatus, add errorMsg to task report
* PR comments
* drop selection through cost balancer
* use collections.emptyIterator
* add test to ensure does not drop from server with larger loading queue with cost balancer
* javadocs and comments to clear things up
* random drop for completeness
* Add support for task reports, upload reports to deep storage
* PR comments
* Better name for method
* Fix report file upload
* Use TaskReportFileWriter
* Checkstyle
* More PR comments
* fix issue where assign primary assigns segments to all historical servers in cluster
* fix test
* add test to ensure primary assignment will not assign to another server while loading is in progress
* Add graceful shutdown timeout
* Handle interruptedException
* Incorporate code review comments
* Address code review comments
* Poll for activeConnections to be zero
* Use statistics handler to get active requests
* Use native jetty shutdown gracefully
* Move log line back to where it was
* Add unannounce wait time
* Make the default retain prior behavior
* Update docs with new config defaults
* Make duration handling on jetty shutdown more consistent
* StatisticsHandler is a wrapper
* Move jetty lifecycle error logging to error
Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448.
I also added redirects and updated links to point to the new
datasketches-extension.html landing page for the extension, rather than to
the old page about theta sketches.
* Use the official aws-sdk instead of jet3t
* fix compile and serde tests
* address comments and fix test
* add http version string
* remove redundant dependencies, fix potential NPE, and fix test
* resolve TODOs
* fix build
* downgrade jackson version to 2.6.7
* fix test
* resolve the last TODO
* support proxy and endpoint configurations
* fix build
* remove debugging log
* downgrade hadoop version to 2.8.3
* fix tests
* remove unused log
* fix it test
* revert KerberosAuthenticator change
* change hadoop-aws scope to provided in hdfs-storage
* address comments
* address comments
* Future-proof some Guava usage
* Use a java-util EmptyIterator instead of Guava's
* Change some of the guava future handling to do manual async
transforms. Guava changes transform into transformAsync by deprecating
transform in ONLY Guava 19. Then its gone in 20
* Use `Collections.emptyIterator()`
* Pretty formatting
* Make listenable future transforms a thing in default druid
* Format fix
* Add forbidden guava apis
* Make the ListenableFutrues.transformAsync have comments
* Undo intellij bad pattern matching in comments
* Futrues --> Futures
* Add empty iterators forbidding
* Fix extra `A`
* Correct method signature
* Address review comments
* Finish Gian review comments
* Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax
* Fix round robining in router.
Say that ten times fast.
For query endpoints, AsyncQueryForwardingServlet called hostFinder.getDefaultServer()
to set a default server, followed by hostFinder.getServer(inputQuery) to override it
with query-specific routing. Since hostFinder is round-robin, this skips a server.
When there are only two servers, one server is _always_ skipped and the router sends
all queries to the same broker.
* Adjust spacing.
* SegmentMetadataQuery: Fix default interval handling.
PR #4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.
This patch fixes the bug and adds a test for the copy builder.
* Intervals
NativeIO.chunkedCopy fsyncs its writebuffer directly and
requires an O_DIRECT RandomAccessFile. By allowing the
kernel to start writing while filling the buffer the writes
will be more constant. In addition the O_DIRECT flag is not
required anymore and this will work faster in case fadvise
is not supported on some system.
This is based on Linus' post here:
http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01845.html
and added datasource information as part of existing endpoint /druid/indexer/v1/runningTasks.
Added junit test cases for the newly implemented API and fixed existing junit test cases.
Fixed review comments - added new method getCreatedDateTimeAndDataSource into TaskStorageQueryAdapter class
and formatted changed files