Commit Graph

8407 Commits

Author SHA1 Message Date
Clint Wylie 50e0e7f97d Correct lookup documentation (#5537)
fixes #5536
2018-03-26 17:01:02 -07:00
Nathan Hartwell ea30c05355 Adding ParserSpec for Influx Line Protocol (#5440)
* Adding ParserSpec for Influx Line Protocol

* Addressing PR feedback

- Remove extraneous TODO
- Better handling of parse errors (e.g. invalid timestamp)
- Handle sub-millisecond timestamps

* Adding documentation for Influx parser

* Fixing docs
2018-03-26 14:28:46 -07:00
Atul Mohan ec17a44e09 Add result level caching to Brokers (#5028)
* Add result level caching to Brokers

* Minor doc changes

* Simplify sequences

*  Move etag execution

* Modify cacheLimit criteria

* Fix incorrect etag computation

* Fix docs

* Add separate query runner for result level caching

* Update docs

* Add post aggregated results to result level cache

* Fix indents

* Check byte size for exceeding cache limit

* Fix indents

* Fix indents

* Add flag for result caching

* Remove logs

* Make cache object generation synchronous

* Avoid saving intermediate cache results to list

* Fix changes that handle etag based response

* Release bytestream after use

*  Address PR comments

*  Discard resultcache stream after use

* Fix docs

* Address comments

* Add comment about fluent workflow issue
2018-03-23 19:11:52 -07:00
Charles Allen ef21ce5a64
Add graceful shutdown timeout for Jetty (#5429)
* Add graceful shutdown timeout

* Handle interruptedException

* Incorporate code review comments

* Address code review comments

* Poll for activeConnections to be zero

* Use statistics handler to get active requests

* Use native jetty shutdown gracefully

* Move log line back to where it was

* Add unannounce wait time

* Make the default retain prior behavior

* Update docs with new config defaults

* Make duration handling on jetty shutdown more consistent

* StatisticsHandler is a wrapper

* Move jetty lifecycle error logging to error
2018-03-23 09:38:17 -07:00
Gian Merlino 0851f2206c
Expanded documentation for DataSketches aggregators. (#5513)
Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448.
I also added redirects and updated links to point to the new
datasketches-extension.html landing page for the extension, rather than to
the old page about theta sketches.
2018-03-21 18:19:27 -07:00
Jihoon Son 1ad898bde2
Use the official aws-sdk instead of jet3t (#5382)
* Use the official aws-sdk instead of jet3t

* fix compile and serde tests

* address comments and fix test

* add http version string

* remove redundant dependencies, fix potential NPE, and fix test

* resolve TODOs

* fix build

* downgrade jackson version to 2.6.7

* fix test

* resolve the last TODO

* support proxy and endpoint configurations

* fix build

* remove debugging log

* downgrade hadoop version to 2.8.3

* fix tests

* remove unused log

* fix it test

* revert KerberosAuthenticator change

* change hadoop-aws scope to provided in hdfs-storage

* address comments

* address comments
2018-03-21 15:36:54 -07:00
Clint Wylie 885b975c95 fix LongsColumnWithNulls and FloatsColumnWithNulls to override isNull in order to actually use nullValueBitmap (#5510) 2018-03-20 16:04:08 -07:00
Charles Allen 58f110f7f8 Future-proof some Guava usage (#5414)
* Future-proof some Guava usage

* Use a java-util EmptyIterator instead of Guava's
* Change some of the guava future handling to do manual async
transforms. Guava changes transform into transformAsync by deprecating
transform in ONLY Guava 19. Then its gone in 20

* Use `Collections.emptyIterator()`

* Pretty formatting

* Make listenable future transforms a thing in default druid

* Format fix

* Add forbidden guava apis

* Make the ListenableFutrues.transformAsync have comments

* Undo intellij bad pattern matching in comments

* Futrues --> Futures

* Add empty iterators forbidding

* Fix extra `A`

* Correct method signature

* Address review comments

* Finish Gian review comments

* Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax
2018-03-20 08:59:33 -07:00
Slim 17c71a2a60
Make Doubles aggregators use 64bits by default (#5478)
* use 64-bit float representation for double based aggregator

Change-Id: Ia4f442037052add178f6ac68138c9d52f96c6e09

* review comments

Change-Id: I5a588f7364f236bf22f2b138e9d743bfb27c67fe
2018-03-19 19:13:04 -07:00
Jonathan Wei b22455b924
Fix supervisor tombstone auth handling (#5504) 2018-03-19 12:55:47 -07:00
Roman Leventov 693e3575f9
Remove unused code and exception declarations (#5461)
* Remove unused code and exception declarations

* Address comments

* Remove redundant Exception declarations

* Make FirehoseFactoryV2.connect() to throw IOException again
2018-03-16 22:11:12 +01:00
Samarth Jain afa25202a3 Segment filtering should be done by looking at the inner most query o… (#5496)
* Segment filtering should be done by looking at the inner most query of a nested query

* Fixing checkstyle errors

* Addressing code review comments
2018-03-16 14:05:14 -07:00
Jonathan Wei 30e6bdedf3 Authorize supervisor history instead of current active supervisors for supervisor history API (#5501) 2018-03-16 12:29:17 -07:00
Gian Merlino a08efe4683
Fix round robining in router. (#5500)
* Fix round robining in router.

Say that ten times fast.

For query endpoints, AsyncQueryForwardingServlet called hostFinder.getDefaultServer()
to set a default server, followed by hostFinder.getServer(inputQuery) to override it
with query-specific routing. Since hostFinder is round-robin, this skips a server.
When there are only two servers, one server is _always_ skipped and the router sends
all queries to the same broker.

* Adjust spacing.
2018-03-15 18:45:59 -07:00
Gian Merlino 16b81fcd53
SegmentMetadataQuery: Fix default interval handling. (#5489)
* SegmentMetadataQuery: Fix default interval handling.

PR #4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
2018-03-15 10:05:46 -07:00
bolkedebruin 7d1163b0d9 Optimize chunkedCopy for sequential writes (#5477)
NativeIO.chunkedCopy fsyncs its writebuffer directly and
requires an O_DIRECT RandomAccessFile. By allowing the
kernel to start writing while filling the buffer the writes
will be more constant. In addition the O_DIRECT flag is not
required anymore and this will work faster in case fadvise
is not supported on some system.

This is based on Linus' post here:
http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01845.html
2018-03-14 15:15:41 -07:00
Gian Merlino e096a8d6c5 Emitter: Clarify contract of "emit". (#5486)
* Emitter: Clarify contract of "emit".

* New wording
2018-03-14 22:07:08 +01:00
Gian Merlino fdd55538e1 SQL: Remove unused escalator, authConfig from various classes. (#5483)
DruidPlanner.plan is responsible for checking authorization, so these objects
weren't needed in as many places as they were injected.
2018-03-14 13:28:51 -07:00
Niketh Sabbineni 40cc2c8740 Query should not fail because emitter fails or throws Exception (#5484) 2018-03-13 19:57:05 -07:00
Jihoon Son 9b2a25bd84
Refactor supervisorReport to be type-safe (#5479)
* refactor supervisorReport

* use primitives
2018-03-13 09:28:44 -07:00
Christoph Hösler 34f655599d Let MySQLConnector accept all UTF charsets and recommend utf8mb4 (#5411)
* Let MySQLConnector accept all UTF charsets and recommend utf8mb4

* Fix regex and remove newline in log statement
2018-03-13 01:16:10 -07:00
Niraja Mishra 96cebfc222 As part of this feature, implemented a new endpoint to get running tasks by datasources (#5260)
and added datasource information as part of existing endpoint /druid/indexer/v1/runningTasks.

Added junit test cases for the newly implemented API and fixed existing junit test cases.

Fixed review comments - added new method getCreatedDateTimeAndDataSource into TaskStorageQueryAdapter class
and formatted changed files
2018-03-12 23:48:11 -07:00
Himanshu e968811583 HttpServerInventoryView: fixed startup wait time and more informative logging (#5336) 2018-03-12 22:13:51 -07:00
Roman Leventov 6b158abe3f Enforce optimal IndexedInts iteration (#5456)
* Enforce optimal IndexedInts iteration

* Fix remaining suboptimal usages
2018-03-09 09:42:40 -08:00
Clint Wylie d159a4fa01 better error messaging when parseSpec is missing timestampSpec or dimensionSpec (#5439) 2018-03-08 07:57:13 -08:00
bolkedebruin 8f07a39af7 Skip OS cache on Linux when pulling segments (#5421)
Druid relies on the page cache of Linux in order to have memory segments.
However when loading segments from deep storage or rebalancing the page
cache can get poisoned by segments that should not be in memory yet.
This can significantly slow down Druid in case rebalancing happens
as data that might not be queried often is suddenly in the page cache.

This PR implements the same logic as is in Apache Cassandra and Apache
Bookkeeper.

Closes #4746
2018-03-08 07:54:21 -08:00
Himanshu 8fae0edc95 allow arbitrary aggregators for reindexing with hadoop (#5294) 2018-03-07 17:13:56 -08:00
Hongze Zhang b084075279 Add http/https proxy options to PullDependencies.java (#5450) 2018-03-07 15:05:43 -08:00
Slim 593e87637d
Inline some backward incompatible Hadoop 3.0 method (#5396)
* Inline some backward incompatible hadoop 3.0 method

Change-Id: I49aeff5412d5cdea95e30feb031b2c036d036e9a

* fix build issue

Change-Id: I0a42fdb83ce970d6a2d3d45f150556e45442a0ac
2018-03-07 07:58:18 -08:00
Gian Merlino 0f03ab0c74 SQL: Fix precision of TIMESTAMP types. (#5464)
Druid stores timestamps down to the millisecond, so we should use
precision = 3. Setting this wrong sometimes caused milliseconds
to be ignored in timestamp literals.

Fixes #5337.
2018-03-05 18:56:52 -08:00
Gian Merlino ff0de21fc5 SQL: Fix assumption that AND, OR have two arguments. (#5470)
Calcite can deliver an AND or OR operator with > 2 arguments.
Fixes #5468.
2018-03-05 18:56:35 -08:00
Jihoon Son c9b12e7813 Fix JSON serde for taskStatusPlus (#5469)
* Fix JSON serde for taskStatusPlus

* add newline

* fix to statusCode

* fix fobidden api check

* remove debugging code
2018-03-05 18:49:59 -08:00
Alexander Korablev 8a51800693 fix PortFinder issue #5466 (#5467) 2018-03-05 16:58:49 -08:00
Niraja Mishra ba3dbf2a42 Fixed NPE when dimension is null or empty. https://github.com/druid-io/druid/issues/3007 (#5299) 2018-03-05 16:27:35 -08:00
Gian Merlino 7416d1d02d Add "joda" option to timeFormat extractionFn. (#5448) 2018-03-02 19:59:26 -08:00
Clint Wylie f948066710 KafkaIndexTask remove branch with unreachable code (#5434) 2018-03-02 17:26:12 -08:00
Jonathan Wei b63f1c0e45
Fix authorization check in supervisor history API (#5460) 2018-03-02 14:03:07 -08:00
Kevin Conaway 969a12f6ca #5425 Refactor to use map.get() when asserting the existence of published segments (#5426) 2018-03-02 19:42:39 +01:00
Jonathan Wei cf5f74b013 Fix GroupBy limit push down descending sorting on numeric columns (#5453) 2018-03-01 18:43:45 -08:00
Nishant Bangarwa e0d456b1ba Uniformly set Calcite systemProperties for All Unit tests (#5451)
Fixes test failures reported in -
https://github.com/druid-io/druid/issues/4909

Issue is that If some test skips setting up Calcite system properties
with proper encoding and loads calcite classes that use that property,
All subsequent tests in the same JVM fails.

To reproduce the issue - ExpressionsTest and CalciteQueryTest from IDE
in this order.

A better fix would be to not use System Properties in calcite, This
will work for now.

All new Calcite Unit tests that are added need to inherit
CalciteTestBase.
2018-03-01 12:56:32 -08:00
Vinesh Chemmala Paul fb493ae13a Add repository url (#5437)
snapshot build fails to find dependent artifacts.
Add repoistory URL to resolved dependencies
2018-03-01 07:38:24 -08:00
Jihoon Son 16e08c9adb add task priority for kafka indexing (#5444) 2018-02-28 22:29:23 -08:00
Gian Merlino e4eaee3806
Support for disabling bitmap indexes. (#5402)
* Support for disabling bitmap indexes.

Can save space for columns where bitmap indexes are pointless (like
free-form text).

* Remove import.

* Fix CompactionTaskTest.

* Update for review comments.

* Review comments, tests.

* Fix test.
2018-02-28 19:19:56 -08:00
Alexander Korablev 6a3a5350b8 Make memcached protocol and locator configurable. (#5438)
* Make memcached protocol and locator configurable.

* Style fix.

* Style fix.

* Style fix.
2018-02-28 17:16:43 -08:00
Jonathan Wei c23b723510 Skip normal authentication for JDBC requests in Router (#5435)
* Skip normal authentication for JDBC requests in Router

* Add integration test

* PR comments
2018-02-28 12:25:32 -08:00
Gian Merlino 27bf619d5d
Test flattenSpecs where the referred-to object is missing. (#5318)
It worked, but it's still good to have a test.
2018-02-27 15:10:30 -08:00
Niraja Mishra 0f009a41e1 Fixed PeriodGranularity for Asia pacific timezones (#5410) 2018-02-27 10:39:50 -08:00
Niketh Sabbineni ac5034e241 Improve cache cost to handle heterogenous historicals (#5416) 2018-02-23 13:17:31 -08:00
Jonathan Wei e9977ce4ef
Automatically adjust com.metamx.metrics Monitor class references (#5412)
* Automatically adjust com.metamx.metrics monitor class references

* Log warning for old class names
2018-02-22 12:03:07 -08:00
David Lim 5c56d01daa pass configuration from context into JobConf for determining DatasourceInputFormat splits (#5408) 2018-02-21 10:01:31 -08:00