Commit Graph

7889 Commits

Author SHA1 Message Date
David Lim 13ecf90923 Report Kafka lag information in supervisor status report (#4314)
* refactor lag reporting and report lag at status endpoint

* refactor offset reporting logic to fetch offsets periodically vs. at request time

* remove JavaCompatUtils

* code review changes

* code review changes
2017-06-05 13:26:25 -07:00
Slim a2584d214a Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116)
* Adding s3a schema and s3a implem to hdfs storage module.

* use 2.7.3

* use segment pusher to make loadspec

* move getStorageDir and makeLoad spec under DataSegmentPusher

* fix uts

* fix comment part1

* move to hadoop 2.8

* inject deep storage properties

* set version to 2.7.3

* fix build issue about static class

* fix comments

* fix default hadoop default coordinate

* fix create filesytem

* downgrade aws sdk

* bump the version
2017-06-04 00:55:09 -06:00
Jihoon Son ba816063cb Wait until all data sources are ready for querying in ITUnionQueryTest (#4362) 2017-06-03 17:02:12 -07:00
Roman Leventov ebabe14fbe Rename ExtractionNamespaceCacheFactory to CachePopulator (the last part of #3667) (#4303)
* Renamed ExtractionNamespaceCacheFactory to CachePopulator, and related classes

* Rename CachePopulator to CacheGenerator
2017-06-03 10:09:44 +09:00
Jonathan Wei b90c28e861 Support limit push down for GroupBy (#3873)
* Support limit push down for GroupBy V2

* Use orderBy spec ordering when applying limit push down

* PR Comments

* Remove unused var

* Checkstyle fixes

* Fix test

* Add comment on non-final variables, fix checkstyle

* Address PR comments

* PR comments

* Remove unnecessary buffer reset

* Fix missing @JsonProperty annotation
2017-06-02 15:39:04 -07:00
Jonathan Wei 6daddf97c5 More documentation on expected interval format for coordinator endpoints (#4361) 2017-06-02 15:21:44 -07:00
praveev 290ed3ab9d Make DateTime timezone aware (#4343)
* Make DateTime timezone aware

* Change unit tests to make DateTime timezone aware for PeriodGranularity
2017-06-02 12:45:52 -07:00
Jihoon Son da32e1ae53 Reducing testing time for KafkaIndexTaskTest and KafkaSupervisorTest (#4352) 2017-06-03 00:53:07 +09:00
Jihoon Son f876246af7 Rename FiniteAppenderatorDriver to AppenderatorDriver (#4356) 2017-06-03 00:48:44 +09:00
kaijianding 0efd18247b explicitly unmap hydrant files when abandonSegment to recycle mmap memory (#4341)
* fix TestKafkaExtractionCluster fail due to port already used

* explicitly unmap hydrant files when abandonSegment to recyle mmap memory

* address the comments

* apply to AppenderatorImpl
2017-06-01 18:15:30 -05:00
Jihoon Son 1150bf7a2c Refactoring Appenderator Driver (#4292)
* Refactoring Appenderator

1) Added publishExecutor and handoffExecutor for background publishing and handing segments off
2) Change add() to not move segments out in it

* Address comments

1) Remove publishTimeout for KafkaIndexTask
2) Simplifying registerHandoff()
3) Add increamental handoff test

* Remove unused variable

* Add persist() to Appenderator and more tests for AppenderatorDriver

* Remove unused imports

* Fix strict build

* Address comments
2017-06-02 07:09:11 +09:00
Roman Leventov 50e72c6aea Fix bugs (core) (#4339)
* Fix bugs

* Add test for GoogleDataSegmentPusher.buildPath()

* Exclude extension changes

* Address comments

* Brace
2017-06-02 06:47:59 +09:00
Roman Leventov 05185a610f Add Druid inspection profile (#4348) 2017-06-02 06:29:00 +09:00
Jihoon Son 523b5ec03d Run integration tests on travis (#4344) 2017-05-31 18:27:34 -07:00
Roman Leventov 78179ef74d Inject QueryMetrics factories via PolyBind (#4336) 2017-05-31 09:07:03 -07:00
fanjieqi 2e933e1413 fix a bug in select-query.md which the property_form lack of the『granularity』 (#4327)
There result would be {"error"=>"Unknown exception",
"errorMessage"=>nil, "errorClass"=>"java.lang.NullPointerException",
"host"=>nil} when the json lack of 『granularity』.
2017-05-30 17:04:39 -07:00
Roman Leventov 9625993c9a Fix bugs in Google extensions and RocketMQ extension (#4340) 2017-05-30 14:25:35 -07:00
Kenji Noguchi 3400f601db Protobuf extension (#4039)
* move ProtoBufInputRowParser from processing module to protobuf extensions

* Ported PR #3509

* add DynamicMessage

* fix local test stuff that slipped in

* add license header

* removed redundant type name

* removed commented code

* fix code style

* rename ProtoBuf -> Protobuf

* pom.xml: shade protobuf classes, handle .desc resource file as binary file

* clean up error messages

* pick first message type from descriptor if not specified

* fix protoMessageType null check. add test case

* move protobuf-extension from contrib to core

* document: add new configuration keys, and descriptions

* update document. add examples

* move protobuf-extension from contrib to core (2nd try)

* touch

* include protobuf extensions in the distribution

* fix whitespace

* include protobuf example in the distribution

* example: create new pb obj everytime

* document: use properly quoted json

* fix whitespace

* bump parent version to 0.10.1-SNAPSHOT

* ignore Override check

* touch
2017-05-30 13:11:58 -07:00
Jihoon Son 7889891bd3 Fix integration tests (#4337)
* Fix integration tests

1) Use the same version of kafka
2) Change ServiceEmitter from LazySingleton to ManageLifecycle

* Revert unnecessary change
2017-05-28 08:48:39 -07:00
Kamal Gurala dcb07d6958 Option to configure default analysis types in SegmentMetadataQuery (#4259)
* Option to configure default analysis types

* Updated Docs and renamed

* Added serde tests and Null handling

* Fixed Documentation

* Updated implementation

* Updated implementation

* Updated implementation

* Added usingDefaultIntervals in Builder

* Updated implementation

* Updated implementation and added failing test

* filterSegments implementation updated

* Updated imlementation

* Padding

* Add missing Override

* Updated implementation

* Fixed a naming bug

* Fixed bug

* Removed comment
2017-05-26 12:12:39 -07:00
dgolitsyn 515fabce96 Server selector improvement (#4315)
* Do not re-create prioritized servers on each call in server selector and extend TierSelectorStrategy interface with a method to pick multiple elements at once

* Fix compilation
2017-05-26 11:02:09 -05:00
Gian Merlino 1eaa7887bd Fix integer overflow in BufferGrouper. (#4333)
Would have led to out of bounds buffer access with large buffers.
Also added tests using large buffers.
2017-05-25 23:30:20 -07:00
zwang180 2c55a935f8 Delete a duplicate "Bucket Extraction Function" section at the bottom of "Querying"-"DimensionSpec" page (#4331) 2017-05-25 14:16:00 -07:00
Jihoon Son 11b7b1bea6 Add support for HttpFirehose (#4297)
* Add support for HttpFirehose

* Fix document

* Add documents
2017-05-25 16:13:04 -05:00
chaoqiang 5fc4abcf71 fix equalDistribution worker select strategy (#4318)
* fix equalDistribution worker select strategy

* replace anonymous Comparator

* keep previous version sorting comment

* fix code style

* update comment

* move JsonProperty
2017-05-25 13:30:42 +09:00
Gian Merlino fe42db98ac URIExtractionNamespace: Avoid problems due to canonicalization of lookup fields. (#4307)
Disables canonicalization for simpleJson, where expect field names to be unique
anyway. Keeps canonicalization enabled for customJson, but avoids sharing the
table with the global ObjectMapper.
2017-05-24 17:41:04 -07:00
Goh Wei Xiang b77fab8a30 Replace usages of CountingMap with Object2LongMap (#4320)
* Replaces use of CountingMap with Object2LongMap from fastutil.

* Remove CountingMap classes and minor fixes

* Added additional test cases for DatasourceInputFormat.

* Added additional test cases for CoordinatorStats.

* Not materializing segment list.

* Put in this fix because it is failing the test on its expected behavior.

* Added missing header.
2017-05-24 17:40:32 -07:00
Jihoon Son b578adacae Improve concurrency of SegmentManager (#4298)
* Improve concurrency of SegmentManager

* Fix SegmentManager and add more tests

* Add more tests

* Add null check to TimelineEntry

* Remove empty data source and check null in getTimeline()

* Add a comment for returning null in compute()

* Make SegmentManager LazySingleton
2017-05-24 04:41:24 +09:00
Roman Leventov f97c49ba0e Don't use QueryMetrics from multiple threads in DirectDruidClient (fixes #4308) (#4309)
* Don't use QueryMetrics from multiple threads in DirectDruidClient

* reponseMetrics
2017-05-23 10:07:27 -07:00
Gian Merlino 2bd4c0930f Fix "quarter" granularity serialization. (#4316) 2017-05-23 10:06:17 -07:00
Gian Merlino 9283807ad7 GroupByQuery: Fix type-spanning comparisons. (#4317)
Jackson deserializes integers sometimes as int and sometimes as long,
depending on how big they are. This leads to ClassCastException
when comparing deserialized values as part of groupBy merging on the
broker.
2017-05-23 10:06:04 -07:00
李成露(StefanLee) 22977780aa Doc (#4217)
* Fixed (#4216)

Modify the default value of  `druid.server.http.numThreads`  to `Math.max(10, (Runtime.getRuntime().availableProcessors() * 17) / 16 + 2) + 30`

* Fixed(#4216)

Modify the default value of  `druid.server.http.numThreads` to `max(10, (Number of cores * 17) / 16 + 2) + 30`

* Fixed(#4216)

Modify the default value of  `druid.server.http.numThreads`  to `max(10, (Number of cores * 17) / 16 + 2) + 30`
2017-05-23 17:04:52 +09:00
Jonathan Wei d49e53e6c2 Timeout and maxScatterGatherBytes handling for queries run by Druid SQL (#4305)
* Timeout and maxScatterGatherBytes handling for queries run by Druid SQL

* Address PR comments

* Fix contexts in CalciteQueryTest

* Fix contexts in QuantileSqlAggregatorTest
2017-05-23 16:57:51 +09:00
Gian Merlino 22e5f52d00 Workaround for non-thread-safe use of CardinalityAggregator. (#4304) 2017-05-23 10:33:03 +09:00
Jonathan Wei e043bf88ec Add a ServerType for peons (#4295)
* Add a ServerType for peons

* Add toString() method, toString() test, unsupported type check

* Use ServerType enum in DruidServer and DruidServerMetadata
2017-05-22 17:24:59 -05:00
Roman Leventov 8ec3a29af0 Don't pass QueryMetrics down in concurrent and async QueryRunners (fixes #4279) (#4288)
* Don't pass QueryMetrics down in concurrent and async QueryRunners

* Rename QueryPlus.threadSafe() to withoutThreadUnsafeState(); Update QueryPlus.withQueryMetrics() Javadocs; Fix generics in MetricsEmittingQueryRunner and CpuTimeMetricQueryRunner; Make DefaultQueryMetrics to fail fast on modifications from concurrent threads
2017-05-22 13:42:09 -05:00
Jihoon Son 000b0ffed7 Increase the max heap size for strict compilation (#4306) 2017-05-21 03:42:44 +09:00
Gian Merlino adeecc0e72 Add /isLeader call to overlord and coordinator. (#4282)
This is useful for putting them behind load balancers or proxies, as it lets
the load balancer know which server is currently active through an http health
check.

Also makes the method naming a little more consistent between coordinator and
overlord code.
2017-05-18 20:46:13 -05:00
Roman Leventov 7479cbde68 Make CacheScheduler a singleton (#4293) 2017-05-18 15:46:02 -07:00
Benedict Jin cdd521fb23 Update outdated RLE paper and improve some code refactoring (#4286)
* Update outdated RLE paper and improve some code refactoring

* Roll back CONCISE's abbreviation
2017-05-18 12:26:24 -07:00
Gian Merlino 8ca7f9410e SQL: Add test for concurrent JDBC queries. (#4290) 2017-05-18 12:25:15 -07:00
Jihoon Son 5c0a7ad2f8 Make realtimes available for loading segments (#4148)
* Add ServerType

* Add realtimes to DruidCluster

* fix test fails

* Add SegmentManager

* Fix equals and hashCode of ServerHolder

* Address comments and add more tests

* Address comments
2017-05-18 10:03:39 -05:00
Jihoon Son 733dfc9b30 Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193)
* Add PrefetcheableTextFilesFirehoseFactory

* fix comment

* exception handling

* Fix wrong json property

* Remove ReplayableFirehoseFactory and fix misspelling

* Defer object initialization

* Add a temporaryDirectory parameter to FirehoseFactory.connect()

* fix when cache and fetch are disabled

* Address comments

* Add more test

* Increase timeout for test

* Add wrapObjectStream

* Move methods to Firehose from PrefetchableFirehoseFactory

* Cleanup comment

* add directory listing to s3 firehose

* Rename a variable

* Addressing comments

* Update document

* Support disabling prefetch

* Fix race condition

* Add fetchLock

* Remove ReplayableFirehoseFactoryTest

* Fix compilation error

* Fix test failure

* Address comments

* Add default implementation for new method
2017-05-18 15:37:18 +09:00
Maksim Logvinenko d45dad2b44 Remove boxing/unboxing in indexer (#4269)
* Remove boxing/unboxing in indexer

* Fix rowIndex visibility

* Cleanup
2017-05-17 19:13:53 -05:00
Himanshu daa8ef8658 Optional long-polling based segment announcement via HTTP instead of Zookeeper (#3902)
* Optional long-polling based segment announcement via HTTP instead of Zookeeper

* address review comments

* make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed

* update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers

* address review comments

* remove size check not required anymore as only segment servers announce themselves and not all peon processes

* annouce segment server on historical only after cached segments are loaded

* fix checkstyle errors
2017-05-17 16:31:58 -05:00
Himanshu 0e056863e4 fix timeout check bug in DirectDruidClient (#4287) 2017-05-17 13:47:32 -07:00
Roman Leventov d9f423f55d Make QueryMetrics factories configurable (#4268)
* Ensure QueryMetrics factories accept Json ObjectMapper; Make QueryMetrics factories configurable

* Update QueryMetrics Javadocs

* Add javadocs to QueryMetrics factories

* Move queryMetricsFactory defaults to getter methods of config classes
2017-05-17 08:41:59 -07:00
Gian Merlino ddc2e68998 Remove cache keys from HavingSpecs. (#4280)
* Remove cache keys from HavingSpecs.

They weren't used, since they aren't part of the groupBy cache key.
Also, it's good that they weren't used, since many of them had
value truncation bugs.

* Fix imports.

* Fix test.
2017-05-16 22:13:02 -07:00
Gian Merlino 22f20f2207 IngestSegmentFirehoseTest: Add more tests for reindexing. (#4285)
* IngestSegmentFirehoseTest: Add more tests for reindexing.

* Nix unused imports.
2017-05-16 22:12:26 -07:00
Gian Merlino 51872fd310 Log max memory on startup too, in case Xmx and Xms are different. (#4283) 2017-05-16 20:06:34 -05:00