Commit Graph

8231 Commits

Author SHA1 Message Date
Gian Merlino 5ff8c52f16 SQL: Fix race with metadata caching. (#4674)
If DruidSchema started too long after the BrokerServerView, its
initialization callback would never get called, and it would sit
there not knowing about any tables.

This moves the registration of the callback into the constructor,
where it belongs.
2017-08-10 18:27:10 -07:00
Roman Leventov bf28d0775b Remove QueryRunner.run(Query, responseContext) and related legacy methods (#4482)
* Remove QueryRunner.run(Query, responseContext) and related legacy methods

* Remove local var
2017-08-11 09:12:38 +09:00
Yuewen Wang c821bc9a5a Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 (#4607)
* Implement "earlyMessageRejectionPeriod" config discussed in issue #4599
    * implement the logics of this param
    * Added doc of this config
    * Added unit tests of it

* Update KafkaSupervisor.java

ameliorate comment

* fix format

* fix bug when rebasing
2017-08-11 09:12:08 +09:00
Jihoon Son 65c1d6c797 Add IntGrouper to avoid unnecessary boxing/unboxing in array-based aggregation (#4668)
* Add IntGrouper

* Fix build

* Address comments

* Add a benchmark query
2017-08-10 07:41:39 -07:00
solimant de9ba97d54 Move equals() from Float[Sum|Min|Max]AggregatorFactory to SimpleFloat... (#4675)
Addresses #4671
2017-08-10 07:22:22 -07:00
Gian Merlino 7c89e12ca9 Replace Guava Enum.getIfPresent with builtin version. (#4659)
* Replace Guava Enum.getIfPresent with builtin version.

This is useful for running in Hadoop environments that use Guava 11. Some
code is also simplified.

* Code review
2017-08-09 17:20:00 -07:00
Jihoon Son fe3421032b Parallel sort for ConcurrentGrouper (#4660)
* Multi-thread sort

* Address comments
2017-08-09 16:24:36 -07:00
Peter Cunningham ede7cf9eef Added support for where clauses to JDBC lookups. (#4643)
* Added support for where clauses to filter lookup values on ingestion.

Added a filter field to the JDBC lookups that is used to generate a
where clause so that only rows matching the filter value will be
brought into Druid. Example being filter="SOMECOLUMN=1"

* Required changes based on code review.

* Required changes based on code review.

* Added support for where clauses to filter lookup values on ingestion.

Added a filter field to the JDBC lookups that is used to generate a
where clause so that only rows matching the filter value will be
brought into Druid. Example being filter="SOMECOLUMN=1"

* Updates based on code review, mainly formatting and small refactor of
the buildLookupQuery method.

* Fixed broken buildLookupQuery method

* Removed empty line.

* Updates per review comments
2017-08-09 10:47:46 -07:00
Goh Wei Xiang 42569e65e2 Minor fix in ExpressionSelectors to avoid potential NPE. (#4669) 2017-08-09 10:13:31 -07:00
Roman Leventov 7454fd86a0 Polymorphic numeric getters for ColumnValueSelector (#4623)
* Add methods getFloat(), getDouble() and getLong() to ColumnValueSelector

* Fix copy-paste mistake in docs

* Spelling
2017-08-08 18:38:06 -07:00
Jihoon Son d5606bc558 Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549)
* Passing lockTimeout as a parameter for TaskLockbox.lock()

* Remove TIME_UNIT

* Fix tc fail

* Add taskLockTimeout to TaskContext

* Add caution
2017-08-08 18:21:07 -07:00
Roman Leventov f5d4171459 Prohibit for loops which could be foreach with IntelliJ (#4653)
* Replace for with foreach

* Replace for with for-each in GroupByQueryEngineV2

* Remove io.druid.collections.IntList
2017-08-08 18:05:33 -07:00
QiuMM f18cc5df97 Redis cache extension (#4615)
* Redis cache extension

* Fix some trival and optimize code

* Add Override annotation in RedisCacheTest
2017-08-08 10:11:45 -07:00
dgolitsyn 4dd1e2b59e Do not remove segments from currentlyMovingSegments in DruidBalancer if move is impossible or not needed (#4472)
* Do not remove segment that should not be moved from currentlyMovingSegments (segments are removed by callbacks or not inserted)

* Replace putIfAbsent with computeIfAbsent in DruidBalancer

* Refactoring
2017-08-08 17:22:59 +03:00
Charles Allen bbe7fb8c46 Better logging for S3DataSegmentPuller `getVersion` (#4657)
* Eventual consistency of S3 means a `404` can be thrown. It helps to know the URI that was attempted.
2017-08-08 16:21:22 +03:00
Roman Leventov 59a2507268 Fix spacing in KafkaEmitterConfig (#4663) 2017-08-08 15:12:01 +03:00
Yuewen Wang f8dcb05fd1 Fix a NPE using kafka emitter extension (#4608)
* Fix a NPE using kafka emitter extension

* fix format

* Add @Nullable annotation on kafkaProducerConfig
2017-08-08 12:21:24 +09:00
Roman Leventov 486b7a2347 TaskMaster deadlock fix (#4548)
* Stop RemoteTaskRunner's cleanupExec using TaskMaster's lifecycle, not global injected lifecycle

* Prohibit starting Lifecycle twice; Make Lifecycle to reject addMaybeStartHandler() attempts in the process of stopping rather than entering deadlock

* Fix Lifecycle.addMaybeStartHandler()

* Remove RemoteTaskRunnerFactoryTest

* Add docs

* Language

* Address comments

* Fix RemoteTaskRunnerTestUtils
2017-08-07 14:28:43 -07:00
Roman Leventov aa7e4ae5e4 Enforce correct spacing with Checkstyle (#4651) 2017-08-05 10:18:25 -07:00
Jonathan Wei aa8d75004c More informative QueryInterruptedException toString() (#4642)
* More informative QueryInterruptedException toString()

* Use StringUtils.format
2017-08-04 16:00:20 -07:00
Jonathan Wei 9650d80f3e Don't use limit push down with having spec (#4630)
* Don't use limit push down with having spec

* Throw exception when forcing limit push down with having

* Tests for having and limit push down

* Fix pool sizes in unit test
2017-08-04 15:13:29 -07:00
Roman Leventov 7a005088d9 Add HistoricalCursor.getReadableOffset() to access unwrapped offset in selectors (#4633)
* Add HistoricalCursor.getReadableOffset() to access unwrapped offset in selectors, when the 'main' offset if FilteredOffset (fixes #4628)

* Stack overflow test
2017-08-04 12:51:48 -07:00
tkyaw 8c8759da03 Append instead of create log file so that it is possible to logrotate. (#4644) 2017-08-03 14:29:15 -07:00
David Lim dd0b84e766 Fix bugs in RTR related to blacklisting, change default worker strategy (#4619)
* fix bugs in RTR related to blacklisting, change default worker strategy to equalDistribution

* code review and additional changes

* fix errorprone

* code review changes
2017-08-03 10:34:45 -07:00
Jihoon Son f3f2cd35e1 Array-based aggregation for groupBy query (#4576)
* Array-based aggregation

* Fix handling missing grouping key

* Handle invalid offset

* Fix compilation

* Add cardinality check

* Fix cardinality check

* Address comments

* Address comments

* Address comments

* Address comments

* Cleanup GroupByQueryEngineV2.process

* Change to Byte.SIZE

* Add flatMap
2017-08-03 20:04:54 +03:00
Himanshu 163b0edd79 Router misc fixes (#4517)
* make BrokerQueryResource instantiation singleton

* fix druid.router.http.* handling so that they are actually used and introduce numRequestsQueued for jetty http client at router

* address comments

* address review comment
2017-08-03 16:37:57 +09:00
Himanshu 6d60ef67ce misc http segment discovery fixes (#4618)
* Use ConcurrentHashMap to store segment servers or else getInventory() would need to clone the values list

* introduce unstableTimeout for segment servers

* address review comment

* add HttpServerInventoryViewConfigTest
2017-08-02 14:11:26 -07:00
Charles Allen 729e44d767 Update server-metrics to 0.5.2 (#4624) 2017-08-01 14:45:03 -07:00
Niketh Sabbineni cbb9f7d214 Log used flag (#4614) 2017-07-31 16:58:36 -05:00
Charles Allen 8921538251 Make AWSCredentialsConfig use PasswordProvider for the string matter (#4613)
* Make AWSCredentialsConfig use PasswordProvider for the string matter
* Fixes https://github.com/druid-io/druid/issues/3911

* Add unit tests
2017-07-29 15:48:49 -07:00
David Lim f4ba7a68ab fix missing column in caching documentation (#4612) 2017-07-28 13:53:04 -07:00
Niketh Sabbineni da43f68e95 NPE thrown when empty/null is passes to TimeDimExtractionFn (#4601)
* NPE thrown when empty/null is passes to TimeDimExtractionFn

* Add @Nullable where ever applicable

* Add @Nullable to SearchQuerySpec.apply()

* Remove unused
2017-07-26 21:02:08 -05:00
Roman Leventov 5929066dfb Add NamespaceLookupExtractorFactory.toString() (#4606) 2017-07-26 12:02:07 -07:00
Egor Riashin 2005c5532f ArchiveTask list-unused query optimization (#4600)
* ArchiveTask list unused query optimization

* ArchiveTask list unused query optimization
2017-07-26 20:51:20 +03:00
Roman Leventov 684cfbf889 Upgrade to server-metrics 0.5.0 (#4480)
* Upgrade to server-metrics 0.4.3

* Upgrade to 0.5.0

* Add CpuAcctDeltaMonitor description to docs
2017-07-26 08:56:00 -07:00
Akash Dwivedi c372d2ecc1 Default implementation for getDouble(). (#4595)
* Default implementation for getDouble().

* use getFloat for default implementation.

* addressed comment.

* new line.
2017-07-25 19:06:27 -05:00
Gian Merlino d4ef0f6d94 Improved SQL support for floats and doubles. (#4598)
* Improved SQL support for floats and doubles.

- Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL,
  and DECIMAL.
- Use float* aggregators when appropriate.
- Add tests involving both float and double columns.
- Adjust documentation accordingly.

* CR comments.

* Fix braces.
2017-07-25 13:54:44 -07:00
Himanshu ae6780f62a rolling upgrade order change to bring coordinator and overlord together (#4281)
* rolling upgrade order change to bring coordinator and overlord together

* mentioned merged Coordinator-Overlord in upgrade order doc

* revert autoscaling doc change

* auto scaling doc fix
2017-07-25 12:54:12 -05:00
Yuusaku Taniguchi 525b5f2723 [Bugfix] return null for the null list in OrcStruct (#4590) (#4590) 2017-07-25 10:31:31 -07:00
Gian Merlino 3d6f409fc8 Fix groupBy on double dimensions. (#4596)
* Fix groupBy on double dimensions.

* Fix tests.

* Fix tests.

* Fix Scan tests.
2017-07-24 23:18:06 -07:00
Gian Merlino 5048ab3e96 Add metrics to the native queries underpinning SQL. (#4561)
* Add metrics to the native queries underpinning SQL.

This is done by factoring out the metrics and request log emitting
code from QueryResource into a new QueryLifecycle class. That class
is used by both QueryResource and the SQL DruidSchema and QueryMaker.

Also fixes a couple of bugs in QueryResource:

- RequestLogLine start time was set to `TimeUnit.NANOSECONDS.toMillis(startNs)`,
  which is incorrect since absolute nanos cannot be converted to millis.
- DruidMetrics.makeRequestMetrics was called with null `query` on
  unparseable queries, which led to spurious "Unable to log query"
  errors.

Partial fix for #4047.

* Code style

* Remove unused imports.

* Fix tests.

* Remove unused import.
2017-07-24 21:26:27 -07:00
Gian Merlino 8a4185897e Add filter tests for both floats and doubles. (#4597) 2017-07-24 17:02:02 -07:00
Gian Merlino a9c875d746 IndexTask: Use shared groupId when "appendToExisting" is on. (#4582)
This allows the tasks to run concurrently. Additionally, rework
the partition-determining code in a couple ways:

- Use a task-id based sequenceName so concurrently running append
  tasks do not clobber each others' segments.
- Make the list of shardSpecs empty when rollup is non-guaranteed, and
  let allocators handle the creation of incremental shardSpecs.
2017-07-24 09:57:23 -07:00
Atul Mohan 4bd0f174ba Changes for deduplication (#4581) 2017-07-24 11:12:23 -05:00
Roman Leventov 7408a7c4ed Refactor CachingClusteredClient.run() (#4489)
* Refactor CachingClusteredClient

* Comments

* Refactoring

* Readability fixes
2017-07-23 23:10:36 +09:00
Yuewen Wang b154ff0d4a [Bugfix] null string in map/reduce jvm opts (#4588)
fix a bug when the cluster don't configure mapreduce.map.java.opts or mapreduce.map.java.opts
2017-07-22 06:22:30 -07:00
Roman Leventov c0beb78ffd Enforce brace formatting with Checkstyle (#4564) 2017-07-21 10:26:59 -05:00
Gian Merlino 38b03f56b4 IndexTask: Raise default maxTotalRows. (#4579)
150k is very low, given that it should only be limited by disk space
rather than JVM heap.
2017-07-20 13:55:07 -07:00
Gian Merlino 2be7068f6e Fixes and improvements to SQL metadata caching. (#4551)
* Fixes and improvements to SQL metadata caching.

Also adds support for MultipleSpecificSegmentSpec to CachingClusteredClient.

SQL changes:
- Cache metadata on a per-segment level, in addition to per-dataSource, so
  we don't need to re-query all segments whenever a single new one appears.
  This should lower the load placed on the cluster by metadata queries.
- Fix race condition in DruidSchema that can cause us to miss metadata. It was
  possible to notice new segments, then issue a query, and have that query
  not actually hit those segments, and not notice that it didn't hit those segments.
  Then, the metadata from those segments would be ignored.
- Fix assumption in DruidSchema that all segments are immutable. Now, mutable
  segments are periodically re-queried.
- Fix inappropriate re-use of SchemaPlus. Now we create one for each planning
  cycle, rather than sharing one. It caches table objects, which we want to
  avoid, since it can cause stale metadata. We do the caching in DruidSchema
  so we don't need the SchemaPlus caching.

Server changes:
- Add a TimelineCallback to TimelineServerView, for callers that want to get updates
  when the timeline has been modified.
- Change CachingClusteredClient from a QueryRunner to a QuerySegmentWalker. This
  allows it to accept queries that are segment-descriptor-based rather than
  intervals-based. In particular it will now support MultipleSpecificSegmentSpec.

* Fix DruidSchema, and unused imports.

* Remove unused import.

* Fix SqlBenchmark.
2017-07-20 10:14:15 -07:00
Slim 71e7a4c054 Adding double colums supports (#4491)
* add double columns support

* Fix numbers and expected results in UTs

* adding float aggregators

* fix IT expected test results

* fix comments

* more fixes

* fix comp

* fix test

* refactor double and float aggregator factories

* fix

* fix UTs

* fix comments

* clean unused code

* fix more comments

* undo unnecessary changes

* fix null issue

* refactor TopNColumnSelectorStrategyFactory

* fix docs

* refactor NumericTopNColumnSelectorStrategy

* fix return

* fix comments

* handle the null case in DimesionIndexer

* more null fixing

* cosmetic changes
2017-07-20 10:14:14 +03:00