Commit Graph

7815 Commits

Author SHA1 Message Date
Himanshu de081c711b RealtimeIndexTask to support alertTimeout in context ()
* RealtimeIndexTask to support alertTimeout in context and raise alert if task process exists after the timeout

* move alertTimeout config to tuningConfig and document
2017-03-24 12:48:12 -07:00
Gian Merlino b4289c0004 Remove "granularity" from IngestSegmentFirehose. ()
It wasn't doing anything useful (the sequences were being concatted, and
cursor.getTime() wasn't being called) and it defaulted to Granularities.NONE.
Changing it to Granularities.ALL gave me a 700x+ performance boost on a
small dataset I was reindexing (2m27s to 365ms). Most of that was from avoiding
making a lot of unnecessary column selectors.
2017-03-24 10:28:54 -07:00
Benedict Jin 23f77ebd20 Explain Avro's unnecessary EOFException () ()
* Explain Avro's unnecessary EOFException ()

* add jira link into log message
2017-03-24 10:45:45 -05:00
Erik Dubbelboer 2cbc4764f8 Comparing dimensions to each other in a filter ()
Comparing dimensions to each other using a select filter
2017-03-23 18:23:46 -07:00
Roman Leventov 4b5ae31207 QueryMetrics: abstraction layer of query metrics emitting (part of ) ()
* QueryMetrics: abstraction layer of query metrics emitting

* Minor fixes

* QueryMetrics.emit() for bulk emit and improve Javadoc

* Fixes

* Fix

* Javadoc fixes

* Typo

* Use DefaultObjectMapper

* Add tests

* Address PR comments

* Remove QueryMetrics.userDimensions(); Rename QueryMetric.register() to report()

* Dedicated TopNQueryMetricsFactory, GroupByQueryMetricsFactory and TimeseriesQueryMetricsFactory

* Typo

* More elaborate Javadoc of QueryMetrics

* Formatting

* Replace QueryMetric enum with lambdas

* Add comments and VisibleForTesting annotations
2017-03-23 17:23:59 -07:00
Himanshu c9fc7d1709 fix failure message to mention version.bin instead of index.drd not exists msg () 2017-03-23 14:21:19 -07:00
Gian Merlino 4b9f975f50 Rename SketchAggregationWithSimpleDataTest. ()
Tests that don't end in "Test" won't get run automatically by Maven.
2017-03-23 14:20:50 -07:00
Jonathan Wei 79f1a1d7f0 Allow float parameters for Bound/Selector/In filters on long columns ()
* Allow float parameters for long filters

* Use BigDecimal intermediate form for string->long conversions

* PR comments

* PR comments
2017-03-23 14:18:05 -07:00
Gian Merlino 81d6b49d69 Downgrade Curator. ()
Reverts , fixes , unfixes , . Better the devil you
know than the devil you don't, I always say.

See also https://issues.apache.org/jira/browse/CURATOR-394.
2017-03-23 13:44:00 -07:00
Akash Dwivedi ff7f90b02d relocate method in BufferAggregator. ()
*  relocate method in BufferAggregator.

* Unused import.

* Detailed javadoc.

* using Int2ObjectMap.

* batch relocate.

* Revert batch relocate.

* Unused import.

* code comments.

* code comment.
2017-03-23 13:07:59 -07:00
David Lim f68ba4128f Exclude pagingIdentifiers that don't apply to a datasource ()
* exclude pagingIdentifiers that don't apply to a datasource to support union datasources

* code review changes

* code review changes
2017-03-22 12:32:27 -07:00
Gian Merlino 1f48198607 Fix some query cache key collisions. ()
The query caches generally store dimensions and aggregators positionally, so
appendCacheablesIgnoringOrder could lead to incorrect results being pulled
from the cache.
2017-03-22 11:08:48 -07:00
Gian Merlino 77b6213222 Remove unused Filters.getLongValueMatcher method. () 2017-03-21 13:46:07 -06:00
Gian Merlino 4f7f3e31cb CONTRIBUTING update for the github squash button. ()
Some changes to the contributing guidelines to make pull requests
easier to review.
2017-03-21 10:06:11 -07:00
Karol Woźniak 8510a52e02 scan-query: Use long as limit. ()
* scan-query: Use long instead of int as limit type

* Use MAX_INSTANT queryTimeout, if timeout == 0
2017-03-20 14:19:35 -07:00
Gian Merlino 64248d31b6 SQL: Groundwork for views. ()
* SQL: Groundwork for views.

They are not actually exposed to users at this point, but enough is there
to have some test cases in CalciteQueryTest.

* Remove unused imports.

* Fix injection problem.
2017-03-20 11:53:11 -07:00
Gian Merlino ad477cb454 Fix topNs with extractionFns but no aggregators. ()
The result sets were empty because of an aggs.length > 0 check. I'm not
sure if it was there for any good reason, but there didn't seem to be one.
2017-03-20 11:31:30 -07:00
Zhihui Jiao 6febcd9f24 Fix IngestSegmentFirehoseFactory () 2017-03-17 16:57:25 -06:00
Roman Leventov 84fe91ba0b Monomorphic processing of TopN queries with 1 and 2 aggregators (key part of ) ()
* Monomorphic processing: add HotLoopCallee, CalledFromHotLoop, RuntimeShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing.

* Use Execs.singleThreaded()

* RuntimeShapeInspector to support nullable fields

* Make CalledFromHotLoop annotation Inherited

* Remove unnecessary conversion of array of ColumnSelectorPluses to list and back to array in CardinalityAggregatorFactory

* Close InputStream in SpecializationService

* Formatting

* Test specialized PooledTopNScanners

* Set flags in PooledTopNAlgorithm directly

* Fix tests, dependent on CountAggragatorFactory toString() form

* Fix

* Revert CountAggregatorFactory changes

* Implement inspectRuntimeShape() for LongWrappingDimensionSelector and FloatWrappingDimensionSelector

* Remove duplicate RoaringBitmap dependency in the extendedset pom.xml

* Fix

* Treat ByteBuffers specially in StringRuntimeShape

* Doc fix

* Annotate BufferAggregator.init() with CalledFromHotLoop

* Make triggerSpecializationIterationsThreshold an int

* Remove SpecializationService.PerPrototypeClassState.of()

* Add comments

* Limit the amount of specializations that SpecializationService could make

* Add default implementation for BufferAggregator.inspectRuntimeShape(), for compatibility with extensions

* Use more efficient ConcurrentMap's idioms in SpecializationService
2017-03-17 14:44:36 -05:00
Gian Merlino 3ec1877887 Fix BucketExtractionFn on objects that are strings. () 2017-03-16 22:59:11 -07:00
Gian Merlino 403fbae7b1 SQL: Better error handling for HTTP API. ()
* SQL: Better error handling for HTTP API.

* Fix test.
2017-03-15 14:18:00 -04:00
Gian Merlino db15d494ca Update docs for query filter HavingSpecs. () 2017-03-15 13:59:09 -04:00
Gian Merlino 9cd666282c Update Curator to 2.12.0. ()
Fixes , .
2017-03-15 09:38:31 -07:00
hzy001 c4f44c0590 Update the docs ()
Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>
2017-03-15 10:32:29 -04:00
Charles Allen 805d85afda Allow compilation as Java8 source and target ()
* Allow compilation as Java8 source and target for everything except API

* Remove conditions in tests which assume that we may run with Java 7

* Update easymock to 3.4

* Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration

* Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity()

* Remove java7 special for druid-api
2017-03-14 22:23:47 -06:00
Gian Merlino e5c0dab12c groupBy v2: Better error message when resources are exhausted. ()
* groupBy v2: Better error message when resources are exhausted.

Fixes .

* Fix tests.
2017-03-15 00:37:49 +05:30
Gian Merlino 3216134f8c SQL: Make row extractions extensible and add one for lookups. ()
This is a reopening of , since that PR was merged to master prematurely
and accidentally.
2017-03-13 21:56:16 -07:00
Gian Merlino bad250fe6d SQL: Support for coercing to DECIMAL. ()
Useful for running queries that involve math of ints and floats, which
Calcite types as decimal.
2017-03-13 16:29:23 -07:00
Jihoon Son dfe4bda7fd add doc () 2017-03-10 12:49:20 -08:00
Gian Merlino cab2e2f5d5 Add docs about filtering and indexes on numeric columns. () 2017-03-10 12:48:59 -08:00
Nishant Bangarwa adbe89e7d6 Fix race in KafkaIndexTaskTest ()
task.pause(0) can return early before the task is actually paused.
Exception for failure -
java.lang.AssertionError: expected:<PAUSED> but was:<READING>
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotEquals(Assert.java:743)
	at org.junit.Assert.assertEquals(Assert.java:118)
	at org.junit.Assert.assertEquals(Assert.java:144)
	at
io.druid.indexing.kafka.KafkaIndexTaskTest.testRunWithOffsetOutOfRangeEx
ceptionAndPause(KafkaIndexTaskTest.java:1229)

To reproduce add Thread.sleep(10000) in beginning of
KafkaIndexTask.possiblypause method.
2017-03-09 07:34:46 -08:00
Gian Merlino a5170666b6 groupBy v2: Always merge queries. ()
This fixes  because it means the timestamp will always be included for outermost
queries. Historicals receiving queries from older brokers will think they're
outermost (because CTX_KEY_OUTERMOST isn't set to "false"), so they'll include a
timestamp, so the older brokers will be OK.
2017-03-08 12:47:46 -06:00
Parag Jain c155d9a5e9 increase kill timeout () 2017-03-08 09:00:34 -08:00
Gian Merlino 960769c583 SQL: Fix example INFORMATION_SCHEMA query. () 2017-03-06 16:07:47 -08:00
Gian Merlino 4ca5270e88 Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. ()
* Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods.

Includes two fixes:
- groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults
returns a lazy sequence) and it generates incorrect results.
- Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y".

Also includes doc and test fixes:
- groupBy v1 was no longer being tested by GroupByQueryRunnerTest since , now it
  is once again.
- chunkPeriod documentation was misleading due to its checkered past. Updated it to
  be more accurate.

* Remove unused import.

* Restore buffer size.
2017-03-06 12:27:02 -06:00
Gian Merlino 7b9e6c29cd Fix float, long dimension indexer object selectors. ()
Their "convertUnsortedEncodedKeyComponentToActualArrayOrList" methods didn't respect the contract,
which says they should return single values (not array/list) if there is only a single value
to return. This affects the behavior of ObjectColumnSelectors on realtime segments.
2017-03-06 10:01:30 -08:00
kaijianding 19ac1c7c2c Add SameIntervalMergeTask for easier usage of MergeTask ()
* Add SameIntervalMergeTask for easier usage of MergeTask

* fix a bug and add ut

* remove same_interval_merge_sub from Task.java and remove other no needed code
2017-03-06 11:21:32 -06:00
Akash Dwivedi bebf9f34c7 HdfsDataSegmentPusher bug fix ()
* Fix for HdfsDataSegmentPusher.

* Add missing loadspec in actual descriptor file. Tests to check actual content of descriptor file.
2017-03-06 00:53:44 -08:00
Gian Merlino df623ebfe3 Fix a couple bugs due to calling Period.getMillis(). () 2017-03-05 18:44:20 +05:30
Gian Merlino 337f3870d8 Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. ()
* Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided.

* Remove unused import.

* Use defaults in cache key.
2017-03-04 17:41:59 -08:00
Gian Merlino af5a4cce3c SQL: Clarify approximate distinct count behavior. () 2017-03-03 13:42:30 -08:00
praveev 67d0ae3271 Let toDateTime call fall through for Duration Granularity ()
* Let toDateTime call fall through for Duration Granularity

Added test for the same.

* Add duration granularity test to GroupByQueryRunnerTest
2017-03-03 13:27:22 -06:00
Himanshu e7e3c2dc5a support singleThreaded flag for groupBy-v2 as well () 2017-03-03 23:43:06 +05:30
Gian Merlino 4a56d7d8a0 SQL: Ability to generate exact distinct count queries. () 2017-03-03 23:40:36 +05:30
Gian Merlino 3e8dbd59f8 Fix groupBy docs to reflect that 'v2' is default. () 2017-03-02 15:13:39 -08:00
Roman Leventov 81a5f9851f TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir ()
* In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir

* Use FileUtils.forceMkdir() across the codebase and remove some unused code

* Fix test

* Fix PullDependencies.run()

* Unused import
2017-03-02 14:05:12 -08:00
Roman Leventov ea1f5b7954 LifecycleLock for better synchronization in lifecycled objects ()
* Introduce LifecycleLock

* Add LifecycleLockTest

* Rename LifecycleLock.release() to exitStart()

* Rewrite LifecycleLock using AbstractQueuedSynchronizer for more safety, added tests

* Add LifecycleLock.exitStop() and reset()

* Add LifecycleLock.awaitStarted(timeout)

* Braces

* Fix
2017-03-02 12:22:57 -08:00
Gian Merlino e63eefd7ff Revert "SQL: Make row extractions extensible and add one for lookups. ()"
The PR was merged to master accidentally.

This reverts commit 23927a3c96.
2017-03-01 17:06:12 -08:00
Jonathan Wei 5fb1638534 Add default configuration for select query 'fromNext' parameter ()
* Add default configuration for select query 'fromNext' parameter

* PR comments

* Fix PagingSpec config injection

* Injection fix for test
2017-03-01 17:05:35 -08:00
Gian Merlino 23927a3c96 SQL: Make row extractions extensible and add one for lookups. ()
* SQL: Make row extractions extensible and add one for lookups.

* Fix QuantileSqlAggregatorTest.
2017-03-01 17:03:43 -08:00