Commit Graph

990 Commits

Author SHA1 Message Date
nishantmonu51 4dc0fdba8a consider mapped size in limit calculation & review comments 2014-12-03 23:47:30 +05:30
Charles Allen 529e7e0272 Merge pull request #927 from metamx/speedup-smile-bytes
Improve Smile serde performance by writing binary data as is
2014-12-03 10:02:08 -08:00
Charles Allen 0f5d5840da Merge pull request #924 from metamx/update-joda-time
Update Joda-Time and fix min/max instant overflow
2014-12-03 09:15:39 -08:00
nishantmonu51 da8bd7836b Introduce buffer size 2014-12-03 16:28:22 +05:30
Xavier Léauté 5fece517fa write byte data as is in smile 2014-12-03 00:01:01 -08:00
Xavier Léauté 18f50097a9 upgrade LZ4 to operate directly on ByteBuffers 2014-12-02 23:53:56 -08:00
fjy bc173d14fc a whole bunch of cleanup and fixes 2014-12-02 17:32:05 -08:00
Xavier Léauté a79389a9e5 update joda-time and fix min/max instant 2014-12-02 17:27:22 -08:00
nishantmonu51 b65933ffb8 make tests parameterised 2014-12-02 23:55:29 +05:30
nishantmonu51 6dc69c2f30 code cleanups & formatting 2014-12-02 22:44:33 +05:30
nishantmonu51 eac776f1a7 tests passing with on heap incremental index 2014-12-02 22:29:28 +05:30
Xavier Léauté 4eee7e69b9 fix cardinality aggregator caching 2014-11-26 15:00:37 -08:00
xvrl 5bc1be5ba0 Merge pull request #850 from metamx/druid-0.7.x-compressionstrategy
Compression strategy changes
2014-11-25 12:58:39 -08:00
Charles Allen c6043afa32 Removed empty function from CompressionStrategyTest 2014-11-25 12:57:06 -08:00
Charles Allen 6943db5251 Changed branching logic for LZFCompressor to return null only on error, and avoid checking in most circumstances 2014-11-25 12:53:11 -08:00
Charles Allen 9f945c2216 Removed lz4Fast from CompressedObjectStrategy for compression since it is not currently used 2014-11-24 16:11:03 -08:00
Charles Allen 70e3108282 Multiple speed improvements revolving around topN with HLL
Change serializer / deserializer for HyperLogLog
* Changed DirectDruidClient's InputStream handling. Is now ~10% faster for data heavy queries, and has lower variance in execution speed.
* Changed HLL Collector's toByteStream() method to be better optimized for small values. Is notably faster for small result quantities which fall into the sparse HLL bucket codepath.
  * No change for dense HLL which just uses a direct bytestream of the underlying byte data.

TopNNumericResultBuilder semi-aggressive loop unrolling for metricVals

Benchmark for HLL for sparse packing (small HLL bucket population):
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[0]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 216, GC.time: 0.42, time.total: 15.96, time.warmup: 0.22, time.bench: 15.74
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[1]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 217, GC.time: 0.45, time.total: 13.87, time.warmup: 0.02, time.bench: 13.85
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[2]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 55, GC.time: 0.16, time.total: 4.13, time.warmup: 0.00, time.bench: 4.12
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[3]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 55, GC.time: 0.16, time.total: 4.30, time.warmup: 0.00, time.bench: 4.30
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[4]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 8, GC.time: 0.03, time.total: 1.10, time.warmup: 0.00, time.bench: 1.09
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[5]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 8, GC.time: 0.03, time.total: 0.72, time.warmup: 0.00, time.bench: 0.72
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[6]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 1, GC.time: 0.00, time.total: 0.60, time.warmup: 0.00, time.bench: 0.60
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[7]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 2, GC.time: 0.01, time.total: 0.26, time.warmup: 0.00, time.bench: 0.25

Updates to HyperLogLogCollector toByteBuffer() based on code review

Removed changes from DirectDruidClient from this branch and put it in another branch.

Changed HyperLogLogCollector to have protected getters and setters

Remove unused ByteOrder from HyperLogLogCollector

Copyright header on HyperLogLogSerdeBenchmarkTest

Now with less ass!

Reformat in TopNNumericResultsBuilder. No code change

Removed unused import in HyperLogLogCollector

Replace AppendableByteArrayInputStream in DirectDruidClient
* Replace with SequenceInputStream fueled by an enumeration of ChannelBufferInputStream which directly wrap the response context ChannelBuffer

Modify TopNQueryQueryToolChest to use Arrays instead of Lists

Modify TopNQueryQueryToolChest to use Arrays instead of Lists

Revert accidental changes to DirectDruidClient

They should be in another merge request:
https://github.com/metamx/druid/pull/893

Fixes from code review
* Extracting names from AggregatorFactory classes now done with TopNQueryQueryToolChest.extractFactoryName
* Renamed variable in TopNNumericResultBuilder
2014-11-24 16:02:00 -08:00
fjy 13cae41f6c Merge branch 'master' into refactor-examples 2014-11-24 11:00:26 -08:00
xvrl 9ced097abd Merge pull request #895 from metamx/fix-interval-retry
A set of fixes to retry the query for missing segments in the timeline
2014-11-24 10:23:02 -08:00
fjy c88aff6205 remove unneeded constant 2014-11-24 10:20:02 -08:00
fjy 9da66291e1 change naming to common config 2014-11-21 15:36:42 -08:00
Charles Allen 8f80d9e189 Update CompressedObjectStrategy to try-with-resources but print log error if error while closing 2014-11-21 11:35:11 -08:00
Charles Allen fc9a54ea48 Fix CompressedObjectStrategy LZFCompressor to ignore error on close of ResourceHolder 2014-11-21 10:49:43 -08:00
Charles Allen f8ce68565b Modified CompressedObjectStrategy to use 0xFF for Uncompressed 2014-11-21 10:33:53 -08:00
Charles Allen aa49e56ed6 Merge remote-tracking branch 'origin/master' into druid-0.7.x-compressionstrategy 2014-11-21 10:29:40 -08:00
fjy ef62bccdec ignore benchmark 2014-11-20 16:52:19 -08:00
nishantmonu51 e3260aa177 Filtered Aggregator fixes + enhancements
- fix NPE on IncrementIndex
- refactor code to support AND, OR filter
- tests for AND & OR filter
- handling for missing column / null values
2014-11-20 15:17:18 -08:00
fjy 47f5c1bd0a fix retry interval is stupid 2014-11-20 12:50:56 -08:00
fjy 3d9d989a9f A set of fixes to retry the query for missing intervals in the timeline 2014-11-20 12:04:37 -08:00
nishantmonu51 0ab34f86da Revert "fix filtered Aggregator"
This reverts commit 6fd37ce023.
2014-11-20 10:17:01 +05:30
nishantmonu51 6fd37ce023 fix filtered Aggregator
fix filtered Aggregator
remove unused name parameter for filtered aggregator
add tests
2014-11-20 09:29:26 +05:30
fjy a49e673122 put back another missing test 2014-11-19 16:55:20 -08:00
fjy 14668846aa add back some tests 2014-11-19 14:35:26 -08:00
fjy fdeab0c6af make Druid case sensitive 2014-11-19 14:27:31 -08:00
Fangjin Yang 590d31799e Merge pull request #876 from metamx/remove-backwards-compatible
Remove backwards compatible
2014-11-19 14:33:14 -07:00
Charles Allen 18f44beee9 CompressedObjectStrategy improvements
* Added more unit tests
* Now properly uses safe / fast decompressor for LZ4
* Now chooses fastest lz4 instance instead of only looking at Java implmentations
* Encapsulate ResourceHolder in try-with-resources to make sure they close correctly
2014-11-19 11:10:59 -08:00
Charles Allen ccc757dc64 Merge remote-tracking branch 'origin/master' into druid-0.7.x-compressionstrategy 2014-11-19 09:39:35 -08:00
Charles Allen 1bbc8fcbe5 Allow Smile to fall back to text
* Modify SmileFactory to set the delegate to text option.
  * This option only occurs when a Reader type object is passed in to the deserialization stuff
  * This is needed by the X-Druid-Response-Context header return value, which is JSON
2014-11-18 15:16:14 -08:00
Charles Allen 42517f5d37 Merge pull request #884 from metamx/optimize-topN-pruning
optimise pruning of aggs
2014-11-18 14:19:30 -08:00
xvrl a96eaeb036 Merge pull request #882 from metamx/now_with_OPEN_SOURCE
Added src jar build to maven poms and re-formatted to conform to style guidelines.
2014-11-18 13:00:04 -08:00
nishantmonu51 6023d602e6 optimise pruning of aggs
optimise pruning of aggregators for topN
2014-11-19 00:17:25 +05:30
Charles Allen dc66e1708e Added src jar build to maven poms and re-formatted to conform to style guidelines. 2014-11-18 09:05:30 -08:00
Xavier Léauté d914afe1cd make defaultVersion configurable for non-jar testing 2014-11-17 13:54:32 -08:00
nishantmonu51 0c2d06475d merge from master 2014-11-17 19:19:18 +05:30
nishantmonu51 cbffe3c648 merge from master and resolve conflicts 2014-11-17 18:07:08 +05:30
fjy d5c4282766 fix broken ut 2014-11-14 13:45:42 -08:00
xvrl e1e171ad20 Merge pull request #865 from metamx/fix-retry-qr
Fix a couple of bugs with retry query runner after testing it locally
2014-11-14 13:33:51 -08:00
fjy df1ad95075 remove useless config 2014-11-14 13:32:19 -08:00
fjy d641d41f9e address another cr 2014-11-14 13:29:59 -08:00
fjy 7736c3fc27 address cr 2014-11-14 13:28:32 -08:00
Fangjin Yang 6ee8029462 Merge pull request #866 from metamx/mutableBitmapBenchmark
Add benchmarking for bitmaps
2014-11-14 14:16:21 -07:00
xvrl a4fc64ca3f Merge pull request #856 from metamx/druid-845
Fix query by segment
2014-11-14 13:10:54 -08:00
Charles Allen 4b7ab23289 Remove getIntervalString from BySegmentResultValue 2014-11-14 13:03:48 -08:00
fjy bbc079b880 fix retry to actually return correct sequences 2014-11-14 12:10:04 -08:00
Charles Allen 648759e9f6 Add deserialization benchmark to BitmapCreationBenchmark 2014-11-13 13:43:14 -08:00
Charles Allen 483b2c7be0 Add copyright notice to BitmapCreationBenchmark 2014-11-13 12:55:02 -08:00
Charles Allen 228fb0cf40 Add benchmarking for bitmaps
Here are the results on my laptop:

BitmapCreationBenchmark.testRandomAddition[0]: [measured 10 out of 20 rounds, threads: 1 (sequential)]
 round: 0.49 [+- 0.07], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 16, GC.time: 0.01, time.total: 9.91, time.warmup: 5.06, time.bench: 4.86
BitmapCreationBenchmark.testLinearAdditionDescending[0]: [measured 1000 out of 1010 rounds, threads: 1 (sequential)]
 round: 0.01 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 74, GC.time: 0.03, time.total: 5.82, time.warmup: 0.06, time.bench: 5.76
BitmapCreationBenchmark.testToImmutableByteArray[0]: [measured 1000 out of 1010 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 0, GC.time: 0.00, time.total: 1.80, time.warmup: 0.02, time.bench: 1.78
BitmapCreationBenchmark.testRandomAddition[1]: [measured 10 out of 20 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 0, GC.time: 0.00, time.total: 0.12, time.warmup: 0.08, time.bench: 0.04
BitmapCreationBenchmark.testLinearAdditionDescending[1]: [measured 1000 out of 1010 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 10, GC.time: 0.01, time.total: 4.26, time.warmup: 0.04, time.bench: 4.22
BitmapCreationBenchmark.testToImmutableByteArray[1]: [measured 1000 out of 1010 rounds, threads: 1 (sequential)]
 round: 0.01 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 27, GC.time: 0.01, time.total: 5.11, time.warmup: 0.05, time.bench: 5.06
BitmapCreationBenchmark.testLinearAddition[0]: [measured 1000 out of 1010 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 0, GC.time: 0.00, time.total: 3.48, time.warmup: 0.04, time.bench: 3.45
BitmapCreationBenchmark.testLinearAddition[1]: [measured 1000 out of 1010 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 9, GC.time: 0.00, time.total: 2.95, time.warmup: 0.03, time.bench: 2.92
2014-11-13 12:47:23,995 INFO [main] io.druid.segment.data.BitmapCreationBenchmark - Entry [0] is io.druid.segment.data.ConciseBitmapSerdeFactory
2014-11-13 12:47:23,995 INFO [main] io.druid.segment.data.BitmapCreationBenchmark - Entry [1] is io.druid.segment.data.RoaringBitmapSerdeFactory
2014-11-13 12:48:04 -08:00
fjy 6cdd6a6af7 change default settings for retry 2014-11-13 12:43:30 -08:00
fjy 2b0ed30344 add test 2014-11-13 12:38:35 -08:00
fjy 0d6816a037 Fix a couple of bugs with retry query runner after testing it locally 2014-11-13 11:53:29 -08:00
Fangjin Yang 4a3c0fc5c4 Merge pull request #848 from metamx/druid-0.7.x-fastTopN-rebase
TopN performance improvements
2014-11-13 11:56:20 -07:00
Charles Allen 34c3464bc9 Added more explanatory comments in PooledTopNAlgorithm 2014-11-13 10:47:56 -08:00
Charles Allen 9a9238a801 Merge remote-tracking branch 'origin/master' into druid-845 2014-11-13 10:04:56 -08:00
Xavier Léauté 19a37f773f test for groupBy order 2014-11-12 22:48:16 -08:00
Charles Allen 31fed7d329 Fix query by segment
* Changed topN queries to use joda Interval instead of string values
* topN by segment now implements BySegmentResultValue<Result<TopNResultValue>> instead of BySegmentResultValue<TopNResultValue>
* Added a unit test which failed uner the prior implementation.
2014-11-12 21:20:59 -08:00
Xavier Léauté 4ac1aaf90e Merge remote-tracking branch 'origin/master' into druid-0.7.x
Conflicts:
	processing/src/main/java/io/druid/segment/QueryableIndexStorageAdapter.java
2014-11-12 14:08:51 -08:00
Fangjin Yang 4b4f1c7d38 Merge pull request #846 from vikramakrishnan/cacheKeyFix
Include origin when creating the cacheKey for period Grans
2014-11-12 11:15:21 -07:00
Fangjin Yang 7ef19009db Merge pull request #834 from metamx/optimize-timestampchecking
skip timestamp checking if not required, remove duplicate code
2014-11-12 09:45:34 -07:00
Vikram Ramakrishnan 8151d14786 Include origin when creating the cacheKey for period Grans 2014-11-12 21:36:46 +05:30
Charles Allen 581e6830d5 Merge pull request #840 from metamx/powers-of-2-buffers
make buffer size a power of 2 and optimize buffer lookup
2014-11-11 19:54:00 -08:00
Xavier Léauté 60e98c35d7 workaround for annotations requiring class literals 2014-11-11 18:48:49 -08:00
Xavier Léauté b580269f6d Distinguish between default and legacy bitmaps 2014-11-11 18:35:45 -08:00
fjy bc5c56e441 fix default impl 2014-11-11 18:00:46 -08:00
fjy 63ca2375a3 remove dead code and cleanup some defaults 2014-11-11 17:57:24 -08:00
fjy 5629307548 address a few more comments on roaring 2014-11-11 17:50:03 -08:00
Charles Allen a89b539b4f Merge pull request #823 from metamx/roaring
Configurable bitmap indexes: roaring and concise
2014-11-11 17:26:38 -08:00
Xavier Léauté 007e57f876 make buffer size a power of 2 and optimize 2014-11-11 16:24:37 -08:00
fjy 1cc162727b address code review 2014-11-11 14:05:37 -08:00
Xavier Léauté 3f5449d40a loop unrolling provides no benefit for timeseries 2014-11-11 10:58:37 -08:00
Xavier Léauté e817db8b6c unroll timeseries aggregations + naming 2014-11-11 10:09:37 -08:00
fjy e6b7b03b5b fix conversion 2014-11-10 17:13:28 -08:00
fjy eb457c280e revert change 2014-11-10 17:00:53 -08:00
fjy 336c73bdc2 cleanup dead code 2014-11-10 16:53:13 -08:00
fjy df886fac1b fix 2014-11-10 16:49:27 -08:00
fjy d68bc3bdea cleanup unused imports 2014-11-10 16:15:28 -08:00
Charles Allen 92e71be864 Change TopNQueryRunnerBenchmark to use a ByteBuffer as per OffheapBufferPool 2014-11-10 15:40:54 -08:00
Charles Allen a093f3728a Reformat on TopNNumericResultBuilder 2014-11-10 15:26:09 -08:00
Charles Allen 2b0f4534bf Modify formatting in TopNQueryRunnerBenchmark 2014-11-10 15:17:26 -08:00
Charles Allen fc78f215c4 Aggressive dimValue unrolling in PooledTopNAlgorithm 2014-11-10 15:14:45 -08:00
fjy 6188315293 Merge branch 'master' into druid-0.7.x
Conflicts:
	processing/src/test/java/io/druid/query/search/SearchQueryRunnerTest.java
2014-11-10 14:52:10 -08:00
Xavier Léauté 49e878cf1a unroll multi-value dimensions 2014-11-10 14:21:56 -08:00
fjy df9be030db remove more legacy code 2014-11-10 14:09:00 -08:00
Charles Allen dbb6401fbe Aggressive unrolling in PooledTopNAlgorithm 2014-11-10 14:04:37 -08:00
fjy 72c355e8ae remove crazy handler code 2014-11-10 13:58:23 -08:00
fjy fc34858e95 fix things for real-time ingestion 2014-11-10 13:48:21 -08:00
fjy c5cc826998 make things actually work with roaring 2014-11-10 13:42:26 -08:00
fjy 358b2add17 make things actually work with roaring 2014-11-10 13:42:06 -08:00
Charles Allen 9ac8589143 Merge remote-tracking branch 'origin/druid-0.7.x-fastTopN-rebase' into druid-0.7.x-fastTopN-rebase 2014-11-10 12:55:08 -08:00
Xavier Léauté 70468400bf stupid mistake 2014-11-10 12:54:48 -08:00
Charles Allen bfc9d9f283 Merge remote-tracking branch 'origin/druid-0.7.x-fastTopN-rebase' into druid-0.7.x-fastTopN-rebase 2014-11-10 12:52:36 -08:00