Commit Graph

1042 Commits

Author SHA1 Message Date
Xavier Léauté 4eff269536 Merge pull request #1079 from druid-io/cleanup-deps
Remove non friendly dependencies from Druid
2015-02-03 11:56:41 -08:00
fjy 3e5d338c8e Remove non friendly dependencies from Druid 2015-02-03 11:36:08 -08:00
Fangjin Yang 71b4c5fa86 Merge pull request #1076 from metamx/remove-threadlocals
remove thread-locals in GenericIndexed in favor of wrapped objects
2015-02-02 20:02:33 -08:00
Xavier Léauté cb2e300eba remove thread-locals in GenericIndexed in favor of wrapped objects to reduce GC pressure 2015-02-02 15:59:30 -08:00
Eric Tschetter 42eba986ce Towards consistent null handling
This commit also includes
1) the addition of a context parameter on timeseries queries that allows it to ignore empty buckets instead of generating results for them
2) A cleanup of an unused method on an interface
2015-02-02 12:53:07 -08:00
Fangjin Yang 92e616de11 Merge pull request #1077 from metamx/remove-unused-imports
remove unused imports
2015-02-02 10:45:27 -08:00
nishantmonu51 ba932bb1f2 remove unused imports 2015-02-02 21:53:39 +05:30
fjy d05032b98a towards a community led druid 2015-01-31 20:57:36 -08:00
Xavier Léauté f24a89a22a fix NPE for topN over missing hyperUniques column 2015-01-27 16:12:41 -08:00
Charles Allen 226dd91a31 Add a hash map for storing groupBy partition index
* Improves groupBy performance by approx 15%
2015-01-26 08:42:02 -08:00
fjy 1f94de22c6 [maven-release-plugin] prepare for next development iteration 2015-01-20 14:23:55 -08:00
fjy 17476edc31 [maven-release-plugin] prepare release druid-0.7.0-rc1 2015-01-20 14:23:51 -08:00
Charles Allen 3d27747f7e Upgrade to log4j2
Default behavior is as before.
Added documentation for how to enable synchronous logging for select chatty classes:
* io.druid.client.ServerInventoryView
* io.druid.client.BatchServerInventoryView
* io.druid.curator.inventory.CuratorInventoryManager
* com.metamx.http.client.pool.ChannelResourceFactory
2015-01-20 12:35:18 -08:00
Fangjin Yang 91a79dbf95 Merge pull request #1031 from metamx/ingestmetadata-query
DataSourceMetadata query
2015-01-19 21:55:35 -08:00
Charles Allen 7bb038756c Account for very slow writer threads in IncrementalIndexTest 2015-01-17 13:02:59 -08:00
Fangjin Yang b4041c13e5 Merge pull request #1029 from metamx/fixChainedExecutionQueryRunnerTest
Address spurious test failures
2015-01-16 13:08:32 -08:00
Xavier Léauté 3b3aad78cb Merge pull request #1027 from metamx/concurrentOnHeapIncrementalIndexFix
Fix concurrency issues in OnheapIncrementalIndex
2015-01-16 12:54:42 -08:00
Charles Allen 197af967ef Fix concurrency issues in OnheapIncrementalIndex
* Was encountering weird errors when fast writes were coming in while queries were happening.
* Added unit tests which tend to cause concurrency query problems
2015-01-16 12:01:46 -08:00
Charles Allen ebafa2a786 Fix spurious test failures in ChainedExecutionQueryRunnerTest 2015-01-15 16:49:16 -08:00
Fangjin Yang 5bfcc43377 Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate
Update all String conversions to and from byte[] to use the java-util StringUtils functions
2015-01-15 13:50:27 -08:00
nishantmonu51 c7452b75f6 Merge branch 'master' into ingestmetadata-query 2015-01-15 18:00:31 +05:30
Xavier Léauté d5f4182de4 global test timeouts + fix test race condition 2015-01-07 23:36:57 -08:00
Fangjin Yang 852e863425 Merge pull request #981 from druid-io/strictModuleTyping
Use Module instead of generic Object in Guice related items
2015-01-05 12:43:20 -08:00
Charles Allen b1b5c9099e Update all String conversions to and from byte[] to use the java-util StringUtils functions
* Speedup of GroupBy with javaScript filters by ~10%
* Requires https://github.com/metamx/java-util/pull/15
2015-01-05 11:22:32 -08:00
Xavier Léauté 3fc6cf918d add test for large chunks 2015-01-02 14:31:22 -08:00
Xavier Léauté f2f9cbeca8 throw error rather than returning garbage results 2015-01-02 14:29:21 -08:00
Xavier Léauté 071943a367 fix LZF compression with buffers exceeding LZF chunk size 2015-01-02 11:39:50 -08:00
Xavier Léauté f2439899e7 fix bitmap factory serde 2014-12-23 15:07:32 -08:00
Xavier Léauté 27a3169312 increase test timeouts 2014-12-19 17:09:43 -08:00
Charles Allen 971afab36f Lengthen CompressionStrategyTest::testKnownSizeConcurrency() to have 2m timeout on its test to account for shared Jenkins build lag 2014-12-19 12:53:20 -08:00
Charles Allen 7c8d4a7433 Use Module instead of generic Object in Guice related items 2014-12-19 10:54:06 -08:00
Fangjin Yang be507b8cb4 Merge pull request #943 from mrijke/partialdimextractfn-nullpointer
Fix NullPointerException in PartialDimExtractionFn
2014-12-16 12:29:27 -07:00
nishantmonu51 80e4b68ee7 review comments 2014-12-16 21:16:48 +05:30
Fangjin Yang b3fe91bb50 Merge pull request #830 from metamx/union-merge-on-historical
Union merge on historical
2014-12-15 13:36:47 -07:00
fjy 3cb7999eb9 i hate hadoop dependencies 2014-12-15 09:52:46 -08:00
nishantmonu51 a0d3579a92 add docs + fix tests 2014-12-11 17:58:01 +05:30
nishantmonu51 7ad03087c0 Merge branch 'master' into ingestmetadata-query 2014-12-11 16:54:38 +05:30
nishantmonu51 32b4f55b8a review comments refactoring 2014-12-11 16:33:14 +05:30
nishantmonu51 3763357f6e Ingest metadata query implementation 2014-12-10 19:44:00 +05:30
Fangjin Yang d6d3ec6846 Merge pull request #948 from metamx/ingestion-docs
Redocumenting ingestion
2014-12-09 15:30:03 -07:00
fjy 9596c11f42 address cr 2014-12-09 14:19:18 -08:00
nishantmonu51 1a1b0e6f23 merge from master and review comments 2014-12-09 13:16:45 +05:30
xvrl 1392e2731f Merge pull request #936 from metamx/cachingRunnerImprovements
General Caching Query Runners cleanup (40% query time reduction for HLL)
2014-12-08 14:07:52 -08:00
Charles Allen 7b65f0635d General Caching Query Runners cleanup
* Add type strictness to CachingClusteredClient.
* Add background caching to CachingClusteredClient. Gives between 0% and 5% query speed increase.
* Add @BackgroundCaching annotation for injected ExecutorService items
* Add `numBackgroundThreads' configuration options to CacheConfig (default 0 aka same thread legacy behavior)
* Add unit tests for CacheConfig
* Add an abstract caching query runner class, currently it doesn't do anything exceppt simply make the two caching queries distinct.
* Add caching to CachingQueryRunner. Gives up to a WHOPPING 40% reduction in query time on HLL queries
* Updated docs with more info on cache settings.
2014-12-08 13:29:32 -08:00
Maarten Rijke 90670a9c7e Fix NullPointerException in PartialDimExtractionFn by explicity checking for dimValue == null, attempt 2 2014-12-08 22:26:35 +01:00
Maarten Rijke bd9bbf396c Fix NullPointerException in PartialDimExtractionFn by explicity checking for dimValue == null 2014-12-08 20:11:58 +01:00
Xavier Léauté ad23e49777 use fixed-size mapdb cache to avoid heap growing uncontrollably 2014-12-05 15:34:50 -08:00
Xavier Léauté 7cd45a6e1f IncrementalIndex throws exception if limit exceeded
- For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes
- canAppendRow is a workaround for realtime index since the
Firehose currently does not have a way of rolling back the last event in
case of error
- canAppendRow needs a fudge factor; there is a race between checking
if we can add a row and actually adding a row, because of the way MapDB
reports its size.
2014-12-04 14:38:16 -08:00
Xavier Léauté c7dbe6116c write byte data as is in smile 2014-12-04 10:57:56 -08:00
Xavier Léauté c21a82a697 upgrade LZ4 to operate directly on ByteBuffers 2014-12-04 10:57:56 -08:00
Xavier Léauté 0c521e0a77 update joda-time and fix min/max instant 2014-12-04 10:57:56 -08:00
nishantmonu51 269a51964e fix size calculation 2014-12-04 17:22:24 +05:30
nishantmonu51 4dc0fdba8a consider mapped size in limit calculation & review comments 2014-12-03 23:47:30 +05:30
Charles Allen 529e7e0272 Merge pull request #927 from metamx/speedup-smile-bytes
Improve Smile serde performance by writing binary data as is
2014-12-03 10:02:08 -08:00
Charles Allen 0f5d5840da Merge pull request #924 from metamx/update-joda-time
Update Joda-Time and fix min/max instant overflow
2014-12-03 09:15:39 -08:00
nishantmonu51 da8bd7836b Introduce buffer size 2014-12-03 16:28:22 +05:30
Xavier Léauté 5fece517fa write byte data as is in smile 2014-12-03 00:01:01 -08:00
Xavier Léauté 18f50097a9 upgrade LZ4 to operate directly on ByteBuffers 2014-12-02 23:53:56 -08:00
fjy bc173d14fc a whole bunch of cleanup and fixes 2014-12-02 17:32:05 -08:00
Xavier Léauté a79389a9e5 update joda-time and fix min/max instant 2014-12-02 17:27:22 -08:00
nishantmonu51 b65933ffb8 make tests parameterised 2014-12-02 23:55:29 +05:30
nishantmonu51 6dc69c2f30 code cleanups & formatting 2014-12-02 22:44:33 +05:30
nishantmonu51 eac776f1a7 tests passing with on heap incremental index 2014-12-02 22:29:28 +05:30
Xavier Léauté 4eee7e69b9 fix cardinality aggregator caching 2014-11-26 15:00:37 -08:00
xvrl 5bc1be5ba0 Merge pull request #850 from metamx/druid-0.7.x-compressionstrategy
Compression strategy changes
2014-11-25 12:58:39 -08:00
Charles Allen c6043afa32 Removed empty function from CompressionStrategyTest 2014-11-25 12:57:06 -08:00
Charles Allen 6943db5251 Changed branching logic for LZFCompressor to return null only on error, and avoid checking in most circumstances 2014-11-25 12:53:11 -08:00
Charles Allen 9f945c2216 Removed lz4Fast from CompressedObjectStrategy for compression since it is not currently used 2014-11-24 16:11:03 -08:00
Charles Allen 70e3108282 Multiple speed improvements revolving around topN with HLL
Change serializer / deserializer for HyperLogLog
* Changed DirectDruidClient's InputStream handling. Is now ~10% faster for data heavy queries, and has lower variance in execution speed.
* Changed HLL Collector's toByteStream() method to be better optimized for small values. Is notably faster for small result quantities which fall into the sparse HLL bucket codepath.
  * No change for dense HLL which just uses a direct bytestream of the underlying byte data.

TopNNumericResultBuilder semi-aggressive loop unrolling for metricVals

Benchmark for HLL for sparse packing (small HLL bucket population):
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[0]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 216, GC.time: 0.42, time.total: 15.96, time.warmup: 0.22, time.bench: 15.74
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[1]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 217, GC.time: 0.45, time.total: 13.87, time.warmup: 0.02, time.bench: 13.85
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[2]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 55, GC.time: 0.16, time.total: 4.13, time.warmup: 0.00, time.bench: 4.12
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[3]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 55, GC.time: 0.16, time.total: 4.30, time.warmup: 0.00, time.bench: 4.30
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[4]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 8, GC.time: 0.03, time.total: 1.10, time.warmup: 0.00, time.bench: 1.09
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[5]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 8, GC.time: 0.03, time.total: 0.72, time.warmup: 0.00, time.bench: 0.72
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[6]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 1, GC.time: 0.00, time.total: 0.60, time.warmup: 0.00, time.bench: 0.60
HyperLogLogSerdeBenchmarkTest.benchmarkToByteBuffer[7]: [measured 100000 out of 100100 rounds, threads: 1 (sequential)]
 round: 0.00 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 2, GC.time: 0.01, time.total: 0.26, time.warmup: 0.00, time.bench: 0.25

Updates to HyperLogLogCollector toByteBuffer() based on code review

Removed changes from DirectDruidClient from this branch and put it in another branch.

Changed HyperLogLogCollector to have protected getters and setters

Remove unused ByteOrder from HyperLogLogCollector

Copyright header on HyperLogLogSerdeBenchmarkTest

Now with less ass!

Reformat in TopNNumericResultsBuilder. No code change

Removed unused import in HyperLogLogCollector

Replace AppendableByteArrayInputStream in DirectDruidClient
* Replace with SequenceInputStream fueled by an enumeration of ChannelBufferInputStream which directly wrap the response context ChannelBuffer

Modify TopNQueryQueryToolChest to use Arrays instead of Lists

Modify TopNQueryQueryToolChest to use Arrays instead of Lists

Revert accidental changes to DirectDruidClient

They should be in another merge request:
https://github.com/metamx/druid/pull/893

Fixes from code review
* Extracting names from AggregatorFactory classes now done with TopNQueryQueryToolChest.extractFactoryName
* Renamed variable in TopNNumericResultBuilder
2014-11-24 16:02:00 -08:00
fjy 13cae41f6c Merge branch 'master' into refactor-examples 2014-11-24 11:00:26 -08:00
xvrl 9ced097abd Merge pull request #895 from metamx/fix-interval-retry
A set of fixes to retry the query for missing segments in the timeline
2014-11-24 10:23:02 -08:00
fjy c88aff6205 remove unneeded constant 2014-11-24 10:20:02 -08:00
fjy 9da66291e1 change naming to common config 2014-11-21 15:36:42 -08:00
Charles Allen 8f80d9e189 Update CompressedObjectStrategy to try-with-resources but print log error if error while closing 2014-11-21 11:35:11 -08:00
Charles Allen fc9a54ea48 Fix CompressedObjectStrategy LZFCompressor to ignore error on close of ResourceHolder 2014-11-21 10:49:43 -08:00
Charles Allen f8ce68565b Modified CompressedObjectStrategy to use 0xFF for Uncompressed 2014-11-21 10:33:53 -08:00
Charles Allen aa49e56ed6 Merge remote-tracking branch 'origin/master' into druid-0.7.x-compressionstrategy 2014-11-21 10:29:40 -08:00
fjy ef62bccdec ignore benchmark 2014-11-20 16:52:19 -08:00
nishantmonu51 e3260aa177 Filtered Aggregator fixes + enhancements
- fix NPE on IncrementIndex
- refactor code to support AND, OR filter
- tests for AND & OR filter
- handling for missing column / null values
2014-11-20 15:17:18 -08:00
fjy 47f5c1bd0a fix retry interval is stupid 2014-11-20 12:50:56 -08:00
fjy 3d9d989a9f A set of fixes to retry the query for missing intervals in the timeline 2014-11-20 12:04:37 -08:00
nishantmonu51 0ab34f86da Revert "fix filtered Aggregator"
This reverts commit 6fd37ce023.
2014-11-20 10:17:01 +05:30
nishantmonu51 6fd37ce023 fix filtered Aggregator
fix filtered Aggregator
remove unused name parameter for filtered aggregator
add tests
2014-11-20 09:29:26 +05:30
fjy a49e673122 put back another missing test 2014-11-19 16:55:20 -08:00
fjy 14668846aa add back some tests 2014-11-19 14:35:26 -08:00
fjy fdeab0c6af make Druid case sensitive 2014-11-19 14:27:31 -08:00
Fangjin Yang 590d31799e Merge pull request #876 from metamx/remove-backwards-compatible
Remove backwards compatible
2014-11-19 14:33:14 -07:00
Charles Allen 18f44beee9 CompressedObjectStrategy improvements
* Added more unit tests
* Now properly uses safe / fast decompressor for LZ4
* Now chooses fastest lz4 instance instead of only looking at Java implmentations
* Encapsulate ResourceHolder in try-with-resources to make sure they close correctly
2014-11-19 11:10:59 -08:00
Charles Allen ccc757dc64 Merge remote-tracking branch 'origin/master' into druid-0.7.x-compressionstrategy 2014-11-19 09:39:35 -08:00
Charles Allen 1bbc8fcbe5 Allow Smile to fall back to text
* Modify SmileFactory to set the delegate to text option.
  * This option only occurs when a Reader type object is passed in to the deserialization stuff
  * This is needed by the X-Druid-Response-Context header return value, which is JSON
2014-11-18 15:16:14 -08:00
Charles Allen 42517f5d37 Merge pull request #884 from metamx/optimize-topN-pruning
optimise pruning of aggs
2014-11-18 14:19:30 -08:00
xvrl a96eaeb036 Merge pull request #882 from metamx/now_with_OPEN_SOURCE
Added src jar build to maven poms and re-formatted to conform to style guidelines.
2014-11-18 13:00:04 -08:00
nishantmonu51 6023d602e6 optimise pruning of aggs
optimise pruning of aggregators for topN
2014-11-19 00:17:25 +05:30
Charles Allen dc66e1708e Added src jar build to maven poms and re-formatted to conform to style guidelines. 2014-11-18 09:05:30 -08:00
Xavier Léauté d914afe1cd make defaultVersion configurable for non-jar testing 2014-11-17 13:54:32 -08:00
nishantmonu51 0c2d06475d merge from master 2014-11-17 19:19:18 +05:30
nishantmonu51 cbffe3c648 merge from master and resolve conflicts 2014-11-17 18:07:08 +05:30
fjy d5c4282766 fix broken ut 2014-11-14 13:45:42 -08:00
xvrl e1e171ad20 Merge pull request #865 from metamx/fix-retry-qr
Fix a couple of bugs with retry query runner after testing it locally
2014-11-14 13:33:51 -08:00
fjy df1ad95075 remove useless config 2014-11-14 13:32:19 -08:00