Commit Graph

1218 Commits

Author SHA1 Message Date
Charles Allen 8d3cdd8572 Don't check for sortedness if we already know GenericIndexedWriter isn't sorted 2015-09-11 16:32:09 -07:00
Charles Allen d6849805ea Add some basic latching to concurrency testing in IncrementalIndexTest 2015-09-10 10:06:51 -07:00
Himanshu Gupta 5da58e48e0 use Rule based TemporaryFolder for cleanup of temp directory/files 2015-09-09 11:10:33 -05:00
Himanshu Gupta 44911039c5 update indexing in the helper to use multiple persists and final merge to
catch further issues in aggregator implementations
2015-09-09 11:10:33 -05:00
Charles Allen fcf5cae81d Add CPU time to metrics for segment scanning. 2015-09-08 13:34:19 -07:00
cheddar 4f61b42f40 Merge pull request #1578 from b-slim/fix_extraction_filter_2
Fix UT and documentation to the extraction filter
2015-09-01 10:46:20 -07:00
Himanshu 04ff6cd355 Merge pull request #1685 from gianm/close-loudly
Close output streams and channels loudly when creating segments.
2015-08-28 23:32:22 -05:00
Gian Merlino 940e1aa3eb Replace funky imports with standard ones.
1) Lots of Guava imports were not coming from the actual Guava
2) junit.framework.Assert should be org.junit.Assert
2015-08-28 18:02:05 -07:00
Gian Merlino 7d6fa2ba50 Close output streams and channels loudly when creating segments. 2015-08-28 17:14:03 -07:00
Himanshu Gupta 2e0dd1d792 adding UTs and addressing review comments to
firehoseV2 addition to Realtime[Manager|Plumber],
essential segment metadata persist support,
kafka-simple-consumer-firehose extension patch
2015-08-27 20:50:46 -05:00
lvjq 2237a8cf0f kafka 8 simple consumer firehose 2015-08-27 20:50:46 -05:00
Charles Allen c1388a1685 Merge pull request #1632 from Hailei/fix-subquery-innerquery-demension
Inner Query  should build on sub query
2015-08-27 10:25:38 -07:00
Gian Merlino 2a866f49df Downgrade Jackson to 2.4.6. 2015-08-26 18:25:55 -07:00
Charles Allen 24aa762c79 Add test for #1632 2015-08-25 20:50:30 -07:00
Xavier Léauté 51f6a9a2c9 update jackson to 2.6.1 2015-08-25 16:07:01 -07:00
Himanshu Gupta c57c07f28a add ability for client code to provide InputStream of input data in addition to File
It would be needed when input data file does not reside in the same jar
but you could still use getResourceAsStream() to read the data inside a file
2015-08-20 00:54:58 -05:00
Xavier Léauté 3b2e41e42a update for next release 2015-08-18 17:16:46 -07:00
Slim Bouguerra 7549f02578 support the case filter value is null 2015-08-17 15:09:37 -05:00
zhanghailei 234a958817 Inner Query should build on sub query 2015-08-17 18:18:26 +08:00
Charles Allen db19d2d547 Revert "Update to guice 4.0" 2015-08-14 09:26:07 -07:00
Charles Allen be89105621 Merge pull request #1602 from metamx/more-code-cleanup
Some perf Improvements in Broker
2015-08-11 13:51:49 -07:00
Xavier Léauté fbdb841928 Merge pull request #1603 from metamx/optimize-lexicographic-topN
Optimizations for LexicographicTopNs
2015-08-11 13:35:34 -07:00
Nishant b8d8a8da9e Optimisations for LexicographicTopNs
initial review for perf optimizations for lexicographic TopNs

fix compilation

create map with proper size

review comment

review comment

review comments
2015-08-12 00:37:48 +05:30
Charles Allen 7e61216287 Update to guice 4.0
- Mark a lot of `@Provides` methods as final since guice 4.0 disallows overriding them
2015-08-10 13:57:18 -07:00
Slim Bouguerra f0bc362981 clean code if is not needed anymore 2015-08-07 12:38:41 -05:00
Slim Bouguerra 64d638a386 optimize makeMatcher 2015-08-06 17:04:36 -05:00
Nishant 1a46c4c71c avoid creating mergeSeqence when not required 2015-08-06 14:25:13 +05:30
Slim Bouguerra 83de5a4716 addressing reviewers comments 2015-08-03 09:03:28 -05:00
Slim Bouguerra dda0790a60 Fix extractionFilter by implementing make matcher
Fix getBitmapIndex to consider the case were dim is null
Unit Test for exractionFn with empty result and null_column
UT for TopN queries with Extraction filter
refactor in Extractiuon fileter makematcher for realtime segment and clean code in b/processing/src/test/java/io/druid/query/groupby/GroupByQueryRunnerTest.java
fix to make sure that empty string are converted to null
2015-08-03 09:02:17 -05:00
Himanshu Gupta d11d9b6c45 dont waste memory in storing all lines from input
CharSource.readLines() reads all lines from input into a in-memory list
Since we need an iterator here, so this wastage can be easily prevented
2015-07-20 21:59:38 -05:00
Fangjin Yang 0481c8ca26 Merge pull request #1406 from zhaown/fix-breaking-while-exceeding-max-intermediate-rows
Fix breaking while exceeding max intermediate rows.
2015-07-20 13:41:22 -07:00
Himanshu Gupta f7a92db332 generic byte[] serde for InputRow 2015-07-20 12:01:53 -05:00
Himanshu Gupta 0439e8ec23 adding serde methods for intermediate aggregation object to ComplexMetricSerde
This provides the alternative to using ComplexMetricSerde.getObjectStrategy()
and using the serde methods from ObjectStrategy as that usage pattern is deprecated.
2015-07-20 12:01:53 -05:00
zhaown 524b05f073 Fix breaking while exceeding max intermediate rows. 2015-07-19 10:41:53 +08:00
Fangjin Yang e21195f987 Merge pull request #1469 from guobingkun/table_config
Inconsistent property names for "druid.metadata.storage.tables.xxx"
2015-07-17 07:43:19 -07:00
Himanshu 19af3bc9bc Merge pull request #1535 from metamx/alphanum-docs-tests
Update alphanumeric sort docs + more tests / examples
2015-07-16 22:09:41 -05:00
Xavier Léauté 2c464ad936 correct reference in docs + more tests / examples 2015-07-16 19:50:05 -07:00
Xavier Léauté 9616c10b1d remove import static 2015-07-16 17:46:21 -07:00
Xavier Léauté c1308203b8 Merge pull request #1532 from metamx/fixTopNDimExtractionDoubleApply
Fix TopN dimension extractions being applied twice
2015-07-16 13:39:02 -07:00
Xavier Léauté 3a0793aaf9 Merge pull request #1533 from metamx/extraCheckGroupByDimExtraction
Add more unit tests for group by
2015-07-15 21:09:00 -07:00
Charles Allen 7d0b77c261 Add more unit tests for group by 2015-07-15 20:15:21 -07:00
Xavier Léauté a15a2c4047 fix histogram aggregator cache key 2015-07-15 17:33:36 -07:00
Charles Allen 9092c665b7 Fix TopN dimension extractions being applied twice 2015-07-15 16:58:15 -07:00
Charles Allen 456ad9ffba Merge pull request #1529 from metamx/update-versions
inrement version
2015-07-15 13:25:31 -07:00
Xavier Léauté 4cfb00bc8a inrement version 2015-07-15 13:09:05 -07:00
Charles Allen 5eadd395e2 Move lots of executor service creation to Execs 2015-07-14 15:38:49 -07:00
Nishant 184b12bee8 fix groupBy caching to work with renamed aggregators
Issue - while storing results in cache we store the event map which
contains aggregator names mapped to values. Now when someone fire same
query after renaming aggs, the cache key will be same but the event
will contain metric values mapped to older names which leads to wrong
results.
Fix - modify cache to not store raw event but the actual list of values
only.

review comments + fix dimension renaming

review comment
2015-07-09 11:48:26 +05:30
Xavier Léauté 9789417612 ModuleList is already part of Initialization 2015-07-01 11:37:40 -07:00
Xavier Léauté 2c463ae435 Merge pull request #1489 from metamx/moveTestPackages
Move some test packages
2015-07-01 11:18:09 -07:00
Charles Allen 5e19a615f1 Add coments to DimExtractionTopNAlgorithm 2015-07-01 10:32:45 -07:00
Charles Allen 7a2a8a3d6e Move extraction tests to more reasonable package 2015-07-01 10:30:50 -07:00
Bingkun Guo 4a0ae7d8d5 Fix inconsistent druid property names for "druid.metadata.storage.tables.xxx" between document and code 2015-06-29 10:12:30 -05:00
Xavier Léauté 28fa1642b9 add node time metrics to DirectDruidClient 2015-06-26 17:57:44 -07:00
Xavier Léauté 36b4453789 Merge pull request #1455 from druid-io/fix-protobuf
Fix protobuf impl and docs
2015-06-22 23:15:40 -07:00
nishant f9cdb0ad61 test for #1120
Make the changes described in #1120 to add test for the issue described
there.
2015-06-21 23:34:21 +05:30
fjy 9c74993559 fix protobuf impl and docs 2015-06-20 21:59:38 -07:00
Xavier Léauté 0a5bb909a2 [maven-release-plugin] prepare for next development iteration 2015-06-18 17:35:19 -07:00
Xavier Léauté 59c6b2b279 [maven-release-plugin] prepare release druid-0.8.0-rc1 2015-06-18 17:35:14 -07:00
Charles Allen 6230ac90ae Use IndexMerger for conversion 2015-06-10 11:34:58 -07:00
Xavier Léauté 395ba79f8b Merge pull request #1403 from metamx/mergerMakerTests
Improvements around resource handling in IndexMerger / IndexIO / QueryableIndex
2015-06-04 15:59:10 -07:00
Charles Allen ed8eb5c991 Improvements around resource handling in IndexMerger / IndexIO / QueryableIndex
* Fix resource leak in `io.druid.segment.IndexIO.DefaultIndexIOHandler#validateTwoSegments(java.io.File, java.io.File)`
* Un-deprecate `close()` in `QueryableIndex` and make it inherit `Closeable`
* Fix resource leaks in various unit tests
* Add `CloserRule` for closing out resources
2015-06-04 14:18:27 -07:00
Himanshu 50ad0e6474 Merge pull request #1412 from pjain1/alphaNumericTopN_NPE_fix
NPE fix for TopN query with alphaNumericTopN metric spec
2015-06-04 09:49:31 -05:00
Parag Jain a7b09e857c NPE fix for alphaNumericTopN when pervious stop is not specified 2015-06-04 09:30:31 -05:00
Xavier Léauté 35e2fde18e Merge pull request #1386 from himanshug/aggregation_testing1
General class for testing any Aggregation Implementation
2015-06-03 23:43:36 -07:00
Xavier Léauté 92d7316ed8 Merge pull request #1414 from metamx/timeout2TIMEOUT
Replace "timeout" with QueryContextKeys.TIMEOUT
2015-06-02 17:11:09 -07:00
Charles Allen 1c4d42bc15 Replace "timeout" with QueryContextKeys.TIMEOUT 2015-06-02 14:49:21 -07:00
Charles Allen f48db09e35 Add optimizations for ExtractionFn by enabling MANY_TO_ONE vs ONE_TO_ONE codepaths
* Also adds LookupExtractionFn and MapLookupExtractor which takes in an explicit mapping of renames
* Add injective to javascript extraction fn
2015-06-02 12:22:56 -07:00
Himanshu Gupta 215c1ab01e UTs for hyperUnique aggregation 2015-06-01 12:52:40 -05:00
Himanshu Gupta 160d5fe6b7 a general class for testing any [complex] aggregation implementation 2015-06-01 12:52:40 -05:00
Charles Allen 55292bba13 Add more IndexMergerTests 2015-05-28 18:18:20 -07:00
Charles Allen 1ebe622c7d Add checkin GroupByQuery for null DimensionSpec in dimension list 2015-05-28 14:55:34 -07:00
Xavier Léauté f9c624c7db Merge pull request #1361 from mrijke/groupby-limithavingorder-unittest
GroupBy Query with Having/Limit/Orderingspec inconsistencies (UnitTest)
2015-05-27 14:49:18 -07:00
Xavier Léauté 1a3f04f0ed Merge pull request #1354 from metamx/multi-valued-dimension-compression
Enabling compression for multiValued dimension
2015-05-26 23:43:53 -07:00
Charles Allen fd64c24e43 Fix roaring extraction filter on empty values 2015-05-26 13:54:18 -07:00
nishant 81415282aa Enabling compression for multiValued dimension
Add test and refactoring

Add benchmark tests
2015-05-27 00:09:14 +05:30
Charles Allen e97d22a10a Fix Extraction Filter cast problems for empty results 2015-05-22 15:20:11 -07:00
Charles Allen e1399b7ce4 Add unit test to show breaking Dimension Extraction Filter 2015-05-22 15:02:11 -07:00
Xavier Léauté 75c092ccb1 Merge pull request #1375 from metamx/MetricManipulatorFnInstances
Modify MetricManipulatorFns to use instanced classes
2015-05-22 15:56:47 -04:00
Charles Allen 042653ebcb Modify MetricManipulatorFns to use instanced classes 2015-05-22 12:38:38 -07:00
Himanshu Gupta 723df735e9 force eagerness of processing of SegmentMetadata queries on the processing executor by converting the Sequence into List 2015-05-22 13:46:26 -05:00
Himanshu Gupta 5852b64852 adding UT for SegmentMetadata bySegment query which catches following regression caused by commit 55ebf0cfdf
it fails when we issue the SegmentMetadataQuery by setting {"bySegment" : true} in context with exception -
java.lang.ClassCastException: io.druid.query.Result cannot be cast to io.druid.query.metadata.metadata.SegmentAnalysis
at io.druid.query.metadata.SegmentMetadataQueryQueryToolChest$4.compare(SegmentMetadataQueryQueryToolChest.java:222) ~[druid-processing-0.7.3-SNAPSHOT.jar:0.7.3-SNAPSHOT]
at com.google.common.collect.NullsFirstOrdering.compare(NullsFirstOrdering.java:44) ~[guava-16.0.1.jar:?]
at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:46) ~[java-util-0.27.0.jar:?]
at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:42) ~[java-util-0.27.0.jar:?]
at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:649) ~[?:1.7.0_80]
2015-05-22 13:45:54 -05:00
Himanshu Gupta da0cc32bc8 Revert commit 55ebf0cfdf
which caused following regression
 it fails when we issue the SegmentMetadataQuery by setting {"bySegment" : true} in context with exception -
java.lang.ClassCastException: io.druid.query.Result cannot be cast to io.druid.query.metadata.metadata.SegmentAnalysis
at io.druid.query.metadata.SegmentMetadataQueryQueryToolChest$4.compare(SegmentMetadataQueryQueryToolChest.java:222) ~[druid-processing-0.7.3-SNAPSHOT.jar:0.7.3-SNAPSHOT]
at com.google.common.collect.NullsFirstOrdering.compare(NullsFirstOrdering.java:44) ~[guava-16.0.1.jar:?]
at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:46) ~[java-util-0.27.0.jar:?]
at com.metamx.common.guava.MergeIterator$1.compare(MergeIterator.java:42) ~[java-util-0.27.0.jar:?]
at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:649) ~[?:1.7.0_80]
2015-05-22 13:39:34 -05:00
Maarten Rijke 82da479464 Fix for GroupBy with Having+Limit+Orderspec
* Inverted function arguments to compose postProcFn for GroupBy queries
    with havingspec + limitspec.
  * Replaced query.getLimitSpec() with null in GroupByQueryToolChest's
    mergeGroupByResults
  * Added unittest to verify functionality
2015-05-19 18:35:48 +02:00
Himanshu Gupta 2fd3e9e8e5 return size = 0 in ColumnAnalysis if its unknown
that is if complex agg did not implement inputSizeFn() so
that segment metadata query shows atleast some information.
also instead of COMPLEX, return type of data stored.
2015-05-15 20:11:56 -05:00
Xavier Léauté 3c3db7229c Merge pull request #1355 from himanshug/long_max_min_aggregators
Long max/min aggregators
2015-05-13 12:08:11 -07:00
Himanshu Gupta cebb550796 additional UTs for [DoubleMax/DoubleMin] aggregation 2015-05-13 09:25:41 -05:00
Himanshu Gupta d0ec945129 adding aliases doubleMax and doubleMin for max and min respectively
renamed all [Max/Min]*.java to [DoubleMax/DoubleMin]*.java and created [Max/Min]AggregatorFactory.java which can be removed when we dont need the min/max aggregator type backward compatibility
2015-05-13 09:25:41 -05:00
Himanshu Gupta 2de38f7d29 UTs for long[Max/Min] aggregation 2015-05-13 09:25:22 -05:00
Himanshu Gupta 00436f93e2 long max/min aggregators implementation 2015-05-13 09:25:22 -05:00
fjy 7a6acf5c1b update pom to 0.8 2015-05-11 19:41:58 -06:00
Xavier Léauté 33265d63e1 Merge pull request #1262 from metamx/fix-null-dimension
fix handling of dimension having only null values
2015-05-06 13:51:26 -07:00
nishant 34be1e96fa fix NPE
review comments

Add test

fix test for java8
2015-05-05 23:11:13 +05:30
Neo 8f8400e24e fix handling of dimension having only null values
fixes #1211

fix value matcher

more improvements

more fixes for partial null column

fix handling of dimension having only null values

fixes #1211

fix value matcher

more improvements

more fixes for partial null column

review comment

IndexMaker speedups
* About 15% speedup

Conflicts:
	processing/src/main/java/io/druid/segment/IndexMaker.java

fix handling of dimension having only null values

fixes #1211

fix value matcher

more improvements

more fixes for partial null column

fix handling of dimension having only null values

fixes #1211

fix value matcher

more improvements

more fixes for partial null column

review comment

review comments

review comment

fix failing tests

review comment

fix compilation
2015-05-04 22:07:45 +05:30
nishant 50158357ff fixes #1330
fixes #1330,
Avoid creating Period instance as creating a Period from Long.MAX_VALUE
throws arithmetic exception.
After this query metric will emit duration in seconds instead of
minutes.
2015-05-04 20:34:28 +05:30
Xavier Léauté 721505c017 Merge pull request #1208 from druid-io/rework-metrics
Schemaless metrics + additional metrics for things we care about
2015-04-27 15:04:54 -07:00
fjy 963e5765bf Schemaless metrics + additional metrics for things we care about 2015-04-27 13:39:40 -07:00
Charles Allen 27016c0289 Fix IndexIO segment validator to account for timestamp mismatches. 2015-04-27 12:42:16 -07:00
Charles Allen 633fdb029e Add option to ConvertSegmentTask to skip validation
* Validation is enabled by default
2015-04-27 08:37:55 -07:00
Charles Allen 303727e6a9 IndexMaker speedups
* About 15% speedup

Conflicts:
	processing/src/main/java/io/druid/segment/IndexMaker.java
2015-04-23 13:19:21 -07:00
Charles Allen f2300430d1 Cleanup some code in index creation.
* Add some unit tests
* Add io.druid.segment.IndexMerger.reprocess for quick re-indexing of data
* Add dim-value validation to validation checker (instead of ONLY index #)
* General code refactoring to make things a little easier to read
2015-04-23 12:41:42 -07:00