Jonathan Wei
decefb7477
Add time interval dim filter and retention analysis example ( #3315 )
...
* Add time interval dim filter and retention analysis example
* Use closed-open matching for intervals, update cache key generation
* Fix time filtering tests for interval boundary change
2016-08-05 07:25:04 -07:00
Navis Ryu
5b3f0ccb1f
Support variance and standard deviation ( #2525 )
...
* Support variance and standard deviation
* addressed comments
2016-08-04 17:32:58 -07:00
Gian Merlino
9437a7a313
HLL: Avoid some allocations when possible. ( #3314 )
...
- HLLC.fold avoids duplicating the other buffer by saving and restoring its position.
- HLLC.makeCollector(buffer) no longer duplicates incoming BBs.
- Updated call sites where appropriate to duplicate BBs passed to HLLC.
2016-08-03 18:08:52 -07:00
Gian Merlino
a4b95af839
Fix grouper closing in GroupByMergingQueryRunnerV2. ( #3316 )
...
The grouperHolder should be closed on failure, not the grouper.
2016-08-02 21:02:30 -07:00
Gian Merlino
0299ac73b8
Fix FilteredAggregators at ingestion time and in groupBy v2 nested queries. ( #3312 )
...
The common theme between the two is they both create "fake" DimensionSelectors
that work on top of Rows. They both do it because there isn't really any
dictionary for the underlying Rows, they're just a stream of data. The fix for
both is to allow a DimensionSelector to tell callers that it has no dictionary
by returning CARDINALITY_UNKNOWN from getValueCardinality. The callers, in
turn, can avoid using it in ways that assume it has a dictionary.
Fixes #3311 .
2016-08-02 17:39:40 -07:00
Gian Merlino
ae3e0015b6
Fix ClassCastException in nested v2 groupBys with timeouts. ( #3310 )
...
Add tests for the CCE and for a bunch of other groupBy stuff.
Also avoids setting the interrupted flag when InterruptedExceptions
happen, since this might interfere with resource closing, no other
query does it, and is probably pointless anyway since the thread
is likely to be a jetty thread that we don't actually want to set
an interrupt flag on.
Also fixes toString on OrderByColumnSpec.
2016-08-02 16:02:44 -06:00
kaijianding
50d52a24fc
ability to not rollup at index time, make pre aggregation an option ( #3020 )
...
* ability to not rollup at index time, make pre aggregation an option
* rename getRowIndexForRollup to getPriorIndex
* fix doc misspelling
* test query using no-rollup indexes
* fix benchmark fail due to jmh bug
2016-08-02 11:13:05 -07:00
Jonathan Wei
0bdaaa224b
Use Long.compare for NumericComparator when possible ( #3309 )
2016-08-01 20:36:56 -07:00
Dave Li
bc20658239
groupBy nested query using v2 strategy ( #3269 )
...
* changed v2 nested query strategy
* add test for #3239
* update for new ValueMatcher interface and add benchmarks
* enable time filtering
* address PR comments
* add failing test for outer filter aggregator
* add helper class for sharing code
* update nested groupby doc
* move temporary storage instantiation
* address PR comment
* address PR comment 2
2016-08-01 18:30:39 -07:00
Jonathan Wei
a6105cbb86
Add numeric StringComparator ( #3270 )
...
* Add numeric StringComparator
* Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query
* Address PR comments, add multithreaded BoundDimFilter test
* Add comment on strlen tie handling
* Add timeseries interval filter benchmark
* Adjust docs
* Use jackson for StringComparator, address PR comments
* Add new TopNMetricSpec and SearchSortSpec with tests (WIP)
* More TopNMetricSpec and SearchSortSpec tests
* Fix NewSearchSortSpec serde
* Update docs for new DimensionTopNMetricSpec
* Delete NumericDimensionTopNMetricSpec
* Delete old SearchSortSpec
* Rename NewSearchSortSpec to SearchSortSpec
* Add TopN numeric comparator benchmark, address PR comments
* Refactor OrderByColumnSpec
* Add null checks to NumericComparator and String->BigDecimal conversion function
* Add more OrderByColumnSpec serde tests
2016-07-29 15:44:16 -07:00
Navis Ryu
884017d981
"all" type search query spec ( #3300 )
...
* "all" type search query spec
* addressed comments
* added unit test
2016-07-28 18:16:15 -07:00
Gian Merlino
2553997200
Associate groupBy v2 resources with the Sequence lifecycle. ( #3296 )
...
This fixes a potential issue where groupBy resources could be allocated to
create a Sequence, but then the Sequence is never used, and thus the resources
are never freed.
Also simplifies how groupBy handles config overrides (this made the new
unit test easier to write).
2016-07-27 18:44:19 -07:00
Gian Merlino
9b5523add3
Reference counting, better error handling for resources in groupBy v2. ( #3268 )
...
Refcounting prevents releasing the merge buffer, or closing the concurrent
grouper, before the processing threads have all finished. The better
error handling prevents an avalanche of per-runner exceptions when grouping
resources are exhausted, by grouping those all up into a single merged
exception.
2016-07-27 01:59:02 +05:30
Erik Dubbelboer
76fabcfdb2
Fix #2782 , Unit test failed for DruidProcessingConfigTest.testDeserialization ( #3231 )
...
On systems with only once processor this test fails.
2016-07-25 15:51:09 -07:00
kaijianding
3dc2974894
Add timestampSpec to metadata.drd and SegmentMetadataQuery ( #3227 )
...
* save TimestampSpec in metadata.drd
* add timestampSpec info in SegmentMetadataQuery
2016-07-25 15:45:30 -07:00
Jonathan Wei
a42ccb6d19
Support filtering on long columns (including __time) ( #3180 )
...
* Support filtering on __time column
* Rename DruidPredicate
* Add docs for ValueMatcherFactory, add comment on getColumnCapabilities
* Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate
* Address PR comments (support filter on all long columns)
* Use predicate factory instead of composite predicate
* Address PR comments
* Lazily initialize long handling in selector/in filter
* Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe
* Add multithreaded selector/in filter test
* Fix non-final lock object in SelectorDimFilter
2016-07-20 17:08:49 -07:00
Gian Merlino
06624c40c0
Share query handling between Appenderator and RealtimePlumber. ( #3248 )
...
Fixes inconsistent metric handling between the two implementations. Formerly,
RealtimePlumber only emitted query/segmentAndCache/time and query/wait and
Appenderator only emitted query/partial/time and query/wait (all per sink).
Now they both do the same thing:
- query/segmentAndCache/time, query/segment/time are the time spent per sink.
- query/cpu/time is the CPU time spent per query.
- query/wait/time is the executor waiting time per sink.
These generally match historical metrics, except segmentAndCache & segment
mean the same thing here, because one Sink may be partially cached and
partially uncached and we aren't splitting that out.
2016-07-19 22:15:13 -05:00
Nishant
7995818220
Increase test timeout to prevent failing on slow machines ( #3224 )
...
constantly timing out on one of slow build machines, increasing the
timeout fixed it.
Running io.druid.granularity.QueryGranularityTest
Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.776
sec - in io.druid.granularity.QueryGranularityTest
2016-07-17 18:44:48 -07:00
Gian Merlino
6cd1f5375b
Better harmonized dimensions for query metrics. ( #3245 )
...
All query metrics now start with toolChest.makeMetricBuilder, and all of
*those* now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id"
moved to common code, since all query metrics added it anyway.
In particular this will add query-type specific dimensions like "threshold"
and "numDimensions" to servlet-originated metrics like query/time.
2016-07-14 11:55:51 -07:00
Gian Merlino
ea03906fcf
Configurable compressRunOnSerialization for Roaring bitmaps. ( #3228 )
...
Defaults to true, which is a change in behavior (this used to be false and unconfigurable).
2016-07-08 10:24:19 +05:30
Gian Merlino
fdc7e88a7d
Allow queries with no aggregators. ( #3216 )
...
This is actually reasonable for a groupBy or lexicographic topNs that is
being used to do a "COUNT DISTINCT" kind of query. No aggregators are
needed for that query, and including a dummy aggregator wastes 8 bytes
per row.
It's kind of silly for timeseries, but why not.
2016-07-06 20:38:54 +05:30
Jonathan Wei
f3a3662133
Fix compile error in SearchBinaryFnTest ( #3201 )
2016-06-29 09:44:45 -05:00
jaehong choi
efbcbf5315
Support alphanumeric sort in search query ( #2593 )
...
* support alphanumeric sort in search query
* address a comment about handling equals() and hashCode()
* address comments
* add Ut for string comparators
* address a comment about space indentations.
2016-06-28 15:06:18 -07:00
Hyukjin Kwon
45f553fc28
Replace the deprecated usage of NoneShardSpec ( #3166 )
2016-06-25 10:27:25 -07:00
Gian Merlino
4cc39b2ee7
Alternative groupBy strategy. ( #2998 )
...
This patch introduces a GroupByStrategy concept and two strategies: "v1"
is the current groupBy strategy and "v2" is a new one. It also introduces
a merge buffers concept in DruidProcessingModule, to try to better
manage memory used for merging.
Both of these are described in more detail in #2987 .
There are two goals of this patch:
1. Make it possible for historical/realtime nodes to return larger groupBy
result sets, faster, with better memory management.
2. Make it possible for brokers to merge streams when there are no order-by
columns, avoiding materialization.
This patch does not do anything to help with memory management on the broker
when there are order-by columns or when there are nested queries. That could
potentially be done in a future patch.
2016-06-24 18:06:09 -07:00
Dave Li
8a08398977
Add segment pruning based on secondary partition dimension ( #2982 )
...
* add get dimension rangeset to filters
* add get domain to ShardSpec and added chunk filter in caching clustered client
* add null check and modified not filter, started with unit test
* add filter test with caching
* refactor and some comments
* extract filtershard to helper function
* fixup
* minor changes
* update javadoc
2016-06-24 14:52:19 -07:00
michaelschiff
66d8ad36d7
adds new coordinator metrics 'segment/unavailable/count' and ( #3176 )
...
'segment/underReplicated/count' (#3173 )
2016-06-23 14:53:15 -07:00
Gian Merlino
da660bb592
DumpSegment tool. ( #3182 )
...
Fixes #2723 .
2016-06-23 14:37:50 -07:00
Gian Merlino
a437fb150b
Fix SegmentMetadataQuery when queryGranularity is requested but not present. ( #3181 )
2016-06-23 14:30:50 -07:00
Jonathan Wei
24860a1391
Two-stage filtering ( #3018 )
...
* Two-stage filtering
* PR comment
2016-06-22 16:08:21 -07:00
Nishant
f46ad9a4cb
support Union Segment metadata queries ( #3132 )
...
* support Union Segment metadata queries
fix 3128
* remove extraneous sys out
2016-06-21 10:30:50 -07:00
Dave Li
12be1c0a4b
Add bucket extraction function ( #3033 )
...
* add bucket extraction function
* add doc and header
* updated doc and test
2016-06-17 09:24:27 -07:00
Gian Merlino
ebf890fe79
Update master version to 0.9.2-SNAPSHOT. ( #3133 )
2016-06-13 13:10:38 -07:00
Nishant
0d427923c0
fix caching for search results ( #3119 )
...
* fix caching for search results
properly read count when reading from cache.
* fix NPE during merging search count and add test
* Update cache key to invalidate prev results
2016-06-09 17:49:47 -07:00
Gian Merlino
5998de7d5b
Fix lenient merging of conflicting aggregators. ( #3113 )
...
This should have marked the conflicting aggregator as null, but instead it
threw an NPE for the entire query.
2016-06-08 15:56:48 -07:00
Jonathan Wei
37c8a8f186
Speed up filter tests with adapter cache ( #3103 )
2016-06-08 07:41:10 -07:00
Gian Merlino
54139c6815
Fix NPE in registeredLookup extractionFn when "optimize" is not provided. ( #3064 )
2016-06-03 12:58:17 -05:00
Gian Merlino
6171e078c8
Improve NPE message in LookupDimensionSpec when lookup does not exist. ( #3065 )
...
The message used to be empty, which made things hard to debug.
2016-06-02 19:59:12 -07:00
John Wang
e662efa79f
segment interface refactor for proposal 2965 ( #2990 )
2016-05-26 20:36:41 -07:00
Kurt Young
b5bd406597
fix #2991 : race condition in OnheapIncrementalIndex#addToFacts ( #3002 )
...
* fix #2991 : race condition in OnheapIncrementalIndex#addToFacts
* add missing header
* handle parseExceptions when first doing first agg
2016-05-25 19:05:46 -07:00
Jonathan Wei
b72c54c4f8
Add benchmark data generator, basic ingestion/persist/merge/query benchmarks ( #2875 )
2016-05-25 16:39:37 -07:00
Dave Li
dcabd4b1ee
Add lookup optimization for InDimFilter ( #2938 )
...
* Add lookup optimization for InDimFilter
* tests for in filter with lookup extraction fn
* refactor
* refactor2 and modified filter test
* make optimizeLookup private
2016-05-19 16:29:16 -07:00
Charles Allen
15ccf451f9
Move QueryGranularity static fields to QueryGranularities ( #2980 )
...
* Move QueryGranularity static fields to QueryGranularityUtil
* Fixes #2979
* Add test showing #2979
* change name to QueryGranularities
2016-05-17 16:23:48 -07:00
Charles Allen
fb01db4db7
[QTL] Allows RegisteredLookupExtractionFn to find its lookups lazily ( #2971 )
...
* Allows RegisteredLookupExtractionFn to find its lookups lazily
* Use raw variables instead of AtomicReference
* Make sure to use volatile
* Remove extra local variable.
* Move from BAOS to ByteBuffer
2016-05-17 11:29:39 -07:00
Himanshu
d3e9c47a5f
use correct ObjectMapper in Index[IO/Merger] in AggregationTestHelper and minor fix in theta sketch SketchMergeAggregatorFactory.getMergingFactory(..) ( #2943 )
2016-05-13 10:06:31 +05:30
Himanshu
d821144738
at historicals GpBy query mergeResults does not need merging as results are already merged by GroupByQueryRunnerFactory.mergeRunners(..) ( #2962 )
2016-05-12 17:41:24 -07:00
Gian Merlino
01bebf432a
GroupByQuery: Multi-value dimension tests. ( #2959 )
2016-05-12 11:31:50 -07:00
Charles Allen
a31348450f
Add toString for LookupConfig ( #2935 )
...
* Helps with operations and getting where the snapshot dir is
2016-05-09 18:20:00 -07:00
Dave Li
79a54283d4
Optimize filter for timeseries, search, and select queries ( #2931 )
...
* Optimize filter for timeseries, search, and select queries
* exception at failed toolchest type check
* took out query type check
* java7 error fix and test improvement
2016-05-09 11:04:06 -07:00
Slim
8b570ab130
make it clear what LookupExtractorFactory start/stop methods return ( #2925 )
2016-05-05 10:38:40 -07:00
David Lim
b489f63698
Supervisor for KafkaIndexTask ( #2656 )
...
* supervisor for kafka indexing tasks
* cr changes
2016-05-04 23:13:13 -07:00
Himanshu
8e2742b7e8
adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query ( #2873 )
2016-05-03 11:31:10 -07:00
Gian Merlino
40e595c7a0
Remove types from TimeAndDims, they aren't needed. ( #2865 )
2016-05-03 13:10:25 -05:00
binlijin
841be5c61f
periodically emit metric segment/scan/pending ( #2854 )
2016-05-02 22:38:13 -07:00
Navis Ryu
2729fea84d
Fix parsing fail of segment id with datasource containing underscore ( #2797 )
...
* Fix parsing fail of segment id with underscored datasource (Fix for #2786 )
* addressed comment
* renamed and moved code into api. added log4 dependency for tests
* addressed comments
* fixed test fails
2016-05-02 22:37:28 -07:00
Gian Merlino
90ce03c66f
Fix integer overflow in SegmentMetadataQuery numRows. ( #2890 )
2016-04-27 14:37:04 -07:00
Gian Merlino
6dc7688a29
TimeAndDims equals/hashCode implementation. ( #2870 )
...
Adapted from #2692 , thanks @navis for original implementation.
2016-04-22 08:45:20 +08:00
Himanshu
3cfd9c64c9
make singleThreaded groupBy query config overridable at query time ( #2828 )
...
* make isSingleThreaded groupBy query processing overridable at query time
* refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent
2016-04-21 17:12:58 -07:00
Slim
984a518c9f
Merge pull request #2734 from b-slim/LookupIntrospection2
...
[QTL][Lookup] adding introspection endpoint
2016-04-21 12:15:57 -05:00
Gian Merlino
c74391e54c
JavaScript: Ability to disable. ( #2853 )
...
Fixes #2852 .
2016-04-21 09:43:15 -05:00
Gian Merlino
7d3e55717d
Reduce cost of various toFilter calls. ( #2860 )
...
These happen once per segment and so it's better if they don't do
as much work.
2016-04-21 04:28:46 +08:00
Gian Merlino
59460b17cc
Add Filters.matchPredicate helper, use it where appropriate. ( #2851 )
...
This approach simplifies code and is generally faster, due to skipping
unnecessary dictionary lookups (see #2850 ).
2016-04-19 15:54:32 -07:00
Xavier Léauté
b2745befb7
remove obsolete comment ( #2858 )
2016-04-19 13:06:58 -07:00
Jisoo Kim
7b65ca7889
refactor ClientQuerySegmentWalker ( #2837 )
...
* refactor ClientQuerySegmentWalker
* add header to FluentQueryRunnerBuilder
* refactor QueryRunnerTestHelper
2016-04-18 14:00:47 -07:00
Gian Merlino
7c0b1dde3a
DimensionPredicateFilter: Skip unnecessary dictionary lookup. ( #2850 )
2016-04-18 12:38:25 -07:00
Jonathan Wei
b534f7203c
Fix performance regression from #2753 in IndexMerger ( #2841 )
2016-04-14 21:39:41 -07:00
Jonathan Wei
a26134575b
Fix NPE in TopNLexicographicResultBuilder.addEntry() ( #2835 )
2016-04-13 17:27:16 -07:00
Fangjin Yang
abd951df1a
Document how to use roaring bitmaps ( #2824 )
...
* Document how to use roaring bitmaps
This fixes #2408 .
While not all indexSpec properties are explained, it does explain how roaring bitmaps can be turned on.
* fix
* fix
* fix
* fix
2016-04-12 19:28:02 -07:00
michaelschiff
db35dd7508
fix issue #2744 . Check for null before combining metrics ( #2774 )
2016-04-12 14:46:31 -07:00
Nishant
1bf1dd03a0
Merge pull request #2812 from mrijke/fix-missing-equals-hashcode-filters
...
Add missing equals/hashcode to JS, Regex and SearchQuery DimFilters
2016-04-12 12:00:23 +05:30
Charles Allen
21e406613c
Merge pull request #2809 from metamx/fix2694
...
Fix test for snapshot taker to better check for lookup perist failure
2016-04-11 14:52:47 -07:00
Maarten Rijke
de68d6b7c4
Add missing equals/hashcode to JS, Regex and SearchQuery DimFilters
...
This commits adds missing equals() and hashcode() methods to
the JavascriptDimFilter, RegexDimFilter and the SearchQueryDimFilter.
2016-04-11 12:16:24 +02:00
Nishant
bbb326decf
Merge pull request #2799 from b-slim/fix_snapshot
...
MapLookupFactory need to be Ser/Desr ready.
2016-04-07 13:22:34 +05:30
Slim Bouguerra
bf1eafc4e1
remove all the mock lookupFactory
2016-04-06 15:37:52 -05:00
Slim Bouguerra
59eb2490a0
MapLookupFactory need to be Ser/Desr.
2016-04-06 15:02:18 -05:00
Charles Allen
f915a59138
Merge pull request #2691 from metamx/lookupExtrFn
...
Add ExtractionFn to LookupExtractor bridge
2016-04-06 09:13:08 -07:00
jon-wei
051fd6c0eb
Remove extra println from InFilter
2016-04-05 14:55:49 -07:00
Fangjin Yang
289bb6f885
Merge pull request #2690 from jon-wei/filter_support
...
Allow filters to use extraction functions
2016-04-05 15:40:15 -06:00
jon-wei
0e481d6f93
Allow filters to use extraction functions
2016-04-05 13:24:56 -07:00
Gian Merlino
e060a9f283
Additional ExtractionFn null-handling adjustments.
...
Followup to comments on #2771 .
2016-04-01 18:35:26 -07:00
Fangjin Yang
18b9ea62cf
Merge pull request #2771 from gianm/extractionfn-stuff
...
Various ExtractionFn null handling fixes.
2016-04-01 16:35:46 -07:00
Gian Merlino
23d66e5ff9
Merge pull request #2765 from navis/invalid-encode-nullstring
...
Null string is encoded as "null" in incremental index
2016-04-01 14:43:40 -07:00
Gian Merlino
b6e4d8b2c1
Various ExtractionFn null handling fixes.
...
- JavaScriptExtractionFn shouldn't pass empty strings to its JS functions
- Upper/LowerExtractionFn properly handles null Objects (DimExtractionFn's implementation works here)
- MatchingDimExtractionFn properly returns nulls rather than empties
- RegexDimExtractionFn properly attempts matching on nulls and empties
- SearchQuerySpecDimExtractionFn properly returns nulls when passed empties
2016-04-01 14:34:47 -07:00
Fangjin Yang
eea7a47870
Merge pull request #2576 from navis/paging-from-next
...
Add option for select query to get next page without modifying returned paging identifiers
2016-04-01 13:50:36 -07:00
Fangjin Yang
4eb5a2c4f1
Merge pull request #2715 from navis/stringformat-null-handling
...
stringFormat extractionFn should be able to return null on null values (Fix for #2706 )
2016-04-01 13:45:28 -07:00
Gian Merlino
23364a47fd
BaseFilterTest: Test optimized filters too.
2016-04-01 12:44:59 -07:00
navis.ryu
077522a46f
stringFormat extractionFn should be able to return null on null values (Fix for #2706 )
2016-04-01 13:40:56 +09:00
navis.ryu
f0e55f5d31
Null string is encoded as "null" in incremental index
2016-04-01 09:47:15 +09:00
navis.ryu
29bb00535b
Add option for select query to get next page without modifying returned paging identifiers
2016-04-01 09:03:03 +09:00
Gian Merlino
5f9240fcbc
Merge pull request #2577 from navis/native-in-filter
...
Implement native in filter
2016-03-30 20:02:54 -07:00
Fangjin Yang
3d68da94fe
Merge pull request #2661 from navis/utf8-estimated-length
...
Utility method for length estimation of utf8
2016-03-30 19:56:14 -07:00
navis.ryu
108535fd07
Implement native in filter (Fix for #2577 )
2016-03-31 10:10:57 +09:00
navis.ryu
e0cfd9ee19
Utility method for length estimation of utf8
2016-03-31 10:07:00 +09:00
jon-wei
5503bf1b38
Remove unnecessary type check in TimeAndDimsComp
2016-03-30 17:54:15 -07:00
Fangjin Yang
95733a362f
Merge pull request #2753 from gianm/null-filtering-multi-value-columns
...
More consistent empty-set filtering behavior on multi-value columns.
2016-03-29 18:52:25 -07:00
Charles Allen
95d42cfd9e
Merge pull request #2758 from pjain1/fix_npe_in_filter
...
handle null values in In Filter
2016-03-29 17:53:02 -07:00
Gian Merlino
1853f36e9f
More consistent empty-set filtering behavior on multi-value columns.
...
The behavior is now that filters on "null" will match rows with no
values. The behavior in the past was inconsistent; sometimes these
filters would match and sometimes they wouldn't.
Adds tests for this behavior to SelectorFilterTest and
BoundFilterTest, for query-level filters and filtered aggregates.
Fixes #2750 .
2016-03-29 15:32:13 -07:00
Parag Jain
d892918a3d
handle null values in In Filter
2016-03-29 17:03:26 -05:00
Fangjin Yang
e023df2b92
Merge pull request #2754 from gianm/i-dont-get-it
...
Remove error suppression code from IncrementalIndexAdapter.
2016-03-28 19:29:53 -07:00
Gian Merlino
c7ff0d698e
Remove error suppression code from IncrementalIndexAdapter.
2016-03-28 18:40:27 -07:00
fjy
c418a55638
cleanup distinct count agg
2016-03-28 17:29:41 -07:00
Fangjin Yang
9cb197adec
Merge pull request #2722 from himanshug/fix_hadoop_jar_upload
...
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-28 14:49:03 -07:00
Charles Allen
4a98c4fbac
Fix LookupExtractionFn equals and hashCode
2016-03-28 13:14:43 -07:00
Charles Allen
0ee861d0da
Add ExtractionFn to LookupExtractor bridge
2016-03-28 13:14:43 -07:00
Fangjin Yang
7fe277e6da
Merge pull request #2727 from gianm/optimize-bound-filter
...
BoundFilter optimizations, and related interface changes.
2016-03-26 18:59:05 -07:00
Fangjin Yang
0dae28b6af
Merge pull request #2729 from jon-wei/fix_hyperunique_comparator
...
Fix HyperUniquesAggregatorFactory comparator
2016-03-26 15:39:35 -07:00
Gian Merlino
2970b49adc
BoundFilter optimizations, and related interface changes.
...
BoundFilter:
- For lexicographic bounds, use bitmapIndex.getIndex to find the start and end points,
then union all bitmaps between those points.
- For alphanumeric bounds, iterate through dimValues, and union all bitmaps for values
matching the predicate.
- Change behavior for nulls: it used to be that the BoundFilter would never match nulls,
now it matches nulls if "" is allowed by the lower limit and not excluded by the
upper limit.
Interface changes:
- BitmapIndex: add `int getIndex(value)` to make it possible to get the index for a
value without retrieving the bitmap.
- BitmapIndex: remove `ImmutableBitmap getBitmap(value)`, change callers to `getBitmap(getIndex(value))`.
- BitmapIndexSelector: allow retrieving the underlying BitmapIndex through getBitmapIndex.
- Clarified contract of indexOf in Indexed, GenericIndexed.
Also added tests for SelectorFilter, NotFilter, and BoundFilter.
2016-03-25 14:11:48 -07:00
jon-wei
9afaa2b94a
Fix HyperUniquesAggregatorFactory comparator
2016-03-25 12:36:42 -07:00
Gian Merlino
4ac9e03161
Fix predicate-based ValueMatcher behavior for IncrementalIndex on missing columns.
...
Missing columns should be treated the same as columns containing 100% nulls.
2016-03-25 10:23:59 -07:00
Himanshu Gupta
e78a469fb7
UTs for ExtensionsConfig
2016-03-25 10:51:28 -05:00
Himanshu Gupta
004b00bb96
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-25 10:51:28 -05:00
Nishant
0b03c9405f
Merge pull request #2614 from sirpkt/calendric_gran
...
Support week, month, quarter, and year in query granularity
2016-03-24 16:21:01 -07:00
Himanshu
56343c6cdc
Merge pull request #2704 from navis/simple-optimize
...
optimize single elemented and/or filter
2016-03-24 16:13:48 -05:00
Gian Merlino
713062053c
Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters.
...
I believe that the instanceof chain in Filters exists because in the past, Filter
and DimFilter were in different packages (DimFilter was in druid-client and Filter
was in druid-processing). And since druid-client didn't depend on druid-processing,
DimFilter couldn't have a toFilter method. But now it can.
2016-03-23 17:03:49 -07:00
Gian Merlino
dd86198902
All Filters should work with FilteredAggregators.
...
This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a
ValueMatcherFactory implementation to FilteredAggregatorFactory so it can
take advantage of existing makeMatcher(ValueMatcherFactory) implementations.
This patch also removes the Bound-based method from ValueMatcherFactory. Its
only user was the SpatialFilter, which could use the Predicate-based method.
Fixes #2604 .
2016-03-23 12:24:01 -07:00
binlijin
57d78d3293
clean tmp file when index merge fail
2016-03-23 10:55:12 +08:00
navis.ryu
91f6be4884
optimize single elemented and/or filter
2016-03-23 09:29:15 +09:00
Gian Merlino
ff25325f3b
Improved docs for multi-value dimensions.
...
- Add central doc for multi-value dimensions, with some content from other docs.
- Link to multi-value dimension doc from topN and groupBy docs.
- Fixes a broken link from dimensionspecs.md, which was presciently already
linking to this nonexistent doc.
- Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes
"multi-value") in favor of "multi-value".
2016-03-22 14:40:55 -07:00
jon-wei
a59c9ee1b1
Support use of DimensionSchema class in DimensionsSpec
2016-03-21 13:12:04 -07:00
Keuntae Park
7f29f2ac3b
support week, month, quarter, year in query granularity
2016-03-21 17:41:53 +09:00
Charles Allen
5da9a280b6
Query Time Lookup - Dynamic Configuration
2016-03-18 09:45:05 -07:00
Gian Merlino
738dcd8cd9
Update version to 0.9.1-SNAPSHOT.
...
Fixes #2462
2016-03-17 10:34:20 -07:00
Slim
cf342d8d3c
Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility
...
[QTL][Lookup] lookup module with the snapshot utility
2016-03-17 11:39:47 -05:00
Slim Bouguerra
0c86b29ef0
lookup module with the snapshot utility
2016-03-17 09:20:41 -05:00
Charles Allen
2ac8a22173
Merge pull request #2579 from metamx/closerIsCloser
...
Make CloserRule use guava's Closer
2016-03-14 17:18:19 -07:00
Charles Allen
a64979463f
Make CloserRule use guava's Closer
2016-03-14 15:01:24 -07:00
Fangjin Yang
06813b510a
Merge pull request #2571 from himanshug/gp_by_avoid_sort
...
avoid sort while doing groupBy merging when possible
2016-03-14 14:46:51 -07:00
Fangjin Yang
dbdbacaa18
Merge pull request #2260 from navis/cardinality-for-searchquery
...
Support cardinality for search query
2016-03-14 13:24:40 -07:00
Slim
8cc3582e70
Merge pull request #2644 from metamx/optimize-timeboundary
...
optimize timeboundary for min or max bound
2016-03-13 13:16:24 -05:00
navis.ryu
be341bf4e3
Support cardinality for search query (Fix for #2260 )
2016-03-12 09:51:01 +09:00
Xavier Léauté
6f0d6ef0e9
optimize timeboundary for min or max bound
2016-03-11 14:11:47 -08:00
Gian Merlino
8a11161b20
Plumbers: Move plumber.add out of try/catch for ParseException.
...
The incremental indexes handle that now so it's not necessary.
Also, add debug logging and more detailed exceptions to the incremental
indexes for the case where there are parse exceptions during aggregation.
2016-03-10 16:39:26 -08:00
Himanshu Gupta
dc0214bddb
while GroupBy merging use unsorted facts in IncrementalIndex wherever possible
2016-03-10 16:11:48 -06:00
Himanshu Gupta
02dfd5cd80
update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance
2016-03-10 16:11:48 -06:00
Xavier Léauté
90d7409e1a
Merge pull request #2611 from himanshug/gp_by_max_limit
...
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-10 13:44:13 -08:00
Gian Merlino
a2b1652787
Clarify parser docs.
...
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
Fangjin Yang
68cffe1d91
Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache
...
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-09 18:45:59 -08:00
Gian Merlino
708bc674fa
Make specifying query context booleans more consistent.
...
Before, some needed to be strings and some needed to be real booleans. Now
they can all be either one.
2016-03-08 19:38:26 -08:00
Gian Merlino
40dad6dff4
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-08 19:22:12 -08:00
Himanshu Gupta
ca5de3f583
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-08 15:03:59 -06:00
Himanshu Gupta
099acb4966
allow groupBy max[Intermediate]Rows limit be overridable by context
2016-03-07 15:22:41 -06:00
Himanshu Gupta
c544ebf25e
reintroducing the safety check removed in commit-1d602be so that dim value ids are less than cardinality
2016-03-03 23:34:23 -06:00
Bingkun Guo
4a58462fc7
update querySegmentSpec when passing query to getQueryRunner
...
After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment.
In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.
2016-03-02 16:44:56 -06:00
Nishant
31b502773a
Merge pull request #2480 from navis/pagingfail-over-segments
...
Select query cannot span to next segment with paging
2016-03-01 11:42:41 +05:30
Fangjin Yang
e5c25725c0
Merge pull request #2562 from himanshug/fix_2556
...
with nested GpBy query outer query results need to be further merged
2016-02-29 12:17:33 -08:00
Himanshu Gupta
0722ced413
with GpBy query outer query results need to be further merged
2016-02-29 10:16:25 -06:00
navis.ryu
b1ff920831
Lazily initialize predicate for bound filter
2016-02-29 15:35:52 +09:00
navis.ryu
5f1e60324a
Added more complex test case with versioned segments
2016-02-29 14:48:24 +09:00
navis.ryu
2686bfa394
Select query cannot span to next segment with paging
2016-02-29 00:01:46 +09:00
Fangjin Yang
29d29ba98d
Merge pull request #2263 from jon-wei/flex_dims3
...
Allow IncrementalIndex to store Long/Float dimensions
2016-02-25 17:23:02 -08:00
jon-wei
c17ce02467
Allow IncrementalIndex to store Long/Float dimensions
2016-02-24 13:51:57 -08:00
jon-wei
fd3782522c
Rename 'replaceMissingValues...' parameters in RegexExtractionFn
2016-02-24 13:12:56 -08:00
Nishant
fb7eae34ed
Merge pull request #2249 from metamx/workerExpanded
...
Use Worker instead of ZkWorker whenever possible
2016-02-24 13:23:22 +05:30
Charles Allen
ac13a5942a
Use Worker instead of ZkWorker whenver possible
...
* Moves last run task state information to Worker
* Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker
2016-02-23 15:02:03 -08:00
Gian Merlino
3534483433
Better handling of ParseExceptions.
...
Two changes:
- Allow IncrementalIndex to suppress ParseExceptions on "aggregate".
- Add "reportParseExceptions" option to realtime tuning configs. By default this is "false".
Behavior of the counters should now be:
- processed: Number of rows indexed, including rows where some fields could be parsed and some could not.
- thrownAway: Number of rows thrown away due to rejection policy.
- unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all).
If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would
cause an exception to be thrown). In addition, "processed" will only include fully parseable rows
(because even partial parse failures will cause exceptions to be thrown).
Fixes #2510 .
2016-02-23 10:11:43 -08:00
Fangjin Yang
3bdd757024
Merge pull request #1773 from b-slim/log_details
...
Adding downstream source when throwing QueryInterruptedException
2016-02-22 10:16:07 -08:00
Slim Bouguerra
77925cc061
adding downstream source of QueryInterruptedException
2016-02-20 13:05:14 -06:00
Fangjin Yang
8ee81947cd
Merge pull request #2494 from himanshug/fix_timeseries
...
do not drop post-aggs in TimeseriesQueryToolChest.makePreComputeManipulatorFn
2016-02-20 10:37:32 -08:00
Gian Merlino
d25c46cb9f
Add comparator to HyperUniquesFinalizingPostAggregator.
...
This makes it possible to do groupBys with clauses like "HAVING uniques > 10".
Beforehand you couldn't do it with either an aggregator (because it returns
an HLLV1 which the havingSpec can't understand) or a finalized postaggregator
(because it didn't have a comparator).
Now you can at least do it with a finalizing postaggregator. Trying it with
the aggregator alone still doesn't work.
Added some topN and groupBy tests verifying the comparator, and added an
@Ignore test that should pass if havingSpecs are made work on the aggregator
directly.
2016-02-19 08:36:08 -08:00
Himanshu Gupta
11b0117422
do not drop post-aggs in timeseries query tool chest makePreComputeManipulatorFn like other query types
2016-02-17 20:51:35 -06:00
Jaehong Choi
32b9d57b23
handle a failing UT in GroupByQueryRunnerTest after merging into the master
2016-02-16 16:56:57 +09:00
Jaehong Choi
b25bca85bc
Merge branch 'master' of https://github.com/druid-io/druid into support-alphanumeric-dimensional-sort-in-gropu-by
2016-02-16 16:42:05 +09:00
Jaehong Choi
e89afc901b
delete System.out.println() in test code
2016-02-16 15:26:37 +09:00
Navis Ryu
cd315627c9
Merge pull request #2393 from CHOIJAEHONG1/support-alphanumeric-dimensional-sort-in-gropu-by
...
support alphanumeric sorting for dimensional columns in groupby (#2393 )
2016-02-16 14:11:30 +09:00
Slim
16092eb5e2
Merge pull request #2464 from gianm/print-properties
...
Make startup properties logging optional.
2016-02-14 15:11:35 -06:00
Gian Merlino
e0c049c0b0
Make startup properties logging optional.
...
Off by default, but enabled in the example config files. See also #2452 .
2016-02-12 14:12:16 -08:00
Himanshu Gupta
da5fcd0124
before facts get it , indexAndOffsets should already know about it
2016-02-12 13:32:06 -06:00
Jonathan Wei
d63eec65a1
Merge pull request #2208 from navis/metadataquery-minmax
...
Support min/max values for metadata query
2016-02-11 17:28:07 -08:00
Jonathan Wei
e1b022eac9
Merge pull request #2349 from navis/dimensionspec-for-selectquery
...
Support dimension spec for select query
2016-02-11 16:38:16 -08:00
navis.ryu
dd2375477a
Support min/max values for metadata query ( #2208 )
2016-02-12 09:35:58 +09:00
Gian Merlino
2d037ef05e
Merge pull request #2453 from DreamLab/fix/topn_sorting_anomaly
...
Fix for unstable behavior of HyperLogLog comparator
2016-02-11 16:05:34 -08:00
navis.ryu
4d63196535
Support dimension spec for select query
2016-02-12 08:54:28 +09:00
Himanshu
47d48e1e67
Merge pull request #2452 from gianm/print-properties
...
PropertiesModule: Print properties, processors, totalMemory on startup.
2016-02-11 16:49:34 -06:00
turu
f277a54a5c
removed unsafe heuristics from hll compareTo and provided unit test for regression
2016-02-11 23:46:24 +01:00
Slim
368988d187
Merge pull request #2291 from druid-io/lookupManager
...
Promoting LookupExtractor state and LookupExtractorFactory to be a first class druid state object.
2016-02-11 16:07:27 -06:00
Gian Merlino
29f7758e74
PropertiesModule: Print properties, processors, totalMemory on startup.
2016-02-11 13:51:08 -08:00
Slim Bouguerra
4e119b7a24
Adding lookup ref manager and lookup dimension spec impl
2016-02-11 12:11:51 -06:00
Jaehong Choi
2f2e2ff5b9
support alphanumeric sorting for dimensional columns in groupby
2016-02-11 17:31:28 +09:00
Keuntae Park
05a144e39a
fix crash with filtered aggregator at ingestion time
...
- only for selector filter because extraction filter is not supported as
cardinality is not fixed at ingestion time
2016-02-11 11:25:33 +09:00
Fangjin Yang
b1673ee90e
Merge pull request #2409 from gianm/smq-merged-thing
...
SegmentMetadataQuery: Retain segment id when merging, if possible.
2016-02-08 15:43:39 -08:00
Fangjin Yang
c9c20bb7f3
Merge pull request #2395 from metamx/fixExtractionDimFilterNullTest
...
Actually check cache key null checking in ExtractionDimFilterTest
2016-02-08 14:10:52 -08:00
Gian Merlino
bd9c04244f
SegmentMetadataQuery: Retain segment id when merging, if possible.
...
This is helpful on realtime nodes, where two analyses from two different hydrants
are merged together but they are actually from the same segment.
2016-02-08 13:07:02 -08:00
Himanshu Gupta
9fe1b28ee5
provide configuration to enable usage of Off heap merging for groupBy query
2016-02-05 14:18:06 -06:00
Himanshu Gupta
b40c342cd1
make Global stupid pool cache size configurable
2016-02-05 14:18:06 -06:00
Himanshu Gupta
72a1e730a2
OffheapIncrementalIndex updates to do the aggregation merging off-heap
2016-02-05 14:17:05 -06:00
Himanshu Gupta
907dd77483
OffheapIncrementalIndex a copy/paste of OnheapIncrementalIndex
2016-02-05 14:02:31 -06:00
Charles Allen
aac5f9b2c9
Actually check cache key null checking in ExtractionDimFilterTest
2016-02-04 09:44:13 -08:00
fjy
1aa363cea7
new quickstart
2016-02-04 09:37:38 -08:00
Fangjin Yang
da77591129
Merge pull request #2392 from metamx/fix2391
...
Allow ExtractionDimFilter value to be null
2016-02-03 17:47:14 -08:00
Charles Allen
d4f00096ff
Allow ExtractionDimFilter value to be null
...
* Fixes #2391
2016-02-03 15:51:47 -08:00
Himanshu Gupta
6e7d90cf56
UTs for DefaultLimitSpec
2016-02-03 15:59:12 -06:00
Himanshu Gupta
29e0d7f971
lazily create comparators for row columns when needed
2016-02-03 13:38:20 -06:00
navis.ryu
1d602be0f9
Replace string[] with int[] for dimensions
2016-02-03 15:03:22 +09:00
binlijin
a5ef30ff84
optimize topn on particular situation
2016-02-02 14:20:09 +08:00
Himanshu
93c50d8538
Merge pull request #2094 from navis/simplify-index-merge
...
Simplifying dimension merging
2016-01-29 11:23:14 -06:00
navis.ryu
55a888ea2f
time-descending result of select queries
2016-01-29 10:06:05 +09:00
navis.ryu
dd774ef4dd
one-pass merging of dictionary & index
2016-01-29 10:03:53 +09:00
Himanshu
edd7ce58aa
Merge pull request #2348 from AlexanderSaydakov/fix-aggregator-test-helper
...
fixed createIndex
2016-01-28 16:01:36 -06:00
saydakov
e0860661b1
fixed createIndex
2016-01-28 13:20:50 -08:00
Nishant
99017f4518
Merge pull request #2326 from navis/use-reverse-iterator
...
use reverse-iterator if possible
2016-01-28 19:48:38 +05:30
Nishant
3880f54b87
Merge pull request #2332 from himanshug/configurable_partial
...
make populateUncoveredIntervals a configuration in query context
2016-01-28 10:34:35 +05:30
navis.ryu
7324ece8f9
use reverse-iterator if possible
2016-01-28 09:04:55 +09:00
Xavier Léauté
5a3642bb93
Merge pull request #2247 from metamx/pedanticBuild
...
Enable strict building in travis
2016-01-27 10:27:03 -08:00
Xavier Léauté
2e5004095a
Merge pull request #2341 from gianm/smq-test
...
SegmentMetadataQuery: Fix merging of ColumnAnalysis errors.
2016-01-27 09:37:06 -08:00
Charles Allen
508734c8b0
Long constant reformatting in tests `l` --> `L`
2016-01-27 08:59:19 -08:00
Gian Merlino
b1e6c01762
Make LookupExtractor abstract methods public, they have to work across classloaders.
2016-01-26 23:08:03 -08:00
Gian Merlino
795343f7ef
SegmentMetadataQuery: Fix merging of ColumnAnalysis errors.
...
Also add tests for:
- ColumnAnalysis folding
- Mixed mmap/incremental merging
2016-01-26 17:16:26 -08:00
Himanshu Gupta
3719b6e3c8
make populateUncoveredIntervals a configuration in query context
2016-01-26 15:13:45 -06:00
Himanshu
3844658fb5
Merge pull request #2323 from druid-io/update-druidapi
...
Update druid-api to 0.3.16
2016-01-26 13:02:10 -06:00
Himanshu Gupta
09d3678667
adding single threaded indexing and querying test for IncrementalIndex
2016-01-23 00:17:14 -06:00
Charles Allen
0000b9fc62
Remove sorting in ProtoBufInputRowParserTest
...
Due to processing/src/test/java/io/druid/data/input/ProtoBufInputRowParserTest.java
2016-01-22 16:02:25 -08:00
Himanshu Gupta
2f7f5119cf
older segments might not have field bitmapSerdeFactory for dimension columns and we must use appropriate default
2016-01-22 13:28:25 -06:00
binlijin
1d1f4d996d
Merge pull request #2111 from binlijin/optimize-create-inverted-indexes
...
optimize create inverted indexes
2016-01-22 11:36:27 +08:00
binlijin
55f7dd4629
optimize create inverted indexes
2016-01-22 10:40:09 +08:00
Gian Merlino
d416279c14
SegmentMetadataQuery support for returning aggregators.
2016-01-21 17:27:25 -08:00
Fangjin Yang
5a9cd89059
Merge pull request #2305 from gianm/segment-metadata-query-multivalues
...
Add StorageAdapter#getColumnTypeName, and various SegmentMetadataQuery adjustments
2016-01-21 17:22:34 -08:00
Gian Merlino
e5913be90e
Merge pull request #2257 from tubemogul/index-merge-bug
...
Adds support for empty merge metrics. fixes #2256
2016-01-21 16:38:00 -08:00
Gian Merlino
87c8046c6c
Add StorageAdapter#getColumnTypeName, and various SegmentMetadataQuery adjustments.
...
SegmentMetadataQuery stuff:
- Simplify implementation of SegmentAnalyzer.
- Fix type names for realtime complex columns; this used to try to merge a nice type
name (like "hyperUnique") from mmapped segments with the word "COMPLEX" from incremental
index segments, leading to a merge failure. Now it always uses the nice name.
- Add hasMultipleValues to ColumnAnalysis.
- Add tests for both mmapped and incremental index segments.
- Update docs to include errorMessage.
2016-01-21 15:50:33 -08:00
Fangjin Yang
3f998117a6
Merge pull request #2306 from jon-wei/inherit2
...
More specific null/empty str handling in IndexMerger
2016-01-21 14:36:09 -08:00
Michael Schiff
1e44445f06
Adds support for empty merge metrics. fixes #2256
2016-01-21 13:21:37 -08:00
jon-wei
459a236067
More specific null/empty str handling in IndexMerger
2016-01-21 12:24:38 -08:00
Slim
201539260c
Merge pull request #2076 from b-slim/issue_2010_upper_lower_extractionFN
...
adding lower and upper extraction fn
2016-01-21 09:58:07 -06:00
Slim Bouguerra
78feb3a13e
adding lower and upper extraction fn
2016-01-21 08:59:05 -06:00
Gian Merlino
5a932d28c1
Merge pull request #2288 from tubemogul/index-merge-bug2
...
Null check in IncrementalIndexAdapter.getDimValueLookup()
2016-01-20 17:07:15 -08:00
Nishant
59ea186af7
fix reference counting for segments
2016-01-20 17:24:21 +05:30
Michael Schiff
50ceec78a2
null check in IncrementalIndexAdapter.getDimValueLookup()
2016-01-19 23:19:28 -08:00
jon-wei
bc1e9b27c8
Consolidate IndexMergerTest and IndexMergerV9Test
2016-01-19 16:28:35 -08:00
jon-wei
747343e621
Preserve dimension order across indexes during ingestion
2016-01-19 13:34:11 -08:00
Fangjin Yang
0c31f007fc
Merge pull request #1728 from himanshug/aggregators_in_segment_metadata
...
Store AggregatorFactory[] in segment metadata
2016-01-19 12:55:49 -08:00
Himanshu Gupta
a99aef29a1
adding aggregators to segment metadata
2016-01-19 14:23:39 -06:00
Himanshu Gupta
52eb0f04a7
adding a new method getMergingFactory(..) to AggregatorFactory
2016-01-18 22:03:46 -06:00
Himanshu Gupta
77fc86c015
making AggregatorFactory abstract class
2016-01-18 22:03:46 -06:00
Himanshu Gupta
164b0aad7a
removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class
...
instead of a Map to store segment metadata
2016-01-18 22:03:46 -06:00
zhxiaog
3459a202ce
fixed #1873 , add ability to express CONCAT as an extractionFn
2016-01-18 15:03:17 -08:00
Keuntae Park
238dd3be3c
support cascade execution of extraction filters in extraction dimension spec
2016-01-18 11:10:19 +09:00
Fangjin Yang
f6a1a4ae20
Merge pull request #2138 from KurtYoung/feature-build-v9
...
build v9 directly
2016-01-16 13:35:46 -06:00
Kurt Young
82ff98c2bf
add config for build v9 directly and update docs
2016-01-16 11:26:34 +08:00
Kurt Young
1f2168fae5
add IndexMergerV9
...
add unit tests for IndexMergerV9 and fix some bugs
add more unit tests and fix bugs
handle null values and add more tests
minor changes & use LoggingProgressIndicator in IndexGeneratorReducer
make some static class public from IndexMerger
minor changes and add some comments
changes for comments
2016-01-16 11:25:28 +08:00
Kurt Young
bb50d2a2b2
add some streaming writers
2016-01-16 11:25:26 +08:00
Fangjin Yang
e0932ba1c2
Merge pull request #2267 from himanshug/fix_topn_multi_val_filter
...
Remap id's returned in XXXFilteredDimensionSpec.getRow() as per reduced cardinality
2016-01-14 17:06:54 -08:00
Fangjin Yang
7704699b40
Merge pull request #2265 from navis/strlen-dimension-ignored
...
Strlen sort spec ignores dimension
2016-01-14 17:06:33 -08:00
Himanshu Gupta
ae6a111444
fix XXXFilteredDimensionSpec to remap the dictionary encodings as per new cardinality
2016-01-13 22:25:02 -06:00
binlijin
a3140b2548
fix topN filtering on multi-valued dimension bug
2016-01-13 22:25:02 -06:00
navis.ryu
ea9fabdf2f
Strlen sort spec ignores dimension
2016-01-14 11:05:44 +09:00
Fangjin Yang
4c014c1574
Merge pull request #2228 from metamx/incremental-index-mem2
...
Improve heap usage for IncrementalIndex
2016-01-13 14:48:03 -08:00
navis.ryu
18479bb757
time-descending result of timeseries queries
2016-01-13 12:23:01 +09:00
Fangjin Yang
d7ad93debc
Merge pull request #2221 from binlijin/topN_minTopNThreshold
...
Allow change minTopNThreshold per topN query
2016-01-12 16:22:20 -08:00
Nishant
4863e2ca4f
cache metric selectors instead of creating new ones for every metric in each row
...
clear selectors on close.
Add comments about thread safety.
2016-01-13 00:45:23 +05:30
Nishant
dfe6abb721
Merge pull request #2250 from himanshug/agg_test_helper_fix
...
remove redundant registering of json modules in AggregationTestHelper
2016-01-12 11:42:00 +05:30
navis.ryu
976ebc45c0
Simplify information in IncrementalIndex
2016-01-12 10:18:11 +09:00
Himanshu Gupta
b973604bf8
remove redundant registering of json modules in AggregationTestHelper
2016-01-11 19:03:22 -06:00
Xavier Léauté
46a7f2660d
fix casing to be consistent with other classes
2016-01-08 10:19:06 -08:00
Fangjin Yang
d0b10c29d7
Merge pull request #2197 from metamx/clearIncIndexClose
...
Make OnHeapIncrementalIndex clean maps on close()
2016-01-07 15:43:47 -08:00
Gian Merlino
4ecd901a1a
Merge pull request #2219 from himanshug/identity_extraction_fn_singleton
...
make IdentityExtractionFn singleton
2016-01-07 10:08:28 -08:00
Fangjin Yang
aaea95ed1b
Merge pull request #2207 from himanshug/theta_sketch_select_query
...
fix bug for thetaSketch metric not working with select queries
2016-01-07 09:46:09 -08:00
binlijin
010c6e959c
add test
2016-01-07 18:01:46 +08:00
binlijin
a6bfcc5bfd
Allow change minTopNThreshold per topN query
2016-01-07 14:51:00 +08:00
Fangjin Yang
4cc81d3eff
Merge pull request #2096 from b-slim/add_use_case_unapply
...
Add use case unapply
2016-01-06 21:58:12 -08:00
Himanshu Gupta
217079d0c7
make IdentityExtractionFn singleton
2016-01-06 22:29:07 -06:00
Himanshu
902f51433d
Merge pull request #2125 from mangeshpardeshiyahoo/master
...
Add extraction function support for Dimension Selector
2016-01-06 14:22:26 -06:00
Mangesh Pardeshi
75ee952197
Add extraction function support for dimension Selector
2016-01-06 13:47:07 -06:00
Slim Bouguerra
032d3bf6e6
Optimization of extraction filter by reversing the lookup
2016-01-06 11:16:11 -06:00
Himanshu Gupta
3f048f0b15
adding support to execute Select queries in AggregationTestHelper so that Select query based UTs can be written for complex aggregator implementations
2016-01-05 21:54:55 -06:00
Charles Allen
91fc32749b
Make OnHeapIncrementalIndex clean maps on close()
2016-01-04 11:18:16 -08:00
Himanshu Gupta
b47d807738
Add support for filtering at DimensionSpec level so that multivalued dimensions can be filtered correctly
...
also adding UTs for multi-valued dimensions
2015-12-30 17:59:47 -06:00
Himanshu Gupta
fa5c3bb014
adding decorate(DimensionSelector) to DimensionSpec to enable support for arbitrary filtering/transformations to returned dimension values
2015-12-30 15:06:24 -06:00
Nishant
b68265399c
Merge pull request #2168 from druid-io/remove-indexmaker
...
Remove IndexMaker
2015-12-30 12:24:29 +05:30
Fangjin Yang
e14ad74088
Merge pull request #1936 from b-slim/between_range_with_predicat
...
adding Upper/Lower Bound Filter
2015-12-29 10:11:22 -08:00
fjy
faf421726b
remove IndexMaker
2015-12-28 14:19:02 -08:00
Gian Merlino
83f4130b5f
SegmentMetadataQuery merging fixes.
...
- Fix merging when the INTERVALS analysisType is disabled, and add a test.
- Remove transformFn from CombiningSequence, use MappingSequence instead. transformFn did
not work for "accumulate" anyway, which made the tests wrong (the intervals should have
been condensed, but were not).
- Add analysisTypes to the Druids segmentMetadataQuery builder to make testing simpler.
2015-12-22 07:57:10 -08:00
Robin
dded4441d3
for completeness, add unit test for groupby/having with unrecognized type
2015-12-21 12:06:56 -06:00
Himanshu Gupta
e1631967e3
adding comments to explain merge failure in segmentMetadata query
2015-12-19 11:39:24 -06:00
Himanshu Gupta
7ecad1be24
Fix and UT for testing segment analysis merge
2015-12-19 00:24:02 -06:00
Fangjin Yang
7019d3c421
Merge pull request #2107 from jon-wei/fix_smq
...
More efficient SegmentMetadataQuery
2015-12-18 16:40:47 -08:00
Fangjin Yang
14229ba0f2
Merge pull request #1922 from metamx/jsonIgnoresFinalFields
...
Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to
2015-12-18 15:38:32 -08:00
Fangjin Yang
71f554bf80
Merge pull request #2101 from himanshug/fix_extraction_dim_filter_cache_key
...
add extractionFn bytes to cache key in ExtractionDimFilter
2015-12-18 12:05:43 -08:00
Fangjin Yang
9e6874cc7e
Merge pull request #2084 from binlijin/master
...
minor optimize IndexMerger's MMappedIndexRowIterable
2015-12-18 11:42:55 -08:00
Bingkun
cc21a5fac7
Merge pull request #1999 from himanshug/remove_min_max_aggs
...
remove min/max aggregator factory
2015-12-18 13:38:52 -06:00
jon-wei
356b07c6c3
More efficient SegmentMetadataQuery
2015-12-17 12:46:23 -08:00
Jonathan Wei
f8cf84f466
Merge pull request #1995 from himanshug/num_rows_seg_metadata_query
...
add numRows to segment metadata query response
2015-12-17 12:23:46 -08:00
Himanshu Gupta
82ea348003
add extractionFn bytes to cache key in ExtractionDimFilter
2015-12-16 14:00:38 -06:00
Himanshu
628643d80e
Merge pull request #2091 from rasahner/noDefaultForGroupbyHaving
...
take away default for groupBy/having
2015-12-16 01:07:40 -06:00
sahner
3441cf3110
take away default for groupBy/having
2015-12-15 10:32:45 -06:00
Fangjin Yang
e7f06cf61c
Merge pull request #2075 from jon-wei/regex_extract
...
Configurable value replacement on match failure for RegexExtractionFn
2015-12-14 19:10:50 -08:00
jon-wei
c88f75df7c
Configurable value replacement on match failure for RegexExtractionFn
2015-12-14 17:57:41 -08:00
binlijin
362bea1090
minor optimize IndexMerger's MMappedIndexRowIterable
2015-12-11 15:04:46 +08:00
Xavier Léauté
d531e69d1a
Merge pull request #2079 from binlijin/master
...
reduce bytearray copy to minimal optimize VSizeIndexedWriter
2015-12-10 21:30:09 -08:00
Slim Bouguerra
77afdf25e3
adding Bound Filter
2015-12-10 08:47:21 -06:00
Slim Bouguerra
ee1a39801a
adding bulk lookup and reverse lookup
2015-12-10 08:29:41 -06:00
binlijin
0eafbd55b2
reduce bytearray copy to minimal optimize VSizeIndexedWriter
2015-12-10 16:34:39 +08:00
Fangjin Yang
f4ba13a1ac
Merge pull request #2029 from b-slim/add_reverse_fn
...
Adding reverse lookup function to LookupExtractor.
2015-12-09 12:50:13 -08:00
Xavier Léauté
9015a68c03
Merge pull request #2002 from navis/DRUID-2001
...
fixed #2001 GenericIndexed.fromIterable compares all values even when it's not sorted
2015-12-09 08:56:49 -08:00
Slim Bouguerra
85f339b687
introduction and implem of reverse lookup function unApply.
2015-12-09 10:02:57 -06:00
Nishant
6c23d8edb4
Merge pull request #2043 from mangeshpardeshiyahoo/master
...
Add dimension selector support for groupby/having filters
2015-12-08 12:08:53 +05:30
Mangesh Pardeshi
d7ce120929
Add dimension selector support for groupby/having quries
2015-12-08 01:51:11 +00:00
Himanshu Gupta
431469e9c1
remove min/max aggregator factory which are replaced by double[min/max] aggregator factories
2015-12-05 22:36:49 -06:00
Himanshu Gupta
62ba9ade37
unifying license header in all java files
2015-12-05 22:16:23 -06:00
Gian Merlino
d21a640695
Merge pull request #2034 from b-slim/fix_cache_key
...
Fix getCacheKey for DimFilters
2015-12-04 09:13:06 -08:00
Slim Bouguerra
fb4ff3cf54
fix getCacheKey
2015-12-04 08:07:08 -06:00
Charles Allen
9d02f47201
Update IncrementalIndexTest copyright notice
2015-12-03 18:03:08 -08:00