Fangjin Yang
9cb197adec
Merge pull request #2722 from himanshug/fix_hadoop_jar_upload
...
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-28 14:49:03 -07:00
Charles Allen
4a98c4fbac
Fix LookupExtractionFn equals and hashCode
2016-03-28 13:14:43 -07:00
Charles Allen
0ee861d0da
Add ExtractionFn to LookupExtractor bridge
2016-03-28 13:14:43 -07:00
Fangjin Yang
7fe277e6da
Merge pull request #2727 from gianm/optimize-bound-filter
...
BoundFilter optimizations, and related interface changes.
2016-03-26 18:59:05 -07:00
Fangjin Yang
0dae28b6af
Merge pull request #2729 from jon-wei/fix_hyperunique_comparator
...
Fix HyperUniquesAggregatorFactory comparator
2016-03-26 15:39:35 -07:00
Gian Merlino
2970b49adc
BoundFilter optimizations, and related interface changes.
...
BoundFilter:
- For lexicographic bounds, use bitmapIndex.getIndex to find the start and end points,
then union all bitmaps between those points.
- For alphanumeric bounds, iterate through dimValues, and union all bitmaps for values
matching the predicate.
- Change behavior for nulls: it used to be that the BoundFilter would never match nulls,
now it matches nulls if "" is allowed by the lower limit and not excluded by the
upper limit.
Interface changes:
- BitmapIndex: add `int getIndex(value)` to make it possible to get the index for a
value without retrieving the bitmap.
- BitmapIndex: remove `ImmutableBitmap getBitmap(value)`, change callers to `getBitmap(getIndex(value))`.
- BitmapIndexSelector: allow retrieving the underlying BitmapIndex through getBitmapIndex.
- Clarified contract of indexOf in Indexed, GenericIndexed.
Also added tests for SelectorFilter, NotFilter, and BoundFilter.
2016-03-25 14:11:48 -07:00
jon-wei
9afaa2b94a
Fix HyperUniquesAggregatorFactory comparator
2016-03-25 12:36:42 -07:00
Gian Merlino
4ac9e03161
Fix predicate-based ValueMatcher behavior for IncrementalIndex on missing columns.
...
Missing columns should be treated the same as columns containing 100% nulls.
2016-03-25 10:23:59 -07:00
Himanshu Gupta
e78a469fb7
UTs for ExtensionsConfig
2016-03-25 10:51:28 -05:00
Himanshu Gupta
004b00bb96
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-25 10:51:28 -05:00
Nishant
0b03c9405f
Merge pull request #2614 from sirpkt/calendric_gran
...
Support week, month, quarter, and year in query granularity
2016-03-24 16:21:01 -07:00
Himanshu
56343c6cdc
Merge pull request #2704 from navis/simple-optimize
...
optimize single elemented and/or filter
2016-03-24 16:13:48 -05:00
Gian Merlino
713062053c
Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters.
...
I believe that the instanceof chain in Filters exists because in the past, Filter
and DimFilter were in different packages (DimFilter was in druid-client and Filter
was in druid-processing). And since druid-client didn't depend on druid-processing,
DimFilter couldn't have a toFilter method. But now it can.
2016-03-23 17:03:49 -07:00
Gian Merlino
dd86198902
All Filters should work with FilteredAggregators.
...
This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a
ValueMatcherFactory implementation to FilteredAggregatorFactory so it can
take advantage of existing makeMatcher(ValueMatcherFactory) implementations.
This patch also removes the Bound-based method from ValueMatcherFactory. Its
only user was the SpatialFilter, which could use the Predicate-based method.
Fixes #2604 .
2016-03-23 12:24:01 -07:00
binlijin
57d78d3293
clean tmp file when index merge fail
2016-03-23 10:55:12 +08:00
navis.ryu
91f6be4884
optimize single elemented and/or filter
2016-03-23 09:29:15 +09:00
Gian Merlino
ff25325f3b
Improved docs for multi-value dimensions.
...
- Add central doc for multi-value dimensions, with some content from other docs.
- Link to multi-value dimension doc from topN and groupBy docs.
- Fixes a broken link from dimensionspecs.md, which was presciently already
linking to this nonexistent doc.
- Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes
"multi-value") in favor of "multi-value".
2016-03-22 14:40:55 -07:00
jon-wei
a59c9ee1b1
Support use of DimensionSchema class in DimensionsSpec
2016-03-21 13:12:04 -07:00
Keuntae Park
7f29f2ac3b
support week, month, quarter, year in query granularity
2016-03-21 17:41:53 +09:00
Charles Allen
5da9a280b6
Query Time Lookup - Dynamic Configuration
2016-03-18 09:45:05 -07:00
Gian Merlino
738dcd8cd9
Update version to 0.9.1-SNAPSHOT.
...
Fixes #2462
2016-03-17 10:34:20 -07:00
Slim
cf342d8d3c
Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility
...
[QTL][Lookup] lookup module with the snapshot utility
2016-03-17 11:39:47 -05:00
Slim Bouguerra
0c86b29ef0
lookup module with the snapshot utility
2016-03-17 09:20:41 -05:00
Charles Allen
2ac8a22173
Merge pull request #2579 from metamx/closerIsCloser
...
Make CloserRule use guava's Closer
2016-03-14 17:18:19 -07:00
Charles Allen
a64979463f
Make CloserRule use guava's Closer
2016-03-14 15:01:24 -07:00
Fangjin Yang
06813b510a
Merge pull request #2571 from himanshug/gp_by_avoid_sort
...
avoid sort while doing groupBy merging when possible
2016-03-14 14:46:51 -07:00
Fangjin Yang
dbdbacaa18
Merge pull request #2260 from navis/cardinality-for-searchquery
...
Support cardinality for search query
2016-03-14 13:24:40 -07:00
Slim
8cc3582e70
Merge pull request #2644 from metamx/optimize-timeboundary
...
optimize timeboundary for min or max bound
2016-03-13 13:16:24 -05:00
navis.ryu
be341bf4e3
Support cardinality for search query (Fix for #2260 )
2016-03-12 09:51:01 +09:00
Xavier Léauté
6f0d6ef0e9
optimize timeboundary for min or max bound
2016-03-11 14:11:47 -08:00
Gian Merlino
8a11161b20
Plumbers: Move plumber.add out of try/catch for ParseException.
...
The incremental indexes handle that now so it's not necessary.
Also, add debug logging and more detailed exceptions to the incremental
indexes for the case where there are parse exceptions during aggregation.
2016-03-10 16:39:26 -08:00
Himanshu Gupta
dc0214bddb
while GroupBy merging use unsorted facts in IncrementalIndex wherever possible
2016-03-10 16:11:48 -06:00
Himanshu Gupta
02dfd5cd80
update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance
2016-03-10 16:11:48 -06:00
Xavier Léauté
90d7409e1a
Merge pull request #2611 from himanshug/gp_by_max_limit
...
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-10 13:44:13 -08:00
Gian Merlino
a2b1652787
Clarify parser docs.
...
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
Fangjin Yang
68cffe1d91
Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache
...
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-09 18:45:59 -08:00
Gian Merlino
708bc674fa
Make specifying query context booleans more consistent.
...
Before, some needed to be strings and some needed to be real booleans. Now
they can all be either one.
2016-03-08 19:38:26 -08:00
Gian Merlino
40dad6dff4
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-08 19:22:12 -08:00
Himanshu Gupta
ca5de3f583
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-08 15:03:59 -06:00
Himanshu Gupta
099acb4966
allow groupBy max[Intermediate]Rows limit be overridable by context
2016-03-07 15:22:41 -06:00
Himanshu Gupta
c544ebf25e
reintroducing the safety check removed in commit-1d602be so that dim value ids are less than cardinality
2016-03-03 23:34:23 -06:00
Bingkun Guo
4a58462fc7
update querySegmentSpec when passing query to getQueryRunner
...
After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment.
In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.
2016-03-02 16:44:56 -06:00
Nishant
31b502773a
Merge pull request #2480 from navis/pagingfail-over-segments
...
Select query cannot span to next segment with paging
2016-03-01 11:42:41 +05:30
Fangjin Yang
e5c25725c0
Merge pull request #2562 from himanshug/fix_2556
...
with nested GpBy query outer query results need to be further merged
2016-02-29 12:17:33 -08:00
Himanshu Gupta
0722ced413
with GpBy query outer query results need to be further merged
2016-02-29 10:16:25 -06:00
navis.ryu
b1ff920831
Lazily initialize predicate for bound filter
2016-02-29 15:35:52 +09:00
navis.ryu
5f1e60324a
Added more complex test case with versioned segments
2016-02-29 14:48:24 +09:00
navis.ryu
2686bfa394
Select query cannot span to next segment with paging
2016-02-29 00:01:46 +09:00
Fangjin Yang
29d29ba98d
Merge pull request #2263 from jon-wei/flex_dims3
...
Allow IncrementalIndex to store Long/Float dimensions
2016-02-25 17:23:02 -08:00
jon-wei
c17ce02467
Allow IncrementalIndex to store Long/Float dimensions
2016-02-24 13:51:57 -08:00
jon-wei
fd3782522c
Rename 'replaceMissingValues...' parameters in RegexExtractionFn
2016-02-24 13:12:56 -08:00
Nishant
fb7eae34ed
Merge pull request #2249 from metamx/workerExpanded
...
Use Worker instead of ZkWorker whenever possible
2016-02-24 13:23:22 +05:30
Charles Allen
ac13a5942a
Use Worker instead of ZkWorker whenver possible
...
* Moves last run task state information to Worker
* Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker
2016-02-23 15:02:03 -08:00
Gian Merlino
3534483433
Better handling of ParseExceptions.
...
Two changes:
- Allow IncrementalIndex to suppress ParseExceptions on "aggregate".
- Add "reportParseExceptions" option to realtime tuning configs. By default this is "false".
Behavior of the counters should now be:
- processed: Number of rows indexed, including rows where some fields could be parsed and some could not.
- thrownAway: Number of rows thrown away due to rejection policy.
- unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all).
If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would
cause an exception to be thrown). In addition, "processed" will only include fully parseable rows
(because even partial parse failures will cause exceptions to be thrown).
Fixes #2510 .
2016-02-23 10:11:43 -08:00
Fangjin Yang
3bdd757024
Merge pull request #1773 from b-slim/log_details
...
Adding downstream source when throwing QueryInterruptedException
2016-02-22 10:16:07 -08:00
Slim Bouguerra
77925cc061
adding downstream source of QueryInterruptedException
2016-02-20 13:05:14 -06:00
Fangjin Yang
8ee81947cd
Merge pull request #2494 from himanshug/fix_timeseries
...
do not drop post-aggs in TimeseriesQueryToolChest.makePreComputeManipulatorFn
2016-02-20 10:37:32 -08:00
Gian Merlino
d25c46cb9f
Add comparator to HyperUniquesFinalizingPostAggregator.
...
This makes it possible to do groupBys with clauses like "HAVING uniques > 10".
Beforehand you couldn't do it with either an aggregator (because it returns
an HLLV1 which the havingSpec can't understand) or a finalized postaggregator
(because it didn't have a comparator).
Now you can at least do it with a finalizing postaggregator. Trying it with
the aggregator alone still doesn't work.
Added some topN and groupBy tests verifying the comparator, and added an
@Ignore test that should pass if havingSpecs are made work on the aggregator
directly.
2016-02-19 08:36:08 -08:00
Himanshu Gupta
11b0117422
do not drop post-aggs in timeseries query tool chest makePreComputeManipulatorFn like other query types
2016-02-17 20:51:35 -06:00
Jaehong Choi
32b9d57b23
handle a failing UT in GroupByQueryRunnerTest after merging into the master
2016-02-16 16:56:57 +09:00
Jaehong Choi
b25bca85bc
Merge branch 'master' of https://github.com/druid-io/druid into support-alphanumeric-dimensional-sort-in-gropu-by
2016-02-16 16:42:05 +09:00
Jaehong Choi
e89afc901b
delete System.out.println() in test code
2016-02-16 15:26:37 +09:00
Navis Ryu
cd315627c9
Merge pull request #2393 from CHOIJAEHONG1/support-alphanumeric-dimensional-sort-in-gropu-by
...
support alphanumeric sorting for dimensional columns in groupby (#2393 )
2016-02-16 14:11:30 +09:00
Slim
16092eb5e2
Merge pull request #2464 from gianm/print-properties
...
Make startup properties logging optional.
2016-02-14 15:11:35 -06:00
Gian Merlino
e0c049c0b0
Make startup properties logging optional.
...
Off by default, but enabled in the example config files. See also #2452 .
2016-02-12 14:12:16 -08:00
Himanshu Gupta
da5fcd0124
before facts get it , indexAndOffsets should already know about it
2016-02-12 13:32:06 -06:00
Jonathan Wei
d63eec65a1
Merge pull request #2208 from navis/metadataquery-minmax
...
Support min/max values for metadata query
2016-02-11 17:28:07 -08:00
Jonathan Wei
e1b022eac9
Merge pull request #2349 from navis/dimensionspec-for-selectquery
...
Support dimension spec for select query
2016-02-11 16:38:16 -08:00
navis.ryu
dd2375477a
Support min/max values for metadata query ( #2208 )
2016-02-12 09:35:58 +09:00
Gian Merlino
2d037ef05e
Merge pull request #2453 from DreamLab/fix/topn_sorting_anomaly
...
Fix for unstable behavior of HyperLogLog comparator
2016-02-11 16:05:34 -08:00
navis.ryu
4d63196535
Support dimension spec for select query
2016-02-12 08:54:28 +09:00
Himanshu
47d48e1e67
Merge pull request #2452 from gianm/print-properties
...
PropertiesModule: Print properties, processors, totalMemory on startup.
2016-02-11 16:49:34 -06:00
turu
f277a54a5c
removed unsafe heuristics from hll compareTo and provided unit test for regression
2016-02-11 23:46:24 +01:00
Slim
368988d187
Merge pull request #2291 from druid-io/lookupManager
...
Promoting LookupExtractor state and LookupExtractorFactory to be a first class druid state object.
2016-02-11 16:07:27 -06:00
Gian Merlino
29f7758e74
PropertiesModule: Print properties, processors, totalMemory on startup.
2016-02-11 13:51:08 -08:00
Slim Bouguerra
4e119b7a24
Adding lookup ref manager and lookup dimension spec impl
2016-02-11 12:11:51 -06:00
Jaehong Choi
2f2e2ff5b9
support alphanumeric sorting for dimensional columns in groupby
2016-02-11 17:31:28 +09:00
Keuntae Park
05a144e39a
fix crash with filtered aggregator at ingestion time
...
- only for selector filter because extraction filter is not supported as
cardinality is not fixed at ingestion time
2016-02-11 11:25:33 +09:00
Fangjin Yang
b1673ee90e
Merge pull request #2409 from gianm/smq-merged-thing
...
SegmentMetadataQuery: Retain segment id when merging, if possible.
2016-02-08 15:43:39 -08:00
Fangjin Yang
c9c20bb7f3
Merge pull request #2395 from metamx/fixExtractionDimFilterNullTest
...
Actually check cache key null checking in ExtractionDimFilterTest
2016-02-08 14:10:52 -08:00
Gian Merlino
bd9c04244f
SegmentMetadataQuery: Retain segment id when merging, if possible.
...
This is helpful on realtime nodes, where two analyses from two different hydrants
are merged together but they are actually from the same segment.
2016-02-08 13:07:02 -08:00
Himanshu Gupta
9fe1b28ee5
provide configuration to enable usage of Off heap merging for groupBy query
2016-02-05 14:18:06 -06:00
Himanshu Gupta
b40c342cd1
make Global stupid pool cache size configurable
2016-02-05 14:18:06 -06:00
Himanshu Gupta
72a1e730a2
OffheapIncrementalIndex updates to do the aggregation merging off-heap
2016-02-05 14:17:05 -06:00
Himanshu Gupta
907dd77483
OffheapIncrementalIndex a copy/paste of OnheapIncrementalIndex
2016-02-05 14:02:31 -06:00
Charles Allen
aac5f9b2c9
Actually check cache key null checking in ExtractionDimFilterTest
2016-02-04 09:44:13 -08:00
fjy
1aa363cea7
new quickstart
2016-02-04 09:37:38 -08:00
Fangjin Yang
da77591129
Merge pull request #2392 from metamx/fix2391
...
Allow ExtractionDimFilter value to be null
2016-02-03 17:47:14 -08:00
Charles Allen
d4f00096ff
Allow ExtractionDimFilter value to be null
...
* Fixes #2391
2016-02-03 15:51:47 -08:00
Himanshu Gupta
6e7d90cf56
UTs for DefaultLimitSpec
2016-02-03 15:59:12 -06:00
Himanshu Gupta
29e0d7f971
lazily create comparators for row columns when needed
2016-02-03 13:38:20 -06:00
navis.ryu
1d602be0f9
Replace string[] with int[] for dimensions
2016-02-03 15:03:22 +09:00
binlijin
a5ef30ff84
optimize topn on particular situation
2016-02-02 14:20:09 +08:00
Himanshu
93c50d8538
Merge pull request #2094 from navis/simplify-index-merge
...
Simplifying dimension merging
2016-01-29 11:23:14 -06:00
navis.ryu
55a888ea2f
time-descending result of select queries
2016-01-29 10:06:05 +09:00
navis.ryu
dd774ef4dd
one-pass merging of dictionary & index
2016-01-29 10:03:53 +09:00
Himanshu
edd7ce58aa
Merge pull request #2348 from AlexanderSaydakov/fix-aggregator-test-helper
...
fixed createIndex
2016-01-28 16:01:36 -06:00
saydakov
e0860661b1
fixed createIndex
2016-01-28 13:20:50 -08:00
Nishant
99017f4518
Merge pull request #2326 from navis/use-reverse-iterator
...
use reverse-iterator if possible
2016-01-28 19:48:38 +05:30
Nishant
3880f54b87
Merge pull request #2332 from himanshug/configurable_partial
...
make populateUncoveredIntervals a configuration in query context
2016-01-28 10:34:35 +05:30