1636 Commits

Author SHA1 Message Date
Gian Merlino
4ac9e03161 Fix predicate-based ValueMatcher behavior for IncrementalIndex on missing columns.
Missing columns should be treated the same as columns containing 100% nulls.
2016-03-25 10:23:59 -07:00
Himanshu Gupta
e78a469fb7 UTs for ExtensionsConfig 2016-03-25 10:51:28 -05:00
Himanshu Gupta
004b00bb96 config to explicitly specify classpath for hadoop container during hadoop ingestion 2016-03-25 10:51:28 -05:00
Nishant
0b03c9405f Merge pull request #2614 from sirpkt/calendric_gran
Support week, month, quarter, and year in query granularity
2016-03-24 16:21:01 -07:00
Himanshu
56343c6cdc Merge pull request #2704 from navis/simple-optimize
optimize single elemented and/or filter
2016-03-24 16:13:48 -05:00
Gian Merlino
713062053c Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters.
I believe that the instanceof chain in Filters exists because in the past, Filter
and DimFilter were in different packages (DimFilter was in druid-client and Filter
was in druid-processing). And since druid-client didn't depend on druid-processing,
DimFilter couldn't have a toFilter method. But now it can.
2016-03-23 17:03:49 -07:00
Gian Merlino
dd86198902 All Filters should work with FilteredAggregators.
This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a
ValueMatcherFactory implementation to FilteredAggregatorFactory so it can
take advantage of existing makeMatcher(ValueMatcherFactory) implementations.

This patch also removes the Bound-based method from ValueMatcherFactory. Its
only user was the SpatialFilter, which could use the Predicate-based method.

Fixes #2604.
2016-03-23 12:24:01 -07:00
binlijin
57d78d3293 clean tmp file when index merge fail 2016-03-23 10:55:12 +08:00
navis.ryu
91f6be4884 optimize single elemented and/or filter 2016-03-23 09:29:15 +09:00
Gian Merlino
ff25325f3b Improved docs for multi-value dimensions.
- Add central doc for multi-value dimensions, with some content from other docs.
- Link to multi-value dimension doc from topN and groupBy docs.
- Fixes a broken link from dimensionspecs.md, which was presciently already
  linking to this nonexistent doc.
- Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes
  "multi-value") in favor of "multi-value".
2016-03-22 14:40:55 -07:00
jon-wei
a59c9ee1b1 Support use of DimensionSchema class in DimensionsSpec 2016-03-21 13:12:04 -07:00
Keuntae Park
7f29f2ac3b support week, month, quarter, year in query granularity 2016-03-21 17:41:53 +09:00
Charles Allen
5da9a280b6 Query Time Lookup - Dynamic Configuration 2016-03-18 09:45:05 -07:00
Gian Merlino
738dcd8cd9 Update version to 0.9.1-SNAPSHOT.
Fixes #2462
2016-03-17 10:34:20 -07:00
Slim
cf342d8d3c Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility
[QTL][Lookup] lookup module with the snapshot utility
2016-03-17 11:39:47 -05:00
Slim Bouguerra
0c86b29ef0 lookup module with the snapshot utility 2016-03-17 09:20:41 -05:00
Charles Allen
2ac8a22173 Merge pull request #2579 from metamx/closerIsCloser
Make CloserRule use guava's Closer
2016-03-14 17:18:19 -07:00
Charles Allen
a64979463f Make CloserRule use guava's Closer 2016-03-14 15:01:24 -07:00
Fangjin Yang
06813b510a Merge pull request #2571 from himanshug/gp_by_avoid_sort
avoid sort while doing groupBy merging when possible
2016-03-14 14:46:51 -07:00
Fangjin Yang
dbdbacaa18 Merge pull request #2260 from navis/cardinality-for-searchquery
Support cardinality for search query
2016-03-14 13:24:40 -07:00
Slim
8cc3582e70 Merge pull request #2644 from metamx/optimize-timeboundary
optimize timeboundary for min or max bound
2016-03-13 13:16:24 -05:00
navis.ryu
be341bf4e3 Support cardinality for search query (Fix for #2260) 2016-03-12 09:51:01 +09:00
Xavier Léauté
6f0d6ef0e9 optimize timeboundary for min or max bound 2016-03-11 14:11:47 -08:00
Gian Merlino
8a11161b20 Plumbers: Move plumber.add out of try/catch for ParseException.
The incremental indexes handle that now so it's not necessary.

Also, add debug logging and more detailed exceptions to the incremental
indexes for the case where there are parse exceptions during aggregation.
2016-03-10 16:39:26 -08:00
Himanshu Gupta
dc0214bddb while GroupBy merging use unsorted facts in IncrementalIndex wherever possible 2016-03-10 16:11:48 -06:00
Himanshu Gupta
02dfd5cd80 update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance 2016-03-10 16:11:48 -06:00
Xavier Léauté
90d7409e1a Merge pull request #2611 from himanshug/gp_by_max_limit
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-10 13:44:13 -08:00
Gian Merlino
a2b1652787 Clarify parser docs.
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
Fangjin Yang
68cffe1d91 Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-09 18:45:59 -08:00
Gian Merlino
708bc674fa Make specifying query context booleans more consistent.
Before, some needed to be strings and some needed to be real booleans. Now
they can all be either one.
2016-03-08 19:38:26 -08:00
Gian Merlino
40dad6dff4 Fix caching of skipEmptyBuckets for TimeseriesQuery. 2016-03-08 19:22:12 -08:00
Himanshu Gupta
ca5de3f583 only allow lowering maxResults and maxIntermediateRows from groupBy query context 2016-03-08 15:03:59 -06:00
Himanshu Gupta
099acb4966 allow groupBy max[Intermediate]Rows limit be overridable by context 2016-03-07 15:22:41 -06:00
Himanshu Gupta
c544ebf25e reintroducing the safety check removed in commit-1d602be so that dim value ids are less than cardinality 2016-03-03 23:34:23 -06:00
Bingkun Guo
4a58462fc7 update querySegmentSpec when passing query to getQueryRunner
After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment.

In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.
2016-03-02 16:44:56 -06:00
Nishant
31b502773a Merge pull request #2480 from navis/pagingfail-over-segments
Select query cannot span to next segment with paging
2016-03-01 11:42:41 +05:30
Fangjin Yang
e5c25725c0 Merge pull request #2562 from himanshug/fix_2556
with nested GpBy query outer query results need to be further merged
2016-02-29 12:17:33 -08:00
Himanshu Gupta
0722ced413 with GpBy query outer query results need to be further merged 2016-02-29 10:16:25 -06:00
navis.ryu
b1ff920831 Lazily initialize predicate for bound filter 2016-02-29 15:35:52 +09:00
navis.ryu
5f1e60324a Added more complex test case with versioned segments 2016-02-29 14:48:24 +09:00
navis.ryu
2686bfa394 Select query cannot span to next segment with paging 2016-02-29 00:01:46 +09:00
Fangjin Yang
29d29ba98d Merge pull request #2263 from jon-wei/flex_dims3
Allow IncrementalIndex to store Long/Float dimensions
2016-02-25 17:23:02 -08:00
jon-wei
c17ce02467 Allow IncrementalIndex to store Long/Float dimensions 2016-02-24 13:51:57 -08:00
jon-wei
fd3782522c Rename 'replaceMissingValues...' parameters in RegexExtractionFn 2016-02-24 13:12:56 -08:00
Nishant
fb7eae34ed Merge pull request #2249 from metamx/workerExpanded
Use Worker instead of ZkWorker whenever possible
2016-02-24 13:23:22 +05:30
Charles Allen
ac13a5942a Use Worker instead of ZkWorker whenver possible
* Moves last run task state information to Worker
* Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker
2016-02-23 15:02:03 -08:00
Gian Merlino
3534483433 Better handling of ParseExceptions.
Two changes:
- Allow IncrementalIndex to suppress ParseExceptions on "aggregate".
- Add "reportParseExceptions" option to realtime tuning configs. By default this is "false".

Behavior of the counters should now be:

- processed: Number of rows indexed, including rows where some fields could be parsed and some could not.
- thrownAway: Number of rows thrown away due to rejection policy.
- unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all).

If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would
cause an exception to be thrown). In addition, "processed" will only include fully parseable rows
(because even partial parse failures will cause exceptions to be thrown).

Fixes #2510.
2016-02-23 10:11:43 -08:00
Fangjin Yang
3bdd757024 Merge pull request #1773 from b-slim/log_details
Adding downstream source when throwing QueryInterruptedException
2016-02-22 10:16:07 -08:00
Slim Bouguerra
77925cc061 adding downstream source of QueryInterruptedException 2016-02-20 13:05:14 -06:00
Fangjin Yang
8ee81947cd Merge pull request #2494 from himanshug/fix_timeseries
do not drop post-aggs in TimeseriesQueryToolChest.makePreComputeManipulatorFn
2016-02-20 10:37:32 -08:00