Commit Graph

1420 Commits

Author SHA1 Message Date
Charles Allen 2ac8a22173 Merge pull request #2579 from metamx/closerIsCloser
Make CloserRule use guava's Closer
2016-03-14 17:18:19 -07:00
Charles Allen a64979463f Make CloserRule use guava's Closer 2016-03-14 15:01:24 -07:00
Fangjin Yang 06813b510a Merge pull request #2571 from himanshug/gp_by_avoid_sort
avoid sort while doing groupBy merging when possible
2016-03-14 14:46:51 -07:00
Fangjin Yang dbdbacaa18 Merge pull request #2260 from navis/cardinality-for-searchquery
Support cardinality for search query
2016-03-14 13:24:40 -07:00
Slim 8cc3582e70 Merge pull request #2644 from metamx/optimize-timeboundary
optimize timeboundary for min or max bound
2016-03-13 13:16:24 -05:00
navis.ryu be341bf4e3 Support cardinality for search query (Fix for #2260) 2016-03-12 09:51:01 +09:00
Xavier Léauté 6f0d6ef0e9 optimize timeboundary for min or max bound 2016-03-11 14:11:47 -08:00
Gian Merlino 8a11161b20 Plumbers: Move plumber.add out of try/catch for ParseException.
The incremental indexes handle that now so it's not necessary.

Also, add debug logging and more detailed exceptions to the incremental
indexes for the case where there are parse exceptions during aggregation.
2016-03-10 16:39:26 -08:00
Himanshu Gupta dc0214bddb while GroupBy merging use unsorted facts in IncrementalIndex wherever possible 2016-03-10 16:11:48 -06:00
Himanshu Gupta 02dfd5cd80 update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance 2016-03-10 16:11:48 -06:00
Xavier Léauté 90d7409e1a Merge pull request #2611 from himanshug/gp_by_max_limit
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-10 13:44:13 -08:00
Gian Merlino a2b1652787 Clarify parser docs.
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
Fangjin Yang 68cffe1d91 Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-09 18:45:59 -08:00
Gian Merlino 708bc674fa Make specifying query context booleans more consistent.
Before, some needed to be strings and some needed to be real booleans. Now
they can all be either one.
2016-03-08 19:38:26 -08:00
Gian Merlino 40dad6dff4 Fix caching of skipEmptyBuckets for TimeseriesQuery. 2016-03-08 19:22:12 -08:00
Himanshu Gupta ca5de3f583 only allow lowering maxResults and maxIntermediateRows from groupBy query context 2016-03-08 15:03:59 -06:00
Himanshu Gupta 099acb4966 allow groupBy max[Intermediate]Rows limit be overridable by context 2016-03-07 15:22:41 -06:00
Himanshu Gupta c544ebf25e reintroducing the safety check removed in commit-1d602be so that dim value ids are less than cardinality 2016-03-03 23:34:23 -06:00
Bingkun Guo 4a58462fc7 update querySegmentSpec when passing query to getQueryRunner
After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment.

In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.
2016-03-02 16:44:56 -06:00
Nishant 31b502773a Merge pull request #2480 from navis/pagingfail-over-segments
Select query cannot span to next segment with paging
2016-03-01 11:42:41 +05:30
Fangjin Yang e5c25725c0 Merge pull request #2562 from himanshug/fix_2556
with nested GpBy query outer query results need to be further merged
2016-02-29 12:17:33 -08:00
Himanshu Gupta 0722ced413 with GpBy query outer query results need to be further merged 2016-02-29 10:16:25 -06:00
navis.ryu b1ff920831 Lazily initialize predicate for bound filter 2016-02-29 15:35:52 +09:00
navis.ryu 5f1e60324a Added more complex test case with versioned segments 2016-02-29 14:48:24 +09:00
navis.ryu 2686bfa394 Select query cannot span to next segment with paging 2016-02-29 00:01:46 +09:00
Fangjin Yang 29d29ba98d Merge pull request #2263 from jon-wei/flex_dims3
Allow IncrementalIndex to store Long/Float dimensions
2016-02-25 17:23:02 -08:00
jon-wei c17ce02467 Allow IncrementalIndex to store Long/Float dimensions 2016-02-24 13:51:57 -08:00
jon-wei fd3782522c Rename 'replaceMissingValues...' parameters in RegexExtractionFn 2016-02-24 13:12:56 -08:00
Nishant fb7eae34ed Merge pull request #2249 from metamx/workerExpanded
Use Worker instead of ZkWorker whenever possible
2016-02-24 13:23:22 +05:30
Charles Allen ac13a5942a Use Worker instead of ZkWorker whenver possible
* Moves last run task state information to Worker
* Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker
2016-02-23 15:02:03 -08:00
Gian Merlino 3534483433 Better handling of ParseExceptions.
Two changes:
- Allow IncrementalIndex to suppress ParseExceptions on "aggregate".
- Add "reportParseExceptions" option to realtime tuning configs. By default this is "false".

Behavior of the counters should now be:

- processed: Number of rows indexed, including rows where some fields could be parsed and some could not.
- thrownAway: Number of rows thrown away due to rejection policy.
- unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all).

If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would
cause an exception to be thrown). In addition, "processed" will only include fully parseable rows
(because even partial parse failures will cause exceptions to be thrown).

Fixes #2510.
2016-02-23 10:11:43 -08:00
Fangjin Yang 3bdd757024 Merge pull request #1773 from b-slim/log_details
Adding downstream source when throwing QueryInterruptedException
2016-02-22 10:16:07 -08:00
Slim Bouguerra 77925cc061 adding downstream source of QueryInterruptedException 2016-02-20 13:05:14 -06:00
Fangjin Yang 8ee81947cd Merge pull request #2494 from himanshug/fix_timeseries
do not drop post-aggs in TimeseriesQueryToolChest.makePreComputeManipulatorFn
2016-02-20 10:37:32 -08:00
Gian Merlino d25c46cb9f Add comparator to HyperUniquesFinalizingPostAggregator.
This makes it possible to do groupBys with clauses like "HAVING uniques > 10".
Beforehand you couldn't do it with either an aggregator (because it returns
an HLLV1 which the havingSpec can't understand) or a finalized postaggregator
(because it didn't have a comparator).

Now you can at least do it with a finalizing postaggregator. Trying it with
the aggregator alone still doesn't work.

Added some topN and groupBy tests verifying the comparator, and added an
@Ignore test that should pass if havingSpecs are made work on the aggregator
directly.
2016-02-19 08:36:08 -08:00
Himanshu Gupta 11b0117422 do not drop post-aggs in timeseries query tool chest makePreComputeManipulatorFn like other query types 2016-02-17 20:51:35 -06:00
Jaehong Choi 32b9d57b23 handle a failing UT in GroupByQueryRunnerTest after merging into the master 2016-02-16 16:56:57 +09:00
Jaehong Choi b25bca85bc Merge branch 'master' of https://github.com/druid-io/druid into support-alphanumeric-dimensional-sort-in-gropu-by 2016-02-16 16:42:05 +09:00
Jaehong Choi e89afc901b delete System.out.println() in test code 2016-02-16 15:26:37 +09:00
Navis Ryu cd315627c9 Merge pull request #2393 from CHOIJAEHONG1/support-alphanumeric-dimensional-sort-in-gropu-by
support alphanumeric sorting for dimensional columns in groupby (#2393)
2016-02-16 14:11:30 +09:00
Slim 16092eb5e2 Merge pull request #2464 from gianm/print-properties
Make startup properties logging optional.
2016-02-14 15:11:35 -06:00
Gian Merlino e0c049c0b0 Make startup properties logging optional.
Off by default, but enabled in the example config files. See also #2452.
2016-02-12 14:12:16 -08:00
Himanshu Gupta da5fcd0124 before facts get it , indexAndOffsets should already know about it 2016-02-12 13:32:06 -06:00
Jonathan Wei d63eec65a1 Merge pull request #2208 from navis/metadataquery-minmax
Support min/max values for metadata query
2016-02-11 17:28:07 -08:00
Jonathan Wei e1b022eac9 Merge pull request #2349 from navis/dimensionspec-for-selectquery
Support dimension spec for select query
2016-02-11 16:38:16 -08:00
navis.ryu dd2375477a Support min/max values for metadata query (#2208) 2016-02-12 09:35:58 +09:00
Gian Merlino 2d037ef05e Merge pull request #2453 from DreamLab/fix/topn_sorting_anomaly
Fix for unstable behavior of HyperLogLog comparator
2016-02-11 16:05:34 -08:00
navis.ryu 4d63196535 Support dimension spec for select query 2016-02-12 08:54:28 +09:00
Himanshu 47d48e1e67 Merge pull request #2452 from gianm/print-properties
PropertiesModule: Print properties, processors, totalMemory on startup.
2016-02-11 16:49:34 -06:00
turu f277a54a5c removed unsafe heuristics from hll compareTo and provided unit test for regression 2016-02-11 23:46:24 +01:00