Commit Graph

123 Commits

Author SHA1 Message Date
Jonathan Wei a6105cbb86 Add numeric StringComparator (#3270)
* Add numeric StringComparator

* Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query

* Address PR comments, add multithreaded BoundDimFilter test

* Add comment on strlen tie handling

* Add timeseries interval filter benchmark

* Adjust docs

* Use jackson for StringComparator, address PR comments

* Add new TopNMetricSpec and SearchSortSpec with tests (WIP)

* More TopNMetricSpec and SearchSortSpec tests

* Fix NewSearchSortSpec serde

* Update docs for new DimensionTopNMetricSpec

* Delete NumericDimensionTopNMetricSpec

* Delete old SearchSortSpec

* Rename NewSearchSortSpec to SearchSortSpec

* Add TopN numeric comparator benchmark, address PR comments

* Refactor OrderByColumnSpec

* Add null checks to NumericComparator and String->BigDecimal conversion function

* Add more OrderByColumnSpec serde tests
2016-07-29 15:44:16 -07:00
Gian Merlino 2553997200 Associate groupBy v2 resources with the Sequence lifecycle. (#3296)
This fixes a potential issue where groupBy resources could be allocated to
create a Sequence, but then the Sequence is never used, and thus the resources
are never freed.

Also simplifies how groupBy handles config overrides (this made the new
unit test easier to write).
2016-07-27 18:44:19 -07:00
kaijianding 3dc2974894 Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227)
* save TimestampSpec in metadata.drd

* add timestampSpec info in SegmentMetadataQuery
2016-07-25 15:45:30 -07:00
Jonathan Wei a42ccb6d19 Support filtering on long columns (including __time) (#3180)
* Support filtering on __time column

* Rename DruidPredicate

* Add docs for ValueMatcherFactory, add comment on getColumnCapabilities

* Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate

* Address PR comments (support filter on all long columns)

* Use predicate factory instead of composite predicate

* Address PR comments

* Lazily initialize long handling in selector/in filter

* Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe

* Add multithreaded selector/in filter test

* Fix non-final lock object in SelectorDimFilter
2016-07-20 17:08:49 -07:00
Gian Merlino 3ab4a4efbc Fix formatting in granularities doc. (#3229) 2016-07-08 09:29:58 -07:00
Gian Merlino fdc7e88a7d Allow queries with no aggregators. (#3216)
This is actually reasonable for a groupBy or lexicographic topNs that is
being used to do a "COUNT DISTINCT" kind of query. No aggregators are
needed for that query, and including a dummy aggregator wastes 8 bytes
per row.

It's kind of silly for timeseries, but why not.
2016-07-06 20:38:54 +05:30
jaehong choi efbcbf5315 Support alphanumeric sort in search query (#2593)
* support alphanumeric sort in search query

* address a comment about handling equals() and hashCode()

* address comments

* add Ut for string comparators

* address a comment about space indentations.
2016-06-28 15:06:18 -07:00
Gian Merlino 4cc39b2ee7 Alternative groupBy strategy. (#2998)
This patch introduces a GroupByStrategy concept and two strategies: "v1"
is the current groupBy strategy and "v2" is a new one. It also introduces
a merge buffers concept in DruidProcessingModule, to try to better
manage memory used for merging.

Both of these are described in more detail in #2987.

There are two goals of this patch:

1. Make it possible for historical/realtime nodes to return larger groupBy
   result sets, faster, with better memory management.
2. Make it possible for brokers to merge streams when there are no order-by
   columns, avoiding materialization.

This patch does not do anything to help with memory management on the broker
when there are order-by columns or when there are nested queries. That could
potentially be done in a future patch.
2016-06-24 18:06:09 -07:00
Dave Li 12be1c0a4b Add bucket extraction function (#3033)
* add bucket extraction function

* add doc and header

* updated doc and test
2016-06-17 09:24:27 -07:00
linbo.jin 8c76fe7b97 docs: change OR to AND inside query docs about multi-value dims (#3162)
* docs: replace OR by AND inside topnquery docs about multi value dimensions

* docs: replace OR by AND inside groupby docs about multi value dimensions
2016-06-17 08:54:18 -07:00
Kirill Kozlov 9f93be448e Fix logical operator in example (#3093) 2016-06-07 10:44:18 -07:00
Gian Merlino 99ee3f4dc3 Fixups, clarifications to lookup docs. (#3060) 2016-06-07 10:43:35 -07:00
Charles Allen fa41a6466a Cleanup the base lookup cluster wide config docs (#3061)
* Cleanup the base lookup cluster wide config docs

* Add better examples in lookups-cached-global.md

* Use actual valid stock lookups

* Fixed maps with :

* Add mix of lookups

* Better examples in extension

* Remove unneeded namespace requirement

* Add extra line space

* Add link to lookup tiers

* Renamed header
2016-06-07 10:42:41 -07:00
Charles Allen 8cac710546 Async lookups-cached-global by default (#3074)
* Async lookups-cached-global by default
* Also better lookup docs

* Fix test timeouts

* Fix timing of deserialized test

* Fix problem with 0 wait failing immediately
2016-06-03 15:58:10 -05:00
Gian Merlino 603fbbcc20 Fix docs for "contains" search spec. (#3066) 2016-06-02 19:03:40 -07:00
Vadim Ogievetsky 13c267bfee Added new line for site formatting (#3059) 2016-06-02 11:36:45 -07:00
Vadim Ogievetsky 767190d5db Clear up confusing wording (#3052)
There is no such thing as a "Java aggregator" in Druid from a user's point of view, there are just native aggregator that happen to be implemented in Java.
2016-06-01 15:41:50 -07:00
Charles Allen eaaad01de7 [QTL] Datasource as lookupTier (#2955)
* Datasource as lookup tier
* Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for.

* Fix bad docs for lookups lookupTier

* Add Datasource name holder

* Move task and datasource to be pulled from Task file

* Make LookupModule pull from bound dataSource

* Fix test

* Fix code style on imports

* Fix formatting

* Make naming better

* Address code comments about naming
2016-05-17 15:44:42 -07:00
Parag Jain e3ea842cd3 add available query granularity strings (#2960) 2016-05-12 18:49:31 -07:00
Himanshu 8e2742b7e8 adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query (#2873) 2016-05-03 11:31:10 -07:00
Slim 55785267e4 postAgg filedName must match name of AGG (#2874) 2016-04-22 11:11:54 -07:00
Himanshu 3cfd9c64c9 make singleThreaded groupBy query config overridable at query time (#2828)
* make isSingleThreaded groupBy query processing overridable at query time

* refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent
2016-04-21 17:12:58 -07:00
Slim 984a518c9f Merge pull request #2734 from b-slim/LookupIntrospection2
[QTL][Lookup] adding introspection endpoint
2016-04-21 12:15:57 -05:00
Gian Merlino e320d13385 Fix various broken links in the docs. (#2833) 2016-04-13 13:30:01 -07:00
Charles Allen 2b99f717e4 Move lookup config doc to proper location 2016-04-08 08:15:38 -07:00
Charles Allen f915a59138 Merge pull request #2691 from metamx/lookupExtrFn
Add ExtractionFn to LookupExtractor bridge
2016-04-06 09:13:08 -07:00
jon-wei 0e481d6f93 Allow filters to use extraction functions 2016-04-05 13:24:56 -07:00
Fangjin Yang eea7a47870 Merge pull request #2576 from navis/paging-from-next
Add option for select query to get next page without modifying returned paging identifiers
2016-04-01 13:50:36 -07:00
navis.ryu 077522a46f stringFormat extractionFn should be able to return null on null values (Fix for #2706) 2016-04-01 13:40:56 +09:00
navis.ryu 29bb00535b Add option for select query to get next page without modifying returned paging identifiers 2016-04-01 09:03:03 +09:00
Gian Merlino 1853f36e9f More consistent empty-set filtering behavior on multi-value columns.
The behavior is now that filters on "null" will match rows with no
values. The behavior in the past was inconsistent; sometimes these
filters would match and sometimes they wouldn't.

Adds tests for this behavior to SelectorFilterTest and
BoundFilterTest, for query-level filters and filtered aggregates.

Fixes #2750.
2016-03-29 15:32:13 -07:00
Charles Allen 4764e86409 Add docs for RegisteredDimensionExtractionFn 2016-03-28 13:27:49 -07:00
Charles Allen ab324e4ac0 Move lookup docs that are in druid-proper back into lookups.md 2016-03-25 10:46:50 -07:00
Fangjin Yang a5d5529749 Merge pull request #2711 from gianm/filtered-aggregator-impls
All Filters should work with FilteredAggregators.
2016-03-23 13:37:21 -07:00
Gian Merlino dd86198902 All Filters should work with FilteredAggregators.
This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a
ValueMatcherFactory implementation to FilteredAggregatorFactory so it can
take advantage of existing makeMatcher(ValueMatcherFactory) implementations.

This patch also removes the Bound-based method from ValueMatcherFactory. Its
only user was the SpatialFilter, which could use the Predicate-based method.

Fixes #2604.
2016-03-23 12:24:01 -07:00
Gian Merlino 2dfd3877c0 Fix a bunch of broken links in the docs. 2016-03-23 10:21:28 -07:00
Fangjin Yang d1f8f2b2fd Merge pull request #2698 from druid-io/fix-ext-docs
refactor extensions into their own docs
2016-03-22 22:04:12 -07:00
fjy 943cbe6e76 refactor extensions into their own docs 2016-03-22 18:54:10 -07:00
Fangjin Yang 041350c31b Merge pull request #2701 from gianm/mvd-docs
Improved docs for multi-value dimensions.
2016-03-22 18:09:37 -07:00
Gian Merlino 451c0bc6d8 Merge pull request #2702 from pjain1/improve_docs
how to query in the querying section, correct default for select strategy, formatting
2016-03-22 16:40:35 -07:00
Parag Jain 39ecb9929d how to query, correct default for select strategy, formatting 2016-03-22 17:06:15 -05:00
Gian Merlino ff25325f3b Improved docs for multi-value dimensions.
- Add central doc for multi-value dimensions, with some content from other docs.
- Link to multi-value dimension doc from topN and groupBy docs.
- Fixes a broken link from dimensionspecs.md, which was presciently already
  linking to this nonexistent doc.
- Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes
  "multi-value") in favor of "multi-value".
2016-03-22 14:40:55 -07:00
Nishant 11b8d1ed70 Merge pull request #2686 from gianm/fix-analysistypes-docs
Fix analysisTypes docs for SegmentMetadataQuery.
2016-03-18 16:15:38 -07:00
Gian Merlino 76ae30604e Fix analysisTypes docs for SegmentMetadataQuery. 2016-03-18 13:17:33 -07:00
Charles Allen 5da9a280b6 Query Time Lookup - Dynamic Configuration 2016-03-18 09:45:05 -07:00
Slim cf342d8d3c Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility
[QTL][Lookup] lookup module with the snapshot utility
2016-03-17 11:39:47 -05:00
Slim Bouguerra 0c86b29ef0 lookup module with the snapshot utility 2016-03-17 09:20:41 -05:00
Fangjin Yang 8cea85816d Merge pull request #2668 from navis/fix-document-selectquery
Document for search query was not updated properly (Fix for #2662)
2016-03-15 20:34:27 -07:00
navis.ryu 71ee9e2aac Document for search query is not updated properly (Fix for #2662) 2016-03-16 09:22:26 +09:00
dclim 553b677971 caching doc fix 2016-03-15 17:09:33 -06:00