Commit Graph

1209 Commits

Author SHA1 Message Date
Gian Merlino fffa9c8265 Fix flattenSpec docs, "nested" should be "path". (#2924) 2016-05-05 08:59:41 -07:00
David Lim b489f63698 Supervisor for KafkaIndexTask (#2656)
* supervisor for kafka indexing tasks

* cr changes
2016-05-04 23:13:13 -07:00
Charles Allen 44e52acfc0 Link up metrics configuration to what they mean (#2921) 2016-05-04 10:30:02 -07:00
Himanshu 8e2742b7e8 adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query (#2873) 2016-05-03 11:31:10 -07:00
Navis Ryu 45a3a26ef7 Add more math functions (#2822)
* Add more math functions

* added function list
2016-05-03 10:55:13 -07:00
Gian Merlino e680665f1c Fix Avro parseSpec example, "type" should be "format". (#2918) 2016-05-03 09:22:43 -07:00
Himanshu 6c5bf91f9a publish metrics numJettyConns to see how number of active jetty connections change over time (#2839)
this can be compared with numer of active queries to see if requests are waiting in jetty queue
2016-05-02 14:08:25 -07:00
Charles Allen 6b957aa072 [QTL] Make URI Exctraction Namespace take more sane arguments (#2738)
* Make URI Exctraction Namespace take more sane arguments
* Fixes https://github.com/druid-io/druid/issues/2669

* Update docs

* Rename error message

* Undo overzealous deletion of docs

* Explain caching mechanism a bit more in docs
2016-05-02 12:54:34 -07:00
Charles Allen 54b717bdc3 [QTL] Move kafka-extraction-namespace to the Lookup framework. (#2800)
* Move kafka-extraction-namespace to the Lookup framework.

* Address comments

* Fix missing kafka introspection

* Fix tests to be less racy

* Make testing a bit more leniant

* Make tests even more forgiving

* Add comments to kafka lookup cache method

* Move startStopLock to just use started

* Make start() and stop() idempotent

* Forgot to update test after last change, test now accounts for idempotency

* Add extra idempotency on stop check

* Add more descriptive docs of behavior
2016-05-02 09:45:13 -07:00
michaelschiff 2203a812bc statsd-emitter (#2410) 2016-04-28 18:41:02 -07:00
David Lim 890bdb543d doc fixes (#2897) 2016-04-28 15:34:58 -07:00
Slim 58510d826b fix emit wait time (#2869) 2016-04-26 17:07:03 -07:00
Slim 55785267e4 postAgg filedName must match name of AGG (#2874) 2016-04-22 11:11:54 -07:00
binlijin 9151099e08 add document for druid.segmentCache.numBootstrapThreads (#2872) 2016-04-22 12:06:08 +08:00
Himanshu 3cfd9c64c9 make singleThreaded groupBy query config overridable at query time (#2828)
* make isSingleThreaded groupBy query processing overridable at query time

* refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent
2016-04-21 17:12:58 -07:00
Slim 984a518c9f Merge pull request #2734 from b-slim/LookupIntrospection2
[QTL][Lookup] adding introspection endpoint
2016-04-21 12:15:57 -05:00
Gian Merlino c74391e54c JavaScript: Ability to disable. (#2853)
Fixes #2852.
2016-04-21 09:43:15 -05:00
Nishant dbf63f738f Add ability to filter segments for specific dataSources on broker without creating tiers (#2848)
* Add back FilteredServerView removed in a32906c7fd to reduce memory usage using watched tiers.

* Add functionality to specify "druid.broker.segment.watchedDataSources"
2016-04-19 10:10:06 -07:00
Gaurav Kumar f5822faca3 Fixed wrong parseSpec in Avro Hadoop Parser (#2846)
`parseSpec` should contain `format` instead of `type`. It was wrongly defaulting to `tsv`
2016-04-16 11:34:54 -07:00
du00cs 639d0630b8 jackson conflict workaround in hadooop ingestio & parquet extension coordinate update (#2817) 2016-04-13 14:20:33 -07:00
Fangjin Yang 0c4a42bb6f change toc entry (#2834) 2016-04-13 13:45:07 -07:00
Gian Merlino e320d13385 Fix various broken links in the docs. (#2833) 2016-04-13 13:30:01 -07:00
Gian Merlino 725ee1401d Update tranquility version in the docs. (#2832) 2016-04-13 11:33:59 -07:00
Gian Merlino aa25cc1f68 Fix up Kafka tutorial (#2831)
1) Remove extraneous section
2) Remove -SNAPSHOT version
2016-04-13 11:33:45 -07:00
Fangjin Yang abd951df1a Document how to use roaring bitmaps (#2824)
* Document how to use roaring bitmaps

This fixes #2408.
While not all indexSpec properties are explained, it does explain how roaring bitmaps can be turned on.

* fix

* fix

* fix

* fix
2016-04-12 19:28:02 -07:00
Charles Allen ed5377465a add AirBnB Caravel to list of libraries (#2719) 2016-04-12 12:53:50 -07:00
Sébastien Launay 37d2ab623e Merge pull request #2815 from slaunay/documentation/hadoop-classpath-issue-fix-with-configuration
Doc for mapreduce.job.user.classpath.first=true
2016-04-12 10:51:51 -07:00
Nishant deb6ecf919 handle review comments for PR 2784
https://github.com/druid-io/druid/pull/2784#discussion_r59062021
2016-04-12 21:52:00 +05:30
Fangjin Yang bd6bd34cd8 Merge pull request #2090 from himanshug/math_exp
math expression support
2016-04-11 21:36:17 -07:00
Fangjin Yang 234125878a Merge pull request #2808 from metamx/moveLookupSaveStateConfigDocs
Move lookup config doc to proper location
2016-04-08 13:50:42 -06:00
Himanshu Gupta 308211cc18 math expression language with parser/lexer generated using ANTLR 2016-04-08 11:40:29 -05:00
Himanshu Gupta 36ccfbd20e math expression language with hand written parser/lexer 2016-04-08 11:40:29 -05:00
Charles Allen 2b99f717e4 Move lookup config doc to proper location 2016-04-08 08:15:38 -07:00
Nishant edd74f2b67 Allow Lite DataSegment Announcements
separate config for each skipping dimensions, metrics and loadSpec

Add test

fix test comment

Add docs
2016-04-07 18:24:12 +05:30
Charles Allen f915a59138 Merge pull request #2691 from metamx/lookupExtrFn
Add ExtractionFn to LookupExtractor bridge
2016-04-06 09:13:08 -07:00
jon-wei 0e481d6f93 Allow filters to use extraction functions 2016-04-05 13:24:56 -07:00
Fangjin Yang eea7a47870 Merge pull request #2576 from navis/paging-from-next
Add option for select query to get next page without modifying returned paging identifiers
2016-04-01 13:50:36 -07:00
Fangjin Yang 4eb5a2c4f1 Merge pull request #2715 from navis/stringformat-null-handling
stringFormat extractionFn should be able to return null on null values (Fix for #2706)
2016-04-01 13:45:28 -07:00
navis.ryu 077522a46f stringFormat extractionFn should be able to return null on null values (Fix for #2706) 2016-04-01 13:40:56 +09:00
navis.ryu 29bb00535b Add option for select query to get next page without modifying returned paging identifiers 2016-04-01 09:03:03 +09:00
fjy 14dbc431ef clean up for extensions docs 2016-03-30 17:14:58 -07:00
Fangjin Yang a8b28879f1 Merge pull request #2369 from du00cs/master
[Feature] Extension: Offline Ingestion with limited Parquet Support
2016-03-29 23:19:35 -07:00
Fangjin Yang 23a8830bc2 Merge pull request #2757 from druid-io/fix-conf
Update libraries.md
2016-03-29 21:32:01 -07:00
DuNinglin [杜宁林] 0f67ff7dfb reoganize code folder according to recent upstream folder changes, seperate it from avro code and take it into extensions-conrib. docs rewite too 2016-03-30 11:21:41 +08:00
Gian Merlino 1853f36e9f More consistent empty-set filtering behavior on multi-value columns.
The behavior is now that filters on "null" will match rows with no
values. The behavior in the past was inconsistent; sometimes these
filters would match and sometimes they wouldn't.

Adds tests for this behavior to SelectorFilterTest and
BoundFilterTest, for query-level filters and filtered aggregates.

Fixes #2750.
2016-03-29 15:32:13 -07:00
r4ruchir 4bff008d65 Update libraries.md
Adding embedded-druid information in helper libraries
2016-03-29 15:16:36 -07:00
Fangjin Yang 1e02eeab13 Merge pull request #2683 from metamx/default_retry
Better defaults for Retry policy for task actions
2016-03-29 08:02:59 -07:00
fjy c418a55638 cleanup distinct count agg 2016-03-28 17:29:41 -07:00
Fangjin Yang 62c1dc7a09 Merge pull request #2602 from binlijin/distinctcount
implement special distinctcount
2016-03-28 17:20:17 -07:00
Fangjin Yang 9cb197adec Merge pull request #2722 from himanshug/fix_hadoop_jar_upload
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-28 14:49:03 -07:00
Charles Allen 4764e86409 Add docs for RegisteredDimensionExtractionFn 2016-03-28 13:27:49 -07:00
Gian Merlino dbdfcd2443 Fix extension reference in Kafka namespaced lookup docs.
The reference to io.druid.extensions:kafka-extraction-namespace is wrong (should
be druid-kafka-extraction-namespace) and unnecessary (the extension id is written
at the top of the doc file).
2016-03-28 09:23:24 -07:00
Fangjin Yang a0216dcf7d Merge pull request #2735 from metamx/fixlookupDocs
Move lookup docs that are in druid-proper back into lookups.md
2016-03-26 15:38:48 -07:00
Charles Allen ab324e4ac0 Move lookup docs that are in druid-proper back into lookups.md 2016-03-25 10:46:50 -07:00
Gian Merlino 6d18382fb2 Fix broken link in datasketches-aggregators.md. 2016-03-25 09:32:40 -07:00
Himanshu Gupta e78a469fb7 UTs for ExtensionsConfig 2016-03-25 10:51:28 -05:00
Himanshu Gupta 004b00bb96 config to explicitly specify classpath for hadoop container during hadoop ingestion 2016-03-25 10:51:28 -05:00
Bingkun Guo 0fa04305a6 refine description for mergeBytesLimit 2016-03-24 13:17:24 -05:00
binlijin 2729efca71 implement special distinctcount 2016-03-24 11:11:11 +08:00
Robin 448e0127b9 dynamic config endpoint is at coordinator 2016-03-23 17:22:19 -05:00
Fangjin Yang a5d5529749 Merge pull request #2711 from gianm/filtered-aggregator-impls
All Filters should work with FilteredAggregators.
2016-03-23 13:37:21 -07:00
Gian Merlino dd86198902 All Filters should work with FilteredAggregators.
This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a
ValueMatcherFactory implementation to FilteredAggregatorFactory so it can
take advantage of existing makeMatcher(ValueMatcherFactory) implementations.

This patch also removes the Bound-based method from ValueMatcherFactory. Its
only user was the SpatialFilter, which could use the Predicate-based method.

Fixes #2604.
2016-03-23 12:24:01 -07:00
Gian Merlino 2dfd3877c0 Fix a bunch of broken links in the docs. 2016-03-23 10:21:28 -07:00
Fangjin Yang d1f8f2b2fd Merge pull request #2698 from druid-io/fix-ext-docs
refactor extensions into their own docs
2016-03-22 22:04:12 -07:00
fjy 943cbe6e76 refactor extensions into their own docs 2016-03-22 18:54:10 -07:00
Fangjin Yang 041350c31b Merge pull request #2701 from gianm/mvd-docs
Improved docs for multi-value dimensions.
2016-03-22 18:09:37 -07:00
Gian Merlino 451c0bc6d8 Merge pull request #2702 from pjain1/improve_docs
how to query in the querying section, correct default for select strategy, formatting
2016-03-22 16:40:35 -07:00
Parag Jain 39ecb9929d how to query, correct default for select strategy, formatting 2016-03-22 17:06:15 -05:00
Gian Merlino ff25325f3b Improved docs for multi-value dimensions.
- Add central doc for multi-value dimensions, with some content from other docs.
- Link to multi-value dimension doc from topN and groupBy docs.
- Fixes a broken link from dimensionspecs.md, which was presciently already
  linking to this nonexistent doc.
- Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes
  "multi-value") in favor of "multi-value".
2016-03-22 14:40:55 -07:00
Himanshu 3220b109ad Merge pull request #2570 from binlijin/single_dimension_partitioning
Single dimension hash-based partitioning
2016-03-22 11:51:06 -05:00
binlijin bce600f5d5 Single dimension hash-based partitioning 2016-03-22 13:15:33 +08:00
Nishant 11b8d1ed70 Merge pull request #2686 from gianm/fix-analysistypes-docs
Fix analysisTypes docs for SegmentMetadataQuery.
2016-03-18 16:15:38 -07:00
Gian Merlino 76ae30604e Fix analysisTypes docs for SegmentMetadataQuery. 2016-03-18 13:17:33 -07:00
Nishant ed8f39fcfe Better defaults for Retry policy for task actions
This PR changes the retry of task actions to be a bit more aggressive
by reducing the maxWait. Current defaults were 1 min to 10 mins, which
lead to a very delayed recovery in case there are any transient network
issues between the overlord and the peons.

doc changes.
2016-03-18 11:59:55 -07:00
Charles Allen 5da9a280b6 Query Time Lookup - Dynamic Configuration 2016-03-18 09:45:05 -07:00
Slim cf342d8d3c Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility
[QTL][Lookup] lookup module with the snapshot utility
2016-03-17 11:39:47 -05:00
Slim Bouguerra 0c86b29ef0 lookup module with the snapshot utility 2016-03-17 09:20:41 -05:00
Fangjin Yang 8cea85816d Merge pull request #2668 from navis/fix-document-selectquery
Document for search query was not updated properly (Fix for #2662)
2016-03-15 20:34:27 -07:00
navis.ryu 71ee9e2aac Document for search query is not updated properly (Fix for #2662) 2016-03-16 09:22:26 +09:00
dclim 553b677971 caching doc fix 2016-03-15 17:09:33 -06:00
Gian Merlino a938f0853b Additional ports docs. 2016-03-14 19:11:18 -07:00
Jonathan Wei 5ec5ac92c6 Merge pull request #2382 from himanshug/broker_segment_tier_selection
at broker, if configured, only add segments from specific tiers to the timeline
2016-03-14 16:53:06 -07:00
Fangjin Yang a41a70d370 Merge pull request #2651 from gianm/ports-docs
Docs on default ports.
2016-03-14 14:15:52 -07:00
Fangjin Yang dbdbacaa18 Merge pull request #2260 from navis/cardinality-for-searchquery
Support cardinality for search query
2016-03-14 13:24:40 -07:00
Gian Merlino e51277b96c Docs on default ports. 2016-03-14 11:25:21 -07:00
rasahner 2861e854f0 Merge pull request #2540 from pjain1/remove_kill
Remove extra parameter from deleteDataSourceSpecificInterval endpoint and correct exception message for invalid interval
2016-03-14 11:16:23 -05:00
navis.ryu be341bf4e3 Support cardinality for search query (Fix for #2260) 2016-03-12 09:51:01 +09:00
Bingkun Guo 96c981cd0a fix broken link for Tasks 2016-03-11 11:36:34 -06:00
Xavier Léauté 90d7409e1a Merge pull request #2611 from himanshug/gp_by_max_limit
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-10 13:44:13 -08:00
Charles Allen 7b1bfbf704 Add documentation to modules about what should be excluded. 2016-03-10 10:18:33 -08:00
Gian Merlino a2b1652787 Clarify parser docs.
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
fjy e3e932a4d4 refactor extensions into core and contrib 2016-03-08 17:12:09 -08:00
Himanshu Gupta ca5de3f583 only allow lowering maxResults and maxIntermediateRows from groupBy query context 2016-03-08 15:03:59 -06:00
Fangjin Yang 8e36e6fa43 Merge pull request #2610 from dclim/add-combineText-doc
add combineText property and cleanup batch ingestion doc
2016-03-08 12:54:16 -08:00
Fangjin Yang e7018f524f Merge pull request #2598 from himanshug/handoff_timeout
optional ability to configure handoff wait timeout on realtime tasks
2016-03-08 12:43:36 -08:00
dclim df29667a89 add combineText property and cleanup batch ingestion doc 2016-03-08 13:10:34 -07:00
Himanshu Gupta 099acb4966 allow groupBy max[Intermediate]Rows limit be overridable by context 2016-03-07 15:22:41 -06:00
Himanshu Gupta 0402636598 configurable handoffConditionTimeout in realtime tasks for segment handoff wait 2016-03-05 10:14:54 -06:00
Charles Allen 2ad134638d Merge pull request #2589 from b-slim/fix_real_time
Make realtime kafka firehose skip corrupt message
2016-03-04 12:14:23 -08:00
Slim Bouguerra 623e89aa54 skip corrupt message 2016-03-04 08:30:40 -06:00