Commit Graph

141 Commits

Author SHA1 Message Date
Gian Merlino e7d01b67b6 Move SQL configs to sql.md. (#3959)
This puts all the SQL stuff in one place. It also makes life easier by
pointing out that configs be made in either common.runtime.properties
or the broker runtime.properties.
2017-02-22 08:37:24 -08:00
Jonathan Wei bc33b68b51 Use GroupBy V2 as default (#3953)
* Use GroupBy V2 as default

* Remove unused line

* Change assert to exception propagation
2017-02-18 07:40:40 -08:00
Himanshu 9dfcf0763a disable javascript execution by default (#3818) 2017-02-13 15:11:18 -08:00
Himanshu 8cf7ad1e3a druid.coordinator.asOverlord.enabled flag at coordinator to make it an overlord too (#3711) 2017-02-13 15:03:59 -08:00
Jonathan Wei 182261f713 Allow configurable temp directory for query processing (#3893) 2017-02-02 10:22:28 -08:00
Niketh Sabbineni 2b8d3c102b Remove throttling on drop segments (#3736)
* Remove throttling on drop

* Throttle loadqueuepeon segment change requests to ZK

* Make initial delay configurable, add docs, shutdown gracefully

* Make loadqueuepeon repeat delay configurable
2017-01-20 10:02:19 -08:00
Gian Merlino d51f5e058d SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852)
* SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators.

Switched from CalciteConnection to Planner, bringing benefits:

- CalciteConnection's JDBC interface no longer sits between the SQL server
  (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use
  Druid Sequence objects directly, reducing overhead in the query return path.

- Implemented our own Planner-based Avatica Meta, letting us control
  connection timeouts and connection / statement limits. The previous
  CalciteConnection-based implementation didn't have any limits or timeouts.

- The Planner interface lets us override the operator table, opening up
  SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT
  in core, and a QUANTILE aggregator in the druid-histogram extension.

Also:

- Added INFORMATION_SCHEMA metadata schema.

- Added tests for Unicode literals and escapes.

* Verify statement is actually open before closing it.

* More detailed INFORMATION_SCHEMA docs.
2017-01-19 16:32:20 -08:00
Gian Merlino e86859b228 SQL support for nested groupBys. (#3806)
* SQL support for nested groupBys.

Allows, for example, doing exact count distinct by writing:

  SELECT COUNT(*) FROM (SELECT DISTINCT col FROM druid.foo)

Contrast with approximate count distinct, which is:

  SELECT COUNT(DISTINCT col) FROM druid.foo

* Add deeply-nested groupBy docs, tests, and maxQueryCount config.

* Extract magic constants into statics.

* Rework rules to put preconditions in the "matches" method.
2017-01-11 18:32:53 -08:00
Gian Merlino 76620615a1 Properly respect the enableAvatica and enableJsonOverHttp options. (#3834) 2017-01-11 14:43:34 -06:00
Vinh Tran dddeae813a Update caching.md typo (#3824)
* Update caching.md

Typo of Command vs Comma

* Update index.md

Fixing `Command` typo
2017-01-06 12:14:07 -08:00
Yuusaku Taniguchi 02519d5b64 Exhibitor Support (#3664)
* allow JsonConfigTesterBase to treat the fields of collections

* [Feature] Exhibitor Support (#3664)

This patch provides the integration of Druid & Netflix Exhibitor. Druid
currently use Apache Curator as ZooKeeper client. Curator can be
integrated with Exhibitor to achieve a live/updating list of the
ZooKeeper ensemble. This patch enables Druid to use this features.
2017-01-02 09:15:36 -08:00
Himanshu 4ca3b7f1e4 overlord helpers framework and tasklog auto cleanup (#3677)
* overlord helpers framework and tasklog auto cleanup

* review comment changes

* further review comments addressed
2016-12-21 15:18:55 -08:00
Nishant 35160e5595 Add metrics for Query Count statistics (#3470)
* Add metrics for Query Count statistics

This PR adds a new metrics monitor “QueryCountStatsMonitor” which emits
three new metrics -
1) query/success/count - number of successful queries
2) query/failed/count - number of failed queries
3) query/interrupted/count - number of interrupted/timedout queries

fix bindings

* make fields final

* fix imports

* AsyncQueryForwardingServlet implement QueryStatsProvider

* remove unused import
2016-12-19 09:47:58 -08:00
Gian Merlino dd63f54325 Built-in SQL. (#3682) 2016-12-16 17:15:59 -08:00
Nishant 8cfcb95fbc Add Filtered and Composing request loggers (#3469)
* Add Filtered and Composing request loggers

Add Filtered and Composite Request loggers
- enables users to filter request logs for slow queries.

fix test

* review comments

* review comment

* remove unused import
2016-12-16 11:18:32 -08:00
Gian Merlino 943982b7b0 Configurable HTTP compression. (#3759)
* Configurable HTTP compression.

* Call real-time nodes real-time processes in docs.
2016-12-07 17:40:39 -08:00
Himanshu 06d0ef9c6c allow and load extensions with absolute paths in druid.extensions.loadList (#3747) 2016-12-06 17:40:23 -08:00
Niketh Sabbineni d904c79081 Normalized Cost Balancer (#3632)
* Normalized Cost Balancer

* Adding documentation and renaming to use diskNormalizedCostBalancer

* Remove balancer from the strings

* Update docs and include random cost balancer

* Fix checkstyle issues
2016-12-05 17:18:20 -08:00
Niketh Sabbineni 2640d170c3 Blacklist workers if they fail for too many times (#3643)
* Blacklist workers if they fail for too many times

* Adding documentation

* Changing to timeout to period and updating docs

* 1. Add configurable maxPercentageBlacklistWorkers
2. Rename variable

* Change maxPercentageBlacklistWorkers to double

* Remove thread.sleep
2016-11-29 12:38:56 +05:30
Erik Dubbelboer 7d36f540e8 WIP: Add Google Storage support (#2458)
Also excludes the correct artifacts from #2741
2016-11-16 14:06:45 +05:30
Gian Merlino 7a2a4bc6de JavaScript: Disable now affects worker selection and router strategy too. (#3458) 2016-09-13 16:37:42 -07:00
Gian Merlino e0e28866ee JavaScript docs: Fix links and typos, add to TOC. (#3457) 2016-09-13 15:26:44 -07:00
Gian Merlino 76a24054e3 JavaScript docs, including docs for globals. (#3454) 2016-09-13 13:46:55 -07:00
Himanshu 03cfcf002b fix the race described in #3174 (#3205) 2016-08-10 11:29:50 -07:00
Nishant 8035c73409 Implement EnvironmentVariablePasswordProvider (#3329)
* Implement EnvironmentVariablePasswordProvider

* Review Comment : rename passwordKey to passwordVariable

* add docs

* improve doc layout

* review comment: rename property for variable
2016-08-10 05:33:51 +08:00
Navis Ryu 39351fb8d2 Mask properties from logging (#3332)
* Mask properties from logging

* mask "password" by default
2016-08-08 21:36:10 +05:30
Charles Allen d04af6aee4 Add `slf4j` requst logger (#3146)
* Add `slf4j` requst logger

* Address comments

* Fix conflicts with master

* Fix removed map value
2016-07-29 15:15:41 -07:00
David Lim 9a068e1ba6 fix broken link and use of pipes in table (#3290) 2016-07-26 15:46:51 -07:00
Himanshu 3f82108d15 optionally enable coordinator auto kill tasks on all dataSources via dynamic config (#3250) 2016-07-17 18:47:52 -07:00
Fangjin Yang 8eeae2e844 remove bad docs on setting up clusters (#3188) 2016-07-01 15:41:40 -05:00
Gian Merlino 4cc39b2ee7 Alternative groupBy strategy. (#2998)
This patch introduces a GroupByStrategy concept and two strategies: "v1"
is the current groupBy strategy and "v2" is a new one. It also introduces
a merge buffers concept in DruidProcessingModule, to try to better
manage memory used for merging.

Both of these are described in more detail in #2987.

There are two goals of this patch:

1. Make it possible for historical/realtime nodes to return larger groupBy
   result sets, faster, with better memory management.
2. Make it possible for brokers to merge streams when there are no order-by
   columns, avoiding materialization.

This patch does not do anything to help with memory management on the broker
when there are order-by columns or when there are nested queries. That could
potentially be done in a future patch.
2016-06-24 18:06:09 -07:00
Gian Merlino 2db5f49f35 Fix JavaScriptConfig. (#3062) 2016-06-02 23:59:00 -07:00
Parag Jain 44237e25d9 fix duration format and number format (#3057) 2016-06-02 10:09:21 -07:00
David Lim b489f63698 Supervisor for KafkaIndexTask (#2656)
* supervisor for kafka indexing tasks

* cr changes
2016-05-04 23:13:13 -07:00
Charles Allen 44e52acfc0 Link up metrics configuration to what they mean (#2921) 2016-05-04 10:30:02 -07:00
binlijin 9151099e08 add document for druid.segmentCache.numBootstrapThreads (#2872) 2016-04-22 12:06:08 +08:00
Himanshu 3cfd9c64c9 make singleThreaded groupBy query config overridable at query time (#2828)
* make isSingleThreaded groupBy query processing overridable at query time

* refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent
2016-04-21 17:12:58 -07:00
Gian Merlino c74391e54c JavaScript: Ability to disable. (#2853)
Fixes #2852.
2016-04-21 09:43:15 -05:00
Nishant dbf63f738f Add ability to filter segments for specific dataSources on broker without creating tiers (#2848)
* Add back FilteredServerView removed in a32906c7fd to reduce memory usage using watched tiers.

* Add functionality to specify "druid.broker.segment.watchedDataSources"
2016-04-19 10:10:06 -07:00
Nishant deb6ecf919 handle review comments for PR 2784
https://github.com/druid-io/druid/pull/2784#discussion_r59062021
2016-04-12 21:52:00 +05:30
Nishant edd74f2b67 Allow Lite DataSegment Announcements
separate config for each skipping dimensions, metrics and loadSpec

Add test

fix test comment

Add docs
2016-04-07 18:24:12 +05:30
Fangjin Yang 1e02eeab13 Merge pull request #2683 from metamx/default_retry
Better defaults for Retry policy for task actions
2016-03-29 08:02:59 -07:00
Fangjin Yang 9cb197adec Merge pull request #2722 from himanshug/fix_hadoop_jar_upload
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-28 14:49:03 -07:00
Himanshu Gupta e78a469fb7 UTs for ExtensionsConfig 2016-03-25 10:51:28 -05:00
Himanshu Gupta 004b00bb96 config to explicitly specify classpath for hadoop container during hadoop ingestion 2016-03-25 10:51:28 -05:00
Bingkun Guo 0fa04305a6 refine description for mergeBytesLimit 2016-03-24 13:17:24 -05:00
Robin 448e0127b9 dynamic config endpoint is at coordinator 2016-03-23 17:22:19 -05:00
Gian Merlino 451c0bc6d8 Merge pull request #2702 from pjain1/improve_docs
how to query in the querying section, correct default for select strategy, formatting
2016-03-22 16:40:35 -07:00
Parag Jain 39ecb9929d how to query, correct default for select strategy, formatting 2016-03-22 17:06:15 -05:00
Nishant ed8f39fcfe Better defaults for Retry policy for task actions
This PR changes the retry of task actions to be a bit more aggressive
by reducing the maxWait. Current defaults were 1 min to 10 mins, which
lead to a very delayed recovery in case there are any transient network
issues between the overlord and the peons.

doc changes.
2016-03-18 11:59:55 -07:00