Commit Graph

39 Commits

Author SHA1 Message Date
Himanshu a46d34daa2 HTTP based task/worker management. (#5104)
* just renaming of SegmentChangeRequestHistory etc

* additional change history refactoring changes

* WorkerTaskManager a replica of WorkerTaskMonitor

* HttpServerInventoryView refactoring to extract sync code and robustification

* Introducing HttpRemoteTaskRunner

* Additional Worker side updates
2018-01-04 19:19:35 -08:00
Roman Leventov a7a6a0487e Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762)
* Replace IOPeon with OutputMedium; Improve compression

* Fix test

* Cleanup CompressionStrategy

* Javadocs

* Add OutputBytesTest

* Address comments

* Random access in OutputBytes and GenericIndexedWriter

* Fix bugs

* Fixes

* Test OutputBytes.readFully()

* Address comments

* Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes

* Add comments to ByteBufferInputStream

* Remove unused declarations
2017-12-04 18:04:27 -08:00
Himanshu f69c9280c4 remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858)
* remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form

* sanitize output of /druid/coordinator/v1/cluster endpoint
2017-09-28 10:40:59 -05:00
Gian Merlino 33c0928bed Collapse worker select strategies, change default, add strong affinity. (#4534)
* Collapse worker select strategies, change default, add strong affinity.

- Change default worker select strategy to equalDistribution. It is
  more generally useful than fillCapacity.
- Collapse the *WithAffinity strategies into the regular ones. The
  *WithAffinity strategies are retained for backwards compatibility.
- Change WorkerSelectStrategy to return nullable instead of Optional.
- Fix a couple of errors in the docs.

* Fix test.

* Review adjustments.

* Remove unused imports.

* Switch to DateTimes.nowUtc.

* Simplify code.

* Fix tests (worker assignment started off on a different foot)
2017-09-04 14:40:55 -07:00
Akash Dwivedi b43720c46d Correction in indexing-service configuration doc. (#4700) 2017-08-22 23:21:34 -05:00
David Lim dd0b84e766 Fix bugs in RTR related to blacklisting, change default worker strategy (#4619)
* fix bugs in RTR related to blacklisting, change default worker strategy to equalDistribution

* code review and additional changes

* fix errorprone

* code review changes
2017-08-03 10:34:45 -07:00
Parag Jain 6e2f78f552 TLS support (#4270) 2017-07-06 17:40:12 -07:00
JackyWoo a0f2cf05d5 Add EqualDistributionWithAffinityWorkerSelectStrategy which balance w… (#3998)
* Add EqualDistributionWithAffinityWorkerSelectStrategy which balance work load within affinity workers.

* add docs to equalDistributionWithAffinity
2017-03-25 19:15:49 -07:00
Himanshu 9dfcf0763a disable javascript execution by default (#3818) 2017-02-13 15:11:18 -08:00
Himanshu 4ca3b7f1e4 overlord helpers framework and tasklog auto cleanup (#3677)
* overlord helpers framework and tasklog auto cleanup

* review comment changes

* further review comments addressed
2016-12-21 15:18:55 -08:00
Niketh Sabbineni 2640d170c3 Blacklist workers if they fail for too many times (#3643)
* Blacklist workers if they fail for too many times

* Adding documentation

* Changing to timeout to period and updating docs

* 1. Add configurable maxPercentageBlacklistWorkers
2. Rename variable

* Change maxPercentageBlacklistWorkers to double

* Remove thread.sleep
2016-11-29 12:38:56 +05:30
Erik Dubbelboer 7d36f540e8 WIP: Add Google Storage support (#2458)
Also excludes the correct artifacts from #2741
2016-11-16 14:06:45 +05:30
Gian Merlino e0e28866ee JavaScript docs: Fix links and typos, add to TOC. (#3457) 2016-09-13 15:26:44 -07:00
Gian Merlino 76a24054e3 JavaScript docs, including docs for globals. (#3454) 2016-09-13 13:46:55 -07:00
Himanshu 03cfcf002b fix the race described in #3174 (#3205) 2016-08-10 11:29:50 -07:00
Fangjin Yang 1e02eeab13 Merge pull request #2683 from metamx/default_retry
Better defaults for Retry policy for task actions
2016-03-29 08:02:59 -07:00
Gian Merlino 451c0bc6d8 Merge pull request #2702 from pjain1/improve_docs
how to query in the querying section, correct default for select strategy, formatting
2016-03-22 16:40:35 -07:00
Parag Jain 39ecb9929d how to query, correct default for select strategy, formatting 2016-03-22 17:06:15 -05:00
Nishant ed8f39fcfe Better defaults for Retry policy for task actions
This PR changes the retry of task actions to be a bit more aggressive
by reducing the maxWait. Current defaults were 1 min to 10 mins, which
lead to a very delayed recovery in case there are any transient network
issues between the overlord and the peons.

doc changes.
2016-03-18 11:59:55 -07:00
Björn Zettergren 2462c82c0e New defaults for maxRowsInMemory rowFlushBoundary
To bring consistency to docs and source this commit changes the default
values for maxRowsInMemory and rowFlushBoundary to 75000 after
discussion in PR https://github.com/druid-io/druid/pull/2457.

The previous default was 500000 and it's lower now on the grounds that
it's better for a default to be somewhat less efficient, and work,
than to reach for the stars and possibly result in
"OutOfMemoryError: java heap space" errors.
2016-03-01 13:50:28 +01:00
Charles Allen c6803c4364 Allow specifying peon javaOpts as an array 2016-02-26 13:24:35 -08:00
Himanshu Gupta bc156effe7 RTR has multiple threads for assignment of pending tasks now. 2016-02-26 09:27:03 -06:00
Gian Merlino 23c993c9e7 Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports.
- Add druid.indexer.server.maxChatRequests, which sets up a QoSFilter on the main Jetty server.
- Deprecate druid.indexer.runner.separateIngestionEndpoint
- Deprecate druid.indexer.server.chathandler.*
2016-02-19 13:36:09 -08:00
Charles Allen 3a6452c6d4 Make QuotableWhiteSpaceSplitter able to take json
* Fixes #2435
2016-02-10 16:42:14 -08:00
Charles Allen e18301d99c Make TaskConfig pull from java.io.tmpdir
* Also makes paths built off of java.nio.file.Paths instead of String.format
2016-01-04 10:17:08 -08:00
Gian Merlino bad270b6c4 druid.indexer.task.restoreTasksOnRestart configuration. 2015-12-22 10:59:15 -08:00
Steve M 2b5a010332 Change sample worker config spec with host:port instead of ip:port.
Also extend description of the 'affinity' property of the worker strategy
fillCapacityWithAffinity and fix a couple typos of middle manager (to
be more consistent throughout the page).

Add additional verbiage about appropriate middle manager host value.
2015-12-14 14:59:23 -08:00
Gian Merlino 501dcb43fa Some changes that make it possible to restart tasks on the same hardware.
This is done by killing and respawning the jvms rather than reconnecting to existing
jvms, for a couple reasons. One is that it lets you restore tasks after server reboots
too, and another is that it lets you upgrade all the software on a box at once by just
restarting everything.

The main changes are,

1) Add "canRestore" and "stopGracefully" methods to Tasks that say if a task can
   stop gracefully, and actually do a graceful stop. RealtimeIndexTask is the only
   one that currently implements this.

2) Add "stop" method to TaskRunners that attempts to do an orderly shutdown.
   ThreadPoolTaskRunner- call stopGracefully on restorable tasks, wait for exit
   ForkingTaskRunner- close output stream to restorable tasks, wait for exit
   RemoteTaskRunner- do nothing special, we actually don't want to shutdown

3) Add "restore" method to TaskRunners that attempts to bootstrap tasks from last run.
   Only ForkingTaskRunner does anything here. It maintains a "restore.json" file with
   a list of restorable tasks.

4) Have the CliPeon's ExecutorLifecycle lock the task base directory to avoid a restored
   task and a zombie old task from stomping on each other.
2015-11-23 11:22:08 -08:00
Charles Allen dbe201aeed Merge pull request #1929 from pjain1/jetty_threads
separate ingestion and query thread pool
2015-11-17 12:14:25 -08:00
Parag Jain 6c498b7d4a separate ingestion and query thread pool 2015-11-17 13:42:41 -06:00
Bartosz Ługowski 6e5d2c6745 Add count parameter to history endpoints. 2015-11-11 23:03:57 +01:00
Charles Allen 44a2b204df Add timeout to shutdown request to middle manager for indexing service 2015-10-27 13:56:03 -07:00
sahner 4801de62a2 make "announce" the chathandler default in realtime node,
remove doc references to chathandler type "announce" since it is the default now,
2015-07-27 12:14:28 -05:00
Zak Kristjanson 0bda7af52c Add more support for Azure Blob Store
Azure Blob Store support for Task Logs and a firehose for data ingestion
2015-07-17 15:38:21 -07:00
Fangjin Yang 726ed432a1 Merge pull request #1451 from rasahner/doc_minorFixes
minor documentation fixes in Tasks.md, index.md, indexing-service.md
2015-06-23 10:15:47 -07:00
sahner 4ba34fe43d minor documentation fixes in Tasks.md, index.md, indexing-service.md 2015-06-19 17:09:53 -05:00
nishant fb4052d577 JavaScript Worker Select Strategy
this PR adds a JavaScriptWorkerSelectStrategy which allows defining
arbitrary logic for selecting workers to run task using a JavaScript
function.

This gives users full control to implement complex worker selection
strategies based on task attributes.

more tests and a complex javascript config

fix for java8 modify for nashorn compatibility
2015-06-20 02:01:34 +05:30
nishant e9afec4a2b fix task status issues on zk outages
docs

review comments

fix test

review comments

Review comments

fix compilation

fix typo
2015-06-11 00:49:52 +05:30
Xavier Léauté d2346b6834 shorten links and file names
* remove redundant parts in file names
* delete unsupported "Druid-Personal-Demo-Cluster"
2015-05-29 20:55:42 -05:00