196 Commits

Author SHA1 Message Date
Gian Merlino
26991b5a2a Indexing service: Fix termination related log message 2013-12-20 12:05:42 -08:00
Gian Merlino
4d83837e88 RealtimeIndexTask: Clean up imports and comments 2013-12-20 11:37:16 -08:00
Gian Merlino
17ad4ee2f0 Fix RemoteTaskRunnerTest 2013-12-20 11:23:28 -08:00
Gian Merlino
e5b8546d19 Autoscaling fixes.
- Initial targetWorkerCount must be subject to pool size limits
- Use consistent workerSetupData for the entire autoscaling run
- Don't call terminate() when we have nothing to terminate
- Terminate obsolete workers even faster
2013-12-20 11:17:01 -08:00
fjy
3ec2766cd3 Merge pull request #339 from metamx/autoscaling
Autoscaling: Move target count independent of actual count.
2013-12-20 10:04:26 -08:00
Gian Merlino
6224577ed1 Autoscaling: Terminate obsolete workers faster 2013-12-20 10:01:32 -08:00
Gian Merlino
4a722c0a6d Autoscaling changes from code review.
- Log and return immediately when workerSetupData is null
- Allow provisioning more nodes while other nodes are still provisioning
- Add tests for bumping up the minimum version
2013-12-20 08:59:35 -08:00
Gian Merlino
0ee6136ea3 NoopTask: Fix things that should be static. Add simple factory method. 2013-12-20 08:56:17 -08:00
Gian Merlino
3dd9a25546 Fix import 2013-12-19 16:18:16 -08:00
Gian Merlino
0ff7f0e8e0 TaskActionToolbox: Combine adjacent ifs 2013-12-19 16:16:34 -08:00
Gian Merlino
f86342f7dc DbTaskStorage: Protect against invalid lock_payload 2013-12-19 16:16:20 -08:00
Gian Merlino
1f4b99634f Autoscaling: Move target count independent of actual count.
This should let us grow and shrink the worker pool in chunks when necessary
(like when a bunch of them go offline, or when there is a worker version
change).
2013-12-19 16:11:30 -08:00
Gian Merlino
846c3da4ab Empty task intervals, and empty lock intervals, aren't useful.
So prevent them from being created, through checks in AbstractFixedIntervalTask
and TaskLockbox.tryLock.
2013-12-19 13:21:41 -08:00
Gian Merlino
566a3a6112 Indexing service: Break up segment actions
Each one now one operates on at most a collection of segments that comprise
a single partition. The main purpose of this change is to prevent audit log
payload sizes from getting out of control.
2013-12-19 13:10:40 -08:00
Gian Merlino
6fbe67eeea IndexerDBCoordinator: Work around SELECT -> INSERT races when adding segments 2013-12-19 13:10:40 -08:00
Gian Merlino
1ff855d744 Fix MoveTask serde and ArchiveTask id creation 2013-12-18 15:17:12 -08:00
Gian Merlino
58d1262edf Indexing console: Clarify "Complete" with "recently completed" 2013-12-17 08:16:49 -08:00
Xavier Léauté
e333776aca rename SegmentMoveAction to SegmentMetadataUpdateAction 2013-12-16 14:00:56 -08:00
Xavier Léauté
ac2ca0e46c separate move and archive tasks 2013-12-16 14:00:55 -08:00
Xavier Léauté
123bddd615 update for new interfaces 2013-12-16 13:59:16 -08:00
Xavier Léauté
4a291fdf30 better naming 2013-12-16 13:59:16 -08:00
Xavier Léauté
a417cd5df2 add archive task 2013-12-16 13:59:15 -08:00
fjy
87b83bceb1 fix task storage config serde and prepare for next release 2013-12-13 16:55:22 -08:00
fjy
01f9c1df31 fix broken task storage config and prepare for next release 2013-12-13 16:45:32 -08:00
Gian Merlino
600dc7546f Configurability of recency threshold 2013-12-13 16:02:54 -08:00
fjy
4a8140be81 better messaging to console again 2013-12-13 15:04:25 -08:00
fjy
52cdb20f10 add better messaging and error handling 2013-12-13 15:01:07 -08:00
Gian Merlino
e63c69dd57 TaskStorage: Return recently complete tasks in reverse chronological order 2013-12-13 12:27:45 -08:00
Gian Merlino
6c993d87bf Indexing service API and GUI improvements!
- New APIs: waitingTasks, completeTasks, task payload
- GUI for the above, and for task logs + status
2013-12-13 11:38:18 -08:00
Gian Merlino
f36a5b677c TaskLifecycleTest: Add test for noop task 2013-12-13 07:48:28 -08:00
Gian Merlino
3b053a66ff TaskLifecycleTest: Add test for never-ready task 2013-12-13 07:48:27 -08:00
Gian Merlino
863012c384 TaskQueue: Exception during isReady does not warrant an alert. 2013-12-13 07:48:27 -08:00
Gian Merlino
6227963af9 TaskQueue: Copy task list before management loop. 2013-12-13 07:48:27 -08:00
Gian Merlino
70c153592f CliPeon: Fix local mode 2013-12-12 14:22:57 -08:00
Gian Merlino
370e2f855a TaskSerdeTest: Fix IndexTask test by including an actual firehoseFactory 2013-12-12 13:58:44 -08:00
Gian Merlino
169f149cf9 TaskLifecycleTest: Fix broken setUp and broken assumptions. 2013-12-12 13:51:13 -08:00
Gian Merlino
ba757b1e5a IndexTask: Actually make and publish segments for the correct intervals. 2013-12-12 13:50:53 -08:00
Gian Merlino
be25d51a2c RemoteTaskRunner: Fix issues leading to failing tests 2013-12-12 13:49:49 -08:00
Gian Merlino
c60158a21a RemoteTaskRunner: Remove task from pendingTaskPayloads on shutdown if needed 2013-12-12 10:59:16 -08:00
Gian Merlino
0129ea99cf RemoteTaskRunner changes to make bootstrapping actually work.
- Workers are not added to zkWorkers until caches have been initialized.
- Worker status we haven't heard about will be added to runningTasks or
  completeTasks as appropriate. 
- TaskRunnerWorkItem now only needs a taskId, not the entire Task. This makes
  it possible to create them from TaskStatus objects, if that's all we have.
- Also remove some dead code.
2013-12-12 10:44:46 -08:00
Gian Merlino
d92b88718c OverlordResource: Fix comment 2013-12-12 08:46:24 -08:00
Gian Merlino
b6a52610bc IndexTask: Call plumber.startJob() 2013-12-12 08:46:10 -08:00
Gian Merlino
db9b515e71 IndexTask: Remove unnecessary args to determinePartitions. 2013-12-12 08:46:00 -08:00
Gian Merlino
f4a09d4ee3 TaskAction: Add JsonSubType for LockTryAcquireAction 2013-12-12 08:45:23 -08:00
Gian Merlino
b17dc6f744 Task interval, isReady hygiene 2013-12-11 22:42:20 -08:00
Gian Merlino
05e24bd85c RemoteTaskRunner: Fix typo 2013-12-11 22:38:04 -08:00
Gian Merlino
bed263efa5 VersionConverterTask: Less goofy import for Preconditions 2013-12-11 22:37:55 -08:00
Gian Merlino
53d90efe30 TaskQueueConfig: Copyright header 2013-12-11 22:37:40 -08:00
Gian Merlino
0adda97776 AbstractFixedIntervalTask: Copyright header 2013-12-11 22:37:28 -08:00
Gian Merlino
c4b8c8bc6f Rework indexing service internals to hopefully be more reliable.
The TaskQueue directly manages the TaskRunner. The main management loop runs
periodically and checks that the runner is doing reasonable things. If not, it
attempts to adjust the runner. The management loop also runs on-demand when a
task is added to keep task assignment relatively low latency. The TaskConsumer
is no longer necessary and so it no longer exists.

Task interval locks are handled differently. Instead of some tasks acquiring
locks at runtime and some tasks having implicit fixed lock intervals, all tasks
ask for locks explicitly. This occurs either in "isReady" (which runs on the
overlord) or in "run" (which runs on the peon).

Other changes:
- The TaskQueue is attached to the leader lifecycle, instead of global
- The TaskLockbox is able to sync itself from storage and is no longer
  bootstrapped by the TaskQueue.
- RemoteTaskRunner does not clean up zk paths until asked to. This will
  prevent deletion of statuses that have not yet been committed.
- Added retries on DbTaskStorage operations.
- Removed SpawnTasksAction (no more subtasks)
- Removed obsolete EventReceiverFirehose configs
- Removed obsolete OldOverlordResource
- Removed TaskStorageQueryAdapter methods related to subtasks
2013-12-11 15:05:16 -08:00