Commit Graph

483 Commits

Author SHA1 Message Date
nishantmonu51 01e8a713b6 unit tests passing with offheap-indexing 2014-06-05 17:42:53 +05:30
fjy 9f4cc5ca1f fix test 2014-06-04 16:29:20 -07:00
fjy 77ec4df797 update guava, java-util, and druid-api 2014-06-03 13:43:38 -07:00
Gian Merlino 8a6384862f Better task log errors. 2014-06-02 12:55:17 -07:00
Gian Merlino 48511e15cf Autoscaling: Fix case where target < current, but target is too low.
This can happen when workers are provisioned manually.
2014-06-02 12:36:48 -07:00
fjy 7ffe60ca60 clean up a bit of the logic in RTR failure with exceptions in announceTask 2014-06-02 10:49:28 -07:00
nishantmonu51 6f176cee85 fail task on exception
If the task announcement throw exception, set status to fail instead of
retrying it again and again.
2014-06-02 22:52:50 +05:30
fjy 4c13327297 more logging for determine hashed 2014-05-30 16:19:20 -07:00
Xavier Léauté 1a8eb25852 update realtime tasks to use FilteredServerView 2014-05-29 16:41:26 -07:00
fjy 7be93a770a make all firehoses work with tasks, add a lot more documentation about configuration 2014-05-28 16:33:59 -07:00
fjy 09ad32c5c5 fix race condition with merge and persist and sink adding
Conflicts:
	indexing-service/src/main/java/io/druid/indexing/common/task/IndexTask.java
	server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumber.java
2014-05-16 15:21:09 -07:00
fjy d75cc7b9b8 fix more serde 2014-05-06 15:17:38 -07:00
fjy 1100d2f2a1 rename configs to make a bit more sense 2014-05-06 14:52:50 -07:00
fjy b6fb4245aa Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/HadoopDriverConfig.java
	indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java
	indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfigBuilder.java
	pom.xml
	server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
	server/src/main/java/io/druid/segment/realtime/firehose/EventReceiverFirehoseFactory.java
2014-05-06 14:32:51 -07:00
fjy 79e6d4eb56 Merge pull request #528 from metamx/union-query-source
Union query source
2014-05-06 14:13:33 -06:00
Gian Merlino bdf9e74a3b Allow config-based overriding of hadoop job properties. 2014-05-06 09:11:31 -07:00
nishantmonu51 728a606d32 Add support for union queries 2014-05-01 01:54:52 +05:30
fjy 171d20d52d Merge branch 'move-firehose' of github.com:metamx/druid into move-firehose 2014-04-25 14:13:19 -07:00
fjy eef034ca7e Merge branch 'master' of github.com:metamx/druid into move-firehose 2014-04-25 14:13:08 -07:00
fjy 76e0a48527 Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/DbUpdaterJob.java
	indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
	indexing-service/src/main/java/io/druid/indexing/common/task/HadoopIndexTask.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumber.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-04-25 14:03:28 -07:00
fjy 174ff66a88 Merge pull request #504 from metamx/shard-count-limit-fix
Fix shard count limit
2014-04-23 10:12:05 -06:00
nishantmonu51 6eb282b3a8 Fix shard count limit
Fix shard count limit at another place,
Use partitioned as a flag
2014-04-23 13:56:06 +05:30
Gian Merlino 8d650ae131 PullDependencies changes.
- Pull default Hadoop version by default.
- Allow additional coordinates on the command line.
2014-04-22 14:33:53 -07:00
fjy c00fb1d08e Merge pull request #445 from metamx/hadoop-version-update
Hadoop version update
2014-04-17 19:16:44 -06:00
fjy 9b06b6aa58 misc fixes for router 2014-04-15 14:23:54 -07:00
fjy 9bfc738032 im deeply saddened but another case sensitivity problem and want to resolve these once and for all 2014-04-14 16:20:31 -07:00
nishantmonu51 0c95c0b689 moved file 2014-04-09 14:25:31 +05:30
fjy 1843316db6 commonalize event receiver firehose 2014-04-03 20:46:36 -07:00
Xavier Léauté e0ff2aa0d6 make isPostgreSQL check URI instead of metadata 2014-03-27 12:13:10 -07:00
nishantmonu51 a24fc84d0d Merge branch 'master' into hadoop-version-update 2014-03-27 03:32:11 +05:30
nishantmonu51 a9a6682a0e Upgrade to Hadoop 2.3.0 2014-03-26 11:43:31 +05:30
fjy 4360c050eb fix broken ut 2014-03-25 11:13:52 -07:00
fjy 771aa2ae68 backwards compat 2014-03-25 09:23:48 -07:00
fjy 5ebb2d27e3 fix hadoop 2014-03-24 18:43:31 -07:00
fjy 2adcf07f5f Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/DetermineHashedPartitionsJob.java
	indexing-service/src/main/java/io/druid/indexing/common/task/RealtimeIndexTask.java
	indexing-service/src/test/java/io/druid/indexing/common/task/TaskSerdeTest.java
	processing/src/test/java/io/druid/segment/TestIndex.java
	server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-03-17 10:59:31 -07:00
nishantmonu51 04f3d0a13a fix cli hadoop indexer
* run determine partitions job using CLI
2014-03-11 16:36:19 +05:30
fjy b5181a8a89 Merge pull request #422 from tucksaun/feat-postgresql-support
Fixed segments SQL queries for PostgreSQL compatibility
2014-03-07 13:42:57 -07:00
Yuval Oren 4b63645802 Merge branch 'master' into subquery
Conflicts:
	processing/src/main/java/io/druid/query/BaseQuery.java
	processing/src/main/java/io/druid/query/groupby/GroupByQueryQueryToolChest.java
	processing/src/test/java/io/druid/query/QueryRunnerTestHelper.java
	processing/src/test/java/io/druid/query/topn/TopNQueryRunnerTest.java
2014-03-06 13:33:59 -08:00
Tugdual Saunier ae38c92491 Fixed segments SQL queries for PostgreSQL compatibility 2014-03-06 19:14:32 +00:00
Gian Merlino 70db460f97 Blocking Executors and maxPendingPersists, oh my!
- Execs.newBlockingSingleThreaded can now accept capacity = 0.
- Changed default maxPendingPersists from 2 to 0.
- Fixed serde of maxPendingPersists in RealtimeIndexTasks.
2014-03-05 10:55:12 -08:00
fjy 5db00afb37 clean up and default values 2014-03-04 14:38:27 -08:00
fjy c4c4d80336 make local testing pass 2014-03-03 14:52:43 -08:00
fjy 46b9ac78e7 Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
	pom.xml
	publications/whitepaper/druid.pdf
	publications/whitepaper/druid.tex
2014-03-03 14:48:15 -08:00
fjy bf2ddda897 unit tests passing after more refactoring 2014-02-27 15:21:09 -08:00
Yuval Oren 314e38a2c6 Fixed realtime index query serving failure due to incorrect treatment of DataSource 2014-02-26 22:36:27 -08:00
fjy d57974d58b Merge pull request #407 from metamx/restore-task
Restore task
2014-02-26 19:18:58 -07:00
Tugdual Saunier e40725d5f3 Added support for PostgreSQL on overlord nodes 2014-02-26 01:24:37 +01:00
Xavier Léauté 7dbafa5453 fix task id 2014-02-25 13:49:39 -08:00
Xavier Léauté e2defe8bf1 update copyright date 2014-02-25 13:44:15 -08:00
Xavier Léauté 2f61035585 add restore task 2014-02-25 13:41:40 -08:00
fjy 5d2367f0fd unit tests pass at this point 2014-02-20 15:52:12 -08:00
fjy 20cac8c506 not compiling yet but close 2014-02-19 15:54:27 -08:00
fjy 4b7c76762d unit tests passingn at this point, finished rt port maybe 2014-02-18 15:14:38 -08:00
fjy 3979eb270c Revert "Revert "Merge branch 'determine-partitions-improvements'""
This reverts commit 189b3e2b9b.
2014-02-14 12:58:56 -08:00
fjy a8c4362d72 rejiggering druid api 2014-02-14 12:57:52 -08:00
fjy 189b3e2b9b Revert "Merge branch 'determine-partitions-improvements'"
This reverts commit 7ad228ceb5, reversing
changes made to 9c55e2b779.
2014-02-14 12:47:34 -08:00
nishantmonu51 7ad228ceb5 Merge branch 'determine-partitions-improvements'
Conflicts:
	pom.xml
2014-02-12 10:51:26 +05:30
Gian Merlino b9556e2e2e SimpleResourceManagementStrategy: Fix log/return 2014-02-07 10:02:45 -08:00
Gian Merlino 5ec634e498 SimpleResourceManagementStrategy: Scale up to minWorkerCount when increased 2014-02-06 13:20:09 -08:00
fjy 0f6af72ea4 Merge branch 'master' into new-schema 2014-02-06 12:46:13 -08:00
fjy af48273369 fix bug with dynamic configs in coordinator not working 2014-02-06 11:19:05 -08:00
nishantmonu51 bacc72415f correct locking and partitionsSpec 2014-02-05 03:17:47 +05:30
fjy 14d0e54327 first commit 2014-02-03 14:15:03 -08:00
Gian Merlino 994c7e3fa8 RemoteTaskActionClient: Retry on ChannelExceptions too 2014-02-03 08:16:51 -08:00
fjy 019be5c3b0 update jquery 2014-01-31 12:37:37 -08:00
nishantmonu51 97e5d68635 determine intervals working with determine partitions 2014-01-31 19:04:52 +05:30
fjy 0c789412bb add a workaround for jackson bug where jacksoninject fails when a null value is passed through json creator annotated constructor 2014-01-25 07:07:27 +08:00
fjy 2ff86984da fix broken ut 2014-01-21 10:47:45 -08:00
fjy ebc66df27d use terminateWithIds in terminate 2014-01-20 16:58:53 -08:00
fjy 1d81ad2946 remove unused class 2014-01-20 16:45:54 -08:00
fjy 1ecc9d0f98 fix the edge case where autoscaling tries to terminate node without ip 2014-01-20 16:44:19 -08:00
nishantmonu51 3fb72aff93 rename maxPendingPersistBatches to maxPendingPersists 2014-01-17 12:59:21 +05:30
nishantmonu51 fb819abd6f make maxPendingPersistPeriod configurable 2014-01-17 11:01:55 +05:30
Gian Merlino 1331f2ce56 TaskStorage.add() now throws TaskExistsException, and the servlet respects it
The servlet will throw 400 rather than 500 when a task already exists, to
signify that the request has no hope of ever working.
2014-01-13 15:42:05 -08:00
Gian Merlino a72c4429f7 RemoteTaskRunner: Fix NPE on cleanup due to missing withWorker 2014-01-13 15:41:04 -08:00
fjy f3476f40e1 fix typo 2014-01-10 18:08:33 -08:00
fjy f4e3f02c3b more exceptions 2014-01-10 18:06:42 -08:00
fjy 1ecc94cfb6 another attempt at index task 2014-01-10 17:56:22 -08:00
fjy f0b4d0c1e4 fix small bug with unusable dims 2014-01-10 14:59:09 -08:00
fjy fe50104053 fix the index task and more docs 2014-01-10 14:47:18 -08:00
nishantmonu51 da01c4a78a Add registration for backward compatibility 2014-01-10 02:02:06 +05:30
nishantmonu51 d28f9daccb Remove duplicate registration of service
If the serviceName does not contain ":" this leads to duplicate
registration of firehose with same name
2014-01-10 01:17:41 +05:30
Gian Merlino 9037141c00 IndexTask: Better logging at the end of each segment 2014-01-08 15:22:12 -08:00
Gian Merlino 2c53af4d66 ForkingTaskRunner: Upload task logs even when job fails 2014-01-08 14:46:18 -08:00
Gian Merlino 7f430d9fde RealtimeIndexTask: If a Throwable was thrown it is not a normalExit 2014-01-08 14:45:35 -08:00
Gian Merlino 83b4641e31 ForkingTaskRunnerConfig: Add java.io.tmpdir to allowedPrefixes 2014-01-07 16:12:24 -08:00
Gian Merlino bf158102c4 IndexTask: Print metrics even if finishJob fails 2014-01-07 07:17:19 -08:00
Gian Merlino 26991b5a2a Indexing service: Fix termination related log message 2013-12-20 12:05:42 -08:00
Gian Merlino 4d83837e88 RealtimeIndexTask: Clean up imports and comments 2013-12-20 11:37:16 -08:00
Gian Merlino 17ad4ee2f0 Fix RemoteTaskRunnerTest 2013-12-20 11:23:28 -08:00
Gian Merlino e5b8546d19 Autoscaling fixes.
- Initial targetWorkerCount must be subject to pool size limits
- Use consistent workerSetupData for the entire autoscaling run
- Don't call terminate() when we have nothing to terminate
- Terminate obsolete workers even faster
2013-12-20 11:17:01 -08:00
fjy 3ec2766cd3 Merge pull request #339 from metamx/autoscaling
Autoscaling: Move target count independent of actual count.
2013-12-20 10:04:26 -08:00
Gian Merlino 6224577ed1 Autoscaling: Terminate obsolete workers faster 2013-12-20 10:01:32 -08:00
Gian Merlino 4a722c0a6d Autoscaling changes from code review.
- Log and return immediately when workerSetupData is null
- Allow provisioning more nodes while other nodes are still provisioning
- Add tests for bumping up the minimum version
2013-12-20 08:59:35 -08:00
Gian Merlino 0ee6136ea3 NoopTask: Fix things that should be static. Add simple factory method. 2013-12-20 08:56:17 -08:00
Gian Merlino 3dd9a25546 Fix import 2013-12-19 16:18:16 -08:00
Gian Merlino 0ff7f0e8e0 TaskActionToolbox: Combine adjacent ifs 2013-12-19 16:16:34 -08:00
Gian Merlino f86342f7dc DbTaskStorage: Protect against invalid lock_payload 2013-12-19 16:16:20 -08:00
Gian Merlino 1f4b99634f Autoscaling: Move target count independent of actual count.
This should let us grow and shrink the worker pool in chunks when necessary
(like when a bunch of them go offline, or when there is a worker version
change).
2013-12-19 16:11:30 -08:00
Gian Merlino 846c3da4ab Empty task intervals, and empty lock intervals, aren't useful.
So prevent them from being created, through checks in AbstractFixedIntervalTask
and TaskLockbox.tryLock.
2013-12-19 13:21:41 -08:00
Gian Merlino 566a3a6112 Indexing service: Break up segment actions
Each one now one operates on at most a collection of segments that comprise
a single partition. The main purpose of this change is to prevent audit log
payload sizes from getting out of control.
2013-12-19 13:10:40 -08:00
Gian Merlino 6fbe67eeea IndexerDBCoordinator: Work around SELECT -> INSERT races when adding segments 2013-12-19 13:10:40 -08:00
Gian Merlino 1ff855d744 Fix MoveTask serde and ArchiveTask id creation 2013-12-18 15:17:12 -08:00
Gian Merlino 58d1262edf Indexing console: Clarify "Complete" with "recently completed" 2013-12-17 08:16:49 -08:00
Xavier Léauté e333776aca rename SegmentMoveAction to SegmentMetadataUpdateAction 2013-12-16 14:00:56 -08:00
Xavier Léauté ac2ca0e46c separate move and archive tasks 2013-12-16 14:00:55 -08:00
Xavier Léauté 123bddd615 update for new interfaces 2013-12-16 13:59:16 -08:00
Xavier Léauté 4a291fdf30 better naming 2013-12-16 13:59:16 -08:00
Xavier Léauté a417cd5df2 add archive task 2013-12-16 13:59:15 -08:00
fjy 87b83bceb1 fix task storage config serde and prepare for next release 2013-12-13 16:55:22 -08:00
fjy 01f9c1df31 fix broken task storage config and prepare for next release 2013-12-13 16:45:32 -08:00
Gian Merlino 600dc7546f Configurability of recency threshold 2013-12-13 16:02:54 -08:00
fjy 4a8140be81 better messaging to console again 2013-12-13 15:04:25 -08:00
fjy 52cdb20f10 add better messaging and error handling 2013-12-13 15:01:07 -08:00
Gian Merlino e63c69dd57 TaskStorage: Return recently complete tasks in reverse chronological order 2013-12-13 12:27:45 -08:00
Gian Merlino 6c993d87bf Indexing service API and GUI improvements!
- New APIs: waitingTasks, completeTasks, task payload
- GUI for the above, and for task logs + status
2013-12-13 11:38:18 -08:00
Gian Merlino f36a5b677c TaskLifecycleTest: Add test for noop task 2013-12-13 07:48:28 -08:00
Gian Merlino 3b053a66ff TaskLifecycleTest: Add test for never-ready task 2013-12-13 07:48:27 -08:00
Gian Merlino 863012c384 TaskQueue: Exception during isReady does not warrant an alert. 2013-12-13 07:48:27 -08:00
Gian Merlino 6227963af9 TaskQueue: Copy task list before management loop. 2013-12-13 07:48:27 -08:00
Gian Merlino 70c153592f CliPeon: Fix local mode 2013-12-12 14:22:57 -08:00
Gian Merlino 370e2f855a TaskSerdeTest: Fix IndexTask test by including an actual firehoseFactory 2013-12-12 13:58:44 -08:00
Gian Merlino 169f149cf9 TaskLifecycleTest: Fix broken setUp and broken assumptions. 2013-12-12 13:51:13 -08:00
Gian Merlino ba757b1e5a IndexTask: Actually make and publish segments for the correct intervals. 2013-12-12 13:50:53 -08:00
Gian Merlino be25d51a2c RemoteTaskRunner: Fix issues leading to failing tests 2013-12-12 13:49:49 -08:00
Gian Merlino c60158a21a RemoteTaskRunner: Remove task from pendingTaskPayloads on shutdown if needed 2013-12-12 10:59:16 -08:00
Gian Merlino 0129ea99cf RemoteTaskRunner changes to make bootstrapping actually work.
- Workers are not added to zkWorkers until caches have been initialized.
- Worker status we haven't heard about will be added to runningTasks or
  completeTasks as appropriate. 
- TaskRunnerWorkItem now only needs a taskId, not the entire Task. This makes
  it possible to create them from TaskStatus objects, if that's all we have.
- Also remove some dead code.
2013-12-12 10:44:46 -08:00
Gian Merlino d92b88718c OverlordResource: Fix comment 2013-12-12 08:46:24 -08:00
Gian Merlino b6a52610bc IndexTask: Call plumber.startJob() 2013-12-12 08:46:10 -08:00
Gian Merlino db9b515e71 IndexTask: Remove unnecessary args to determinePartitions. 2013-12-12 08:46:00 -08:00
Gian Merlino f4a09d4ee3 TaskAction: Add JsonSubType for LockTryAcquireAction 2013-12-12 08:45:23 -08:00
Gian Merlino b17dc6f744 Task interval, isReady hygiene 2013-12-11 22:42:20 -08:00
Gian Merlino 05e24bd85c RemoteTaskRunner: Fix typo 2013-12-11 22:38:04 -08:00
Gian Merlino bed263efa5 VersionConverterTask: Less goofy import for Preconditions 2013-12-11 22:37:55 -08:00
Gian Merlino 53d90efe30 TaskQueueConfig: Copyright header 2013-12-11 22:37:40 -08:00
Gian Merlino 0adda97776 AbstractFixedIntervalTask: Copyright header 2013-12-11 22:37:28 -08:00
Gian Merlino c4b8c8bc6f Rework indexing service internals to hopefully be more reliable.
The TaskQueue directly manages the TaskRunner. The main management loop runs
periodically and checks that the runner is doing reasonable things. If not, it
attempts to adjust the runner. The management loop also runs on-demand when a
task is added to keep task assignment relatively low latency. The TaskConsumer
is no longer necessary and so it no longer exists.

Task interval locks are handled differently. Instead of some tasks acquiring
locks at runtime and some tasks having implicit fixed lock intervals, all tasks
ask for locks explicitly. This occurs either in "isReady" (which runs on the
overlord) or in "run" (which runs on the peon).

Other changes:
- The TaskQueue is attached to the leader lifecycle, instead of global
- The TaskLockbox is able to sync itself from storage and is no longer
  bootstrapped by the TaskQueue.
- RemoteTaskRunner does not clean up zk paths until asked to. This will
  prevent deletion of statuses that have not yet been committed.
- Added retries on DbTaskStorage operations.
- Removed SpawnTasksAction (no more subtasks)
- Removed obsolete EventReceiverFirehose configs
- Removed obsolete OldOverlordResource
- Removed TaskStorageQueryAdapter methods related to subtasks
2013-12-11 15:05:16 -08:00
fjy 96f679f31c clean up for merge 2013-12-10 17:51:13 -08:00
Gian Merlino f3cfd1d781 Introduce FileTaskLogs, and move TaskLogs module from server to indexing-service 2013-12-10 17:39:43 -08:00
Gian Merlino 47c1c8cab2 TaskStorage: Rename getRunningTasks -> getActiveTasks 2013-12-10 17:39:42 -08:00
fjy 303f6ff334 fix worker config setup problems 2013-12-09 18:25:29 -08:00
nishantmonu51 2186bd6cb7 Minor fixes and documentation changes 2013-12-09 19:07:48 +05:30
fjy 5d7173ac98 fix autoscaling termination duration bug 2013-11-15 19:21:25 -08:00
fjy 346cf0e04c fix out of order in urls for hadoop classpath 2013-11-14 18:18:39 -08:00
fjy 6b41681424 fix ordering of urls, ugh, need to write more tests 2013-11-14 15:06:07 -08:00
fjy 23ebca6d32 more hadoop dependency hell, getting the right urls to hadoop is hard 2013-11-14 14:29:42 -08:00
fjy c51eed060f actually use a class loader pull out dependencies from extension modules 2013-11-14 11:52:21 -08:00
fjy fcf0c6ce06 change the classpath ordering for batch processing and prepare for next release 2013-11-14 10:52:51 -08:00
fjy 64b93bf448 fix broken autoscaling and prepare for next release 2013-11-08 11:29:42 -08:00
fjy a049b42674 fix an issue with task tables not getting created automatically and prepare for next release 2013-11-07 18:01:35 -08:00
fjy 7f85a126ac fix broken event receiving firehose 2013-11-07 16:59:01 -08:00
fjy 621133d6f2 removing dead code 2013-11-07 16:53:34 -08:00
fjy 084c90aa19 cleanup to prepare for next release 2013-11-07 15:55:51 -08:00
fjy bad1a7e9f8 fix according to code review 2013-11-07 15:52:34 -08:00
fjy aeb411a3a3 fix according to code review and fix broken examples 2013-11-07 15:42:48 -08:00
fjy e2e10fae1f clean up code 2013-11-07 15:03:14 -08:00
fjy 913ff3a082 clean up code 2013-11-07 15:00:11 -08:00
fjy 6b573c76f1 more fixes 2013-11-07 14:40:45 -08:00
Gian Merlino 8660db93fc RemoteTaskRunner: Run taskComplete after a task times out 2013-11-04 10:56:26 -08:00
Gian Merlino 186bbd1cb6 WorkerTaskMonitor: Add log message 2013-11-04 10:25:16 -08:00
Gian Merlino 781673a8f8 ForkingTaskRunner: Fix pass-down of nodeType 2013-10-23 14:13:05 -07:00
Gian Merlino b68b3526e8 IndexingServiceFirehoseModule: Add header 2013-10-18 13:27:40 -07:00
Gian Merlino 2e9c46867f Fixes for indexing service.
- Create IndexingServiceFirehoseModule so firehoses can be loaded by all mains
- Fix implicit lock acquisition in AbstractTask
2013-10-18 11:14:33 -07:00
fjy 4862852b43 more docs about how to use different versions of hadoop in druid 2013-10-16 17:54:49 -07:00
fjy 6192602893 fix extensions config not getting picked up in hadoop index task 2013-10-16 16:52:23 -07:00
fjy a1c09df17f make the hadoop index task work again 2013-10-16 09:45:17 -07:00
fjy 9796a40b92 port docs over to 0.6 and a bunch of misc fixes 2013-10-11 18:38:53 -07:00
fjy 4e509d1d09 Merge branch 'master' into is-docs 2013-10-09 14:05:10 -07:00
cheddar c47fe202c7 Fix HadoopDruidIndexer to work with the new way of things
There are multiple and sundry changes in here.

First, "HadoopDruidIndexer" has been split into two pieces, (1) CliHadoop which pulls the hadoop version and builds up the right classpath with the proper hadoop version to run the indexer and (2) CliInternalHadoopIndexer which actually runs the indexer.

In order to work around a bunch of jets3t version conflicts with Hadoop and Druid, I needed to extract the S3 deep storage stuff into its own module.  I then also moved the HDFS stuff into its own module so that I could eliminate the dependency on Hadoop for druid-server.

In doing these changes, I wanted to make the extensions buildable with only the druid-api jar, so a few other things had to move out of Druid and into druid-api.  They are all API-level things, however, so they really belong in druid-api instead.

Lastly, I removed the druid-realtime module and put it all in druid-server.
2013-10-09 15:15:44 -05:00
fjy 4ec4b8e024 rewrite indexing service docs 2013-10-08 16:34:58 -07:00
fjy 703b674800 add availability zone info to autoscaling 2013-10-07 12:16:50 -07:00
fjy ac330f72bb first set of changes to standarize the naming convention we use in druid 2013-10-03 16:36:48 -07:00
fjy 17874eeb67 make the CliPeon actually able to run on its own 2013-10-02 15:55:10 -07:00
fjy bc8db7daa5 1) make chat handler resource work again
2) add more default configs
3) make examples work again
2013-10-02 14:22:39 -07:00
Gian Merlino 384dcda7e4 Chat handlers still don't work, but, they're closer maybe. 2013-10-01 17:45:53 -07:00
fjy 59f2d0711d Merge branch 'master' into local-index 2013-10-01 13:21:43 -07:00
fjy 30df53671e remove line iterator factory because it is not needed 2013-10-01 13:21:20 -07:00
Gian Merlino 62eda5020c ShardSpec: Remove isInChunk(Map<String, String>) 2013-10-01 12:50:08 -07:00
fjy 5d0d71250b fix chat handler resources not correctly registering themselves 2013-10-01 11:25:39 -07:00
fjy 53698a135a add interface to new firehose as per code review comments 2013-09-30 18:00:59 -07:00
fjy f55a5199b1 add a firehose module to remove so much copy and pasted code 2013-09-30 16:29:20 -07:00
fjy ed9e0cf9f6 add a local firehose for indexing local files 2013-09-30 16:03:26 -07:00
Gian Merlino dc5dab8747 Fixes for property conversion, firehose registration, and the indexing service 2013-09-27 17:09:59 -07:00
fjy 0b04325ee8 fix things up according to code review comments 2013-09-27 10:17:45 -07:00
fjy e404295c1f make indexing service work 2013-09-26 17:44:21 -07:00
fjy 8bc56daa66 fix things up according to code review comments 2013-09-26 11:35:45 -07:00
fjy 87259321b6 port hadoop druid indexer to new guice framework 2013-09-26 11:04:42 -07:00
fjy 15843c3978 refactor how server service discovery is done 2013-09-24 10:36:26 -07:00
fjy dc8a119787 fix broken unit tests are a result of the last merge 2013-09-23 12:56:01 -07:00
cheddar 5712b29c8c Fix issues with bindings and handling extensions
The way the Guice bindings were setup previously, each process only had bindings
for the things it cared about.  This became problematic when adding extension modules
that bound everything that they could possibly need expecting that the processes would
only instantiate what they actually do need.  Guice tries to fail-fast and verifies that all
 bindings exist before it does anything, which is a problem because the extension bind
 some objects that don't necessarily have all of their dependencies bound in all processes.

The fix for this is to build a single Injector with all bindings in it and let each of the
 processes only load the things that they care about.  This also requires the use of
 Module overrides and other such interesting things, which are node done.

 In doing the fix, I also swapped out the way that the DataSegmentPusher/Puller stuff is bound, as well as made the Cassandra stuff fail if its settings are not provided.  This all of a sudden made all of the things require Cassandra's settings, so I migrated the Cassandra deep storage stuff into its own module.

 In doing these changes, I also discovered that some properties weren't properly converting for the ConvertProperties command (specifically, the properties related to data segment loading and pushing), so I fixed that.
2013-09-20 17:45:01 -05:00
fjy cabae7993d port over multi threaded realtime and also fix broken realtime nodes that can't start up 2013-09-16 16:03:47 -07:00
fjy f7c10e3594 rework tests in indexing service to be more unit testy 2013-09-12 16:37:58 -07:00
cheddar a2dcc45a8e 1) Remove SingleSegmentLoader and replace with OmniSegmentLoader 2013-09-12 11:47:03 -05:00
cheddar 6c9a107356 1) remove duplicate package initialization.initialization 2013-09-09 17:02:57 -05:00
cheddar 3c39f90c89 1) Move Firehose interface and dependencies to druid-api
2) Move DataSegment* interfaces and dependencies to druid-api
2013-08-31 16:43:28 -05:00
cheddar 5ab671050e No more com.metamx.druid, it is now all io.druid! 2013-08-30 19:42:12 -05:00
cheddar bd0756e360 More stuff moved, things still compiling and tests still passing. Yay! 2013-08-30 18:58:35 -05:00
cheddar 56e2b956d0 OMG!!! A lot of stuff has been moved. Modules have been created and destroyed, but everything is compiling and unit tests are passing, OMFG this is awesome.! 2013-08-30 18:21:04 -05:00
cheddar cb90ed05b0 Revert the previous commit. After going down this path, I realized that extracting things enough to allow Queries to be extended without depending on Druid proper was going to lead down a very nasty path. So, I've decided against that. Extending queries will require a tight dependency on Druid proper. 2013-08-29 16:45:03 -05:00
cheddar 2a46086e20 1) Didn't remove the io.druid files from client. Remove those and make sure things compile
2) Switch DefaultObjectMapper to CommonObjectMapper
3) Create new DefaultObjectMapper in client that has Query stuff registered on it by default
2013-08-29 15:25:36 -05:00
cheddar 9c30ced5ea 1) Move various "api" classes to io.druid packages and make sure things compile and stuff 2013-08-28 15:51:02 -05:00
cheddar ee1e73cfa1 1) Make it compile again after the merge 2013-08-27 14:36:01 -05:00
cheddar 8097450d8c Some things that didn't get committed with the merge for some reason!? 2013-08-27 14:29:03 -05:00
cheddar 5fa944dd26 Merge branch 'master' into guice
Conflicts:
	client/src/main/java/com/metamx/druid/coordination/BatchDataSegmentAnnouncer.java
	client/src/main/java/com/metamx/druid/curator/announcement/Announcer.java
	client/src/main/java/com/metamx/druid/query/filter/SelectorDimFilter.java
	client/src/main/java/com/metamx/druid/query/search/SearchQueryQueryToolChest.java
	indexing-service/src/main/java/com/metamx/druid/indexing/common/tasklogs/S3TaskLogs.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/ForkingTaskRunner.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunner.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/WorkerCuratorCoordinator.java
	indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunnerTest.java
	pom.xml
	server/src/main/java/com/metamx/druid/http/MasterMain.java
	server/src/main/java/com/metamx/druid/http/MasterServletModule.java
	server/src/main/java/com/metamx/druid/master/DruidMasterConfig.java
	server/src/test/java/com/metamx/druid/master/DruidMasterTest.java
	server/src/test/java/com/metamx/druid/query/group/GroupByQueryRunnerTest.java
2013-08-27 14:27:32 -05:00
cheddar 3617ac17fc 1) Eliminate ExecutorMain and have it run using the new Main! 2013-08-27 14:11:05 -05:00
cheddar 269997dc94 1) ExecutorNode is working, except for the running of the task. Need to adjust it to be able to run a task and then everything will be wonderful 2013-08-26 18:08:41 -05:00
cheddar 6636ef1ea8 Remove unused files again 2013-08-23 18:00:56 -05:00
cheddar 55dbda2046 1) Worker appears to be running! It's also now known as the MiddleManager 2013-08-23 17:59:48 -05:00
cheddar 613ebd54b5 1) Delete unused things 2013-08-23 14:32:14 -05:00
cheddar b897c2cb22 1) IndexCoordinator appears to work as the CliOverlord now, yay! 2013-08-23 14:11:34 -05:00
fjy d92ab8bb58 more logs for RTR 2013-08-21 21:47:59 -07:00
fjy 54f00479cc add explicit null check for moving tasks from pending to running 2013-08-21 13:02:35 -07:00
fjy 88661b26a0 bug fix for RTR removing workers race condition and partition chunks not being sorted by chunk number 2013-08-21 11:14:54 -07:00
fjy e283de6831 fix another bug with RTR to remove things correctly from running tasks 2013-08-20 19:34:30 -07:00
fjy d02be16245 fix RTR closing PCC too early 2013-08-20 19:25:16 -07:00
Gian Merlino 70ab225770 Add missing license headers 2013-08-20 17:50:10 -07:00
Gian Merlino 455645e723 Workers announce TaskAnnouncement rather than TaskStatus 2013-08-20 16:14:36 -07:00
Gian Merlino 9609314765 ForkingTaskRunner: Make TaskInfo into ForkingTaskRunnerWorkItem
This allows the API/GUI to return reasonable results when the primary
task runner is a ForkingTaskRunner.
2013-08-20 14:04:28 -07:00
Gian Merlino 4e8325f963 Better tests and error messages for TaskResource 2013-08-20 14:01:38 -07:00
Gian Merlino b102d67173 Fix getResourceManagementScheduler for non-autoscaling configs 2013-08-20 13:39:05 -07:00
Gian Merlino 25e330780c Simplify AbstractTask constructor 2013-08-20 13:38:52 -07:00
Gian Merlino d8493f8e26 RealtimeIndexTask: Fix "resource" serde 2013-08-20 13:02:52 -07:00
fjy 5c600c7012 change error msg to alert 2013-08-15 13:27:41 -07:00
fjy 4e7dac18b9 fix condition where status would be returned even if worker not running task 2013-08-15 13:12:57 -07:00
fjy 1fb6107a37 fix the case where RTR does not clean up a completed task on startup 2013-08-15 13:09:02 -07:00
Gian Merlino 8d7a4f4493 Retries for S3TaskLogs, S3DataSegmentPusher 2013-08-12 14:27:34 -07:00
fjy df883a9823 learn to type 2013-08-06 21:18:36 -07:00
fjy 795657aedf fix bug where workers with same capacity would not be unique 2013-08-06 21:04:13 -07:00
fjy ebf1ac47f0 Merge branch 'master' of github.com:metamx/druid 2013-08-06 15:38:25 -07:00
fjy 9d0e4a94f0 alert when task fails in RTR assign 2013-08-06 15:38:17 -07:00
cheddar 4a64ce37ed Finish the merging, wtf IntelliJ? 2013-08-06 13:34:24 -07:00
cheddar eee1efdcb5 Merge branch 'master' into guice
Conflicts:
	client/src/main/java/com/metamx/druid/client/DruidServerConfig.java
	indexing-service/src/main/java/com/metamx/druid/indexing/common/index/ChatHandlerProvider.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/TaskMasterLifecycle.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
	indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/TaskLifecycleTest.java
2013-08-06 13:33:31 -07:00
Gian Merlino a1904c9b3b ChatHandlerResource: Fix Guice type errors 2013-08-05 19:56:03 -07:00
fjy d1b2a5a4b3 fix indexer console serde of running tasks 2013-08-05 18:22:12 -07:00
fjy 479f0cefca fix bug with RTR not assigning tasks when a new worker is available 2013-08-05 17:57:59 -07:00
fjy 626cf14a6e fix bug where the curator config name was changed in one place but not another; make some info msgs into debug msgs; fix zkworker serialization 2013-08-05 16:02:26 -07:00
fjy 66c658305f Merge branch 'master' of github.com:metamx/druid 2013-08-05 14:44:09 -07:00
fjy 35f89d7232 make RTR idempotent to multiple run requests for same task, because higher level things in the indexing service require this behaviour 2013-08-05 14:44:01 -07:00
Gian Merlino efd34f3a8b TaskRunner: Fix comment 2013-08-05 14:20:31 -07:00
cheddar 2361e0112a Make it all compile again... 2013-08-02 10:14:46 -07:00
fjy c33f2f06ff fix logic of how to assign tasks to workers 2013-08-02 09:01:02 -07:00
fjy 584ccac833 move scanning of workers and tasks into RTR start, simplify bootstrap, make tests better 2013-08-01 17:50:05 -07:00
cheddar 9e78bb38f5 Merge branch 'master' into guice
Conflicts:
	client/src/main/java/com/metamx/druid/QueryableNode.java
	client/src/main/java/com/metamx/druid/client/ServerInventoryView.java
	client/src/main/java/com/metamx/druid/coordination/SingleDataSegmentAnnouncer.java
	client/src/main/java/com/metamx/druid/initialization/CuratorDiscoveryConfig.java
	client/src/main/java/com/metamx/druid/query/MetricsEmittingExecutorService.java
	indexing-hadoop/src/test/java/com/metamx/druid/indexer/HadoopDruidIndexerConfigTest.java
	indexing-service/src/main/java/com/metamx/druid/indexing/common/TaskToolbox.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/http/IndexerCoordinatorNode.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/http/WorkerNode.java
	pom.xml
	server/src/main/java/com/metamx/druid/coordination/ServerManager.java
	server/src/main/java/com/metamx/druid/coordination/ZkCoordinator.java
	server/src/main/java/com/metamx/druid/db/DatabaseRuleManager.java
	server/src/main/java/com/metamx/druid/db/DatabaseSegmentManager.java
	server/src/main/java/com/metamx/druid/http/ComputeNode.java
	server/src/main/java/com/metamx/druid/http/MasterMain.java
	server/src/main/java/com/metamx/druid/loading/SegmentLoaderConfig.java
	server/src/main/java/com/metamx/druid/loading/SingleSegmentLoader.java
	server/src/main/java/com/metamx/druid/master/DruidMaster.java
2013-08-01 16:42:47 -07:00
cheddar 019bb5d453 1) Another whole bunch of changes to annotate things and create Modules and bind stuff. But OMFG, the compute node actually appears to be working!
2) The compute node works with Guice
3) The compute node fires up with Guice and appears to work
4) Did I mention that the compute node, now called historical node, fires up with Guice and appears to work?
2013-08-01 15:28:08 -07:00
fjy a4edc2221d fix RTR comments 2013-07-31 15:28:52 -07:00
fjy 215d147a69 Merge branch 'worker-resource' of github.com:metamx/druid into worker-resource 2013-07-31 15:23:49 -07:00
fjy e2b5cd6067 Merge branch 'master' of github.com:metamx/druid into worker-resource
Conflicts:
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
2013-07-31 15:23:13 -07:00
Gian Merlino eaddce06d5 Call TaskRunner.bootstrap immediately after starting it 2013-07-30 15:26:11 -07:00
fjy 50836798fa toggle between compressed and non compressed service discovery 2013-07-29 15:40:45 -07:00
fjy ad65c8111d fix logs 2013-07-29 11:41:42 -07:00