druid

Commit Graph

Author	SHA1	Message	Date
Roman Leventov	7b56cec3b9	Fix resource leaks (#3702 )	2016-11-18 21:21:36 +05:30
Gian Merlino	bcd20441be	Make buildV9Directly the default. (#3688 )	2016-11-14 09:29:32 -08:00
Roman Leventov	fbbb55f867	Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics (#3679 ) * Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics, not only query metrics * Remove unused imports * Use empty string instead of "testing-version" as a version placeholder	2016-11-11 17:17:27 -06:00
Himanshu	b76b3f8d85	reset-cluster command to clean up druid state stored on metadata and deep storage (#3670 )	2016-11-09 11:07:01 -06:00
Akash Dwivedi	4b3bd8bd63	Migrating java-util from Metamarkets. (#3585 ) * Migrating java-util from Metamarkets. * checkstyle and updated license on java-util files. * Removed unused imports from whole project. * cherry pick metamx/java-util@826021f. * Copyright changes on java-util pom, address review comments.	2016-10-21 14:57:07 -07:00
Parag Jain	1e79a1be82	fix useExplicitVersion (#3559 )	2016-10-10 14:28:06 -05:00
Akash Dwivedi	078de4fcf9	Use explicit version from HadoopIngestionSpec. (#3554 )	2016-10-07 13:59:14 -07:00
Parag Jain	e419407eba	handle supervisor spec metadata failures (#3456 ) close kafka consumer in case supervisor start fails	2016-10-04 10:15:28 -07:00
Gian Merlino	40f2fe7893	Bump versions to 0.9.3-SNAPSHOT (#3524 )	2016-09-29 13:53:32 -07:00
David Lim	ca9114b41b	add supervisor reset API (#3484 ) * add supervisor reset API * CR doc changes and kill running tasks / clear offsets from supervisor	2016-09-22 17:51:06 -07:00
Gian Merlino	27bd5cb13a	Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. (#3473 ) Fixes #3241.	2016-09-21 13:40:04 -06:00
Gian Merlino	7a2a4bc6de	JavaScript: Disable now affects worker selection and router strategy too. (#3458 )	2016-09-13 16:37:42 -07:00
Dave Li	c4e8440c22	Adds long compression methods (#3148 ) * add read * update deprecated guava calls * add write and vsizeserde * add benchmark * separate encoding and compression * add header and reformat * update doc * address PR comment * fix buffer order * generate benchmark files * separate encoding strategy and format * fix benchmark * modify supplier write to channel * add float NONE handling * address PR comment * address PR comment 2	2016-08-30 16:17:46 -07:00
Nishant	4c2b8d29d3	Make RTR assign pending tasks by insertion order (#3405 )	2016-08-30 12:22:44 -07:00
Gian Merlino	2f46effc8e	FileTaskLogsTest: Throw unthrown exception. (#3352 )	2016-08-11 09:40:28 -07:00
Himanshu	03cfcf002b	fix the race described in #3174 (#3205 )	2016-08-10 11:29:50 -07:00
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
David Lim	d5ed3f1347	change expected response from ACCEPTED to OK (#3280 )	2016-07-23 19:48:30 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Hyukjin Kwon	55e7a52475	Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215 )	2016-07-14 09:19:17 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Xavier Léauté	485e381387	remove datasource from hadoop output path (#3196 ) fixes #2083, follow-up to #1702	2016-06-29 08:53:45 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Charles Allen	6be18376c0	Make forking task runner have more informative thread names during the long-blocking part (#3172 ) * Make forking task runner have more informative thread names during the long-blocking part * Make string.format do the work	2016-06-24 08:56:01 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
David Lim	5a3db634ff	add synchronization to SupervisorManager (#3077 )	2016-06-07 00:29:23 -06:00
David Lim	a2290a8f05	support seamless config changes (#3051 )	2016-06-03 13:50:19 -07:00
Charles Allen	474286bbce	Make TaskMaster giant lock fair (#3050 )	2016-06-02 12:10:40 -07:00
David Lim	3ef24c03b3	Validate X-Druid-Task-Id header in request/response and support retrying on outdated TaskLocation information, add KafkaIndexTaskClient unit tests (#3006 ) * validate X-Druid-Task-Id header in request and add header to response * modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant	2016-05-25 22:05:18 -07:00
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
Gian Merlino	f8ddfb9a4b	Split SegmentInsertAction and SegmentTransactionalInsertAction for backwards compat. (#2922 ) Fixes #2912.	2016-05-04 13:54:34 -07:00
Himanshu	50065c8288	fix spurious failure of RTR concurrency test (#2915 )	2016-05-04 10:30:20 -07:00
Charles Allen	3f71a4a302	Fix missing log arguments in PendingTaskBasedWorkerResourceManagementStrategy (#2898 )	2016-04-28 18:15:41 -07:00
Parag Jain	0d745ee120	Basic authorization support in Druid (#2424 ) - Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424	2016-04-28 16:50:28 -07:00
Himanshu	9669e79df2	fix misleading error log due to race in RTR and concurrency test (#2878 )	2016-04-28 10:28:00 -07:00
Nishant	c29cb7d711	add pending task based resource management strategy (#2086 )	2016-04-27 10:40:53 -07:00
Nishant	bf5e5e7b75	fix #2886 (#2887 ) Fixes https://github.com/druid-io/druid/issues/2886	2016-04-27 08:29:41 -07:00
David Lim	7641f2628f	add control and status endpoints to KafkaIndexTask (#2730 )	2016-04-21 15:34:59 -07:00
Nishant	dbf63f738f	Add ability to filter segments for specific dataSources on broker without creating tiers (#2848 ) * Add back FilteredServerView removed in `a32906c7fd` to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources"	2016-04-19 10:10:06 -07:00
Gian Merlino	08c784fbf6	KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844 ) segment creation deterministic. This means that each segment will contain data from just one Kafka partition. So, users will probably not want to have a super high number of Kafka partitions... Fixes #2703.	2016-04-18 22:29:52 -07:00
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Fangjin Yang	1e02eeab13	Merge pull request #2683 from metamx/default_retry Better defaults for Retry policy for task actions	2016-03-29 08:02:59 -07:00
Gian Merlino	195c9c5240	Overlord: Avoid a scary Jersey warning. Avoids the following message from being printed on Overlord startup: WARNING: Parameter 1 of type io.druid.indexing.common.actions.TaskActionHolder<T> from public <T> javax.ws.rs.core.Response io.druid.indexing.overlord.http.OverlordResource.doAction (io.druid.indexing.common.actions.TaskActionHolder<T>) is not resolvable to a concrete type	2016-03-28 19:08:56 -07:00
Fangjin Yang	c2284929dc	Merge pull request #2739 from gianm/fix-wtmtest-failure Fix handling of InterruptedException in WorkerTaskMonitor's mainLoop.	2016-03-28 14:52:10 -07:00
Gian Merlino	ee4bb96855	Fix handling of InterruptedException in WorkerTaskMonitor's mainLoop. I believe this will fix #2664.	2016-03-25 12:17:33 -07:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Nishant	ed8f39fcfe	Better defaults for Retry policy for task actions This PR changes the retry of task actions to be a bit more aggressive by reducing the maxWait. Current defaults were 1 min to 10 mins, which lead to a very delayed recovery in case there are any transient network issues between the overlord and the peons. doc changes.	2016-03-18 11:59:55 -07:00
Charles Allen	c716af5b04	Merge pull request #2678 from metamx/fixImports Fix some google related imports	2016-03-17 11:53:16 -07:00
Charles Allen	a52c6d3bee	Fix some google related imports	2016-03-17 11:03:29 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Nishant	9cceff2274	Use ImmutableWorkerInfo instead of ZKWorker review comments add test for equals and hashcode	2016-03-14 11:17:15 -07:00
Himanshu	d51a0a0cf4	Merge pull request #2220 from gianm/appenderator-kafka Appenderators, DataSource metadata, KafkaIndexTask	2016-03-14 13:14:36 -05:00
Nishant	cf7f6da392	Merge pull request #2634 from gianm/stopGracefully-avoid-interrupt ThreadPoolTaskRunner: Make graceful shutdown logs less scary.	2016-03-11 16:36:10 -08:00
Charles Allen	a3f0048ea4	Merge pull request #2631 from gianm/plumbers-rpe Better logging for ParseExceptions on index aggregation, and remove unnecessary exception handling.	2016-03-11 14:22:58 -08:00
Gian Merlino	79a95f7789	WorkerTaskMonitor: stop() waits for mainLoop to exit. Fixes #2637.	2016-03-11 11:40:13 -08:00
Gian Merlino	05397a9b4f	ThreadPoolTaskRunner: Make graceful shutdown logs less scary. - It's okay to suppress InterruptedException during graceful shutdown, as tasks may use it to accelerate their own shutdown. - It's okay to ignore return statuses during graceful shutdown (which may be FAILED!) because it actually doesn't matter what they are.	2016-03-11 07:49:29 -08:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Gian Merlino	3d2214377d	Appenderatoring. Appenderators are a way of getting more control over the ingestion process than a Plumber allows. The idea is that existing Plumbers could be implemented using Appenderators, but you could also implement things that Plumbers can't do. FiniteAppenderatorDrivers help simplify indexing a finite stream of data. Also: - Sink: Ability to consider itself "finished" vs "still writable". - Sink: Ability to return the number of rows contained within the sink.	2016-03-10 17:41:50 -08:00
Gian Merlino	08284fea62	Publish test-jar for indexing-service.	2016-03-10 16:50:37 -08:00
Gian Merlino	92c828f904	Make SegmentHandoffNotifier Closeable.	2016-03-10 16:50:37 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Charles Allen	d299540efc	Make HadoopTask load hadoop dependency classes LAST for local isolated classrunner	2016-03-10 10:18:23 -08:00
Himanshu Gupta	0402636598	configurable handoffConditionTimeout in realtime tasks for segment handoff wait	2016-03-05 10:14:54 -06:00
Gian Merlino	e9c23bf376	OverlordResource: Use getZkWorkers on RemoteTaskRunner. Restores old behavior of this api, from before #2249 when getWorkers returned ZkWorkers.	2016-03-02 17:31:34 -08:00
Fangjin Yang	80d954578d	Merge pull request #2572 from gianm/fix-rit-taskresource Fix default TaskResource for RealtimeIndexTasks.	2016-03-02 10:20:27 -08:00
Gian Merlino	acd95d3e28	TaskLocation: Add toString method. Necessary because these objects are used in log messages.	2016-03-01 17:52:06 -08:00
Gian Merlino	a355bfb7a9	Fix default TaskResource for RealtimeIndexTasks. It was supposed to be the same as the task id, but it wasn't because "makeTaskId" has a random component.	2016-03-01 16:54:22 -08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Charles Allen	c6803c4364	Allow specifying peon javaOpts as an array	2016-02-26 13:24:35 -08:00
Himanshu Gupta	bc156effe7	RTR has multiple threads for assignment of pending tasks now.	2016-02-26 09:27:03 -06:00
Fangjin Yang	53a5f07c14	Merge pull request #2544 from metamx/fixMaxPort Limit PortFinder to 0xFFFF	2016-02-25 17:12:53 -08:00
Fangjin Yang	143e85eaa5	Merge pull request #2419 from gianm/task-hostports Plumb task peon host/ports back out to the overlord.	2016-02-25 17:11:53 -08:00
Charles Allen	3fa7a7ebfe	Limit PortFinder to 0xFFFF	2016-02-25 08:16:40 -08:00
Charles Allen	187b788089	UnRegister port in ForkingTaskRunner	2016-02-25 08:04:25 -08:00
Gian Merlino	cf0bc905fb	Plumb task peon host/ports back out to the overlord. - Add TaskLocation class - Add registerListener to TaskRunner - Add getLocation to TaskRunnerWorkItem - Implement location tracking in existing TaskRunners - Rework WorkerTaskMonitor to do management out of a single thread so it can handle status and location updates more simply.	2016-02-24 15:13:10 -08:00
Nishant	fb7eae34ed	Merge pull request #2249 from metamx/workerExpanded Use Worker instead of ZkWorker whenever possible	2016-02-24 13:23:22 +05:30
Charles Allen	ac13a5942a	Use Worker instead of ZkWorker whenver possible * Moves last run task state information to Worker * Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker	2016-02-23 15:02:03 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Bingkun Guo	499288ff4b	Merge pull request #2509 from metamx/hadoopIsolatorTest Add hadoop classloader isolation tests for HadoopTask	2016-02-19 14:23:22 -06:00
Fangjin Yang	a3c29b91cc	Merge pull request #2505 from gianm/rt-exceptions Harmonize realtime indexing loop across the task and standalone nodes.	2016-02-19 11:23:14 -08:00
Charles Allen	9dff0e5dbd	Add hadoop classloader isolation tests for HadoopTask	2016-02-19 11:15:53 -08:00
Fangjin Yang	ddf913d626	Merge pull request #2508 from gianm/ftr-shutdown-logging ForkingTaskRunner: Better logging during orderly shutdown.	2016-02-19 10:02:24 -08:00
Gian Merlino	c0c6cf77fa	ForkingTaskRunner: Better logging during orderly shutdown.	2016-02-19 09:17:16 -08:00
Gian Merlino	243ac5399b	Harmonize realtime indexing loop across the task and standalone nodes. - Both now catch ParseExceptions on plumber.add (see https://groups.google.com/d/topic/druid-user/wmiRDvx2RvM/discussion) - Standalone now treats IndexSizeExceededException as fatal (previously only the task did)	2016-02-19 07:34:15 -08:00
Charles Allen	87752be740	Make HadoopTasks's classloader a single one	2016-02-18 20:58:09 -08:00
Andrés Gomez	07d714b1b5	Fixed equal distribution strategy when exist disable middleManager with same currCapacityUsed.	2016-02-17 08:40:42 +01:00
Himanshu	5779b32742	Merge pull request #2439 from metamx/fix2435 Make QuotableWhiteSpaceSplitter able to take JSON	2016-02-11 13:14:43 -06:00
Charles Allen	40ade32a1f	Fix dependencies. * Don't put druid***selfcontained.jar at the end of the hadoop isolated classpath Add `<scope>provided</scope>` to prevent repeated dependency inclusion in the extension directories	2016-02-11 07:30:14 -08:00
Charles Allen	3a6452c6d4	Make QuotableWhiteSpaceSplitter able to take json * Fixes #2435	2016-02-10 16:42:14 -08:00
Xavier Léauté	91f23583f5	Merge pull request #2436 from gianm/mm-less-suppressey Harmonize znode writing code in RTR and Worker.	2016-02-10 16:11:30 -08:00
Gian Merlino	fa92b77f5a	Harmonize znode writing code in RTR and Worker. - Throw most exceptions rather than suppressing them, which should help detect problems. Continue suppressing exceptions that make sense to suppress. - Handle payload length checks consistently, and improve error message. - Remove unused WorkerCuratorCoordinator.announceTaskAnnouncement method. - Max znode length should be int, not long. - Add tests.	2016-02-10 14:52:00 -08:00
Charles Allen	2bde8b1d68	Make hadoop classpath isolation more explicit * Fixes #2428	2016-02-10 12:09:17 -08:00
Charles Allen	a0728fa854	Allow ScalingStats to be null * Fixes #2378	2016-02-02 18:01:01 -08:00
Parag Jain	7853a9cc41	clean up TaskLifecycleTest	2016-01-31 11:19:20 -06:00
Gian Merlino	5fd4b79373	RealtimeIndexTask: Fix NPE caused by calling stopGracefully before a firehose had been connected.	2016-01-29 11:20:23 -08:00
Gian Merlino	c4fde52160	Fix 'graceful shutdown aborted' log message in ThreadPoolTaskRunner.	2016-01-29 11:07:17 -08:00
Nishant	dcb7830330	Merge pull request #984 from drcrallen/thread-priority-rebase Use thread priorities. (aka set `nice` values for background-like tasks)	2016-01-21 15:02:34 +05:30
Charles Allen	66e74b1a63	Minor field name change in RemoteTaskRunnerFactory to be more descriptive * Addresses https://github.com/druid-io/druid/pull/2309#discussion_r50335081	2016-01-20 17:43:20 -08:00
Charles Allen	3152d08844	Fix overlord scheduled executor injection * Fixes https://github.com/druid-io/druid/issues/2308	2016-01-20 14:16:14 -08:00
Charles Allen	2e1d6aaf3d	Use thread priorities. (aka set `nice` values for background-like tasks) * Defaults the thread priority to java.util.Thread.NORM_PRIORITY in io.druid.indexing.common.task.AbstractTask * Each exec service has its own Task Factory which is assigned a priority for spawned task. Therefore each priority class has a unique exec service * Added priority to tasks as taskPriority in the task context. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers * Add options to ForkingTaskRunner * Add "-XX:+UseThreadPriorities" default option * Add "-XX:ThreadPriorityPolicy=42" default option * AbstractTask - Removed unneded @JsonIgnore on priority * Added priority to RealtimePlumber executors. All sub-executors (non query runners) get Thread.MIN_PRIORITY * Add persistThreadPriority and mergeThreadPriority to realtime tuning config	2016-01-20 14:00:31 -08:00
Nishant	ac6c90e657	Merge pull request #1953 from metamx/taskRunnerResourceManagement Move resource managemnt to be the responsibility of the TaskRunner	2016-01-20 22:27:47 +05:30
Jonathan Wei	df2906a91c	Merge pull request #2290 from gianm/index-merger-v9-stuff Respect buildV9Directly in PlumberSchools, so it works on standalone realtime.	2016-01-19 13:04:00 -08:00
Fangjin Yang	0c31f007fc	Merge pull request #1728 from himanshug/aggregators_in_segment_metadata Store AggregatorFactory[] in segment metadata	2016-01-19 12:55:49 -08:00
Himanshu	fe841fd961	Merge pull request #2118 from guobingkun/fix_segment_loading Fix loading segment for historical	2016-01-19 14:25:48 -06:00
Himanshu Gupta	a99aef29a1	adding aggregators to segment metadata	2016-01-19 14:23:39 -06:00
Gian Merlino	1dcf22edb7	Respect buildV9Directly in PlumberSchools, so it works on standalone realtime nodes. Also parameterize some tests to run with/without buildV9Directly: - IndexGeneratorJobTest - RealtimeIndexTaskTest - RealtimePlumberSchoolTest	2016-01-19 12:15:06 -08:00
Bingkun Guo	c4ad50f92c	Fix loading segment for historical Historical will drop a segment that shouldn't be dropped in the following scenario: Historical node tried to load segmentA, but failed with SegmentLoadingException, then ZkCoordinator called removeSegment(segmentA, blah) to schedule a runnable that would drop segmentA by deleting its files. Now, before that runnable executed, another LOAD request was sent to this historical, this time historical actually succeeded on loading segmentA and announced it. But later on, the scheduled drop-of-segment runnable started executing and removed the segment files, while historical is still announcing segmentA.	2016-01-19 10:29:49 -06:00
Himanshu Gupta	164b0aad7a	removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class instead of a Map to store segment metadata	2016-01-18 22:03:46 -06:00
Kurt Young	82ff98c2bf	add config for build v9 directly and update docs	2016-01-16 11:26:34 +08:00
Charles Allen	976d4c965b	Move resource managemnt to be the responsibility of the TaskRunner	2016-01-13 10:38:22 -08:00
Himanshu	82bdfbbbf1	Merge pull request #2155 from metamx/taskConfigTmpdir Make TaskConfig pull from java.io.tmpdir	2016-01-05 13:58:39 -06:00
Nishant	45f402f22f	increase timeout tune timeouts	2016-01-05 19:06:04 +05:30
Charles Allen	e18301d99c	Make TaskConfig pull from java.io.tmpdir * Also makes paths built off of java.nio.file.Paths instead of String.format	2016-01-04 10:17:08 -08:00
fjy	b5c094d951	Fixes #2180	2016-01-01 16:56:41 -08:00
Nishant	b68265399c	Merge pull request #2168 from druid-io/remove-indexmaker Remove IndexMaker	2015-12-30 12:24:29 +05:30
Nishant	df893dbaf8	Merge pull request #2141 from gianm/fix-restoring-realtime Fix some problems with restoring	2015-12-30 10:44:45 +05:30
Fangjin Yang	7ffa706655	Merge pull request #2152 from metamx/add-taskId Add taskId to realtimeMetrics	2015-12-29 10:33:40 -08:00
fjy	38b0f1fbc2	fix transient failures in unit tests	2015-12-28 20:03:30 -08:00
fjy	faf421726b	remove IndexMaker	2015-12-28 14:19:02 -08:00
Fangjin Yang	8cb52bddd8	Merge pull request #2140 from navis/fix-sporadic-testfail4 Fix sporadic fail of RemoteTaskRunnerTest#testWorkerRemoved	2015-12-27 14:55:49 -08:00
Fangjin Yang	9aa62e4631	Merge pull request #2154 from navis/fix-testfail-WorkerTaskMonitorTest Fix sporadic fail of WorkerTaskMonitorTest#testRunTask	2015-12-23 20:52:33 -08:00
navis.ryu	a8f6c6110d	Fix sporadic fail of WorkerTaskMonitorTest#testRunTask	2015-12-24 02:31:30 +09:00
navis.ryu	2c3c4a3f8f	Another try to fix xxServerViewTests	2015-12-24 02:13:40 +09:00
Nishant	978a3fd8ae	Add taskId to realtimeMetrics Add task Id to Realtime Metrics	2015-12-23 18:05:25 +05:30
Gian Merlino	32edd1538d	RealtimeIndexTask: Fix a couple of problems with restoring. - Shedding locks at startup is bad, we actually want to keep them. Stop doing that. - stopGracefully now interrupts the run thread if had started running finishJob. This avoids waiting for handoff unnecessarily.	2015-12-22 16:04:47 -08:00
Gian Merlino	f4ce2b9bc5	TaskLockbox: Consider active tasks active even if they have no locks.	2015-12-22 16:04:16 -08:00
Gian Merlino	bad270b6c4	druid.indexer.task.restoreTasksOnRestart configuration.	2015-12-22 10:59:15 -08:00
navis.ryu	8a179fc273	Fix sporadic fail of RemoteTaskRunnerTest#testWorkerRemoved	2015-12-22 14:33:37 +09:00
Himanshu Gupta	5e178499e8	trying to fix transient errors in testRealtimeIndexTask() by increasing overall timeout and unlimited wait for segment publish	2015-12-21 00:11:20 -06:00
Fangjin Yang	14229ba0f2	Merge pull request #1922 from metamx/jsonIgnoresFinalFields Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to	2015-12-18 15:38:32 -08:00
Bingkun Guo	1e5aa2f3ac	fix getType() and Json serialization in ClientMergeQuery and add serde tests	2015-12-15 12:08:43 -06:00
Nishant	a32906c7fd	Remove FilteredServerView	2015-12-09 01:54:12 +05:30
Nishant	9491e8de3b	Remove ServerView from RealtimeIndexTasks and use coordinator http endpoint for handoffs - fixes #1970 - extracted out segment handoff callbacks in SegmentHandoffNotifier which is responsible for tracking segment handoffs and doing callbacks when handoff is complete. - Coordinator now maintains a view of segments in the cluster, this will affect the jam heap requirements for the overlord for large clusters. realtime index task and nodes now use HTTP end points exposed by the coordinator to get serverView review comment fix realtime node guide injection review comments make test not rely on scheduled exec fix compilation fix import review comment introduce immutableSegmentLoadInfo fix son reading remove unnecessary logging	2015-12-09 01:54:09 +05:30
Himanshu Gupta	62ba9ade37	unifying license header in all java files	2015-12-05 22:16:23 -06:00
Gian Merlino	20544d409b	Merge pull request #1988 from himanshug/multi-interval-batch-delta support multiple intervals in dataSource inputSpec	2015-12-04 09:07:52 -08:00
Gian Merlino	020a5e7081	Merge pull request #2024 from metamx/fairBigTaskQueueLock Make the TaskQueue big lock fair	2015-12-03 19:32:53 -08:00
Himanshu Gupta	61aaa09012	support multiple intervals in dataSource input spec	2015-12-03 21:28:04 -06:00
Himanshu Gupta	86f0a36e83	support multiple intervals in SegmentListUsedAction	2015-12-03 21:28:04 -06:00
Himanshu Gupta	221fb95d07	add support for getting used segments for multiple interval in IndexerMetadataStorageCoordinator	2015-12-03 21:28:04 -06:00
Charles Allen	dbaaa6af92	Make the TaskQueue big lock fair	2015-12-01 19:13:07 -08:00
Nishant	1eb8211346	Add datasource and taskId to metrics emitted by peons This PR adds the datasource and taskId to the jvm and sys metrics emitted by the peons. fix spelling review comment review comment	2015-12-01 23:20:59 +05:30
Fangjin Yang	8e83d800d6	Merge pull request #1881 from gianm/restartable-tasks Restorable indexing tasks	2015-11-23 21:14:37 -08:00
Gian Merlino	501dcb43fa	Some changes that make it possible to restart tasks on the same hardware. This is done by killing and respawning the jvms rather than reconnecting to existing jvms, for a couple reasons. One is that it lets you restore tasks after server reboots too, and another is that it lets you upgrade all the software on a box at once by just restarting everything. The main changes are, 1) Add "canRestore" and "stopGracefully" methods to Tasks that say if a task can stop gracefully, and actually do a graceful stop. RealtimeIndexTask is the only one that currently implements this. 2) Add "stop" method to TaskRunners that attempts to do an orderly shutdown. ThreadPoolTaskRunner- call stopGracefully on restorable tasks, wait for exit ForkingTaskRunner- close output stream to restorable tasks, wait for exit RemoteTaskRunner- do nothing special, we actually don't want to shutdown 3) Add "restore" method to TaskRunners that attempts to bootstrap tasks from last run. Only ForkingTaskRunner does anything here. It maintains a "restore.json" file with a list of restorable tasks. 4) Have the CliPeon's ExecutorLifecycle lock the task base directory to avoid a restored task and a zombie old task from stomping on each other.	2015-11-23 11:22:08 -08:00
Gian Merlino	666d785787	Switch TaskActions from Optionals to nullable. Deserialization of Optionals does not work quite right- they come back as actual nulls, rather than absent Optionals. So these probably only ever worked for the local task action client.	2015-11-20 09:14:07 -08:00
Fangjin Yang	21c84b5ff7	Merge pull request #1896 from gianm/allocate-segment SegmentAllocateAction (fixes #1515)	2015-11-18 21:05:46 -08:00
Fangjin Yang	e52c156066	Merge pull request #1880 from gianm/rtr-adjust RTR: Ensure that there is only one cleanup task scheduled for a worker at once.	2015-11-18 15:12:55 -08:00
Charles Allen	8fcf2403e3	Merge pull request #1943 from metamx/realtime-caching Enable caching on intermediate realtime persists	2015-11-17 15:06:43 -08:00
Charles Allen	dbe201aeed	Merge pull request #1929 from pjain1/jetty_threads separate ingestion and query thread pool	2015-11-17 12:14:25 -08:00
Parag Jain	6c498b7d4a	separate ingestion and query thread pool	2015-11-17 13:42:41 -06:00
Xavier Léauté	d7eb2f717e	enable query caching on intermediate realtime persists	2015-11-17 10:58:00 -08:00
Charles Allen	46527a9610	Merge pull request #1954 from metamx/fix-stupid-aws-limit EC2 autoscaler: avoid hitting aws filter limits	2015-11-13 10:52:35 -08:00
Fangjin Yang	4f46d457f1	Merge pull request #1947 from noddi/feature/count-parameter-history-endpoints Add count parameter to history endpoints	2015-11-12 10:23:44 -08:00
Xavier Léauté	749ac12f88	EC2 autoscaler: avoid hitting aws filter limits	2015-11-11 20:28:06 -08:00
Fangjin Yang	465cbcf9a7	Merge pull request #1956 from metamx/remove-unused-imports Cleanup + remove unused imports	2015-11-11 17:36:47 -08:00
Gian Merlino	e4e5f0375b	SegmentAllocateAction (fixes #1515 ) This is a feature meant to allow realtime tasks to work without being told upfront what shardSpec they should use (so we can potentially publish a variable number of segments per interval). The idea is that there is a "pendingSegments" table in the metadata store that tracks allocated segments. Each one has a segment id (the same segment id we know and love) and is also part of a sequence. The sequences are an idea from @cheddar that offers a way of doing replication. If there are N tasks reading exactly the same data with exactly the same logic (think Kafka tasks reading a fixed range of offsets) then you can place them in the same sequence, and they will generate the same sequence of segments.	2015-11-11 16:54:35 -08:00
Bartosz Ługowski	6e5d2c6745	Add count parameter to history endpoints.	2015-11-11 23:03:57 +01:00
Xavier Léauté	fa6142e217	cleanup and remove unused imports	2015-11-11 12:25:21 -08:00
zhxiaog	c197a4cf32	fix #1918 , add unit tests for RemoteTaskActionClient	2015-11-12 03:15:17 +08:00
Charles Allen	abae47850a	Add backwards compatability for PR #1922	2015-11-11 10:27:00 -08:00
Charles Allen	1df4baf489	Move Jackson Guice adapters into io.druid * Removes access to protected methods in com.fasterxml * Eliminates druid-common's use of foreign package com.fasterxml	2015-11-09 10:50:45 -08:00
Gian Merlino	fc55314d1c	ForkingTaskRunner: Log without buffering. In #933 the ForkingTaskRunner's logging was changed to buffered from unbuffered. This means that the last few KB of the logs are generally not visible while a task is running, which makes debugging running tasks difficult.	2015-11-07 15:16:53 -08:00
Charles Allen	929b981710	Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to	2015-11-05 18:10:13 -08:00
Gian Merlino	cb409ee928	RemoteTaskActionClient: Fix statusCode check.	2015-11-05 10:03:49 -08:00
fjy	8f231fd3e3	cleanup druid codebase	2015-11-04 13:59:53 -08:00
Himanshu Gupta	84f7d8d264	making static final variables in HadoopDruidIndexerConfig upper case	2015-11-02 23:24:26 -06:00
Himanshu Gupta	8b67417ac8	make methods in Index[Merger,Maker,IO] non-static so that they can have appropriate ObjectMapper injected instead of creating one statically	2015-11-02 23:24:26 -06:00
Gian Merlino	16ae8866b8	Log and continue on failure to schedule cleanup for missing workers at startup.	2015-10-28 08:10:54 -07:00
Gian Merlino	513bc76252	RTR: Ensure that there is only one cleanup task scheduled for a worker at once. This is accomplished by making sure that scheduleTasksCleanupForWorker is only called from the PathChildrenCache event thread, having it cancel existing cleanup tasks when it adds a new one, and having tasks check on finish that the thing they are removing from the task list is actually themselves.	2015-10-27 21:16:58 -07:00
Fangjin Yang	ea2267e08c	Merge pull request #1868 from gianm/fix-announcements Historical and MiddleManager server announcements should not remove parents.	2015-10-27 14:50:05 -07:00
Gian Merlino	7df7370935	Merge pull request #1862 from metamx/indexingServiceMMGone Add timeout to shutdown request to middle manager for indexing service	2015-10-27 14:38:01 -07:00
Charles Allen	44a2b204df	Add timeout to shutdown request to middle manager for indexing service	2015-10-27 13:56:03 -07:00
Gian Merlino	4b92752deb	Historical and MiddleManager server announcements should not remove parents. Removing parent paths causes watchers of the "announcements" path to get stuck and stop seeing new updates.	2015-10-27 08:06:11 -07:00
Bingkun Guo	4914925d65	New extension loading mechanism 1) Remove maven client from downloading extensions at runtime. 2) Provide a way to load Druid extensions and hadoop dependencies through file system. 3) Refactor pull-deps so that it can download extensions into extension directories. 4) Add documents on how to use this new extension loading mechanism. 5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0 are packaged within the Druid tarball.	2015-10-21 14:22:36 -05:00
Himanshu	b7c68ec449	Merge pull request #1842 from metamx/DRUID-1841 Do not pass `druid.indexer.runner.javaOpts` to Peon as a property	2015-10-21 13:15:36 -05:00
Xavier Léauté	e4ac78e43d	bump next snapshot to 0.9.0	2015-10-20 13:46:13 -07:00
Charles Allen	532e1c9fd5	Do not pass `druid.indexer.runner.javaOpts` to Peon as a property * Still places `druid.indexer.runner.javaOpts` on the command line, but the Peon no longer tries to have the property `druid.indexer.runner.javaOpts` set * Fixes https://github.com/druid-io/druid/issues/1841	2015-10-20 09:24:01 -07:00
Xavier Léauté	4c2c7a2c37	update version to 0.8.3	2015-10-14 21:40:55 -07:00
Charles Allen	bf11723a52	Update usages of io.druid.client.selector.Server to build URL or URI directly instead of using String.format	2015-10-12 12:30:56 -07:00
Charles Allen	2d847ad654	Merge pull request #1730 from metamx/union-queries-fix fix #1727 - Union bySegment queries fix	2015-09-29 12:23:25 -07:00
Nishant	573aa96bd6	fix #1727 - Union bySegment queries fix Fixes #1727. revert to doing merging for results for union queries on broker. revert unrelated changes Add test for union query runner Add test remove unused imports fix imports fix renamed file fix test update docs.	2015-09-29 23:32:36 +05:30
Charles Allen	d2e400f063	Merge pull request #1740 from metamx/validate-locks fix #1715	2015-09-29 09:38:42 -07:00
Xavier Léauté	25bbc0b923	Merge pull request #1778 from gianm/redirect-fixes Redirect fixes	2015-09-25 09:54:48 -07:00
Gian Merlino	348172203f	OverlordRedirectInfo: Fix ability to detect that there is no leader.	2015-09-25 09:30:09 -07:00
Parag Jain	b630720164	fail task if finishjob throws any exception add realtime task failure test	2015-09-25 10:55:45 -05:00
Fangjin Yang	aa9d90355e	Merge pull request #1772 from gianm/fix-overlord-startup RemoteTaskRunner: Fix for starting an overlord before any workers ever existed.	2015-09-24 21:55:03 -07:00
Gian Merlino	63bf021077	RemoteTaskRunner: Fix for starting an overlord before any workers ever existed.	2015-09-24 21:15:36 -07:00
Himanshu Gupta	6e550d5346	update doc about aggregation field in merge task and a null check	2015-09-24 22:25:07 -05:00
Nishant	b638400acb	fix #1715 fixes #1715 - TaskLockBox has a set of active tasks - lock requests throws exception for if they are from a task not in active task set. - TaskQueue is responsible for updating the active task set on tasklockbox fix #1715 fixes #1715 - TaskLockBox has a set of active tasks - lock requests throws exception for if they are from a task not in active task set. - TaskQueue is responsible for updating the active task set on tasklockbox review comment remove duplicate line use ISE instead organise imports	2015-09-24 10:06:50 +05:30
Himanshu	61b0743943	Merge pull request #1748 from metamx/forkingJavaOptionsWithQuotes Allow ForkingTaskRunner javaOpts to have quoted arguments which contain spaces	2015-09-21 21:03:00 -05:00
Charles Allen	465035e531	Allow ForkingTaskRunner javaOpts to have quoted arguments which contain spaces	2015-09-21 17:32:27 -07:00
Fangjin Yang	e48f6dd660	Merge pull request #1736 from gianm/additional-ingest-segment-timeline-test IngestSegmentFirehostFactoryTimelineTest for overshadowing of the middle of a segment.	2015-09-17 14:42:29 -07:00
Gian Merlino	64e33b2bcb	IngestSegmentFirehostFactoryTimelineTest for overshadowing of the middle of a segment.	2015-09-16 10:17:43 -07:00
Himanshu Gupta	74f4572bd4	Lazily deserialize "parser" to InputRowParser in DataSchema so that user hadoop related InputRowParsers are created only when needed this allows overlord to accept a HadoopIndexTask with a hadoopy InputRowParser and not fail because hadoopy InputRowParser might need hadoop libraries	2015-09-16 10:58:13 -05:00
Charles Allen	f5ed6e885c	Merge pull request #1702 from himanshug/double_datasource_in_storage_dir do not have dataSource twice in path to segment storage on hdfs	2015-09-15 14:00:35 -07:00
Nishant	4681ff22ed	add task duration in response for completed tasks	2015-09-10 13:51:50 +05:30
Himanshu Gupta	fe0233adf2	removing unused imports from HadoopIndexTask	2015-09-09 11:12:01 -05:00
Nishant	47aac991ec	add null check for task context. make variable final	2015-09-04 22:19:01 +05:30
Fangjin Yang	75a582974b	Merge pull request #1639 from gianm/new-plumber New plumber	2015-09-03 18:52:57 -07:00
Gian Merlino	062a47fba4	Modify Plumbers in these ways, 1) Persist using Committer instead of Runnable. (Although the metadata object is ignored in this patch) 2) Remove the getSink method. 3) Plumbers are now responsible for time-based and hydrant-full-based periodic committing. (FireChief, RealtimeIndexTask, and IndexTask used to do this)	2015-09-03 11:13:06 -07:00
Nishant	726326abc3	Add Task Context and ability to override task specific properties override javaOpts fix compilation review comments Add Test for typecast review comments - remove unused method.	2015-09-03 23:36:32 +05:30
Gian Merlino	940e1aa3eb	Replace funky imports with standard ones. 1) Lots of Guava imports were not coming from the actual Guava 2) junit.framework.Assert should be org.junit.Assert	2015-08-28 18:02:05 -07:00
Gian Merlino	414a6fb477	Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat. Fixes #1678. IngestSegmentFirehose (and its users) need to remember which windows of which segments should actually be read, based on a timeline.	2015-08-28 07:32:41 -07:00
Himanshu Gupta	2e0dd1d792	adding UTs and addressing review comments to firehoseV2 addition to Realtime[Manager\|Plumber], essential segment metadata persist support, kafka-simple-consumer-firehose extension patch	2015-08-27 20:50:46 -05:00
lvjq	2237a8cf0f	kafka 8 simple consumer firehose	2015-08-27 20:50:46 -05:00
Nishant	b306739e9c	fix convert segment task 1) fix serde 2) fix wrong parameter being passed when creating subtask remove sysout	2015-08-27 11:34:41 +05:30
Charles Allen	e38cf54bc8	Migrate TestDerbyConnector to a JUnit @Rule	2015-08-26 21:47:40 -07:00
Xavier Léauté	fdb6a6651b	Merge pull request #1669 from metamx/upgrade-dependencies Upgrade dependencies	2015-08-25 21:30:22 -07:00
Xavier Léauté	5c19ffa98c	Merge pull request #1663 from gianm/segment-insert-constraints TaskActionToolbox: Remove allowOlderVersions, lift interval constraint	2015-08-25 18:11:46 -07:00
Xavier Léauté	51f6a9a2c9	update jackson to 2.6.1	2015-08-25 16:07:01 -07:00
Gian Merlino	33681525e3	TaskActionToolbox: Remove allowOlderVersions switch, lift interval constraint. allowOlderVersions has been stuck true for a while due to a bug (introduced in `566a3a61`), but I think it's actually OK this way. I think it's reasonable to expect tasks to choose versions in some way that makes sense, so long as they don't choose one larger than their taskLock version. This is still verified. The interval constraint was introduced to force tasks to break up their segment insert lists into manageable chunks. They are already doing this, and I think it's reasonable to expect them to do so without enforcement. Lifting these constraints paves the way for transactional insertion of segments that have varying versions and may be for varying intervals.	2015-08-25 14:17:38 -07:00
Paul Otto	2301b60365	Add ability to provide taskResource for IndexTask.	2015-08-24 17:38:31 -07:00
Xavier Léauté	3b2e41e42a	update for next release	2015-08-18 17:16:46 -07:00
Himanshu Gupta	15fa43dd43	changing DatasourcePathSpec, to get segment list, so that hadoop indexer uses overlord action to get list of segments and passes when running as an overlord task. and, uses metadata store directly when running as standalone hadoop indexer also, serialized list of segments is passed to DatasourcePathSpec so that hadoop classloader issues do not creep up	2015-08-16 14:07:35 -05:00
Himanshu Gupta	4d4aa8bfc6	refactor IngestSegmentFirehoseFactory so that IngestSegmentFirehose becomes reusable Conflicts: indexing-service/src/main/java/io/druid/indexing/firehose/IngestSegmentFirehoseFactory.java	2015-08-14 14:44:22 -05:00
Gian Merlino	bc0c7dd65d	Avoid the Hadoop objectMapper in the local IndexTask. Fixes #1545 .	2015-08-11 10:40:53 -07:00
Charles Allen	1ddaa3fb33	Merge pull request #1592 from metamx/clean-test-files clean temporary files	2015-08-03 11:47:20 -07:00
Nishant	2679efee7a	clean temporary files	2015-08-03 23:32:58 +05:30
Fangjin Yang	6f65e6d3ef	Merge pull request #1547 from pjain1/improve_overlord_test add test to OverlordResourceTest	2015-07-28 07:35:48 -10:00
Parag Jain	2e1b617346	add more tests	2015-07-24 15:12:08 -05:00
Fangjin Yang	97242356b4	Merge pull request #1480 from guobingkun/kill_task_test Unit tests for KillTask and MetadataTaskStorage	2015-07-20 16:31:45 -07:00
Xavier Léauté	4cfb00bc8a	inrement version	2015-07-15 13:09:05 -07:00
Fangjin Yang	3f7ba58227	Merge pull request #1504 from metamx/fix-1447 fix for #1447	2015-07-14 08:50:08 -07:00
Himanshu	e2ddfb7a1a	Merge pull request #1511 from pjain1/remove_test remove flaky overlord test	2015-07-13 18:38:34 -05:00
Parag Jain	59dec89f6a	remove flaky overlord test	2015-07-13 15:32:12 -05:00
Himanshu	725086cc89	Merge pull request #1506 from gianm/realtime-plumber-nulls Consider null inputRows and parse errors as unparseable during realtime ingestion.	2015-07-13 10:12:12 -05:00
Gian Merlino	9068bcd062	Consider null inputRows and parse errors as unparseable during realtime ingestion. Also, harmonize exception handling between the RealtimeIndexTask and the RealtimeManager. Conditions other than null inputRows and parse errors bubble up in both.	2015-07-11 20:40:03 -07:00
Himanshu	cac722968e	Merge pull request #1503 from metamx/fix-leaking-zk-nodes Fix leaking Status Path nodes in ZK	2015-07-10 17:40:18 -05:00
Fangjin Yang	9f19e96658	Merge pull request #1477 from pjain1/overlord_test overlord and task master test	2015-07-10 14:27:14 -07:00
Parag Jain	55c4fe64f3	overlord and task master test	2015-07-10 16:17:45 -05:00
Nishant	5fe27fe4ad	fix for #1447 fixes #1447	2015-07-09 19:05:48 +05:30
Nishant	8d7a566bae	Fix leaking Status Path nodes in ZK - remove ZK status path nodes for workers after they are removed	2015-07-09 17:20:09 +05:30
Charles Allen	c0b60c0d2f	I'm not your mom, indexing-service/test... cleanup after yourself	2015-07-01 15:00:09 -07:00
Bingkun Guo	282a0f9760	Unit tests for KillTask and MetadataTaskStorage	2015-06-29 17:55:41 -05:00
Himanshu	b5b9ca1446	Merge pull request #1470 from pjain1/rtindex_test Realtime Index Task test	2015-06-29 16:51:35 -05:00
Parag Jain	284b80b09e	Realtime Index Task test	2015-06-29 09:52:41 -05:00
Himanshu	4a83a22f8c	Merge pull request #1445 from metamx/JSWorkerSelectStrategy JavaScript Worker Select Strategy	2015-06-22 17:19:13 -05:00
nishant	fb4052d577	JavaScript Worker Select Strategy this PR adds a JavaScriptWorkerSelectStrategy which allows defining arbitrary logic for selecting workers to run task using a JavaScript function. This gives users full control to implement complex worker selection strategies based on task attributes. more tests and a complex javascript config fix for java8 modify for nashorn compatibility	2015-06-20 02:01:34 +05:30
Xavier Léauté	0a5bb909a2	[maven-release-plugin] prepare for next development iteration	2015-06-18 17:35:19 -07:00
Xavier Léauté	59c6b2b279	[maven-release-plugin] prepare release druid-0.8.0-rc1	2015-06-18 17:35:14 -07:00
Charles Allen	acc0a3fbf7	Add jitter to the retries for RemoteTaskActionClient	2015-06-12 17:43:25 -07:00
nishant	e9afec4a2b	fix task status issues on zk outages docs review comments fix test review comments Review comments fix compilation fix typo	2015-06-11 00:49:52 +05:30
Xavier Léauté	78d468700b	Merge pull request #1388 from metamx/fix-1360 fix race described in 1360	2015-06-10 11:59:36 -07:00
Xavier Léauté	f6b336ac3e	Merge pull request #1432 from metamx/config-fix fix passing of config from IndexTuningConfig to RealtimeTuningConfig	2015-06-10 11:42:58 -07:00
nishant	963682d696	Add check for valid rowFlushBoundary configuration and fix tests	2015-06-10 21:38:34 +05:30
nishant	191b302f6a	fix passing of config from IndexTuningConfig to RealtimeTuningConfig - pass rowFlushboundary correctly instead of using default. - fixes indexTask failing with io.druid.segment.incremental.IndexSizeExceededException when rowFlushboundary is set higher than RealtimeTuningConfig.defaultMaxRowsInMemory rename test method	2015-06-10 21:07:25 +05:30
nishant	af9ea08041	fix race described in 1360 review comments review comments review comments no need to remove fix test review comments	2015-06-10 12:19:12 +05:30
Charles Allen	056cab93ed	Add Hadoop Converter Job and task * Fixes https://github.com/druid-io/druid/issues/1363 * Add extra utils in JobHelper based on PR feedback	2015-06-09 14:47:38 -07:00
Charles Allen	ef9b67cce3	Merge pull request #1422 from metamx/fix-ec2-public-ip fix public IP not working in EC2 autoscaling	2015-06-03 16:30:51 -07:00
Xavier Léauté	4ebdfea76f	fix public IP not working in EC2 autoscaling	2015-06-03 16:05:59 -07:00
Charles Allen	8289914f76	Make AbstractTask.makeId use AbstractTask.joinId * Also remove TaskUtil	2015-06-03 13:24:20 -07:00
Fangjin Yang	ac9057c00e	Merge pull request #1401 from metamx/ec2-public-ip flag to enable public IP in EC2 autoscaling	2015-05-28 20:21:32 -07:00
Xavier Léauté	d834a974ba	flag to enable public IP in EC2-VPC autoscaling	2015-05-28 18:14:12 -07:00
fjy	bb1145ef56	Make the index task use indexmerger and not indexmaker	2015-05-28 13:34:57 -07:00
Xavier Léauté	5ad5d7d18b	Merge pull request #1379 from flowbehappy/fix-hadoop-ha bug fix: hdfs task log and indexing task not work properly with Hadoop HA	2015-05-22 09:14:50 -04:00
flow	07659f30ab	bug fix: hdfs task log and indexing task not work properly with Hadoop HA	2015-05-21 20:49:42 +08:00
Charles Allen	29ba05c04f	Abstractify HadoopTask * Add `invokeForeignLoader` to commonize the way tasks are attempted to be launched in a foreign class loader * Add `buildClassLoader` to accomplish the common tasks for hadoop jobs when building a ClassLoader	2015-05-14 17:04:43 -07:00
fjy	7a6acf5c1b	update pom to 0.8	2015-05-11 19:41:58 -06:00
Gian Merlino	e69d82a2b4	Realtime: Delay firehose connection until job is started. Some firehoses (like the Kafka firehose) acquire input resources when they connect, so it helps to delay this until after plumber.startJob() runs.	2015-05-04 10:54:07 -07:00
Xavier Léauté	721505c017	Merge pull request #1208 from druid-io/rework-metrics Schemaless metrics + additional metrics for things we care about	2015-04-27 15:04:54 -07:00
fjy	963e5765bf	Schemaless metrics + additional metrics for things we care about	2015-04-27 13:39:40 -07:00
Charles Allen	633fdb029e	Add option to ConvertSegmentTask to skip validation * Validation is enabled by default	2015-04-27 08:37:55 -07:00
Charles Allen	29341f9837	Fix random unit test failure from NoopTask ID collision	2015-04-24 13:07:48 -07:00
Xavier Léauté	f73f14ab91	Merge pull request #1297 from metamx/versionConverterTaskUpdates Update VersionConverterTask for IndexSpec and allowing Forced updates	2015-04-20 16:44:35 -07:00
Charles Allen	7479ac9012	Update VersionConverterTask for IndexSepc and allowing Forced updates	2015-04-20 16:17:06 -07:00
fjy	d260515a43	update druid-api version	2015-04-17 14:58:35 -07:00
Xavier Léauté	ea5572d001	Merge pull request #1271 from metamx/strictErrorChecking Add stricter checking for potential coding errors	2015-04-15 15:21:41 -07:00
Charles Allen	abdeaa0746	Add stricter checking for potential coding errors Can use via `mvn clean compile test-compile -P strict'	2015-04-15 14:52:25 -07:00
Xavier Léauté	3a3046ccf3	add support for dimension compression - compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier - makes dimension compression configurable via IndexSpec - IndexSpec also enables configuring bitmap and metric compression	2015-04-14 10:44:18 -07:00
fjy	195a3b8bb8	ignore rows with invalid interval	2015-04-06 16:08:40 -07:00
Fangjin Yang	208e307915	Merge pull request #1251 from metamx/uriSegmentLoaders Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""	2015-03-30 17:43:51 -07:00
fjy	aea7f9d192	[maven-release-plugin] prepare for next development iteration	2015-03-30 16:35:24 -07:00
fjy	060d7aef03	[maven-release-plugin] prepare release druid-0.7.1	2015-03-30 16:35:20 -07:00
Charles Allen	1c6cbea89c	Revert "Revert "Overhaul of SegmentPullers to add consistency and retries"" This reverts commit `f904bc7858`.	2015-03-30 13:40:04 -07:00
Fangjin Yang	f904bc7858	Revert "Overhaul of SegmentPullers to add consistency and retries"	2015-03-30 13:15:50 -07:00
Charles Allen	6d407e8677	Add URI handling to SegmentPullers * Requires https://github.com/druid-io/druid-api/pull/37 * Requires https://github.com/metamx/java-util/pull/22 * Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can	2015-03-30 12:33:23 -07:00
msprunck	942c17a2aa	Remove timeline chunk count assumptions. * Replace with generic iterables	2015-03-24 22:40:49 +01:00
fjy	b389cfe404	[maven-release-plugin] prepare for next development iteration	2015-03-19 12:38:17 -07:00
fjy	60e7d543cc	[maven-release-plugin] prepare release druid-0.7.1-rc1	2015-03-19 12:38:13 -07:00
Xavier Léauté	9d6b728054	Merge pull request #1215 from metamx/log-audit-IP-Address Add remote ip address in audit log.	2015-03-17 13:59:31 -07:00
fjy	bfe10bd156	This fixes arbitrary gran spec breaking	2015-03-17 12:19:43 -07:00
nishantmonu51	f9821d242f	also log author ip address in audit log	2015-03-17 23:15:15 +05:30
Xavier Léauté	ddfafa0711	randomize task ID to fix spurious test failure	2015-03-12 18:08:48 -07:00
Fangjin Yang	a508c0955f	Merge pull request #1195 from himanshug/task_storage_config_fix correctly parse recentlyFinishedThreshold from config	2015-03-12 16:50:49 -07:00
nishantmonu51	3ec4a30ab5	initial commit review comments more refactoring and cleaning of redundant code add UT + docs + more refactoring fixes + review comments more cleanup end points to fetch history review comments remove unnecessary changes review comments rename header name review comments + add test for MetadataRulesManager review comments docs	2015-03-12 22:50:29 +05:30
Himanshu Gupta	23545fc01c	correctly parse recentlyFinishedThreshold from config	2015-03-12 09:46:57 -05:00
Xavier Léauté	d3f5bddc5c	Add ability to apply extraction functions to the time dimension - Moves DimExtractionFn under a more generic ExtractionFn interface to support extracting dimension values other than strings - pushes down extractionFn to the storage adapter from query engine - 'dimExtractionFn' parameter has been deprecated in favor of 'extractionFn' - adds a TimeFormatExtractionFn, allowing to project the '__time' dimension - JavascriptDimExtractionFn renamed to JavascriptExtractionFn, adding support for any dimension value types that map directly to Javascript - update documentation for time column extraction and related changes	2015-03-11 16:45:42 -07:00
Gian Merlino	b00c243786	Need a null check for iamProfile.	2015-03-10 17:52:15 -07:00
Gian Merlino	b810cdfe58	EC2AutoScaler: Allow setting "iamProfile".	2015-03-10 17:41:35 -07:00
Gian Merlino	d102a89760	Fix license on EC2AutoScalerSerdeTest.	2015-03-10 17:31:30 -07:00
Gian Merlino	9235b45063	EC2AutoScaler: Support for setting subnetId.	2015-03-10 11:29:56 -07:00
Xavier Léauté	113d204b10	break up archive task actions, which was missed in #566a3a6112	2015-03-04 13:19:52 -08:00
Himanshu Gupta	bd5cecdd44	UTs update for indexing service	2015-02-25 15:45:58 -08:00
Xavier Léauté	b167dcf82c	[maven-release-plugin] prepare for next development iteration	2015-02-23 14:28:06 -08:00
Xavier Léauté	e81ac2ba43	[maven-release-plugin] prepare release druid-0.7.0	2015-02-23 14:27:58 -08:00
Fangjin Yang	25db9abb7f	Merge pull request #1138 from metamx/better-default-hostname Better default hostname	2015-02-18 17:37:34 -08:00
Xavier Léauté	53d2b961c5	default to canonical hostname instead of localhost	2015-02-18 16:44:48 -08:00
Xavier Léauté	78df7f6165	Move Druid release artifacts to Sonatype - Switch to using Druid parent POM - Add required fields for Sonatype - Common plugin versions and settings have been moved to the parent pom - Cleanup artifacts and POMs for consistent formatting - Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed	2015-02-13 14:26:31 -08:00
fjy	d29740ed9f	[maven-release-plugin] prepare for next development iteration	2015-02-12 16:16:00 -08:00
fjy	211fd15b7e	[maven-release-plugin] prepare release druid-0.7.0-rc3	2015-02-12 16:15:56 -08:00
fjy	708759e1e0	Update http-client to 1.0.0	2015-02-10 13:36:47 -08:00
Charles Allen	79a3e8f59f	Fix overriding base of IndexerZkConfig to be absolute instead of relative * Updated docs to clarify ZK config behavior * Added unit tests for this case	2015-02-04 13:04:06 -08:00
fjy	1f12c5b2f1	[maven-release-plugin] prepare for next development iteration	2015-02-03 12:06:49 -08:00
fjy	e82d431be7	[maven-release-plugin] prepare release druid-0.7.0-rc2	2015-02-03 12:06:41 -08:00
Fangjin Yang	92e616de11	Merge pull request #1077 from metamx/remove-unused-imports remove unused imports	2015-02-02 10:45:27 -08:00
nishantmonu51	ba932bb1f2	remove unused imports	2015-02-02 21:53:39 +05:30
fjy	d05032b98a	towards a community led druid	2015-01-31 20:57:36 -08:00
Xavier Léauté	a01a22dba1	Merge pull request #1074 from druid-io/overlord-leader Add an endpoint to return the overlord leader	2015-01-30 13:44:49 -08:00
Xavier Léauté	bd49528805	Merge pull request #1073 from druid-io/fix-statusPath Fix worker status path announcement with indexer zk config	2015-01-30 12:51:21 -08:00
fjy	649f285feb	Add an endpoint to return the overlord leader	2015-01-30 12:37:48 -08:00
fjy	bc1405bee0	fix worker status path announcement with indexer zk config	2015-01-30 12:26:08 -08:00
Xavier Léauté	2c2771b90e	Make dynamic worker selection actually work	2015-01-27 14:17:42 -08:00
nishantmonu51	0f3eac4705	fix dimension exclusion	2015-01-23 00:31:23 +05:30
fjy	1f94de22c6	[maven-release-plugin] prepare for next development iteration	2015-01-20 14:23:55 -08:00
fjy	17476edc31	[maven-release-plugin] prepare release druid-0.7.0-rc1	2015-01-20 14:23:51 -08:00
fjy	2d516fa591	Add a new equal distribution strategy for assigning tasks	2015-01-20 13:12:22 -08:00
Xavier Léauté	cd9635ff5e	Merge pull request #1034 from druid-io/minor-rename minor rename of things in hadoop ingestion config to match 0.6.x	2015-01-15 15:46:13 -08:00
fjy	ccddbf8747	minor rename of things in hadoop ingestion config to match 0.6.x	2015-01-15 14:04:55 -08:00
Fangjin Yang	5bfcc43377	Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate Update all String conversions to and from byte[] to use the java-util StringUtils functions	2015-01-15 13:50:27 -08:00
Charles Allen	67757b6aea	Change IndexerZkConfig to use @JacksonInject instead of just straight @Inject * Updated IndexerZkConfig to use no setters, and take all arguments from constructor instead * Also added more unit tests	2015-01-08 11:11:17 -08:00
Charles Allen	f6fbb733b8	Added a few places where tests were using Object instead of Module	2015-01-05 13:47:25 -08:00
Charles Allen	b1b5c9099e	Update all String conversions to and from byte[] to use the java-util StringUtils functions * Speedup of GroupBy with javaScript filters by ~10% * Requires https://github.com/metamx/java-util/pull/15	2015-01-05 11:22:32 -08:00
Charles Allen	65286a24e0	Change zk configs to use Jackson injection instead of Skife * Also added generic config testing class JsonConfigTesterBase	2014-12-29 10:36:12 -08:00
Fangjin Yang	af1185b58c	Merge pull request #969 from metamx/fixRemoteLogViewing Remove try-with-resources for log stream in WokerResource	2014-12-15 16:26:02 -07:00
Charles Allen	54068e8b1d	Remove try-with-resources for log stream in WokerResource	2014-12-15 15:24:59 -08:00
fjy	ac407fb6ba	clean up defaults	2014-12-15 15:05:02 -08:00
fjy	e872952390	fix working path default bug	2014-12-15 14:51:58 -08:00
Fangjin Yang	b3fe91bb50	Merge pull request #830 from metamx/union-merge-on-historical Union merge on historical	2014-12-15 13:36:47 -07:00
Charles Allen	bed3e7e1d2	Merge pull request #966 from metamx/fix-tasklog-streaming fix task log streaming	2014-12-14 09:31:41 -08:00
Xavier Léauté	bd91a40491	fix task log streaming	2014-12-13 15:22:55 -08:00
Xavier Léauté	092dfe0309	fix IndexTaskTest tmp dir - Create local firehose files in a clean temp directory to avoid firehose reading other random temp files that start with 'druid'	2014-12-12 17:05:45 -08:00
fjy	123db3da4d	fix another broken ut	2014-12-09 15:47:28 -08:00
nishantmonu51	1a1b0e6f23	merge from master and review comments	2014-12-09 13:16:45 +05:30
Charles Allen	a0f9f9877e	Changed all "application/json" to MediaType.APPLICATION_JSON except for in druid.js	2014-12-08 14:21:49 -08:00
nishantmonu51	6e03a6245f	Merge branch 'master' into onheap-incremental-index	2014-12-05 10:40:28 +05:30
Xavier Léauté	7cd45a6e1f	IncrementalIndex throws exception if limit exceeded - For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes - canAppendRow is a workaround for realtime index since the Firehose currently does not have a way of rolling back the last event in case of error - canAppendRow needs a fudge factor; there is a race between checking if we can add a row and actually adding a row, because of the way MapDB reports its size.	2014-12-04 14:38:16 -08:00
Charles Allen	18234a2f00	Fix confusing error message in HadoopIndexTask	2014-12-04 10:57:57 -08:00
Gian Merlino	20a7239ffd	Replace google-http-client imports with real guava imports.	2014-12-04 10:57:57 -08:00
Xavier Léauté	0c521e0a77	update joda-time and fix min/max instant	2014-12-04 10:57:56 -08:00
Fangjin Yang	27d4b2bdea	Merge pull request #934 from metamx/fix-hadoop-metadata-injection metadata update handler injection not needed for indexing service	2014-12-04 10:45:20 -07:00
Xavier Léauté	2e6c254937	metadata injection not needed for indexing service	2014-12-03 15:09:31 -08:00
Charles Allen	325a5c4abc	Update ForkingTaskRunner to remove @Deprecated Files method usage	2014-12-03 13:18:33 -08:00
xvrl	2681da4420	Merge pull request #929 from metamx/google-cleanup Replace google-http-client imports with real guava imports.	2014-12-03 11:50:19 -08:00
Charles Allen	b6f71d3fd6	Fix confusing error message in HadoopIndexTask	2014-12-03 11:11:53 -08:00
Gian Merlino	d388a8fe89	Replace google-http-client imports with real guava imports.	2014-12-03 10:52:57 -08:00

... 5 6 7 8 9 ...

1548 Commits