Commit Graph

1095 Commits

Author SHA1 Message Date
Fangjin Yang 75a582974b Merge pull request #1639 from gianm/new-plumber
New plumber
2015-09-03 18:52:57 -07:00
Gian Merlino 062a47fba4 Modify Plumbers in these ways,
1) Persist using Committer instead of Runnable. (Although the metadata object
   is ignored in this patch)

2) Remove the getSink method.

3) Plumbers are now responsible for time-based and hydrant-full-based periodic
   committing. (FireChief, RealtimeIndexTask, and IndexTask used to do this)
2015-09-03 11:13:06 -07:00
Nishant 726326abc3 Add Task Context and ability to override task specific properties
override javaOpts

fix compilation

review comments

Add Test for typecast

review comments - remove unused method.
2015-09-03 23:36:32 +05:30
Gian Merlino 940e1aa3eb Replace funky imports with standard ones.
1) Lots of Guava imports were not coming from the actual Guava
2) junit.framework.Assert should be org.junit.Assert
2015-08-28 18:02:05 -07:00
Gian Merlino 414a6fb477 Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat.
Fixes #1678. IngestSegmentFirehose (and its users) need to remember which
windows of which segments should actually be read, based on a timeline.
2015-08-28 07:32:41 -07:00
Himanshu Gupta 2e0dd1d792 adding UTs and addressing review comments to
firehoseV2 addition to Realtime[Manager|Plumber],
essential segment metadata persist support,
kafka-simple-consumer-firehose extension patch
2015-08-27 20:50:46 -05:00
lvjq 2237a8cf0f kafka 8 simple consumer firehose 2015-08-27 20:50:46 -05:00
Nishant b306739e9c fix convert segment task
1) fix serde
2) fix wrong parameter being passed when creating subtask

remove sysout
2015-08-27 11:34:41 +05:30
Charles Allen e38cf54bc8 Migrate TestDerbyConnector to a JUnit @Rule 2015-08-26 21:47:40 -07:00
Xavier Léauté fdb6a6651b Merge pull request #1669 from metamx/upgrade-dependencies
Upgrade dependencies
2015-08-25 21:30:22 -07:00
Xavier Léauté 5c19ffa98c Merge pull request #1663 from gianm/segment-insert-constraints
TaskActionToolbox: Remove allowOlderVersions, lift interval constraint
2015-08-25 18:11:46 -07:00
Xavier Léauté 51f6a9a2c9 update jackson to 2.6.1 2015-08-25 16:07:01 -07:00
Gian Merlino 33681525e3 TaskActionToolbox: Remove allowOlderVersions switch, lift interval constraint.
allowOlderVersions has been stuck true for a while due to a bug (introduced in
566a3a61), but I think it's actually OK this way. I think it's reasonable to
expect tasks to choose versions in some way that makes sense, so long as they
don't choose one larger than their taskLock version. This is still verified.

The interval constraint was introduced to force tasks to break up their
segment insert lists into manageable chunks. They are already doing this, and
I think it's reasonable to expect them to do so without enforcement.

Lifting these constraints paves the way for transactional insertion of segments
that have varying versions and may be for varying intervals.
2015-08-25 14:17:38 -07:00
Paul Otto 2301b60365 Add ability to provide taskResource for IndexTask. 2015-08-24 17:38:31 -07:00
Xavier Léauté 3b2e41e42a update for next release 2015-08-18 17:16:46 -07:00
Himanshu Gupta 15fa43dd43 changing DatasourcePathSpec, to get segment list, so that hadoop indexer uses overlord action to get list of segments and passes when running as an overlord task. and, uses metadata store directly when running as standalone hadoop indexer
also, serialized list of segments is passed to DatasourcePathSpec so that hadoop classloader issues do not creep up
2015-08-16 14:07:35 -05:00
Himanshu Gupta 4d4aa8bfc6 refactor IngestSegmentFirehoseFactory so that IngestSegmentFirehose becomes reusable
Conflicts:
	indexing-service/src/main/java/io/druid/indexing/firehose/IngestSegmentFirehoseFactory.java
2015-08-14 14:44:22 -05:00
Gian Merlino bc0c7dd65d Avoid the Hadoop objectMapper in the local IndexTask. Fixes #1545. 2015-08-11 10:40:53 -07:00
Charles Allen 1ddaa3fb33 Merge pull request #1592 from metamx/clean-test-files
clean temporary files
2015-08-03 11:47:20 -07:00
Nishant 2679efee7a clean temporary files 2015-08-03 23:32:58 +05:30
Fangjin Yang 6f65e6d3ef Merge pull request #1547 from pjain1/improve_overlord_test
add test to OverlordResourceTest
2015-07-28 07:35:48 -10:00
Parag Jain 2e1b617346 add more tests 2015-07-24 15:12:08 -05:00
Fangjin Yang 97242356b4 Merge pull request #1480 from guobingkun/kill_task_test
Unit tests for KillTask and MetadataTaskStorage
2015-07-20 16:31:45 -07:00
Xavier Léauté 4cfb00bc8a inrement version 2015-07-15 13:09:05 -07:00
Fangjin Yang 3f7ba58227 Merge pull request #1504 from metamx/fix-1447
fix for #1447
2015-07-14 08:50:08 -07:00
Himanshu e2ddfb7a1a Merge pull request #1511 from pjain1/remove_test
remove flaky overlord test
2015-07-13 18:38:34 -05:00
Parag Jain 59dec89f6a remove flaky overlord test 2015-07-13 15:32:12 -05:00
Himanshu 725086cc89 Merge pull request #1506 from gianm/realtime-plumber-nulls
Consider null inputRows and parse errors as unparseable during realtime ingestion.
2015-07-13 10:12:12 -05:00
Gian Merlino 9068bcd062 Consider null inputRows and parse errors as unparseable during realtime ingestion.
Also, harmonize exception handling between the RealtimeIndexTask and the RealtimeManager.
Conditions other than null inputRows and parse errors bubble up in both.
2015-07-11 20:40:03 -07:00
Himanshu cac722968e Merge pull request #1503 from metamx/fix-leaking-zk-nodes
Fix leaking Status Path nodes in ZK
2015-07-10 17:40:18 -05:00
Fangjin Yang 9f19e96658 Merge pull request #1477 from pjain1/overlord_test
overlord and task master test
2015-07-10 14:27:14 -07:00
Parag Jain 55c4fe64f3 overlord and task master test 2015-07-10 16:17:45 -05:00
Nishant 5fe27fe4ad fix for #1447
fixes #1447
2015-07-09 19:05:48 +05:30
Nishant 8d7a566bae Fix leaking Status Path nodes in ZK
- remove  ZK status path nodes for workers after they are removed
2015-07-09 17:20:09 +05:30
Charles Allen c0b60c0d2f I'm not your mom, indexing-service/test... cleanup after yourself 2015-07-01 15:00:09 -07:00
Bingkun Guo 282a0f9760 Unit tests for KillTask and MetadataTaskStorage 2015-06-29 17:55:41 -05:00
Himanshu b5b9ca1446 Merge pull request #1470 from pjain1/rtindex_test
Realtime Index Task test
2015-06-29 16:51:35 -05:00
Parag Jain 284b80b09e Realtime Index Task test 2015-06-29 09:52:41 -05:00
Himanshu 4a83a22f8c Merge pull request #1445 from metamx/JSWorkerSelectStrategy
JavaScript Worker Select Strategy
2015-06-22 17:19:13 -05:00
nishant fb4052d577 JavaScript Worker Select Strategy
this PR adds a JavaScriptWorkerSelectStrategy which allows defining
arbitrary logic for selecting workers to run task using a JavaScript
function.

This gives users full control to implement complex worker selection
strategies based on task attributes.

more tests and a complex javascript config

fix for java8 modify for nashorn compatibility
2015-06-20 02:01:34 +05:30
Xavier Léauté 0a5bb909a2 [maven-release-plugin] prepare for next development iteration 2015-06-18 17:35:19 -07:00
Xavier Léauté 59c6b2b279 [maven-release-plugin] prepare release druid-0.8.0-rc1 2015-06-18 17:35:14 -07:00
Charles Allen acc0a3fbf7 Add jitter to the retries for RemoteTaskActionClient 2015-06-12 17:43:25 -07:00
nishant e9afec4a2b fix task status issues on zk outages
docs

review comments

fix test

review comments

Review comments

fix compilation

fix typo
2015-06-11 00:49:52 +05:30
Xavier Léauté 78d468700b Merge pull request #1388 from metamx/fix-1360
fix race described in 1360
2015-06-10 11:59:36 -07:00
Xavier Léauté f6b336ac3e Merge pull request #1432 from metamx/config-fix
fix passing of config from IndexTuningConfig to RealtimeTuningConfig
2015-06-10 11:42:58 -07:00
nishant 963682d696 Add check for valid rowFlushBoundary configuration and fix tests 2015-06-10 21:38:34 +05:30
nishant 191b302f6a fix passing of config from IndexTuningConfig to RealtimeTuningConfig
- pass rowFlushboundary correctly instead of using default.
- fixes indexTask failing with
io.druid.segment.incremental.IndexSizeExceededException when
rowFlushboundary is set higher than
RealtimeTuningConfig.defaultMaxRowsInMemory

rename test method
2015-06-10 21:07:25 +05:30
nishant af9ea08041 fix race described in 1360
review comments

review comments

review comments

no need to remove

fix test

review comments
2015-06-10 12:19:12 +05:30
Charles Allen 056cab93ed Add Hadoop Converter Job and task
* Fixes https://github.com/druid-io/druid/issues/1363
* Add extra utils in JobHelper based on PR feedback
2015-06-09 14:47:38 -07:00
Charles Allen ef9b67cce3 Merge pull request #1422 from metamx/fix-ec2-public-ip
fix public IP not working in EC2 autoscaling
2015-06-03 16:30:51 -07:00
Xavier Léauté 4ebdfea76f fix public IP not working in EC2 autoscaling 2015-06-03 16:05:59 -07:00
Charles Allen 8289914f76 Make AbstractTask.makeId use AbstractTask.joinId
* Also remove TaskUtil
2015-06-03 13:24:20 -07:00
Fangjin Yang ac9057c00e Merge pull request #1401 from metamx/ec2-public-ip
flag to enable public IP in EC2 autoscaling
2015-05-28 20:21:32 -07:00
Xavier Léauté d834a974ba flag to enable public IP in EC2-VPC autoscaling 2015-05-28 18:14:12 -07:00
fjy bb1145ef56 Make the index task use indexmerger and not indexmaker 2015-05-28 13:34:57 -07:00
Xavier Léauté 5ad5d7d18b Merge pull request #1379 from flowbehappy/fix-hadoop-ha
bug fix: hdfs task log and indexing task not work properly with Hadoop HA
2015-05-22 09:14:50 -04:00
flow 07659f30ab bug fix: hdfs task log and indexing task not work properly with Hadoop HA 2015-05-21 20:49:42 +08:00
Charles Allen 29ba05c04f Abstractify HadoopTask
* Add `invokeForeignLoader` to commonize the way tasks are attempted to be launched in a foreign class loader
* Add `buildClassLoader` to accomplish the common tasks for hadoop jobs when building a ClassLoader
2015-05-14 17:04:43 -07:00
fjy 7a6acf5c1b update pom to 0.8 2015-05-11 19:41:58 -06:00
Gian Merlino e69d82a2b4 Realtime: Delay firehose connection until job is started.
Some firehoses (like the Kafka firehose) acquire input resources when they
connect, so it helps to delay this until after plumber.startJob() runs.
2015-05-04 10:54:07 -07:00
Xavier Léauté 721505c017 Merge pull request #1208 from druid-io/rework-metrics
Schemaless metrics + additional metrics for things we care about
2015-04-27 15:04:54 -07:00
fjy 963e5765bf Schemaless metrics + additional metrics for things we care about 2015-04-27 13:39:40 -07:00
Charles Allen 633fdb029e Add option to ConvertSegmentTask to skip validation
* Validation is enabled by default
2015-04-27 08:37:55 -07:00
Charles Allen 29341f9837 Fix random unit test failure from NoopTask ID collision 2015-04-24 13:07:48 -07:00
Xavier Léauté f73f14ab91 Merge pull request #1297 from metamx/versionConverterTaskUpdates
Update VersionConverterTask for IndexSpec and allowing Forced updates
2015-04-20 16:44:35 -07:00
Charles Allen 7479ac9012 Update VersionConverterTask for IndexSepc and allowing Forced updates 2015-04-20 16:17:06 -07:00
fjy d260515a43 update druid-api version 2015-04-17 14:58:35 -07:00
Xavier Léauté ea5572d001 Merge pull request #1271 from metamx/strictErrorChecking
Add stricter checking for potential coding errors
2015-04-15 15:21:41 -07:00
Charles Allen abdeaa0746 Add stricter checking for potential coding errors
Can use via `mvn clean compile test-compile -P strict'
2015-04-15 14:52:25 -07:00
Xavier Léauté 3a3046ccf3 add support for dimension compression
- compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier
- makes dimension compression configurable via IndexSpec
- IndexSpec also enables configuring bitmap and metric compression
2015-04-14 10:44:18 -07:00
fjy 195a3b8bb8 ignore rows with invalid interval 2015-04-06 16:08:40 -07:00
Fangjin Yang 208e307915 Merge pull request #1251 from metamx/uriSegmentLoaders
Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
2015-03-30 17:43:51 -07:00
fjy aea7f9d192 [maven-release-plugin] prepare for next development iteration 2015-03-30 16:35:24 -07:00
fjy 060d7aef03 [maven-release-plugin] prepare release druid-0.7.1 2015-03-30 16:35:20 -07:00
Charles Allen 1c6cbea89c Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
This reverts commit f904bc7858.
2015-03-30 13:40:04 -07:00
Fangjin Yang f904bc7858 Revert "Overhaul of SegmentPullers to add consistency and retries" 2015-03-30 13:15:50 -07:00
Charles Allen 6d407e8677 Add URI handling to SegmentPullers
* Requires https://github.com/druid-io/druid-api/pull/37
* Requires https://github.com/metamx/java-util/pull/22
* Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl
  * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
2015-03-30 12:33:23 -07:00
msprunck 942c17a2aa Remove timeline chunk count assumptions.
* Replace with generic iterables
2015-03-24 22:40:49 +01:00
fjy b389cfe404 [maven-release-plugin] prepare for next development iteration 2015-03-19 12:38:17 -07:00
fjy 60e7d543cc [maven-release-plugin] prepare release druid-0.7.1-rc1 2015-03-19 12:38:13 -07:00
Xavier Léauté 9d6b728054 Merge pull request #1215 from metamx/log-audit-IP-Address
Add remote ip address in audit log.
2015-03-17 13:59:31 -07:00
fjy bfe10bd156 This fixes arbitrary gran spec breaking 2015-03-17 12:19:43 -07:00
nishantmonu51 f9821d242f also log author ip address in audit log 2015-03-17 23:15:15 +05:30
Xavier Léauté ddfafa0711 randomize task ID to fix spurious test failure 2015-03-12 18:08:48 -07:00
Fangjin Yang a508c0955f Merge pull request #1195 from himanshug/task_storage_config_fix
correctly parse recentlyFinishedThreshold from config
2015-03-12 16:50:49 -07:00
nishantmonu51 3ec4a30ab5 initial commit
review comments

more refactoring and cleaning of redundant code

add UT + docs + more refactoring

fixes + review comments

more cleanup

end points to fetch history

review comments

remove unnecessary changes

review comments rename header name

review comments + add test for MetadataRulesManager

review comments docs
2015-03-12 22:50:29 +05:30
Himanshu Gupta 23545fc01c correctly parse recentlyFinishedThreshold from config 2015-03-12 09:46:57 -05:00
Xavier Léauté d3f5bddc5c Add ability to apply extraction functions to the time dimension
- Moves DimExtractionFn under a more generic ExtractionFn interface to
  support extracting dimension values other than strings
- pushes down extractionFn to the storage adapter from query engine
- 'dimExtractionFn' parameter has been deprecated in favor of 'extractionFn'
- adds a TimeFormatExtractionFn, allowing to project the '__time' dimension
- JavascriptDimExtractionFn renamed to JavascriptExtractionFn, adding
  support for any dimension value types that map directly to Javascript
- update documentation for time column extraction and related changes
2015-03-11 16:45:42 -07:00
Gian Merlino b00c243786 Need a null check for iamProfile. 2015-03-10 17:52:15 -07:00
Gian Merlino b810cdfe58 EC2AutoScaler: Allow setting "iamProfile". 2015-03-10 17:41:35 -07:00
Gian Merlino d102a89760 Fix license on EC2AutoScalerSerdeTest. 2015-03-10 17:31:30 -07:00
Gian Merlino 9235b45063 EC2AutoScaler: Support for setting subnetId. 2015-03-10 11:29:56 -07:00
Xavier Léauté 113d204b10 break up archive task actions, which was missed in #566a3a6112 2015-03-04 13:19:52 -08:00
Himanshu Gupta bd5cecdd44 UTs update for indexing service 2015-02-25 15:45:58 -08:00
Xavier Léauté b167dcf82c [maven-release-plugin] prepare for next development iteration 2015-02-23 14:28:06 -08:00
Xavier Léauté e81ac2ba43 [maven-release-plugin] prepare release druid-0.7.0 2015-02-23 14:27:58 -08:00
Fangjin Yang 25db9abb7f Merge pull request #1138 from metamx/better-default-hostname
Better default hostname
2015-02-18 17:37:34 -08:00
Xavier Léauté 53d2b961c5 default to canonical hostname instead of localhost 2015-02-18 16:44:48 -08:00
Xavier Léauté 78df7f6165 Move Druid release artifacts to Sonatype
- Switch to using Druid parent POM
- Add required fields for Sonatype
- Common plugin versions and settings have been moved to the parent pom
- Cleanup artifacts and POMs for consistent formatting
- Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed
2015-02-13 14:26:31 -08:00