Commit Graph

1132 Commits

Author SHA1 Message Date
Paul Otto 2301b60365 Add ability to provide taskResource for IndexTask. 2015-08-24 17:38:31 -07:00
Xavier Léauté 3b2e41e42a update for next release 2015-08-18 17:16:46 -07:00
Himanshu Gupta 15fa43dd43 changing DatasourcePathSpec, to get segment list, so that hadoop indexer uses overlord action to get list of segments and passes when running as an overlord task. and, uses metadata store directly when running as standalone hadoop indexer
also, serialized list of segments is passed to DatasourcePathSpec so that hadoop classloader issues do not creep up
2015-08-16 14:07:35 -05:00
Himanshu Gupta 4d4aa8bfc6 refactor IngestSegmentFirehoseFactory so that IngestSegmentFirehose becomes reusable
Conflicts:
	indexing-service/src/main/java/io/druid/indexing/firehose/IngestSegmentFirehoseFactory.java
2015-08-14 14:44:22 -05:00
Gian Merlino bc0c7dd65d Avoid the Hadoop objectMapper in the local IndexTask. Fixes #1545. 2015-08-11 10:40:53 -07:00
Charles Allen 1ddaa3fb33 Merge pull request #1592 from metamx/clean-test-files
clean temporary files
2015-08-03 11:47:20 -07:00
Nishant 2679efee7a clean temporary files 2015-08-03 23:32:58 +05:30
Fangjin Yang 6f65e6d3ef Merge pull request #1547 from pjain1/improve_overlord_test
add test to OverlordResourceTest
2015-07-28 07:35:48 -10:00
Parag Jain 2e1b617346 add more tests 2015-07-24 15:12:08 -05:00
Fangjin Yang 97242356b4 Merge pull request #1480 from guobingkun/kill_task_test
Unit tests for KillTask and MetadataTaskStorage
2015-07-20 16:31:45 -07:00
Xavier Léauté 4cfb00bc8a inrement version 2015-07-15 13:09:05 -07:00
Fangjin Yang 3f7ba58227 Merge pull request #1504 from metamx/fix-1447
fix for #1447
2015-07-14 08:50:08 -07:00
Himanshu e2ddfb7a1a Merge pull request #1511 from pjain1/remove_test
remove flaky overlord test
2015-07-13 18:38:34 -05:00
Parag Jain 59dec89f6a remove flaky overlord test 2015-07-13 15:32:12 -05:00
Himanshu 725086cc89 Merge pull request #1506 from gianm/realtime-plumber-nulls
Consider null inputRows and parse errors as unparseable during realtime ingestion.
2015-07-13 10:12:12 -05:00
Gian Merlino 9068bcd062 Consider null inputRows and parse errors as unparseable during realtime ingestion.
Also, harmonize exception handling between the RealtimeIndexTask and the RealtimeManager.
Conditions other than null inputRows and parse errors bubble up in both.
2015-07-11 20:40:03 -07:00
Himanshu cac722968e Merge pull request #1503 from metamx/fix-leaking-zk-nodes
Fix leaking Status Path nodes in ZK
2015-07-10 17:40:18 -05:00
Fangjin Yang 9f19e96658 Merge pull request #1477 from pjain1/overlord_test
overlord and task master test
2015-07-10 14:27:14 -07:00
Parag Jain 55c4fe64f3 overlord and task master test 2015-07-10 16:17:45 -05:00
Nishant 5fe27fe4ad fix for #1447
fixes #1447
2015-07-09 19:05:48 +05:30
Nishant 8d7a566bae Fix leaking Status Path nodes in ZK
- remove  ZK status path nodes for workers after they are removed
2015-07-09 17:20:09 +05:30
Charles Allen c0b60c0d2f I'm not your mom, indexing-service/test... cleanup after yourself 2015-07-01 15:00:09 -07:00
Bingkun Guo 282a0f9760 Unit tests for KillTask and MetadataTaskStorage 2015-06-29 17:55:41 -05:00
Himanshu b5b9ca1446 Merge pull request #1470 from pjain1/rtindex_test
Realtime Index Task test
2015-06-29 16:51:35 -05:00
Parag Jain 284b80b09e Realtime Index Task test 2015-06-29 09:52:41 -05:00
Himanshu 4a83a22f8c Merge pull request #1445 from metamx/JSWorkerSelectStrategy
JavaScript Worker Select Strategy
2015-06-22 17:19:13 -05:00
nishant fb4052d577 JavaScript Worker Select Strategy
this PR adds a JavaScriptWorkerSelectStrategy which allows defining
arbitrary logic for selecting workers to run task using a JavaScript
function.

This gives users full control to implement complex worker selection
strategies based on task attributes.

more tests and a complex javascript config

fix for java8 modify for nashorn compatibility
2015-06-20 02:01:34 +05:30
Xavier Léauté 0a5bb909a2 [maven-release-plugin] prepare for next development iteration 2015-06-18 17:35:19 -07:00
Xavier Léauté 59c6b2b279 [maven-release-plugin] prepare release druid-0.8.0-rc1 2015-06-18 17:35:14 -07:00
Charles Allen acc0a3fbf7 Add jitter to the retries for RemoteTaskActionClient 2015-06-12 17:43:25 -07:00
nishant e9afec4a2b fix task status issues on zk outages
docs

review comments

fix test

review comments

Review comments

fix compilation

fix typo
2015-06-11 00:49:52 +05:30
Xavier Léauté 78d468700b Merge pull request #1388 from metamx/fix-1360
fix race described in 1360
2015-06-10 11:59:36 -07:00
Xavier Léauté f6b336ac3e Merge pull request #1432 from metamx/config-fix
fix passing of config from IndexTuningConfig to RealtimeTuningConfig
2015-06-10 11:42:58 -07:00
nishant 963682d696 Add check for valid rowFlushBoundary configuration and fix tests 2015-06-10 21:38:34 +05:30
nishant 191b302f6a fix passing of config from IndexTuningConfig to RealtimeTuningConfig
- pass rowFlushboundary correctly instead of using default.
- fixes indexTask failing with
io.druid.segment.incremental.IndexSizeExceededException when
rowFlushboundary is set higher than
RealtimeTuningConfig.defaultMaxRowsInMemory

rename test method
2015-06-10 21:07:25 +05:30
nishant af9ea08041 fix race described in 1360
review comments

review comments

review comments

no need to remove

fix test

review comments
2015-06-10 12:19:12 +05:30
Charles Allen 056cab93ed Add Hadoop Converter Job and task
* Fixes https://github.com/druid-io/druid/issues/1363
* Add extra utils in JobHelper based on PR feedback
2015-06-09 14:47:38 -07:00
Charles Allen ef9b67cce3 Merge pull request #1422 from metamx/fix-ec2-public-ip
fix public IP not working in EC2 autoscaling
2015-06-03 16:30:51 -07:00
Xavier Léauté 4ebdfea76f fix public IP not working in EC2 autoscaling 2015-06-03 16:05:59 -07:00
Charles Allen 8289914f76 Make AbstractTask.makeId use AbstractTask.joinId
* Also remove TaskUtil
2015-06-03 13:24:20 -07:00
Fangjin Yang ac9057c00e Merge pull request #1401 from metamx/ec2-public-ip
flag to enable public IP in EC2 autoscaling
2015-05-28 20:21:32 -07:00
Xavier Léauté d834a974ba flag to enable public IP in EC2-VPC autoscaling 2015-05-28 18:14:12 -07:00
fjy bb1145ef56 Make the index task use indexmerger and not indexmaker 2015-05-28 13:34:57 -07:00
Xavier Léauté 5ad5d7d18b Merge pull request #1379 from flowbehappy/fix-hadoop-ha
bug fix: hdfs task log and indexing task not work properly with Hadoop HA
2015-05-22 09:14:50 -04:00
flow 07659f30ab bug fix: hdfs task log and indexing task not work properly with Hadoop HA 2015-05-21 20:49:42 +08:00
Charles Allen 29ba05c04f Abstractify HadoopTask
* Add `invokeForeignLoader` to commonize the way tasks are attempted to be launched in a foreign class loader
* Add `buildClassLoader` to accomplish the common tasks for hadoop jobs when building a ClassLoader
2015-05-14 17:04:43 -07:00
fjy 7a6acf5c1b update pom to 0.8 2015-05-11 19:41:58 -06:00
Gian Merlino e69d82a2b4 Realtime: Delay firehose connection until job is started.
Some firehoses (like the Kafka firehose) acquire input resources when they
connect, so it helps to delay this until after plumber.startJob() runs.
2015-05-04 10:54:07 -07:00
Xavier Léauté 721505c017 Merge pull request #1208 from druid-io/rework-metrics
Schemaless metrics + additional metrics for things we care about
2015-04-27 15:04:54 -07:00
fjy 963e5765bf Schemaless metrics + additional metrics for things we care about 2015-04-27 13:39:40 -07:00
Charles Allen 633fdb029e Add option to ConvertSegmentTask to skip validation
* Validation is enabled by default
2015-04-27 08:37:55 -07:00
Charles Allen 29341f9837 Fix random unit test failure from NoopTask ID collision 2015-04-24 13:07:48 -07:00
Xavier Léauté f73f14ab91 Merge pull request #1297 from metamx/versionConverterTaskUpdates
Update VersionConverterTask for IndexSpec and allowing Forced updates
2015-04-20 16:44:35 -07:00
Charles Allen 7479ac9012 Update VersionConverterTask for IndexSepc and allowing Forced updates 2015-04-20 16:17:06 -07:00
fjy d260515a43 update druid-api version 2015-04-17 14:58:35 -07:00
Xavier Léauté ea5572d001 Merge pull request #1271 from metamx/strictErrorChecking
Add stricter checking for potential coding errors
2015-04-15 15:21:41 -07:00
Charles Allen abdeaa0746 Add stricter checking for potential coding errors
Can use via `mvn clean compile test-compile -P strict'
2015-04-15 14:52:25 -07:00
Xavier Léauté 3a3046ccf3 add support for dimension compression
- compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier
- makes dimension compression configurable via IndexSpec
- IndexSpec also enables configuring bitmap and metric compression
2015-04-14 10:44:18 -07:00
fjy 195a3b8bb8 ignore rows with invalid interval 2015-04-06 16:08:40 -07:00
Fangjin Yang 208e307915 Merge pull request #1251 from metamx/uriSegmentLoaders
Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
2015-03-30 17:43:51 -07:00
fjy aea7f9d192 [maven-release-plugin] prepare for next development iteration 2015-03-30 16:35:24 -07:00
fjy 060d7aef03 [maven-release-plugin] prepare release druid-0.7.1 2015-03-30 16:35:20 -07:00
Charles Allen 1c6cbea89c Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
This reverts commit f904bc7858.
2015-03-30 13:40:04 -07:00
Fangjin Yang f904bc7858 Revert "Overhaul of SegmentPullers to add consistency and retries" 2015-03-30 13:15:50 -07:00
Charles Allen 6d407e8677 Add URI handling to SegmentPullers
* Requires https://github.com/druid-io/druid-api/pull/37
* Requires https://github.com/metamx/java-util/pull/22
* Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl
  * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
2015-03-30 12:33:23 -07:00
msprunck 942c17a2aa Remove timeline chunk count assumptions.
* Replace with generic iterables
2015-03-24 22:40:49 +01:00
fjy b389cfe404 [maven-release-plugin] prepare for next development iteration 2015-03-19 12:38:17 -07:00
fjy 60e7d543cc [maven-release-plugin] prepare release druid-0.7.1-rc1 2015-03-19 12:38:13 -07:00
Xavier Léauté 9d6b728054 Merge pull request #1215 from metamx/log-audit-IP-Address
Add remote ip address in audit log.
2015-03-17 13:59:31 -07:00
fjy bfe10bd156 This fixes arbitrary gran spec breaking 2015-03-17 12:19:43 -07:00
nishantmonu51 f9821d242f also log author ip address in audit log 2015-03-17 23:15:15 +05:30
Xavier Léauté ddfafa0711 randomize task ID to fix spurious test failure 2015-03-12 18:08:48 -07:00
Fangjin Yang a508c0955f Merge pull request #1195 from himanshug/task_storage_config_fix
correctly parse recentlyFinishedThreshold from config
2015-03-12 16:50:49 -07:00
nishantmonu51 3ec4a30ab5 initial commit
review comments

more refactoring and cleaning of redundant code

add UT + docs + more refactoring

fixes + review comments

more cleanup

end points to fetch history

review comments

remove unnecessary changes

review comments rename header name

review comments + add test for MetadataRulesManager

review comments docs
2015-03-12 22:50:29 +05:30
Himanshu Gupta 23545fc01c correctly parse recentlyFinishedThreshold from config 2015-03-12 09:46:57 -05:00
Xavier Léauté d3f5bddc5c Add ability to apply extraction functions to the time dimension
- Moves DimExtractionFn under a more generic ExtractionFn interface to
  support extracting dimension values other than strings
- pushes down extractionFn to the storage adapter from query engine
- 'dimExtractionFn' parameter has been deprecated in favor of 'extractionFn'
- adds a TimeFormatExtractionFn, allowing to project the '__time' dimension
- JavascriptDimExtractionFn renamed to JavascriptExtractionFn, adding
  support for any dimension value types that map directly to Javascript
- update documentation for time column extraction and related changes
2015-03-11 16:45:42 -07:00
Gian Merlino b00c243786 Need a null check for iamProfile. 2015-03-10 17:52:15 -07:00
Gian Merlino b810cdfe58 EC2AutoScaler: Allow setting "iamProfile". 2015-03-10 17:41:35 -07:00
Gian Merlino d102a89760 Fix license on EC2AutoScalerSerdeTest. 2015-03-10 17:31:30 -07:00
Gian Merlino 9235b45063 EC2AutoScaler: Support for setting subnetId. 2015-03-10 11:29:56 -07:00
Xavier Léauté 113d204b10 break up archive task actions, which was missed in #566a3a6112 2015-03-04 13:19:52 -08:00
Himanshu Gupta bd5cecdd44 UTs update for indexing service 2015-02-25 15:45:58 -08:00
Xavier Léauté b167dcf82c [maven-release-plugin] prepare for next development iteration 2015-02-23 14:28:06 -08:00
Xavier Léauté e81ac2ba43 [maven-release-plugin] prepare release druid-0.7.0 2015-02-23 14:27:58 -08:00
Fangjin Yang 25db9abb7f Merge pull request #1138 from metamx/better-default-hostname
Better default hostname
2015-02-18 17:37:34 -08:00
Xavier Léauté 53d2b961c5 default to canonical hostname instead of localhost 2015-02-18 16:44:48 -08:00
Xavier Léauté 78df7f6165 Move Druid release artifacts to Sonatype
- Switch to using Druid parent POM
- Add required fields for Sonatype
- Common plugin versions and settings have been moved to the parent pom
- Cleanup artifacts and POMs for consistent formatting
- Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed
2015-02-13 14:26:31 -08:00
fjy d29740ed9f [maven-release-plugin] prepare for next development iteration 2015-02-12 16:16:00 -08:00
fjy 211fd15b7e [maven-release-plugin] prepare release druid-0.7.0-rc3 2015-02-12 16:15:56 -08:00
fjy 708759e1e0 Update http-client to 1.0.0 2015-02-10 13:36:47 -08:00
Charles Allen 79a3e8f59f Fix overriding base of IndexerZkConfig to be absolute instead of relative
* Updated docs to clarify ZK config behavior
* Added unit tests for this case
2015-02-04 13:04:06 -08:00
fjy 1f12c5b2f1 [maven-release-plugin] prepare for next development iteration 2015-02-03 12:06:49 -08:00
fjy e82d431be7 [maven-release-plugin] prepare release druid-0.7.0-rc2 2015-02-03 12:06:41 -08:00
Fangjin Yang 92e616de11 Merge pull request #1077 from metamx/remove-unused-imports
remove unused imports
2015-02-02 10:45:27 -08:00
nishantmonu51 ba932bb1f2 remove unused imports 2015-02-02 21:53:39 +05:30
fjy d05032b98a towards a community led druid 2015-01-31 20:57:36 -08:00
Xavier Léauté a01a22dba1 Merge pull request #1074 from druid-io/overlord-leader
Add an endpoint to return the overlord leader
2015-01-30 13:44:49 -08:00
Xavier Léauté bd49528805 Merge pull request #1073 from druid-io/fix-statusPath
Fix worker status path announcement with indexer zk config
2015-01-30 12:51:21 -08:00
fjy 649f285feb Add an endpoint to return the overlord leader 2015-01-30 12:37:48 -08:00
fjy bc1405bee0 fix worker status path announcement with indexer zk config 2015-01-30 12:26:08 -08:00
Xavier Léauté 2c2771b90e Make dynamic worker selection actually work 2015-01-27 14:17:42 -08:00
nishantmonu51 0f3eac4705 fix dimension exclusion 2015-01-23 00:31:23 +05:30
fjy 1f94de22c6 [maven-release-plugin] prepare for next development iteration 2015-01-20 14:23:55 -08:00
fjy 17476edc31 [maven-release-plugin] prepare release druid-0.7.0-rc1 2015-01-20 14:23:51 -08:00
fjy 2d516fa591 Add a new equal distribution strategy for assigning tasks 2015-01-20 13:12:22 -08:00
Xavier Léauté cd9635ff5e Merge pull request #1034 from druid-io/minor-rename
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 15:46:13 -08:00
fjy ccddbf8747 minor rename of things in hadoop ingestion config to match 0.6.x 2015-01-15 14:04:55 -08:00
Fangjin Yang 5bfcc43377 Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate
Update all String conversions to and from byte[] to use the java-util StringUtils functions
2015-01-15 13:50:27 -08:00
Charles Allen 67757b6aea Change IndexerZkConfig to use @JacksonInject instead of just straight @Inject
* Updated IndexerZkConfig to use no setters, and take all arguments from constructor instead
* Also added more unit tests
2015-01-08 11:11:17 -08:00
Charles Allen f6fbb733b8 Added a few places where tests were using Object instead of Module 2015-01-05 13:47:25 -08:00
Charles Allen b1b5c9099e Update all String conversions to and from byte[] to use the java-util StringUtils functions
* Speedup of GroupBy with javaScript filters by ~10%
* Requires https://github.com/metamx/java-util/pull/15
2015-01-05 11:22:32 -08:00
Charles Allen 65286a24e0 Change zk configs to use Jackson injection instead of Skife
* Also added generic config testing class JsonConfigTesterBase
2014-12-29 10:36:12 -08:00
Fangjin Yang af1185b58c Merge pull request #969 from metamx/fixRemoteLogViewing
Remove try-with-resources for log stream in WokerResource
2014-12-15 16:26:02 -07:00
Charles Allen 54068e8b1d Remove try-with-resources for log stream in WokerResource 2014-12-15 15:24:59 -08:00
fjy ac407fb6ba clean up defaults 2014-12-15 15:05:02 -08:00
fjy e872952390 fix working path default bug 2014-12-15 14:51:58 -08:00
Fangjin Yang b3fe91bb50 Merge pull request #830 from metamx/union-merge-on-historical
Union merge on historical
2014-12-15 13:36:47 -07:00
Charles Allen bed3e7e1d2 Merge pull request #966 from metamx/fix-tasklog-streaming
fix task log streaming
2014-12-14 09:31:41 -08:00
Xavier Léauté bd91a40491 fix task log streaming 2014-12-13 15:22:55 -08:00
Xavier Léauté 092dfe0309 fix IndexTaskTest tmp dir
- Create local firehose files in a clean temp directory to avoid
firehose reading other random temp files that start with 'druid'
2014-12-12 17:05:45 -08:00
fjy 123db3da4d fix another broken ut 2014-12-09 15:47:28 -08:00
nishantmonu51 1a1b0e6f23 merge from master and review comments 2014-12-09 13:16:45 +05:30
Charles Allen a0f9f9877e Changed all "application/json" to MediaType.APPLICATION_JSON except for in druid.js 2014-12-08 14:21:49 -08:00
nishantmonu51 6e03a6245f Merge branch 'master' into onheap-incremental-index 2014-12-05 10:40:28 +05:30
Xavier Léauté 7cd45a6e1f IncrementalIndex throws exception if limit exceeded
- For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes
- canAppendRow is a workaround for realtime index since the
Firehose currently does not have a way of rolling back the last event in
case of error
- canAppendRow needs a fudge factor; there is a race between checking
if we can add a row and actually adding a row, because of the way MapDB
reports its size.
2014-12-04 14:38:16 -08:00
Charles Allen 18234a2f00 Fix confusing error message in HadoopIndexTask 2014-12-04 10:57:57 -08:00
Gian Merlino 20a7239ffd Replace google-http-client imports with real guava imports. 2014-12-04 10:57:57 -08:00
Xavier Léauté 0c521e0a77 update joda-time and fix min/max instant 2014-12-04 10:57:56 -08:00
Fangjin Yang 27d4b2bdea Merge pull request #934 from metamx/fix-hadoop-metadata-injection
metadata update handler injection not needed for indexing service
2014-12-04 10:45:20 -07:00
Xavier Léauté 2e6c254937 metadata injection not needed for indexing service 2014-12-03 15:09:31 -08:00
Charles Allen 325a5c4abc Update ForkingTaskRunner to remove @Deprecated Files method usage 2014-12-03 13:18:33 -08:00
xvrl 2681da4420 Merge pull request #929 from metamx/google-cleanup
Replace google-http-client imports with real guava imports.
2014-12-03 11:50:19 -08:00
Charles Allen b6f71d3fd6 Fix confusing error message in HadoopIndexTask 2014-12-03 11:11:53 -08:00
Gian Merlino d388a8fe89 Replace google-http-client imports with real guava imports. 2014-12-03 10:52:57 -08:00
nishantmonu51 da8bd7836b Introduce buffer size 2014-12-03 16:28:22 +05:30
Xavier Léauté a79389a9e5 update joda-time and fix min/max instant 2014-12-02 17:27:22 -08:00
Xavier Léauté d23fd1e1ab make host+port more explicit
- document the behavior for node host/port initialization
- throw exception if settings make no sense
- fixes announcement for nodes without host/port defaults
- makes code clearer as to when host vs. host+port are used
2014-11-26 22:03:25 -08:00
Fangjin Yang 3ff569ef2d Merge pull request #879 from metamx/rtr-with-pref
Rewrite autoscaling and enable easier configuration of worker selection and autoscaling behaviour
2014-11-24 17:54:28 -07:00
fjy 3808411340 address some cr 2014-11-24 16:54:47 -08:00
fjy 13cae41f6c Merge branch 'master' into refactor-examples 2014-11-24 11:00:26 -08:00
fjy 9b701bbc76 a few more code review fixes 2014-11-24 10:54:29 -08:00
fjy 1aaea9a0d7 address code review 2014-11-24 10:52:30 -08:00
fjy 8ee4d12562 Refactor structure for examples and extensions 2014-11-21 14:45:24 -08:00
fjy 580e1172c1 move IndexTask to use hashed partition; fixes #815 2014-11-21 11:15:25 -08:00
fjy fdeab0c6af make Druid case sensitive 2014-11-19 14:27:31 -08:00
Fangjin Yang 109fdf0b34 Merge pull request #852 from metamx/druid-0.7.x-TaskLogStreamer
(DO NOT MERGE YET) Update logging to https://github.com/druid-io/druid-api/pull/27
2014-11-19 15:03:12 -07:00
Fangjin Yang 590d31799e Merge pull request #876 from metamx/remove-backwards-compatible
Remove backwards compatible
2014-11-19 14:33:14 -07:00
fjy 64719b15e0 rewrite autoscaling with tests 2014-11-18 15:41:06 -08:00
xvrl a96eaeb036 Merge pull request #882 from metamx/now_with_OPEN_SOURCE
Added src jar build to maven poms and re-formatted to conform to style guidelines.
2014-11-18 13:00:04 -08:00
Charles Allen dc66e1708e Added src jar build to maven poms and re-formatted to conform to style guidelines. 2014-11-18 09:05:30 -08:00