Michael Schiff
6ad451a44a
JobHelper.ensurePaths will set job properties from config (tuningConfig.jobProperties) before adding input paths to the config.
...
Adding input paths will create Path and FileSystem instances which may depend on the values in the job config.
This allows all properties to be set from the spec file, avoiding having to directly edit cluster xml files.
IndexGeneratorJob.run adds job properties before adding input paths (adding input paths may depend on having job properies set)
JobHelperTest confirms that JobHelper.ensurePaths adds job properties
javadoc for addInputPaths to explain relationship with
addJobProperties
2015-07-01 12:45:32 -07:00
Davide Anastasia
4a3a7dd1ad
read hadoop-indexer configuration file from HDFS
2015-06-24 14:08:53 -07:00
Hao Xia
1931491c9f
A couple of hdfs related fixes
...
* Class loading issue with hdfs-storage extension
* Exception when using hdfs with non-fully qualified segment path
2015-06-19 17:22:20 -07:00
Xavier Léauté
0a5bb909a2
[maven-release-plugin] prepare for next development iteration
2015-06-18 17:35:19 -07:00
Xavier Léauté
59c6b2b279
[maven-release-plugin] prepare release druid-0.8.0-rc1
2015-06-18 17:35:14 -07:00
Charles Allen
94a567732a
Wipe FileContext off the face of the earth
...
* Fixes https://github.com/druid-io/druid/issues/1433
* Works arround https://issues.apache.org/jira/browse/HADOOP-10643
* Reverts to the prior method of renaming
2015-06-16 09:48:09 -07:00
Charles Allen
6230ac90ae
Use IndexMerger for conversion
2015-06-10 11:34:58 -07:00
Charles Allen
056cab93ed
Add Hadoop Converter Job and task
...
* Fixes https://github.com/druid-io/druid/issues/1363
* Add extra utils in JobHelper based on PR feedback
2015-06-09 14:47:38 -07:00
Charles Allen
2a76bdc60a
Abstractify hadoopy indexer configuration.
...
* Moves many items to JobHelper
* Remove dependencies of these functions on HadoopDruidIndexerConfig in favor of more general items
* Changes functionalities of some of the path methods to always return a path with scheme
* Adds retry to uploads
* Change output loadSpec determining from using outputFS.getClass().getName() to using outputFS.getScheme()
2015-06-08 10:53:27 -07:00
fjy
be2a35188e
Additional schema validations and better logs for common extensions
2015-05-27 16:25:02 -07:00
Xavier Léauté
4466e77b25
Merge pull request #1371 from guobingkun/unit_test
...
Unit test for IndexGeneratorJob
2015-05-22 10:34:24 -04:00
flow
07659f30ab
bug fix: hdfs task log and indexing task not work properly with Hadoop HA
2015-05-21 20:49:42 +08:00
Bingkun Guo
b46aff12ae
Unit test for IndexGeneratorJob
2015-05-18 12:31:16 -05:00
fjy
7a6acf5c1b
update pom to 0.8
2015-05-11 19:41:58 -06:00
Fangjin Yang
a2dc58cd2d
Merge pull request #1345 from pjain1/unit_test_warn_fix
...
fix warn msg and some unit tests
2015-05-08 08:06:20 -07:00
Parag Jain
01448d264c
Fix warn msg and added some unit tests
2015-05-07 17:10:05 -05:00
fjy
b19435d172
fix typos with batch ingestion in docs
2015-05-07 14:46:17 -07:00
Bingkun Guo
1ee550dd91
Fix a potential issue in DeterminePartitionsJob by making HadoopDruidIndexerConfig non-static, and two unit tests for DeterminPartitionsJob and LocalDataSegmentKiller
2015-05-04 20:00:29 -07:00
Xavier Léauté
3a3046ccf3
add support for dimension compression
...
- compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier
- makes dimension compression configurable via IndexSpec
- IndexSpec also enables configuring bitmap and metric compression
2015-04-14 10:44:18 -07:00
Prajwal Tuladhar
3044bf5592
use Job.getInstance() to fix deprecated warnings
2015-04-09 13:22:21 -04:00
Xavier Léauté
8b5fa8f85d
always upload SNAPSHOT self-contained jars
2015-04-03 21:18:09 -07:00
fjy
aea7f9d192
[maven-release-plugin] prepare for next development iteration
2015-03-30 16:35:24 -07:00
fjy
060d7aef03
[maven-release-plugin] prepare release druid-0.7.1
2015-03-30 16:35:20 -07:00
Dia Kharrat
3a6dc99384
log invalid rows in mapper of Hadoop indexer
2015-03-19 22:31:04 -07:00
Dia Kharrat
58d5f5e7f0
Honor ignoreInvalidRows in Hadoop indexer
...
The reducer of the hadoop indexer now ignores lines with parsing
exceptions (if enabled by the indexer config).
2015-03-19 22:31:04 -07:00
Himanshu Gupta
8c1f0834ba
Removing MapWritableInputRowParser from indexing-hadoop it should really be an extension if user needs
2015-03-19 18:37:08 -05:00
Xavier Léauté
a98187f798
Merge pull request #1177 from himanshug/custom_input_format1
...
Feature: Make hadoop input format configurable for batch ingestion
2015-03-19 15:49:36 -07:00
fjy
b389cfe404
[maven-release-plugin] prepare for next development iteration
2015-03-19 12:38:17 -07:00
fjy
60e7d543cc
[maven-release-plugin] prepare release druid-0.7.1-rc1
2015-03-19 12:38:13 -07:00
Himanshu Gupta
3f7a7ba5d3
For batch hadoop indexing, make hadoop input format configuration. Given input format must extend from org.apache.hadoop.mapreduce.InputFormat
2015-03-18 16:09:45 -05:00
fjy
bfe10bd156
This fixes arbitrary gran spec breaking
2015-03-17 12:19:43 -07:00
Himanshu Gupta
6a0405de20
fail early if there is no input data for batch hadoop indexing
2015-03-07 12:45:57 -06:00
Himanshu Gupta
30f64ff19e
UTs update for indexing-hadoop
2015-02-25 15:45:57 -08:00
Xavier Léauté
0784d7e30e
Merge pull request #1152 from himanshug/metastorage-pwd-provider
...
support for metadata store PasswordProvider interface
2015-02-25 15:19:37 -08:00
Fangjin Yang
708f35151d
Merge pull request #1121 from gianm/issue-1116
...
Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )
2015-02-25 13:03:59 -08:00
Fangjin Yang
6424815f88
Merge pull request #1097 from metamx/better-hadoop-sort-key
...
Sort HadoopIndexer rows by time+dim bucket to help reduce spilling
2015-02-25 12:49:58 -08:00
Fangjin Yang
3d50a3771a
Merge pull request #1151 from himanshug/remove-s3-fs-dep
...
removing dependency on NativeS3FileSystem and other file systems
2015-02-25 12:31:45 -08:00
Himanshu Gupta
126262edce
support for PasswordProvider interface to enable writing druid extension which can get metadata store password from secured location or anywhere instead of plain text properties file
2015-02-25 14:05:19 -06:00
Xavier Léauté
b167dcf82c
[maven-release-plugin] prepare for next development iteration
2015-02-23 14:28:06 -08:00
Xavier Léauté
e81ac2ba43
[maven-release-plugin] prepare release druid-0.7.0
2015-02-23 14:27:58 -08:00
Himanshu Gupta
01a4f19ea2
removing dependency on NativeS3FileSystem and other file systems
2015-02-23 14:27:50 -06:00
Xavier Léauté
78df7f6165
Move Druid release artifacts to Sonatype
...
- Switch to using Druid parent POM
- Add required fields for Sonatype
- Common plugin versions and settings have been moved to the parent pom
- Cleanup artifacts and POMs for consistent formatting
- Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed
2015-02-13 14:26:31 -08:00
Gian Merlino
fd5a7d1f08
Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )
2015-02-12 16:20:10 -08:00
fjy
d29740ed9f
[maven-release-plugin] prepare for next development iteration
2015-02-12 16:16:00 -08:00
fjy
211fd15b7e
[maven-release-plugin] prepare release druid-0.7.0-rc3
2015-02-12 16:15:56 -08:00
Xavier Léauté
b1ec7afc12
Sort HadoopIndexer rows by time+dim bucket to help reduce spilling
2015-02-10 14:26:28 -08:00
fjy
1f12c5b2f1
[maven-release-plugin] prepare for next development iteration
2015-02-03 12:06:49 -08:00
fjy
e82d431be7
[maven-release-plugin] prepare release druid-0.7.0-rc2
2015-02-03 12:06:41 -08:00
Fangjin Yang
92e616de11
Merge pull request #1077 from metamx/remove-unused-imports
...
remove unused imports
2015-02-02 10:45:27 -08:00
nishantmonu51
ba932bb1f2
remove unused imports
2015-02-02 21:53:39 +05:30
fjy
d05032b98a
towards a community led druid
2015-01-31 20:57:36 -08:00
fjy
1f94de22c6
[maven-release-plugin] prepare for next development iteration
2015-01-20 14:23:55 -08:00
fjy
17476edc31
[maven-release-plugin] prepare release druid-0.7.0-rc1
2015-01-20 14:23:51 -08:00
Xavier Léauté
cd9635ff5e
Merge pull request #1034 from druid-io/minor-rename
...
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 15:46:13 -08:00
fjy
ccddbf8747
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 14:04:55 -08:00
Fangjin Yang
5bfcc43377
Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate
...
Update all String conversions to and from byte[] to use the java-util StringUtils functions
2015-01-15 13:50:27 -08:00
Fangjin Yang
852e863425
Merge pull request #981 from druid-io/strictModuleTyping
...
Use Module instead of generic Object in Guice related items
2015-01-05 12:43:20 -08:00
Charles Allen
b1b5c9099e
Update all String conversions to and from byte[] to use the java-util StringUtils functions
...
* Speedup of GroupBy with javaScript filters by ~10%
* Requires https://github.com/metamx/java-util/pull/15
2015-01-05 11:22:32 -08:00
Xavier Léauté
f1375b0bfb
workaround to pass down bitmap type to map-reduce tasks
2015-01-02 17:29:00 -08:00
Charles Allen
7c8d4a7433
Use Module instead of generic Object in Guice related items
2014-12-19 10:54:06 -08:00
fjy
43d27ddaf0
update http client and fix logging
2014-12-15 16:59:57 -08:00
fjy
e872952390
fix working path default bug
2014-12-15 14:51:58 -08:00
fjy
28b72a69ad
redocumenting ingestion
2014-12-08 16:15:46 -08:00
nishantmonu51
40f223215a
fix buffer pool usage
2014-12-05 16:09:26 +05:30
nishantmonu51
6e03a6245f
Merge branch 'master' into onheap-incremental-index
2014-12-05 10:40:28 +05:30
Xavier Léauté
7cd45a6e1f
IncrementalIndex throws exception if limit exceeded
...
- For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes
- canAppendRow is a workaround for realtime index since the
Firehose currently does not have a way of rolling back the last event in
case of error
- canAppendRow needs a fudge factor; there is a race between checking
if we can add a row and actually adding a row, because of the way MapDB
reports its size.
2014-12-04 14:38:16 -08:00
Gian Merlino
20a7239ffd
Replace google-http-client imports with real guava imports.
2014-12-04 10:57:57 -08:00
Charles Allen
c2add5730b
Fix Hadoop CLI jobs
...
* Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop
* Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec
2014-12-04 10:57:56 -08:00
xvrl
c867d59ee0
fix error message
2014-12-03 15:30:32 -08:00
Xavier Léauté
2e6c254937
metadata injection not needed for indexing service
2014-12-03 15:09:31 -08:00
Gian Merlino
d388a8fe89
Replace google-http-client imports with real guava imports.
2014-12-03 10:52:57 -08:00
nishantmonu51
4dc0fdba8a
consider mapped size in limit calculation & review comments
2014-12-03 23:47:30 +05:30
nishantmonu51
da8bd7836b
Introduce buffer size
2014-12-03 16:28:22 +05:30
Charles Allen
7cd689be75
Fix Hadoop CLI jobs
...
* Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop
* Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec
2014-12-02 11:23:04 -08:00
nishantmonu51
eac776f1a7
tests passing with on heap incremental index
2014-12-02 22:29:28 +05:30
Xavier Léauté
59542c41f8
fix port not set in DruidNode
2014-12-01 14:37:28 -08:00
Charles Allen
8b3652a67a
Modify HadoopDruidIndexerConfig to give a port of 0 instead of -1 when binding DruidNode @Self annotation
2014-12-01 14:08:41 -08:00
fjy
fdeab0c6af
make Druid case sensitive
2014-11-19 14:27:31 -08:00
Fangjin Yang
590d31799e
Merge pull request #876 from metamx/remove-backwards-compatible
...
Remove backwards compatible
2014-11-19 14:33:14 -07:00
Charles Allen
dc66e1708e
Added src jar build to maven poms and re-formatted to conform to style guidelines.
2014-11-18 09:05:30 -08:00
nishantmonu51
f0452c5968
merge from master
2014-11-18 19:34:51 +05:30
nishantmonu51
edf0fc0851
Make hashed partitions spec default
...
- make hashed partitionsSpec as default partitions spec for 0.7
2014-11-17 19:48:12 +05:30
nishantmonu51
0c2d06475d
merge from master
2014-11-17 19:19:18 +05:30
Xavier Léauté
0498df25df
override metadata storage injection in CliHadoopIndexer
2014-11-07 13:44:56 -08:00
Xavier Léauté
50a191425c
fix injection on MetadataStorageUpdaterJob
2014-11-07 11:11:14 -08:00
Xavier Léauté
20a9aef96a
fix test
2014-11-06 17:27:05 -08:00
Xavier Léauté
9c06db021f
rename db->metadata postgres->postgresql
2014-10-31 10:30:27 -07:00
jisookim0513
aa754b86e8
build success!
2014-10-24 11:28:42 -07:00
fjy
bef74104d9
merge with 0.7.x and resolve any conflicts
2014-10-23 17:24:06 -07:00
fjy
3b29e77866
[maven-release-plugin] prepare for next development iteration
2014-10-22 16:25:32 -07:00
fjy
dcab2997f2
[maven-release-plugin] prepare release druid-0.6.160
2014-10-22 16:25:27 -07:00
fjy
d76d57d95d
update docs
2014-10-22 16:16:28 -07:00
jisookim0513
37979282fe
enabled ansi-quote in mysql; insert statement should now work
2014-10-21 00:09:19 -07:00
jisookim0513
7d5c5f2083
fixed createTable; fixed miscellaneous stuff; added DerbyMetadataRuleManagerProvider
2014-10-17 00:10:36 -07:00
nishantmonu51
41e88baeca
Add test for bucket selection
2014-10-15 23:09:28 +05:30
nishantmonu51
f4a97aebbc
fix rollup for hashed partitions
...
truncate timestamp while calculating the partitionNumber
2014-10-15 22:32:56 +05:30
nishantmonu51
b5d66381f3
more cleanup
2014-10-14 18:32:40 +05:30
fjy
a4c8f04409
[maven-release-plugin] prepare for next development iteration
2014-10-13 12:50:45 -07:00
fjy
7fd1747ffa
[maven-release-plugin] prepare release druid-0.6.159
2014-10-13 12:50:41 -07:00
nishantmonu51
454acd3f5a
remove backwards compatible code
...
1) remove backwards compatible and deprecated code
2) make hashed partitions spec default
2014-10-13 19:30:44 +05:30