Charles Allen
056cab93ed
Add Hadoop Converter Job and task
...
* Fixes https://github.com/druid-io/druid/issues/1363
* Add extra utils in JobHelper based on PR feedback
2015-06-09 14:47:38 -07:00
Charles Allen
2a76bdc60a
Abstractify hadoopy indexer configuration.
...
* Moves many items to JobHelper
* Remove dependencies of these functions on HadoopDruidIndexerConfig in favor of more general items
* Changes functionalities of some of the path methods to always return a path with scheme
* Adds retry to uploads
* Change output loadSpec determining from using outputFS.getClass().getName() to using outputFS.getScheme()
2015-06-08 10:53:27 -07:00
fjy
be2a35188e
Additional schema validations and better logs for common extensions
2015-05-27 16:25:02 -07:00
Xavier Léauté
4466e77b25
Merge pull request #1371 from guobingkun/unit_test
...
Unit test for IndexGeneratorJob
2015-05-22 10:34:24 -04:00
flow
07659f30ab
bug fix: hdfs task log and indexing task not work properly with Hadoop HA
2015-05-21 20:49:42 +08:00
Bingkun Guo
b46aff12ae
Unit test for IndexGeneratorJob
2015-05-18 12:31:16 -05:00
Fangjin Yang
a2dc58cd2d
Merge pull request #1345 from pjain1/unit_test_warn_fix
...
fix warn msg and some unit tests
2015-05-08 08:06:20 -07:00
Parag Jain
01448d264c
Fix warn msg and added some unit tests
2015-05-07 17:10:05 -05:00
fjy
b19435d172
fix typos with batch ingestion in docs
2015-05-07 14:46:17 -07:00
Bingkun Guo
1ee550dd91
Fix a potential issue in DeterminePartitionsJob by making HadoopDruidIndexerConfig non-static, and two unit tests for DeterminPartitionsJob and LocalDataSegmentKiller
2015-05-04 20:00:29 -07:00
Xavier Léauté
3a3046ccf3
add support for dimension compression
...
- compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier
- makes dimension compression configurable via IndexSpec
- IndexSpec also enables configuring bitmap and metric compression
2015-04-14 10:44:18 -07:00
Prajwal Tuladhar
3044bf5592
use Job.getInstance() to fix deprecated warnings
2015-04-09 13:22:21 -04:00
Xavier Léauté
8b5fa8f85d
always upload SNAPSHOT self-contained jars
2015-04-03 21:18:09 -07:00
Dia Kharrat
3a6dc99384
log invalid rows in mapper of Hadoop indexer
2015-03-19 22:31:04 -07:00
Dia Kharrat
58d5f5e7f0
Honor ignoreInvalidRows in Hadoop indexer
...
The reducer of the hadoop indexer now ignores lines with parsing
exceptions (if enabled by the indexer config).
2015-03-19 22:31:04 -07:00
Himanshu Gupta
8c1f0834ba
Removing MapWritableInputRowParser from indexing-hadoop it should really be an extension if user needs
2015-03-19 18:37:08 -05:00
Himanshu Gupta
3f7a7ba5d3
For batch hadoop indexing, make hadoop input format configuration. Given input format must extend from org.apache.hadoop.mapreduce.InputFormat
2015-03-18 16:09:45 -05:00
fjy
bfe10bd156
This fixes arbitrary gran spec breaking
2015-03-17 12:19:43 -07:00
Himanshu Gupta
6a0405de20
fail early if there is no input data for batch hadoop indexing
2015-03-07 12:45:57 -06:00
Himanshu Gupta
30f64ff19e
UTs update for indexing-hadoop
2015-02-25 15:45:57 -08:00
Xavier Léauté
0784d7e30e
Merge pull request #1152 from himanshug/metastorage-pwd-provider
...
support for metadata store PasswordProvider interface
2015-02-25 15:19:37 -08:00
Fangjin Yang
708f35151d
Merge pull request #1121 from gianm/issue-1116
...
Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )
2015-02-25 13:03:59 -08:00
Fangjin Yang
6424815f88
Merge pull request #1097 from metamx/better-hadoop-sort-key
...
Sort HadoopIndexer rows by time+dim bucket to help reduce spilling
2015-02-25 12:49:58 -08:00
Himanshu Gupta
126262edce
support for PasswordProvider interface to enable writing druid extension which can get metadata store password from secured location or anywhere instead of plain text properties file
2015-02-25 14:05:19 -06:00
Himanshu Gupta
01a4f19ea2
removing dependency on NativeS3FileSystem and other file systems
2015-02-23 14:27:50 -06:00
Gian Merlino
fd5a7d1f08
Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )
2015-02-12 16:20:10 -08:00
Xavier Léauté
b1ec7afc12
Sort HadoopIndexer rows by time+dim bucket to help reduce spilling
2015-02-10 14:26:28 -08:00
Fangjin Yang
92e616de11
Merge pull request #1077 from metamx/remove-unused-imports
...
remove unused imports
2015-02-02 10:45:27 -08:00
nishantmonu51
ba932bb1f2
remove unused imports
2015-02-02 21:53:39 +05:30
fjy
d05032b98a
towards a community led druid
2015-01-31 20:57:36 -08:00
Xavier Léauté
cd9635ff5e
Merge pull request #1034 from druid-io/minor-rename
...
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 15:46:13 -08:00
fjy
ccddbf8747
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 14:04:55 -08:00
Fangjin Yang
5bfcc43377
Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate
...
Update all String conversions to and from byte[] to use the java-util StringUtils functions
2015-01-15 13:50:27 -08:00
Fangjin Yang
852e863425
Merge pull request #981 from druid-io/strictModuleTyping
...
Use Module instead of generic Object in Guice related items
2015-01-05 12:43:20 -08:00
Charles Allen
b1b5c9099e
Update all String conversions to and from byte[] to use the java-util StringUtils functions
...
* Speedup of GroupBy with javaScript filters by ~10%
* Requires https://github.com/metamx/java-util/pull/15
2015-01-05 11:22:32 -08:00
Xavier Léauté
f1375b0bfb
workaround to pass down bitmap type to map-reduce tasks
2015-01-02 17:29:00 -08:00
Charles Allen
7c8d4a7433
Use Module instead of generic Object in Guice related items
2014-12-19 10:54:06 -08:00
fjy
43d27ddaf0
update http client and fix logging
2014-12-15 16:59:57 -08:00
fjy
e872952390
fix working path default bug
2014-12-15 14:51:58 -08:00
fjy
28b72a69ad
redocumenting ingestion
2014-12-08 16:15:46 -08:00
nishantmonu51
40f223215a
fix buffer pool usage
2014-12-05 16:09:26 +05:30
nishantmonu51
6e03a6245f
Merge branch 'master' into onheap-incremental-index
2014-12-05 10:40:28 +05:30
Xavier Léauté
7cd45a6e1f
IncrementalIndex throws exception if limit exceeded
...
- For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes
- canAppendRow is a workaround for realtime index since the
Firehose currently does not have a way of rolling back the last event in
case of error
- canAppendRow needs a fudge factor; there is a race between checking
if we can add a row and actually adding a row, because of the way MapDB
reports its size.
2014-12-04 14:38:16 -08:00
Gian Merlino
20a7239ffd
Replace google-http-client imports with real guava imports.
2014-12-04 10:57:57 -08:00
Charles Allen
c2add5730b
Fix Hadoop CLI jobs
...
* Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop
* Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec
2014-12-04 10:57:56 -08:00
xvrl
c867d59ee0
fix error message
2014-12-03 15:30:32 -08:00
Xavier Léauté
2e6c254937
metadata injection not needed for indexing service
2014-12-03 15:09:31 -08:00
Gian Merlino
d388a8fe89
Replace google-http-client imports with real guava imports.
2014-12-03 10:52:57 -08:00
nishantmonu51
4dc0fdba8a
consider mapped size in limit calculation & review comments
2014-12-03 23:47:30 +05:30
nishantmonu51
da8bd7836b
Introduce buffer size
2014-12-03 16:28:22 +05:30
Charles Allen
7cd689be75
Fix Hadoop CLI jobs
...
* Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop
* Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec
2014-12-02 11:23:04 -08:00
nishantmonu51
eac776f1a7
tests passing with on heap incremental index
2014-12-02 22:29:28 +05:30
Xavier Léauté
59542c41f8
fix port not set in DruidNode
2014-12-01 14:37:28 -08:00
Charles Allen
8b3652a67a
Modify HadoopDruidIndexerConfig to give a port of 0 instead of -1 when binding DruidNode @Self annotation
2014-12-01 14:08:41 -08:00
fjy
fdeab0c6af
make Druid case sensitive
2014-11-19 14:27:31 -08:00
nishantmonu51
f0452c5968
merge from master
2014-11-18 19:34:51 +05:30
nishantmonu51
edf0fc0851
Make hashed partitions spec default
...
- make hashed partitionsSpec as default partitions spec for 0.7
2014-11-17 19:48:12 +05:30
nishantmonu51
0c2d06475d
merge from master
2014-11-17 19:19:18 +05:30
Xavier Léauté
0498df25df
override metadata storage injection in CliHadoopIndexer
2014-11-07 13:44:56 -08:00
Xavier Léauté
50a191425c
fix injection on MetadataStorageUpdaterJob
2014-11-07 11:11:14 -08:00
Xavier Léauté
20a9aef96a
fix test
2014-11-06 17:27:05 -08:00
Xavier Léauté
9c06db021f
rename db->metadata postgres->postgresql
2014-10-31 10:30:27 -07:00
jisookim0513
aa754b86e8
build success!
2014-10-24 11:28:42 -07:00
fjy
bef74104d9
merge with 0.7.x and resolve any conflicts
2014-10-23 17:24:06 -07:00
fjy
d76d57d95d
update docs
2014-10-22 16:16:28 -07:00
jisookim0513
37979282fe
enabled ansi-quote in mysql; insert statement should now work
2014-10-21 00:09:19 -07:00
jisookim0513
7d5c5f2083
fixed createTable; fixed miscellaneous stuff; added DerbyMetadataRuleManagerProvider
2014-10-17 00:10:36 -07:00
nishantmonu51
41e88baeca
Add test for bucket selection
2014-10-15 23:09:28 +05:30
nishantmonu51
f4a97aebbc
fix rollup for hashed partitions
...
truncate timestamp while calculating the partitionNumber
2014-10-15 22:32:56 +05:30
nishantmonu51
b5d66381f3
more cleanup
2014-10-14 18:32:40 +05:30
nishantmonu51
454acd3f5a
remove backwards compatible code
...
1) remove backwards compatible and deprecated code
2) make hashed partitions spec default
2014-10-13 19:30:44 +05:30
fjy
c7b4d5b7b4
Merge branch 'master' into druid-0.7.x
...
Conflicts:
processing/src/test/java/io/druid/segment/filter/SpatialFilterTest.java
2014-10-02 18:12:10 -07:00
nishantmonu51
ad75a21040
separate offheapIncrementalIndex implementation
2014-10-01 13:58:51 +05:30
jisookim0513
9d7b5d9b0f
fixed javadoc; fixed pom files; deleted unnecessary class
2014-09-30 13:47:35 -07:00
nishantmonu51
358ff915bb
fix merge conflicts
2014-09-30 22:19:18 +05:30
nishantmonu51
2789536bed
merge changes from druid-0.7.x
2014-09-30 22:05:49 +05:30
nishantmonu51
61c7fd2e6e
make ingestOffheap tuneable
2014-09-30 15:30:02 +05:30
nishantmonu51
adb4a65e0a
Merge branch 'offheap-incremental-index' into mapdb-branch
2014-09-29 12:38:31 +05:30
jisookim0513
74565c9371
cleaned up the code
2014-09-27 13:10:01 -07:00
jisookim0513
aa887edb73
added two seperate modules for mysql and postgres
2014-09-27 13:08:53 -07:00
flow
2dd62979bb
Fixed the issue of batch ingestion with indexing service to hdfs end up with the path of metadata in mysql missing "hdfs://host" prefix. The detail describe can be found here: https://groups.google.com/forum/#!topic/druid-development/ofvSxiPpCxI
2014-09-27 22:26:52 +08:00
jisookim0513
6a641621b2
finished merging into druid-0.7.x; derby not working (to be fixed)
2014-09-26 14:24:53 -07:00
jisookim0513
43cc6283d3
trying to revert files that have overwritten changes
2014-09-26 12:38:04 -07:00
fjy
eaf0a48b92
Merge branch 'master' into druid-0.7.x
...
Conflicts:
cassandra-storage/pom.xml
common/pom.xml
examples/pom.xml
hdfs-storage/pom.xml
histogram/pom.xml
indexing-hadoop/pom.xml
indexing-service/pom.xml
kafka-eight/pom.xml
kafka-seven/pom.xml
pom.xml
processing/pom.xml
processing/src/main/java/io/druid/guice/PropertiesModule.java
rabbitmq/pom.xml
s3-extensions/pom.xml
server/pom.xml
services/pom.xml
2014-09-26 11:39:24 -07:00
jisookim0513
3bf39cc9f8
attempted to fix merge-conflicts
2014-09-24 15:55:42 -07:00
nishantmonu51
f51ab84386
merge changes from druid-0.7.x
2014-09-22 23:48:45 +05:30
nishantmonu51
443e5788fb
make OffheapIncrementalIndex tuneable
2014-09-22 19:26:10 +05:30
jisookim0513
273205f217
initial attempt for abstraction; druid cluster works with Derby as a default
2014-09-19 17:39:59 -07:00
nishantmonu51
8eb6466487
revert buffer size and add back rowFlushBoundary
2014-09-19 23:06:04 +05:30
Xavier Léauté
d501b052ea
remove unused columnConfig
2014-09-15 13:02:47 -07:00
Xavier Léauté
e57e2d97ba
make constants final
2014-09-15 12:53:40 -07:00
fjy
469ccbbe5e
Merge branch 'master' into druid-0.7.x
...
Conflicts:
cassandra-storage/pom.xml
common/pom.xml
examples/pom.xml
hdfs-storage/pom.xml
histogram/pom.xml
indexing-hadoop/pom.xml
indexing-service/pom.xml
kafka-eight/pom.xml
kafka-seven/pom.xml
pom.xml
processing/pom.xml
processing/src/main/java/io/druid/query/FinalizeResultsQueryRunner.java
processing/src/main/java/io/druid/query/UnionQueryRunner.java
processing/src/main/java/io/druid/query/groupby/GroupByQueryRunnerFactory.java
processing/src/main/java/io/druid/query/topn/TopNQueryEngine.java
processing/src/main/java/io/druid/query/topn/TopNQueryRunnerFactory.java
rabbitmq/pom.xml
s3-extensions/pom.xml
server/pom.xml
server/src/test/java/io/druid/server/initialization/JettyTest.java
services/pom.xml
2014-09-11 16:20:50 -07:00
fjy
fec7b43fcb
make making v9 segments something completely configurable
2014-09-10 15:28:30 -07:00
fjy
351afb8be7
allow legacy index generator
2014-09-09 17:04:35 -07:00
Xavier Léauté
58ab759fc6
remove unused imports
2014-08-29 14:03:47 -07:00
Xavier Léauté
ac05836833
make Java 8 javadoc happy
2014-08-29 13:58:50 -07:00
fjy
12f4147df5
switch index gen job to use logging indicator
2014-08-21 13:28:15 -07:00
fjy
d64879ccca
more cleanup
2014-08-20 13:22:42 -07:00
fjy
bb73b2556e
fix compilation
2014-08-20 13:17:00 -07:00
fjy
92f26d9a1f
cleanup rowflushboundary
2014-08-20 13:09:37 -07:00