Commit Graph

277 Commits

Author SHA1 Message Date
nishantmonu51 b5d66381f3 more cleanup 2014-10-14 18:32:40 +05:30
nishantmonu51 454acd3f5a remove backwards compatible code
1) remove backwards compatible and deprecated code
2) make hashed partitions spec default
2014-10-13 19:30:44 +05:30
fjy c7b4d5b7b4 Merge branch 'master' into druid-0.7.x
Conflicts:
	processing/src/test/java/io/druid/segment/filter/SpatialFilterTest.java
2014-10-02 18:12:10 -07:00
nishantmonu51 ad75a21040 separate offheapIncrementalIndex implementation 2014-10-01 13:58:51 +05:30
jisookim0513 9d7b5d9b0f fixed javadoc; fixed pom files; deleted unnecessary class 2014-09-30 13:47:35 -07:00
nishantmonu51 358ff915bb fix merge conflicts 2014-09-30 22:19:18 +05:30
nishantmonu51 2789536bed merge changes from druid-0.7.x 2014-09-30 22:05:49 +05:30
nishantmonu51 61c7fd2e6e make ingestOffheap tuneable 2014-09-30 15:30:02 +05:30
nishantmonu51 adb4a65e0a Merge branch 'offheap-incremental-index' into mapdb-branch 2014-09-29 12:38:31 +05:30
jisookim0513 74565c9371 cleaned up the code 2014-09-27 13:10:01 -07:00
jisookim0513 aa887edb73 added two seperate modules for mysql and postgres 2014-09-27 13:08:53 -07:00
flow 2dd62979bb Fixed the issue of batch ingestion with indexing service to hdfs end up with the path of metadata in mysql missing "hdfs://host" prefix. The detail describe can be found here: https://groups.google.com/forum/#!topic/druid-development/ofvSxiPpCxI 2014-09-27 22:26:52 +08:00
jisookim0513 6a641621b2 finished merging into druid-0.7.x; derby not working (to be fixed) 2014-09-26 14:24:53 -07:00
jisookim0513 43cc6283d3 trying to revert files that have overwritten changes 2014-09-26 12:38:04 -07:00
fjy eaf0a48b92 Merge branch 'master' into druid-0.7.x
Conflicts:
	cassandra-storage/pom.xml
	common/pom.xml
	examples/pom.xml
	hdfs-storage/pom.xml
	histogram/pom.xml
	indexing-hadoop/pom.xml
	indexing-service/pom.xml
	kafka-eight/pom.xml
	kafka-seven/pom.xml
	pom.xml
	processing/pom.xml
	processing/src/main/java/io/druid/guice/PropertiesModule.java
	rabbitmq/pom.xml
	s3-extensions/pom.xml
	server/pom.xml
	services/pom.xml
2014-09-26 11:39:24 -07:00
jisookim0513 3bf39cc9f8 attempted to fix merge-conflicts 2014-09-24 15:55:42 -07:00
nishantmonu51 f51ab84386 merge changes from druid-0.7.x 2014-09-22 23:48:45 +05:30
nishantmonu51 443e5788fb make OffheapIncrementalIndex tuneable 2014-09-22 19:26:10 +05:30
jisookim0513 273205f217 initial attempt for abstraction; druid cluster works with Derby as a default 2014-09-19 17:39:59 -07:00
nishantmonu51 8eb6466487 revert buffer size and add back rowFlushBoundary 2014-09-19 23:06:04 +05:30
Xavier Léauté d501b052ea remove unused columnConfig 2014-09-15 13:02:47 -07:00
Xavier Léauté e57e2d97ba make constants final 2014-09-15 12:53:40 -07:00
fjy 469ccbbe5e Merge branch 'master' into druid-0.7.x
Conflicts:
	cassandra-storage/pom.xml
	common/pom.xml
	examples/pom.xml
	hdfs-storage/pom.xml
	histogram/pom.xml
	indexing-hadoop/pom.xml
	indexing-service/pom.xml
	kafka-eight/pom.xml
	kafka-seven/pom.xml
	pom.xml
	processing/pom.xml
	processing/src/main/java/io/druid/query/FinalizeResultsQueryRunner.java
	processing/src/main/java/io/druid/query/UnionQueryRunner.java
	processing/src/main/java/io/druid/query/groupby/GroupByQueryRunnerFactory.java
	processing/src/main/java/io/druid/query/topn/TopNQueryEngine.java
	processing/src/main/java/io/druid/query/topn/TopNQueryRunnerFactory.java
	rabbitmq/pom.xml
	s3-extensions/pom.xml
	server/pom.xml
	server/src/test/java/io/druid/server/initialization/JettyTest.java
	services/pom.xml
2014-09-11 16:20:50 -07:00
fjy fec7b43fcb make making v9 segments something completely configurable 2014-09-10 15:28:30 -07:00
fjy 351afb8be7 allow legacy index generator 2014-09-09 17:04:35 -07:00
Xavier Léauté 58ab759fc6 remove unused imports 2014-08-29 14:03:47 -07:00
Xavier Léauté ac05836833 make Java 8 javadoc happy 2014-08-29 13:58:50 -07:00
fjy 12f4147df5 switch index gen job to use logging indicator 2014-08-21 13:28:15 -07:00
fjy d64879ccca more cleanup 2014-08-20 13:22:42 -07:00
fjy bb73b2556e fix compilation 2014-08-20 13:17:00 -07:00
fjy 92f26d9a1f cleanup rowflushboundary 2014-08-20 13:09:37 -07:00
nishantmonu51 79ff993b31 increase default buffer size to 512m 2014-08-20 22:15:06 +05:30
nishantmonu51 33354cf7fe replace maxRowsInMemory with BufferSize 2014-08-20 20:59:44 +05:30
fjy 88a904e0b3 address cr about progress ind 2014-08-19 12:59:01 -07:00
nishantmonu51 c6712739dc merge changes from druid-0.7.x 2014-08-12 15:47:42 +05:30
nishantmonu51 9598a524a8 review comment - move index closure to finally 2014-08-12 14:58:55 +05:30
nishantmonu51 637bd35785 merge changes from druid-0.7.x 2014-07-31 16:07:22 +05:30
nishantmonu51 4ce12470a1 Add way to skip determine partitions for index task
Add a way to skip determinePartitions for IndexTask by manually
specifying numShards.
2014-07-18 18:52:15 +05:30
nishantmonu51 f5f05e3a9b Sync changes from branch new-ingestion PR #599
Sync and Resolve Conflicts
2014-07-11 16:15:10 +05:30
nishantmonu51 fa43049240 review comments & pom changes 2014-07-10 11:48:46 +05:30
nishantmonu51 36fc85736c Add ShardSpec Lookup
Optimize choosing shardSpec for Hash Partitions
2014-07-08 18:01:31 +05:30
fjy 4c40e71e54 address cr 2014-06-19 14:48:46 -07:00
fjy a870fe5cbe inject column config 2014-06-19 14:47:57 -07:00
Xavier Léauté 09346b0a3c make column cache configurable 2014-06-19 14:43:03 -07:00
fjy a63cda3281 Merge branch 'master' into new-guava
Conflicts:
	server/src/main/java/io/druid/server/QueryResource.java
2014-06-13 10:08:10 -07:00
nishantmonu51 a7e19ad892 configure buffer sizes 2014-06-12 19:32:37 +05:30
nishantmonu51 6265613bb9 Merge branch 'master' into offheap-incremental-index 2014-06-05 17:42:57 +05:30
nishantmonu51 01e8a713b6 unit tests passing with offheap-indexing 2014-06-05 17:42:53 +05:30
Gian Merlino 1ca7bf03b8 IndexGeneratorJob needs to respect isCombineText, too. 2014-06-04 17:54:31 -07:00
fjy adc00f2bcf make combine text configurable 2014-06-04 16:24:56 -07:00
fjy bb4105ed1a fix broken standalone hadoop ingestion 2014-06-04 09:23:46 -07:00
fjy 77ec4df797 update guava, java-util, and druid-api 2014-06-03 13:43:38 -07:00
fjy 4c13327297 more logging for determine hashed 2014-05-30 16:19:20 -07:00
fjy 7be93a770a make all firehoses work with tasks, add a lot more documentation about configuration 2014-05-28 16:33:59 -07:00
Deepak 7d92cf2b3b Update IndexGeneratorJob.java
CombineTextInputFormat instead of TextInputFormat combines multiple splits for a single mapper and reduces the strain on hadoop platform. It greatly improves job completion time as there are fewer number of mappers to bookkeep.
2014-05-22 15:08:12 +05:30
Deepak de0a7b27e7 Update DetermineHashedPartitionsJob.java
CombineTextInputFormat instead of TextInputFormat combines multiple splits for a single mapper and reduces the strain on hadoop platform. It greatly improves job completion time as there are fewer number of mappers to bookkeep.
2014-05-22 15:06:56 +05:30
Xavier Léauté 9ec7c71e0f fix compilation error with updated druid-api 2014-05-19 14:06:23 -07:00
fjy 1100d2f2a1 rename configs to make a bit more sense 2014-05-06 14:52:50 -07:00
fjy b6fb4245aa Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/HadoopDriverConfig.java
	indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java
	indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfigBuilder.java
	pom.xml
	server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
	server/src/main/java/io/druid/segment/realtime/firehose/EventReceiverFirehoseFactory.java
2014-05-06 14:32:51 -07:00
Gian Merlino bdf9e74a3b Allow config-based overriding of hadoop job properties. 2014-05-06 09:11:31 -07:00
fjy f9523274ac remove extra println 2014-05-01 15:06:51 -07:00
nishantmonu51 5137031304 use same logic for compression
Use same logic for compression across creating files, reading from
files, and checking file existence
2014-05-01 15:20:47 +05:30
nishantmonu51 728f1e8ee3 fix exists check with compression 2014-05-01 15:01:10 +05:30
nishantmonu51 01e84f10b7 add the checks again.
removing these checks breaks when there is no data for any interval
2014-05-01 14:35:09 +05:30
fjy 76e0a48527 Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/DbUpdaterJob.java
	indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
	indexing-service/src/main/java/io/druid/indexing/common/task/HadoopIndexTask.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumber.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-04-25 14:03:28 -07:00
fjy 2d1f33e59f Merge pull request #500 from metamx/batch-ingestion-fixes
Batch ingestion fixes
2014-04-22 17:59:24 -06:00
nishantmonu51 357bbf5127 add all the shard specs 2014-04-23 05:23:11 +05:30
nishantmonu51 625a5418d2 minor fix 2014-04-23 05:05:51 +05:30
nishantmonu51 1ca61237c1 review comments- use final variables 2014-04-23 03:33:28 +05:30
nishantmonu51 0d8c1ffe54 review comments and add partitioner 2014-04-23 03:30:30 +05:30
nishantmonu51 ea4a80e8d2 Add serde test for shardCount 2014-04-23 00:24:08 +05:30
nishantmonu51 e920cec5d0 remove unused import 2014-04-23 00:13:30 +05:30
nishantmonu51 0748eabe9b batch ingestion fixes
1) Fix path when mapped output is compressed
2) Add number of reducers to the determine hashed partitions job
manually
3) Add a way to disable determine partitions and specify shardCount in
HashedPartitionsSpec
2014-04-23 00:05:08 +05:30
Crystark 40a6804192 Support for postgresql
I think it was the last request using 'end' missing the postgresql support.
2014-04-07 17:37:03 +02:00
fjy 2adcf07f5f Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/DetermineHashedPartitionsJob.java
	indexing-service/src/main/java/io/druid/indexing/common/task/RealtimeIndexTask.java
	indexing-service/src/test/java/io/druid/indexing/common/task/TaskSerdeTest.java
	processing/src/test/java/io/druid/segment/TestIndex.java
	server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-03-17 10:59:31 -07:00
nishantmonu51 4ec1959c30 Use druid implementation of HyperLogLog
remove dependency on clear spring analytics
2014-03-07 00:06:40 +05:30
fjy 5db00afb37 clean up and default values 2014-03-04 14:38:27 -08:00
fjy c4c4d80336 make local testing pass 2014-03-03 14:52:43 -08:00
fjy 46b9ac78e7 Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
	pom.xml
	publications/whitepaper/druid.pdf
	publications/whitepaper/druid.tex
2014-03-03 14:48:15 -08:00
fjy 13c7f1c7b1 remove dead code 2014-02-27 15:52:19 -08:00
fjy bf2ddda897 unit tests passing after more refactoring 2014-02-27 15:21:09 -08:00
nishantmonu51 5e0d418b4b fix determine partitions partitioner to work in local mode 2014-02-26 16:31:42 +05:30
nishantmonu51 1ed5254d5b improvements
1) Number of reducers use 1 only when intervals are to be determined
2) Read only useful bytes from BytesWritable
2014-02-26 02:51:45 +05:30
nishantmonu51 8af63005a6 refactor randomPartitionsSpec to hashedPartitionsSpec
refactor to a more appropriate name
2014-02-25 03:07:31 +05:30
fjy 5d2367f0fd unit tests pass at this point 2014-02-20 15:52:12 -08:00
fjy 20cac8c506 not compiling yet but close 2014-02-19 15:54:27 -08:00
fjy 4b7c76762d unit tests passingn at this point, finished rt port maybe 2014-02-18 15:14:38 -08:00
nishantmonu51 fde7269c86 check published segments before the intermediate files are cleaned up 2014-02-15 04:30:28 +05:30
fjy 3979eb270c Revert "Revert "Merge branch 'determine-partitions-improvements'""
This reverts commit 189b3e2b9b.
2014-02-14 12:58:56 -08:00
fjy a8c4362d72 rejiggering druid api 2014-02-14 12:57:52 -08:00
fjy 189b3e2b9b Revert "Merge branch 'determine-partitions-improvements'"
This reverts commit 7ad228ceb5, reversing
changes made to 9c55e2b779.
2014-02-14 12:47:34 -08:00
nishantmonu51 48d0c37f98 documentation for random partition spec 2014-02-05 15:30:44 +05:30
nishantmonu51 bacc72415f correct locking and partitionsSpec 2014-02-05 03:17:47 +05:30
nishantmonu51 569452121e fix partitioner for loca mode 2014-01-31 21:59:17 +05:30
nishantmonu51 82b748ad43 review comments 2014-01-31 20:19:33 +05:30
nishantmonu51 97e5d68635 determine intervals working with determine partitions 2014-01-31 19:04:52 +05:30
nishantmonu51 5fd76067cd remove logging and use new determine partition job 2014-01-31 13:51:38 +05:30
nishantmonu51 7ca87d59df Determine partitions using cardinality 2014-01-31 00:49:11 +05:30
fjy f898c29e20 fix batch indexing and prepare for next release 2014-01-17 15:52:04 -08:00
fjy 3b17c4c03c a whole bunch of docs and fixes 2014-01-13 18:01:56 -08:00
fjy 1ecc94cfb6 another attempt at index task 2014-01-10 17:56:22 -08:00
Hagen Rother 52746b8ea6 fix hadoop intake's parser exception catching (was too specific) 2013-12-19 07:04:47 +01:00
fjy a1c09df17f make the hadoop index task work again 2013-10-16 09:45:17 -07:00
cheddar c47fe202c7 Fix HadoopDruidIndexer to work with the new way of things
There are multiple and sundry changes in here.

First, "HadoopDruidIndexer" has been split into two pieces, (1) CliHadoop which pulls the hadoop version and builds up the right classpath with the proper hadoop version to run the indexer and (2) CliInternalHadoopIndexer which actually runs the indexer.

In order to work around a bunch of jets3t version conflicts with Hadoop and Druid, I needed to extract the S3 deep storage stuff into its own module.  I then also moved the HDFS stuff into its own module so that I could eliminate the dependency on Hadoop for druid-server.

In doing these changes, I wanted to make the extensions buildable with only the druid-api jar, so a few other things had to move out of Druid and into druid-api.  They are all API-level things, however, so they really belong in druid-api instead.

Lastly, I removed the druid-realtime module and put it all in druid-server.
2013-10-09 15:15:44 -05:00
fjy a79ad7bab4 make dynamic master resource configuration work again 2013-09-27 15:00:40 -07:00
fjy 8bc56daa66 fix things up according to code review comments 2013-09-26 11:35:45 -07:00
fjy 87259321b6 port hadoop druid indexer to new guice framework 2013-09-26 11:04:42 -07:00
cheddar 3c39f90c89 1) Move Firehose interface and dependencies to druid-api
2) Move DataSegment* interfaces and dependencies to druid-api
2013-08-31 16:43:28 -05:00
cheddar 5ab671050e No more com.metamx.druid, it is now all io.druid! 2013-08-30 19:42:12 -05:00
cheddar bd0756e360 More stuff moved, things still compiling and tests still passing. Yay! 2013-08-30 18:58:35 -05:00
cheddar 56e2b956d0 OMG!!! A lot of stuff has been moved. Modules have been created and destroyed, but everything is compiling and unit tests are passing, OMFG this is awesome.! 2013-08-30 18:21:04 -05:00
cheddar 2a46086e20 1) Didn't remove the io.druid files from client. Remove those and make sure things compile
2) Switch DefaultObjectMapper to CommonObjectMapper
3) Create new DefaultObjectMapper in client that has Query stuff registered on it by default
2013-08-29 15:25:36 -05:00
cheddar 9c30ced5ea 1) Move various "api" classes to io.druid packages and make sure things compile and stuff 2013-08-28 15:51:02 -05:00
cheddar 5fa944dd26 Merge branch 'master' into guice
Conflicts:
	client/src/main/java/com/metamx/druid/coordination/BatchDataSegmentAnnouncer.java
	client/src/main/java/com/metamx/druid/curator/announcement/Announcer.java
	client/src/main/java/com/metamx/druid/query/filter/SelectorDimFilter.java
	client/src/main/java/com/metamx/druid/query/search/SearchQueryQueryToolChest.java
	indexing-service/src/main/java/com/metamx/druid/indexing/common/tasklogs/S3TaskLogs.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/ForkingTaskRunner.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunner.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/WorkerCuratorCoordinator.java
	indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunnerTest.java
	pom.xml
	server/src/main/java/com/metamx/druid/http/MasterMain.java
	server/src/main/java/com/metamx/druid/http/MasterServletModule.java
	server/src/main/java/com/metamx/druid/master/DruidMasterConfig.java
	server/src/test/java/com/metamx/druid/master/DruidMasterTest.java
	server/src/test/java/com/metamx/druid/query/group/GroupByQueryRunnerTest.java
2013-08-27 14:27:32 -05:00
fjy d11d0a8284 fix according to code review 2013-08-22 10:49:46 -07:00
fjy 778fd0f10e Fix persist of empty indexes in index generator job 2013-08-22 10:16:43 -07:00
cheddar eee1efdcb5 Merge branch 'master' into guice
Conflicts:
	client/src/main/java/com/metamx/druid/client/DruidServerConfig.java
	indexing-service/src/main/java/com/metamx/druid/indexing/common/index/ChatHandlerProvider.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/TaskMasterLifecycle.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
	indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/TaskLifecycleTest.java
2013-08-06 13:33:31 -07:00
cheddar 3c808b15c3 1) Fix HadoopDruidIndexerConfigTest to actually verify the current correct behavior. 2013-08-05 11:37:20 -07:00
cheddar 2b71505421 1) Fix HadoopDruidIndexerConfig to no longer replace ":" with "_" on the segmentOutputDir. The segmentOutputDir is user-supplied so they should have the ability to just not set a bad directory. 2013-08-05 11:22:26 -07:00
cheddar 2361e0112a Make it all compile again... 2013-08-02 10:14:46 -07:00
cheddar 9e78bb38f5 Merge branch 'master' into guice
Conflicts:
	client/src/main/java/com/metamx/druid/QueryableNode.java
	client/src/main/java/com/metamx/druid/client/ServerInventoryView.java
	client/src/main/java/com/metamx/druid/coordination/SingleDataSegmentAnnouncer.java
	client/src/main/java/com/metamx/druid/initialization/CuratorDiscoveryConfig.java
	client/src/main/java/com/metamx/druid/query/MetricsEmittingExecutorService.java
	indexing-hadoop/src/test/java/com/metamx/druid/indexer/HadoopDruidIndexerConfigTest.java
	indexing-service/src/main/java/com/metamx/druid/indexing/common/TaskToolbox.java
	indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/http/IndexerCoordinatorNode.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
	indexing-service/src/main/java/com/metamx/druid/indexing/worker/http/WorkerNode.java
	pom.xml
	server/src/main/java/com/metamx/druid/coordination/ServerManager.java
	server/src/main/java/com/metamx/druid/coordination/ZkCoordinator.java
	server/src/main/java/com/metamx/druid/db/DatabaseRuleManager.java
	server/src/main/java/com/metamx/druid/db/DatabaseSegmentManager.java
	server/src/main/java/com/metamx/druid/http/ComputeNode.java
	server/src/main/java/com/metamx/druid/http/MasterMain.java
	server/src/main/java/com/metamx/druid/loading/SegmentLoaderConfig.java
	server/src/main/java/com/metamx/druid/loading/SingleSegmentLoader.java
	server/src/main/java/com/metamx/druid/master/DruidMaster.java
2013-08-01 16:42:47 -07:00
Jan Rudert ad087a7a22 correct segment path for hadoop indexer 2013-07-10 09:21:45 +02:00
cheddar 2f56c24259 1) Inject IndexingServiceClient
2) Switch all the DBI references to IDBI
2013-06-07 17:37:33 -07:00
cheddar f68df7ab69 1) Make tests work and continue trying to make the DruidMaster start up with just Guice 2013-06-07 12:01:46 -07:00
fjy 42cc87a294 Merge branch 'master' into refactor-indexing
Conflicts:
	indexing-service/src/main/java/com/metamx/druid/indexing/common/task/IndexTask.java
	pom.xml
2013-05-31 17:28:59 -07:00
fjy 08d84001ba Merge branch 'master' into refactor-indexing 2013-05-16 16:03:29 -07:00
fjy 26e0eb62cb merge and other refactorings 2013-05-15 17:28:08 -07:00