druid

Commit Graph

Author	SHA1	Message	Date
Xavier Léauté	fa6142e217	cleanup and remove unused imports	2015-11-11 12:25:21 -08:00
Charles Allen	abae47850a	Add backwards compatability for PR #1922	2015-11-11 10:27:00 -08:00
Gian Merlino	dfbd0e2b60	Merge pull request #1925 from gianm/fix-index-generator Fix reference to INDEX_MAKER in IndexGeneratorJob.	2015-11-06 09:56:30 -08:00
Gian Merlino	75122dc396	Fix reference to INDEX_MAKER in IndexGeneratorJob.	2015-11-06 09:19:58 -08:00
Himanshu Gupta	6bed633121	do not use LoggingProcessIndicator in IndexGeneratorJob because that uses Stopwatch methods from guava not available in older guava versions, this makes the behavior same as LegacyIndexGeneratorJob	2015-11-06 00:40:51 -06:00
Charles Allen	929b981710	Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to	2015-11-05 18:10:13 -08:00
Xavier Léauté	223d1ebe9f	fix a very old todo	2015-11-05 13:00:30 -08:00
fjy	8f231fd3e3	cleanup druid codebase	2015-11-04 13:59:53 -08:00
Himanshu Gupta	84f7d8d264	making static final variables in HadoopDruidIndexerConfig upper case	2015-11-02 23:24:26 -06:00
Himanshu Gupta	8b67417ac8	make methods in Index[Merger,Maker,IO] non-static so that they can have appropriate ObjectMapper injected instead of creating one statically	2015-11-02 23:24:26 -06:00
Nishant	3641a0e553	Fix Race in jar upload during hadoop indexing - https://github.com/druid-io/druid/issues/582 few fixes delete intermediate file early better exception handling use static pattern instead of compiling it every time Add retry for transient exceptions remove usage of deprecated method. Add test fix imports fix javadoc review comment. review comment: handle crazy snapshot naming review comments remove default retry count in favour of already present constant review comment make random intermediate and final paths. review comment, use temporaryFolder where possible	2015-10-22 21:41:07 +05:30
Himanshu Gupta	0368260018	For dataSource inputSpec in hadoop batch ingestion, use configured query granularity for reading existing segments instead of NONE	2015-10-12 22:19:44 -05:00
Gian Merlino	3aba401ee0	SQLMetadataConnector: Retry table creation, in case something goes wrong. Also rejigger table creation methods to not take a DBI. It's already available inside the connector, and everyone was just using that one anyway.	2015-09-24 21:39:36 -07:00
Himanshu Gupta	e8b9ee85a7	HadoopyStringInputRowParser to convert stringy Text, BytesWritable etc into InputRow	2015-09-16 10:58:13 -05:00
Himanshu Gupta	74f4572bd4	Lazily deserialize "parser" to InputRowParser in DataSchema so that user hadoop related InputRowParsers are created only when needed this allows overlord to accept a HadoopIndexTask with a hadoopy InputRowParser and not fail because hadoopy InputRowParser might need hadoop libraries	2015-09-16 10:58:13 -05:00
Himanshu Gupta	9ca6106128	user specified hadoop settings are ignored if explicitly set in code	2015-08-31 10:50:18 -05:00
Gian Merlino	940e1aa3eb	Replace funky imports with standard ones. 1) Lots of Guava imports were not coming from the actual Guava 2) junit.framework.Assert should be org.junit.Assert	2015-08-28 18:02:05 -07:00
jon-wei	e5c4927b14	Add support for parsing BytesWritable strings to Hadoop Indexer	2015-08-28 14:27:14 -07:00
Gian Merlino	414a6fb477	Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat. Fixes #1678. IngestSegmentFirehose (and its users) need to remember which windows of which segments should actually be read, based on a timeline.	2015-08-28 07:32:41 -07:00
Himanshu Gupta	2e0dd1d792	adding UTs and addressing review comments to firehoseV2 addition to Realtime[Manager\|Plumber], essential segment metadata persist support, kafka-simple-consumer-firehose extension patch	2015-08-27 20:50:46 -05:00
lvjq	2237a8cf0f	kafka 8 simple consumer firehose	2015-08-27 20:50:46 -05:00
Charles Allen	e38cf54bc8	Migrate TestDerbyConnector to a JUnit @Rule	2015-08-26 21:47:40 -07:00
Himanshu Gupta	b3c570e78d	update BatchDeltaIngestion.testDeltaIngestion(..) to check for proper glob path handling	2015-08-20 21:36:34 -05:00
Himanshu Gupta	85e3ce9096	split hadoop glob path before adding it to MultipleInputs This can be safely reverted once https://issues.apache.org/jira/browse/MAPREDUCE-5061 is fixed	2015-08-20 21:36:34 -05:00
Himanshu Gupta	a603bd9547	HadoopGlobPathSplitter implementation to split hadoop glob paths This can be safely reverted once https://issues.apache.org/jira/browse/MAPREDUCE-5061 is fixed	2015-08-20 21:36:34 -05:00
Himanshu Gupta	cf3ec8eb46	helpful cause explaining why SegmentDescriptorInfo did not exist	2015-08-19 10:29:04 -05:00
Himanshu Gupta	a3bab5b7d9	IndexGeneratorJobTest type unit test for batch delta ingestion and reindexing	2015-08-16 14:07:35 -05:00
Himanshu Gupta	15fa43dd43	changing DatasourcePathSpec, to get segment list, so that hadoop indexer uses overlord action to get list of segments and passes when running as an overlord task. and, uses metadata store directly when running as standalone hadoop indexer also, serialized list of segments is passed to DatasourcePathSpec so that hadoop classloader issues do not creep up	2015-08-16 14:07:35 -05:00
Himanshu Gupta	45947a1021	add ability to specify Multiple PathSpecs in batch ingestion, so that we can grab data from multiple places in same ingestion Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java indexing-hadoop/src/main/java/io/druid/indexer/JobHelper.java Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/path/PathSpec.java	2015-08-16 13:15:38 -05:00
Himanshu Gupta	1ae56f139b	Druid Hadoop InputFormat and pathSpec Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/path/PathSpec.java indexing-service/pom.xml	2015-08-16 13:15:38 -05:00
Himanshu Gupta	f1d309a671	do not run parser if value from InputFormat is already an InputRow	2015-08-14 14:44:22 -05:00
Himanshu Gupta	0eec1bbee2	json serde tests for HadoopTuningConfig	2015-07-20 12:01:53 -05:00
Himanshu Gupta	f836c3a7ac	adding flag useCombiner to hadoop tuning config that can be used to add a hadoop combiner to hadoop batch ingestion to do merges on the mappers if possible	2015-07-20 12:01:53 -05:00
Himanshu Gupta	4ef484048a	take control of InputRow serde between Mapper/Reducer in Hadoop Indexing This allows for arbitrary InputFormat while hadoop batch ingestion that can return records of value type other than Text	2015-07-20 12:01:53 -05:00
Himanshu Gupta	f7a92db332	generic byte[] serde for InputRow	2015-07-20 12:01:53 -05:00
Charles Allen	b2bc46be17	Merge pull request #1484 from tubemogul/feature/1463 JobHelper.ensurePaths will set job properties from config (tuningConf…	2015-07-07 10:58:16 -07:00
Michael Schiff	6ad451a44a	JobHelper.ensurePaths will set job properties from config (tuningConfig.jobProperties) before adding input paths to the config. Adding input paths will create Path and FileSystem instances which may depend on the values in the job config. This allows all properties to be set from the spec file, avoiding having to directly edit cluster xml files. IndexGeneratorJob.run adds job properties before adding input paths (adding input paths may depend on having job properies set) JobHelperTest confirms that JobHelper.ensurePaths adds job properties javadoc for addInputPaths to explain relationship with addJobProperties	2015-07-01 12:45:32 -07:00
Davide Anastasia	4a3a7dd1ad	read hadoop-indexer configuration file from HDFS	2015-06-24 14:08:53 -07:00
Hao Xia	1931491c9f	A couple of hdfs related fixes * Class loading issue with hdfs-storage extension * Exception when using hdfs with non-fully qualified segment path	2015-06-19 17:22:20 -07:00
Charles Allen	94a567732a	Wipe FileContext off the face of the earth * Fixes https://github.com/druid-io/druid/issues/1433 * Works arround https://issues.apache.org/jira/browse/HADOOP-10643 * Reverts to the prior method of renaming	2015-06-16 09:48:09 -07:00
Charles Allen	6230ac90ae	Use IndexMerger for conversion	2015-06-10 11:34:58 -07:00
Charles Allen	056cab93ed	Add Hadoop Converter Job and task * Fixes https://github.com/druid-io/druid/issues/1363 * Add extra utils in JobHelper based on PR feedback	2015-06-09 14:47:38 -07:00
Charles Allen	2a76bdc60a	Abstractify hadoopy indexer configuration. * Moves many items to JobHelper * Remove dependencies of these functions on HadoopDruidIndexerConfig in favor of more general items * Changes functionalities of some of the path methods to always return a path with scheme * Adds retry to uploads * Change output loadSpec determining from using outputFS.getClass().getName() to using outputFS.getScheme()	2015-06-08 10:53:27 -07:00
fjy	be2a35188e	Additional schema validations and better logs for common extensions	2015-05-27 16:25:02 -07:00
Xavier Léauté	4466e77b25	Merge pull request #1371 from guobingkun/unit_test Unit test for IndexGeneratorJob	2015-05-22 10:34:24 -04:00
flow	07659f30ab	bug fix: hdfs task log and indexing task not work properly with Hadoop HA	2015-05-21 20:49:42 +08:00
Bingkun Guo	b46aff12ae	Unit test for IndexGeneratorJob	2015-05-18 12:31:16 -05:00
Fangjin Yang	a2dc58cd2d	Merge pull request #1345 from pjain1/unit_test_warn_fix fix warn msg and some unit tests	2015-05-08 08:06:20 -07:00
Parag Jain	01448d264c	Fix warn msg and added some unit tests	2015-05-07 17:10:05 -05:00
fjy	b19435d172	fix typos with batch ingestion in docs	2015-05-07 14:46:17 -07:00
Bingkun Guo	1ee550dd91	Fix a potential issue in DeterminePartitionsJob by making HadoopDruidIndexerConfig non-static, and two unit tests for DeterminPartitionsJob and LocalDataSegmentKiller	2015-05-04 20:00:29 -07:00
Xavier Léauté	3a3046ccf3	add support for dimension compression - compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier - makes dimension compression configurable via IndexSpec - IndexSpec also enables configuring bitmap and metric compression	2015-04-14 10:44:18 -07:00
Prajwal Tuladhar	3044bf5592	use Job.getInstance() to fix deprecated warnings	2015-04-09 13:22:21 -04:00
Xavier Léauté	8b5fa8f85d	always upload SNAPSHOT self-contained jars	2015-04-03 21:18:09 -07:00
Dia Kharrat	3a6dc99384	log invalid rows in mapper of Hadoop indexer	2015-03-19 22:31:04 -07:00
Dia Kharrat	58d5f5e7f0	Honor ignoreInvalidRows in Hadoop indexer The reducer of the hadoop indexer now ignores lines with parsing exceptions (if enabled by the indexer config).	2015-03-19 22:31:04 -07:00
Himanshu Gupta	8c1f0834ba	Removing MapWritableInputRowParser from indexing-hadoop it should really be an extension if user needs	2015-03-19 18:37:08 -05:00
Himanshu Gupta	3f7a7ba5d3	For batch hadoop indexing, make hadoop input format configuration. Given input format must extend from org.apache.hadoop.mapreduce.InputFormat	2015-03-18 16:09:45 -05:00
fjy	bfe10bd156	This fixes arbitrary gran spec breaking	2015-03-17 12:19:43 -07:00
Himanshu Gupta	6a0405de20	fail early if there is no input data for batch hadoop indexing	2015-03-07 12:45:57 -06:00
Himanshu Gupta	30f64ff19e	UTs update for indexing-hadoop	2015-02-25 15:45:57 -08:00
Xavier Léauté	0784d7e30e	Merge pull request #1152 from himanshug/metastorage-pwd-provider support for metadata store PasswordProvider interface	2015-02-25 15:19:37 -08:00
Fangjin Yang	708f35151d	Merge pull request #1121 from gianm/issue-1116 Use the proper FileSystems for writing segments and caching jars. (for issue #1116)	2015-02-25 13:03:59 -08:00
Fangjin Yang	6424815f88	Merge pull request #1097 from metamx/better-hadoop-sort-key Sort HadoopIndexer rows by time+dim bucket to help reduce spilling	2015-02-25 12:49:58 -08:00
Himanshu Gupta	126262edce	support for PasswordProvider interface to enable writing druid extension which can get metadata store password from secured location or anywhere instead of plain text properties file	2015-02-25 14:05:19 -06:00
Himanshu Gupta	01a4f19ea2	removing dependency on NativeS3FileSystem and other file systems	2015-02-23 14:27:50 -06:00
Gian Merlino	fd5a7d1f08	Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )	2015-02-12 16:20:10 -08:00
Xavier Léauté	b1ec7afc12	Sort HadoopIndexer rows by time+dim bucket to help reduce spilling	2015-02-10 14:26:28 -08:00
Fangjin Yang	92e616de11	Merge pull request #1077 from metamx/remove-unused-imports remove unused imports	2015-02-02 10:45:27 -08:00
nishantmonu51	ba932bb1f2	remove unused imports	2015-02-02 21:53:39 +05:30
fjy	d05032b98a	towards a community led druid	2015-01-31 20:57:36 -08:00
Xavier Léauté	cd9635ff5e	Merge pull request #1034 from druid-io/minor-rename minor rename of things in hadoop ingestion config to match 0.6.x	2015-01-15 15:46:13 -08:00
fjy	ccddbf8747	minor rename of things in hadoop ingestion config to match 0.6.x	2015-01-15 14:04:55 -08:00
Fangjin Yang	5bfcc43377	Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate Update all String conversions to and from byte[] to use the java-util StringUtils functions	2015-01-15 13:50:27 -08:00
Fangjin Yang	852e863425	Merge pull request #981 from druid-io/strictModuleTyping Use Module instead of generic Object in Guice related items	2015-01-05 12:43:20 -08:00
Charles Allen	b1b5c9099e	Update all String conversions to and from byte[] to use the java-util StringUtils functions * Speedup of GroupBy with javaScript filters by ~10% * Requires https://github.com/metamx/java-util/pull/15	2015-01-05 11:22:32 -08:00
Xavier Léauté	f1375b0bfb	workaround to pass down bitmap type to map-reduce tasks	2015-01-02 17:29:00 -08:00
Charles Allen	7c8d4a7433	Use Module instead of generic Object in Guice related items	2014-12-19 10:54:06 -08:00
fjy	43d27ddaf0	update http client and fix logging	2014-12-15 16:59:57 -08:00
fjy	e872952390	fix working path default bug	2014-12-15 14:51:58 -08:00
fjy	28b72a69ad	redocumenting ingestion	2014-12-08 16:15:46 -08:00
nishantmonu51	40f223215a	fix buffer pool usage	2014-12-05 16:09:26 +05:30
nishantmonu51	6e03a6245f	Merge branch 'master' into onheap-incremental-index	2014-12-05 10:40:28 +05:30
Xavier Léauté	7cd45a6e1f	IncrementalIndex throws exception if limit exceeded - For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes - canAppendRow is a workaround for realtime index since the Firehose currently does not have a way of rolling back the last event in case of error - canAppendRow needs a fudge factor; there is a race between checking if we can add a row and actually adding a row, because of the way MapDB reports its size.	2014-12-04 14:38:16 -08:00
Gian Merlino	20a7239ffd	Replace google-http-client imports with real guava imports.	2014-12-04 10:57:57 -08:00
Charles Allen	c2add5730b	Fix Hadoop CLI jobs * Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop * Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec	2014-12-04 10:57:56 -08:00
xvrl	c867d59ee0	fix error message	2014-12-03 15:30:32 -08:00
Xavier Léauté	2e6c254937	metadata injection not needed for indexing service	2014-12-03 15:09:31 -08:00
Gian Merlino	d388a8fe89	Replace google-http-client imports with real guava imports.	2014-12-03 10:52:57 -08:00
nishantmonu51	4dc0fdba8a	consider mapped size in limit calculation & review comments	2014-12-03 23:47:30 +05:30
nishantmonu51	da8bd7836b	Introduce buffer size	2014-12-03 16:28:22 +05:30
Charles Allen	7cd689be75	Fix Hadoop CLI jobs * Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop * Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec	2014-12-02 11:23:04 -08:00
nishantmonu51	eac776f1a7	tests passing with on heap incremental index	2014-12-02 22:29:28 +05:30
Xavier Léauté	59542c41f8	fix port not set in DruidNode	2014-12-01 14:37:28 -08:00
Charles Allen	8b3652a67a	Modify HadoopDruidIndexerConfig to give a port of 0 instead of -1 when binding DruidNode @Self annotation	2014-12-01 14:08:41 -08:00
fjy	fdeab0c6af	make Druid case sensitive	2014-11-19 14:27:31 -08:00
nishantmonu51	f0452c5968	merge from master	2014-11-18 19:34:51 +05:30
nishantmonu51	edf0fc0851	Make hashed partitions spec default - make hashed partitionsSpec as default partitions spec for 0.7	2014-11-17 19:48:12 +05:30
nishantmonu51	0c2d06475d	merge from master	2014-11-17 19:19:18 +05:30
Xavier Léauté	0498df25df	override metadata storage injection in CliHadoopIndexer	2014-11-07 13:44:56 -08:00
Xavier Léauté	50a191425c	fix injection on MetadataStorageUpdaterJob	2014-11-07 11:11:14 -08:00
Xavier Léauté	20a9aef96a	fix test	2014-11-06 17:27:05 -08:00
Xavier Léauté	9c06db021f	rename db->metadata postgres->postgresql	2014-10-31 10:30:27 -07:00
jisookim0513	aa754b86e8	build success!	2014-10-24 11:28:42 -07:00
fjy	bef74104d9	merge with 0.7.x and resolve any conflicts	2014-10-23 17:24:06 -07:00
fjy	d76d57d95d	update docs	2014-10-22 16:16:28 -07:00
jisookim0513	37979282fe	enabled ansi-quote in mysql; insert statement should now work	2014-10-21 00:09:19 -07:00
jisookim0513	7d5c5f2083	fixed createTable; fixed miscellaneous stuff; added DerbyMetadataRuleManagerProvider	2014-10-17 00:10:36 -07:00
nishantmonu51	41e88baeca	Add test for bucket selection	2014-10-15 23:09:28 +05:30
nishantmonu51	f4a97aebbc	fix rollup for hashed partitions truncate timestamp while calculating the partitionNumber	2014-10-15 22:32:56 +05:30
nishantmonu51	b5d66381f3	more cleanup	2014-10-14 18:32:40 +05:30
nishantmonu51	454acd3f5a	remove backwards compatible code 1) remove backwards compatible and deprecated code 2) make hashed partitions spec default	2014-10-13 19:30:44 +05:30
fjy	c7b4d5b7b4	Merge branch 'master' into druid-0.7.x Conflicts: processing/src/test/java/io/druid/segment/filter/SpatialFilterTest.java	2014-10-02 18:12:10 -07:00
nishantmonu51	ad75a21040	separate offheapIncrementalIndex implementation	2014-10-01 13:58:51 +05:30
jisookim0513	9d7b5d9b0f	fixed javadoc; fixed pom files; deleted unnecessary class	2014-09-30 13:47:35 -07:00
nishantmonu51	358ff915bb	fix merge conflicts	2014-09-30 22:19:18 +05:30
nishantmonu51	2789536bed	merge changes from druid-0.7.x	2014-09-30 22:05:49 +05:30
nishantmonu51	61c7fd2e6e	make ingestOffheap tuneable	2014-09-30 15:30:02 +05:30
nishantmonu51	adb4a65e0a	Merge branch 'offheap-incremental-index' into mapdb-branch	2014-09-29 12:38:31 +05:30
jisookim0513	74565c9371	cleaned up the code	2014-09-27 13:10:01 -07:00
jisookim0513	aa887edb73	added two seperate modules for mysql and postgres	2014-09-27 13:08:53 -07:00
flow	2dd62979bb	Fixed the issue of batch ingestion with indexing service to hdfs end up with the path of metadata in mysql missing "hdfs://host" prefix. The detail describe can be found here: https://groups.google.com/forum/#!topic/druid-development/ofvSxiPpCxI	2014-09-27 22:26:52 +08:00
jisookim0513	6a641621b2	finished merging into druid-0.7.x; derby not working (to be fixed)	2014-09-26 14:24:53 -07:00
jisookim0513	43cc6283d3	trying to revert files that have overwritten changes	2014-09-26 12:38:04 -07:00
fjy	eaf0a48b92	Merge branch 'master' into druid-0.7.x Conflicts: cassandra-storage/pom.xml common/pom.xml examples/pom.xml hdfs-storage/pom.xml histogram/pom.xml indexing-hadoop/pom.xml indexing-service/pom.xml kafka-eight/pom.xml kafka-seven/pom.xml pom.xml processing/pom.xml processing/src/main/java/io/druid/guice/PropertiesModule.java rabbitmq/pom.xml s3-extensions/pom.xml server/pom.xml services/pom.xml	2014-09-26 11:39:24 -07:00
jisookim0513	3bf39cc9f8	attempted to fix merge-conflicts	2014-09-24 15:55:42 -07:00
nishantmonu51	f51ab84386	merge changes from druid-0.7.x	2014-09-22 23:48:45 +05:30
nishantmonu51	443e5788fb	make OffheapIncrementalIndex tuneable	2014-09-22 19:26:10 +05:30
jisookim0513	273205f217	initial attempt for abstraction; druid cluster works with Derby as a default	2014-09-19 17:39:59 -07:00
nishantmonu51	8eb6466487	revert buffer size and add back rowFlushBoundary	2014-09-19 23:06:04 +05:30
Xavier Léauté	d501b052ea	remove unused columnConfig	2014-09-15 13:02:47 -07:00
Xavier Léauté	e57e2d97ba	make constants final	2014-09-15 12:53:40 -07:00
fjy	469ccbbe5e	Merge branch 'master' into druid-0.7.x Conflicts: cassandra-storage/pom.xml common/pom.xml examples/pom.xml hdfs-storage/pom.xml histogram/pom.xml indexing-hadoop/pom.xml indexing-service/pom.xml kafka-eight/pom.xml kafka-seven/pom.xml pom.xml processing/pom.xml processing/src/main/java/io/druid/query/FinalizeResultsQueryRunner.java processing/src/main/java/io/druid/query/UnionQueryRunner.java processing/src/main/java/io/druid/query/groupby/GroupByQueryRunnerFactory.java processing/src/main/java/io/druid/query/topn/TopNQueryEngine.java processing/src/main/java/io/druid/query/topn/TopNQueryRunnerFactory.java rabbitmq/pom.xml s3-extensions/pom.xml server/pom.xml server/src/test/java/io/druid/server/initialization/JettyTest.java services/pom.xml	2014-09-11 16:20:50 -07:00
fjy	fec7b43fcb	make making v9 segments something completely configurable	2014-09-10 15:28:30 -07:00
fjy	351afb8be7	allow legacy index generator	2014-09-09 17:04:35 -07:00
Xavier Léauté	58ab759fc6	remove unused imports	2014-08-29 14:03:47 -07:00
Xavier Léauté	ac05836833	make Java 8 javadoc happy	2014-08-29 13:58:50 -07:00
fjy	12f4147df5	switch index gen job to use logging indicator	2014-08-21 13:28:15 -07:00
fjy	d64879ccca	more cleanup	2014-08-20 13:22:42 -07:00
fjy	bb73b2556e	fix compilation	2014-08-20 13:17:00 -07:00
fjy	92f26d9a1f	cleanup rowflushboundary	2014-08-20 13:09:37 -07:00
nishantmonu51	79ff993b31	increase default buffer size to 512m	2014-08-20 22:15:06 +05:30
nishantmonu51	33354cf7fe	replace maxRowsInMemory with BufferSize	2014-08-20 20:59:44 +05:30
fjy	88a904e0b3	address cr about progress ind	2014-08-19 12:59:01 -07:00
nishantmonu51	c6712739dc	merge changes from druid-0.7.x	2014-08-12 15:47:42 +05:30
nishantmonu51	9598a524a8	review comment - move index closure to finally	2014-08-12 14:58:55 +05:30
nishantmonu51	637bd35785	merge changes from druid-0.7.x	2014-07-31 16:07:22 +05:30
nishantmonu51	4ce12470a1	Add way to skip determine partitions for index task Add a way to skip determinePartitions for IndexTask by manually specifying numShards.	2014-07-18 18:52:15 +05:30
nishantmonu51	f5f05e3a9b	Sync changes from branch new-ingestion PR #599 Sync and Resolve Conflicts	2014-07-11 16:15:10 +05:30
nishantmonu51	fa43049240	review comments & pom changes	2014-07-10 11:48:46 +05:30
nishantmonu51	36fc85736c	Add ShardSpec Lookup Optimize choosing shardSpec for Hash Partitions	2014-07-08 18:01:31 +05:30
fjy	4c40e71e54	address cr	2014-06-19 14:48:46 -07:00
fjy	a870fe5cbe	inject column config	2014-06-19 14:47:57 -07:00
Xavier Léauté	09346b0a3c	make column cache configurable	2014-06-19 14:43:03 -07:00
fjy	a63cda3281	Merge branch 'master' into new-guava Conflicts: server/src/main/java/io/druid/server/QueryResource.java	2014-06-13 10:08:10 -07:00
nishantmonu51	a7e19ad892	configure buffer sizes	2014-06-12 19:32:37 +05:30
nishantmonu51	6265613bb9	Merge branch 'master' into offheap-incremental-index	2014-06-05 17:42:57 +05:30
nishantmonu51	01e8a713b6	unit tests passing with offheap-indexing	2014-06-05 17:42:53 +05:30
Gian Merlino	1ca7bf03b8	IndexGeneratorJob needs to respect isCombineText, too.	2014-06-04 17:54:31 -07:00
fjy	adc00f2bcf	make combine text configurable	2014-06-04 16:24:56 -07:00
fjy	bb4105ed1a	fix broken standalone hadoop ingestion	2014-06-04 09:23:46 -07:00
fjy	77ec4df797	update guava, java-util, and druid-api	2014-06-03 13:43:38 -07:00
fjy	4c13327297	more logging for determine hashed	2014-05-30 16:19:20 -07:00
fjy	7be93a770a	make all firehoses work with tasks, add a lot more documentation about configuration	2014-05-28 16:33:59 -07:00
Deepak	7d92cf2b3b	Update IndexGeneratorJob.java CombineTextInputFormat instead of TextInputFormat combines multiple splits for a single mapper and reduces the strain on hadoop platform. It greatly improves job completion time as there are fewer number of mappers to bookkeep.	2014-05-22 15:08:12 +05:30
Deepak	de0a7b27e7	Update DetermineHashedPartitionsJob.java CombineTextInputFormat instead of TextInputFormat combines multiple splits for a single mapper and reduces the strain on hadoop platform. It greatly improves job completion time as there are fewer number of mappers to bookkeep.	2014-05-22 15:06:56 +05:30
Xavier Léauté	9ec7c71e0f	fix compilation error with updated druid-api	2014-05-19 14:06:23 -07:00
fjy	1100d2f2a1	rename configs to make a bit more sense	2014-05-06 14:52:50 -07:00
fjy	b6fb4245aa	Merge branch 'master' into new-schema Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/HadoopDriverConfig.java indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfigBuilder.java pom.xml server/src/main/java/io/druid/segment/realtime/RealtimeManager.java server/src/main/java/io/druid/segment/realtime/firehose/EventReceiverFirehoseFactory.java	2014-05-06 14:32:51 -07:00
Gian Merlino	bdf9e74a3b	Allow config-based overriding of hadoop job properties.	2014-05-06 09:11:31 -07:00
fjy	f9523274ac	remove extra println	2014-05-01 15:06:51 -07:00
nishantmonu51	5137031304	use same logic for compression Use same logic for compression across creating files, reading from files, and checking file existence	2014-05-01 15:20:47 +05:30
nishantmonu51	728f1e8ee3	fix exists check with compression	2014-05-01 15:01:10 +05:30
nishantmonu51	01e84f10b7	add the checks again. removing these checks breaks when there is no data for any interval	2014-05-01 14:35:09 +05:30
fjy	76e0a48527	Merge branch 'master' into new-schema Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/DbUpdaterJob.java indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java indexing-service/src/main/java/io/druid/indexing/common/task/HadoopIndexTask.java server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumber.java server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java	2014-04-25 14:03:28 -07:00
fjy	2d1f33e59f	Merge pull request #500 from metamx/batch-ingestion-fixes Batch ingestion fixes	2014-04-22 17:59:24 -06:00
nishantmonu51	357bbf5127	add all the shard specs	2014-04-23 05:23:11 +05:30
nishantmonu51	625a5418d2	minor fix	2014-04-23 05:05:51 +05:30
nishantmonu51	1ca61237c1	review comments- use final variables	2014-04-23 03:33:28 +05:30
nishantmonu51	0d8c1ffe54	review comments and add partitioner	2014-04-23 03:30:30 +05:30
nishantmonu51	ea4a80e8d2	Add serde test for shardCount	2014-04-23 00:24:08 +05:30
nishantmonu51	e920cec5d0	remove unused import	2014-04-23 00:13:30 +05:30
nishantmonu51	0748eabe9b	batch ingestion fixes 1) Fix path when mapped output is compressed 2) Add number of reducers to the determine hashed partitions job manually 3) Add a way to disable determine partitions and specify shardCount in HashedPartitionsSpec	2014-04-23 00:05:08 +05:30
Crystark	40a6804192	Support for postgresql I think it was the last request using 'end' missing the postgresql support.	2014-04-07 17:37:03 +02:00
fjy	2adcf07f5f	Merge branch 'master' into new-schema Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/DetermineHashedPartitionsJob.java indexing-service/src/main/java/io/druid/indexing/common/task/RealtimeIndexTask.java indexing-service/src/test/java/io/druid/indexing/common/task/TaskSerdeTest.java processing/src/test/java/io/druid/segment/TestIndex.java server/src/main/java/io/druid/segment/realtime/RealtimeManager.java server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java	2014-03-17 10:59:31 -07:00
nishantmonu51	4ec1959c30	Use druid implementation of HyperLogLog remove dependency on clear spring analytics	2014-03-07 00:06:40 +05:30
fjy	5db00afb37	clean up and default values	2014-03-04 14:38:27 -08:00
fjy	c4c4d80336	make local testing pass	2014-03-03 14:52:43 -08:00
fjy	46b9ac78e7	Merge branch 'master' into new-schema Conflicts: indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java pom.xml publications/whitepaper/druid.pdf publications/whitepaper/druid.tex	2014-03-03 14:48:15 -08:00
fjy	13c7f1c7b1	remove dead code	2014-02-27 15:52:19 -08:00
fjy	bf2ddda897	unit tests passing after more refactoring	2014-02-27 15:21:09 -08:00
nishantmonu51	5e0d418b4b	fix determine partitions partitioner to work in local mode	2014-02-26 16:31:42 +05:30
nishantmonu51	1ed5254d5b	improvements 1) Number of reducers use 1 only when intervals are to be determined 2) Read only useful bytes from BytesWritable	2014-02-26 02:51:45 +05:30
nishantmonu51	8af63005a6	refactor randomPartitionsSpec to hashedPartitionsSpec refactor to a more appropriate name	2014-02-25 03:07:31 +05:30
fjy	5d2367f0fd	unit tests pass at this point	2014-02-20 15:52:12 -08:00
fjy	20cac8c506	not compiling yet but close	2014-02-19 15:54:27 -08:00
fjy	4b7c76762d	unit tests passingn at this point, finished rt port maybe	2014-02-18 15:14:38 -08:00
nishantmonu51	fde7269c86	check published segments before the intermediate files are cleaned up	2014-02-15 04:30:28 +05:30
fjy	3979eb270c	Revert "Revert "Merge branch 'determine-partitions-improvements'"" This reverts commit `189b3e2b9b`.	2014-02-14 12:58:56 -08:00
fjy	a8c4362d72	rejiggering druid api	2014-02-14 12:57:52 -08:00
fjy	189b3e2b9b	Revert "Merge branch 'determine-partitions-improvements'" This reverts commit `7ad228ceb5`, reversing changes made to `9c55e2b779`.	2014-02-14 12:47:34 -08:00
nishantmonu51	48d0c37f98	documentation for random partition spec	2014-02-05 15:30:44 +05:30
nishantmonu51	bacc72415f	correct locking and partitionsSpec	2014-02-05 03:17:47 +05:30
nishantmonu51	569452121e	fix partitioner for loca mode	2014-01-31 21:59:17 +05:30
nishantmonu51	82b748ad43	review comments	2014-01-31 20:19:33 +05:30
nishantmonu51	97e5d68635	determine intervals working with determine partitions	2014-01-31 19:04:52 +05:30
nishantmonu51	5fd76067cd	remove logging and use new determine partition job	2014-01-31 13:51:38 +05:30
nishantmonu51	7ca87d59df	Determine partitions using cardinality	2014-01-31 00:49:11 +05:30
fjy	f898c29e20	fix batch indexing and prepare for next release	2014-01-17 15:52:04 -08:00
fjy	3b17c4c03c	a whole bunch of docs and fixes	2014-01-13 18:01:56 -08:00
fjy	1ecc94cfb6	another attempt at index task	2014-01-10 17:56:22 -08:00
Hagen Rother	52746b8ea6	fix hadoop intake's parser exception catching (was too specific)	2013-12-19 07:04:47 +01:00
fjy	a1c09df17f	make the hadoop index task work again	2013-10-16 09:45:17 -07:00
cheddar	c47fe202c7	Fix HadoopDruidIndexer to work with the new way of things There are multiple and sundry changes in here. First, "HadoopDruidIndexer" has been split into two pieces, (1) CliHadoop which pulls the hadoop version and builds up the right classpath with the proper hadoop version to run the indexer and (2) CliInternalHadoopIndexer which actually runs the indexer. In order to work around a bunch of jets3t version conflicts with Hadoop and Druid, I needed to extract the S3 deep storage stuff into its own module. I then also moved the HDFS stuff into its own module so that I could eliminate the dependency on Hadoop for druid-server. In doing these changes, I wanted to make the extensions buildable with only the druid-api jar, so a few other things had to move out of Druid and into druid-api. They are all API-level things, however, so they really belong in druid-api instead. Lastly, I removed the druid-realtime module and put it all in druid-server.	2013-10-09 15:15:44 -05:00
fjy	a79ad7bab4	make dynamic master resource configuration work again	2013-09-27 15:00:40 -07:00
fjy	8bc56daa66	fix things up according to code review comments	2013-09-26 11:35:45 -07:00
fjy	87259321b6	port hadoop druid indexer to new guice framework	2013-09-26 11:04:42 -07:00
cheddar	3c39f90c89	1) Move Firehose interface and dependencies to druid-api 2) Move DataSegment* interfaces and dependencies to druid-api	2013-08-31 16:43:28 -05:00
cheddar	5ab671050e	No more com.metamx.druid, it is now all io.druid!	2013-08-30 19:42:12 -05:00
cheddar	bd0756e360	More stuff moved, things still compiling and tests still passing. Yay!	2013-08-30 18:58:35 -05:00
cheddar	56e2b956d0	OMG!!! A lot of stuff has been moved. Modules have been created and destroyed, but everything is compiling and unit tests are passing, OMFG this is awesome.!	2013-08-30 18:21:04 -05:00
cheddar	2a46086e20	1) Didn't remove the io.druid files from client. Remove those and make sure things compile 2) Switch DefaultObjectMapper to CommonObjectMapper 3) Create new DefaultObjectMapper in client that has Query stuff registered on it by default	2013-08-29 15:25:36 -05:00
cheddar	9c30ced5ea	1) Move various "api" classes to io.druid packages and make sure things compile and stuff	2013-08-28 15:51:02 -05:00
cheddar	5fa944dd26	Merge branch 'master' into guice Conflicts: client/src/main/java/com/metamx/druid/coordination/BatchDataSegmentAnnouncer.java client/src/main/java/com/metamx/druid/curator/announcement/Announcer.java client/src/main/java/com/metamx/druid/query/filter/SelectorDimFilter.java client/src/main/java/com/metamx/druid/query/search/SearchQueryQueryToolChest.java indexing-service/src/main/java/com/metamx/druid/indexing/common/tasklogs/S3TaskLogs.java indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/ForkingTaskRunner.java indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunner.java indexing-service/src/main/java/com/metamx/druid/indexing/worker/WorkerCuratorCoordinator.java indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunnerTest.java pom.xml server/src/main/java/com/metamx/druid/http/MasterMain.java server/src/main/java/com/metamx/druid/http/MasterServletModule.java server/src/main/java/com/metamx/druid/master/DruidMasterConfig.java server/src/test/java/com/metamx/druid/master/DruidMasterTest.java server/src/test/java/com/metamx/druid/query/group/GroupByQueryRunnerTest.java	2013-08-27 14:27:32 -05:00
fjy	d11d0a8284	fix according to code review	2013-08-22 10:49:46 -07:00
fjy	778fd0f10e	Fix persist of empty indexes in index generator job	2013-08-22 10:16:43 -07:00
cheddar	eee1efdcb5	Merge branch 'master' into guice Conflicts: client/src/main/java/com/metamx/druid/client/DruidServerConfig.java indexing-service/src/main/java/com/metamx/druid/indexing/common/index/ChatHandlerProvider.java indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/TaskMasterLifecycle.java indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/TaskLifecycleTest.java	2013-08-06 13:33:31 -07:00
cheddar	3c808b15c3	1) Fix HadoopDruidIndexerConfigTest to actually verify the current correct behavior.	2013-08-05 11:37:20 -07:00
cheddar	2b71505421	1) Fix HadoopDruidIndexerConfig to no longer replace ":" with "_" on the segmentOutputDir. The segmentOutputDir is user-supplied so they should have the ability to just not set a bad directory.	2013-08-05 11:22:26 -07:00
cheddar	2361e0112a	Make it all compile again...	2013-08-02 10:14:46 -07:00
cheddar	9e78bb38f5	Merge branch 'master' into guice Conflicts: client/src/main/java/com/metamx/druid/QueryableNode.java client/src/main/java/com/metamx/druid/client/ServerInventoryView.java client/src/main/java/com/metamx/druid/coordination/SingleDataSegmentAnnouncer.java client/src/main/java/com/metamx/druid/initialization/CuratorDiscoveryConfig.java client/src/main/java/com/metamx/druid/query/MetricsEmittingExecutorService.java indexing-hadoop/src/test/java/com/metamx/druid/indexer/HadoopDruidIndexerConfigTest.java indexing-service/src/main/java/com/metamx/druid/indexing/common/TaskToolbox.java indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/http/IndexerCoordinatorNode.java indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java indexing-service/src/main/java/com/metamx/druid/indexing/worker/http/WorkerNode.java pom.xml server/src/main/java/com/metamx/druid/coordination/ServerManager.java server/src/main/java/com/metamx/druid/coordination/ZkCoordinator.java server/src/main/java/com/metamx/druid/db/DatabaseRuleManager.java server/src/main/java/com/metamx/druid/db/DatabaseSegmentManager.java server/src/main/java/com/metamx/druid/http/ComputeNode.java server/src/main/java/com/metamx/druid/http/MasterMain.java server/src/main/java/com/metamx/druid/loading/SegmentLoaderConfig.java server/src/main/java/com/metamx/druid/loading/SingleSegmentLoader.java server/src/main/java/com/metamx/druid/master/DruidMaster.java	2013-08-01 16:42:47 -07:00
Jan Rudert	ad087a7a22	correct segment path for hadoop indexer	2013-07-10 09:21:45 +02:00
cheddar	2f56c24259	1) Inject IndexingServiceClient 2) Switch all the DBI references to IDBI	2013-06-07 17:37:33 -07:00
cheddar	f68df7ab69	1) Make tests work and continue trying to make the DruidMaster start up with just Guice	2013-06-07 12:01:46 -07:00
fjy	42cc87a294	Merge branch 'master' into refactor-indexing Conflicts: indexing-service/src/main/java/com/metamx/druid/indexing/common/task/IndexTask.java pom.xml	2013-05-31 17:28:59 -07:00
fjy	08d84001ba	Merge branch 'master' into refactor-indexing	2013-05-16 16:03:29 -07:00
fjy	26e0eb62cb	merge and other refactorings	2013-05-15 17:28:08 -07:00

... 4 5 6 7 8 ...

487 Commits