druid

Commit Graph

Author	SHA1	Message	Date
Himanshu Gupta	aeffeaf3e2	fixing hadoop test scope dependencies in indexing-hadoop	2015-10-26 17:09:39 -05:00
Nishant	3641a0e553	Fix Race in jar upload during hadoop indexing - https://github.com/druid-io/druid/issues/582 few fixes delete intermediate file early better exception handling use static pattern instead of compiling it every time Add retry for transient exceptions remove usage of deprecated method. Add test fix imports fix javadoc review comment. review comment: handle crazy snapshot naming review comments remove default retry count in favour of already present constant review comment make random intermediate and final paths. review comment, use temporaryFolder where possible	2015-10-22 21:41:07 +05:30
Xavier Léauté	e4ac78e43d	bump next snapshot to 0.9.0	2015-10-20 13:46:13 -07:00
Xavier Léauté	4c2c7a2c37	update version to 0.8.3	2015-10-14 21:40:55 -07:00
Himanshu Gupta	0368260018	For dataSource inputSpec in hadoop batch ingestion, use configured query granularity for reading existing segments instead of NONE	2015-10-12 22:19:44 -05:00
Gian Merlino	3aba401ee0	SQLMetadataConnector: Retry table creation, in case something goes wrong. Also rejigger table creation methods to not take a DBI. It's already available inside the connector, and everyone was just using that one anyway.	2015-09-24 21:39:36 -07:00
Himanshu Gupta	e8b9ee85a7	HadoopyStringInputRowParser to convert stringy Text, BytesWritable etc into InputRow	2015-09-16 10:58:13 -05:00
Himanshu Gupta	74f4572bd4	Lazily deserialize "parser" to InputRowParser in DataSchema so that user hadoop related InputRowParsers are created only when needed this allows overlord to accept a HadoopIndexTask with a hadoopy InputRowParser and not fail because hadoopy InputRowParser might need hadoop libraries	2015-09-16 10:58:13 -05:00
Himanshu Gupta	9ca6106128	user specified hadoop settings are ignored if explicitly set in code	2015-08-31 10:50:18 -05:00
Gian Merlino	940e1aa3eb	Replace funky imports with standard ones. 1) Lots of Guava imports were not coming from the actual Guava 2) junit.framework.Assert should be org.junit.Assert	2015-08-28 18:02:05 -07:00
jon-wei	e5c4927b14	Add support for parsing BytesWritable strings to Hadoop Indexer	2015-08-28 14:27:14 -07:00
Gian Merlino	414a6fb477	Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat. Fixes #1678. IngestSegmentFirehose (and its users) need to remember which windows of which segments should actually be read, based on a timeline.	2015-08-28 07:32:41 -07:00
Himanshu Gupta	2e0dd1d792	adding UTs and addressing review comments to firehoseV2 addition to Realtime[Manager\|Plumber], essential segment metadata persist support, kafka-simple-consumer-firehose extension patch	2015-08-27 20:50:46 -05:00
lvjq	2237a8cf0f	kafka 8 simple consumer firehose	2015-08-27 20:50:46 -05:00
Charles Allen	e38cf54bc8	Migrate TestDerbyConnector to a JUnit @Rule	2015-08-26 21:47:40 -07:00
Himanshu Gupta	b3c570e78d	update BatchDeltaIngestion.testDeltaIngestion(..) to check for proper glob path handling	2015-08-20 21:36:34 -05:00
Himanshu Gupta	85e3ce9096	split hadoop glob path before adding it to MultipleInputs This can be safely reverted once https://issues.apache.org/jira/browse/MAPREDUCE-5061 is fixed	2015-08-20 21:36:34 -05:00
Himanshu Gupta	a603bd9547	HadoopGlobPathSplitter implementation to split hadoop glob paths This can be safely reverted once https://issues.apache.org/jira/browse/MAPREDUCE-5061 is fixed	2015-08-20 21:36:34 -05:00
Himanshu Gupta	cf3ec8eb46	helpful cause explaining why SegmentDescriptorInfo did not exist	2015-08-19 10:29:04 -05:00
Xavier Léauté	3b2e41e42a	update for next release	2015-08-18 17:16:46 -07:00
Himanshu Gupta	a3bab5b7d9	IndexGeneratorJobTest type unit test for batch delta ingestion and reindexing	2015-08-16 14:07:35 -05:00
Himanshu Gupta	15fa43dd43	changing DatasourcePathSpec, to get segment list, so that hadoop indexer uses overlord action to get list of segments and passes when running as an overlord task. and, uses metadata store directly when running as standalone hadoop indexer also, serialized list of segments is passed to DatasourcePathSpec so that hadoop classloader issues do not creep up	2015-08-16 14:07:35 -05:00
Himanshu Gupta	45947a1021	add ability to specify Multiple PathSpecs in batch ingestion, so that we can grab data from multiple places in same ingestion Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java indexing-hadoop/src/main/java/io/druid/indexer/JobHelper.java Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/path/PathSpec.java	2015-08-16 13:15:38 -05:00
Himanshu Gupta	1ae56f139b	Druid Hadoop InputFormat and pathSpec Conflicts: indexing-hadoop/src/main/java/io/druid/indexer/path/PathSpec.java indexing-service/pom.xml	2015-08-16 13:15:38 -05:00
Himanshu Gupta	f1d309a671	do not run parser if value from InputFormat is already an InputRow	2015-08-14 14:44:22 -05:00
Himanshu Gupta	0eec1bbee2	json serde tests for HadoopTuningConfig	2015-07-20 12:01:53 -05:00
Himanshu Gupta	f836c3a7ac	adding flag useCombiner to hadoop tuning config that can be used to add a hadoop combiner to hadoop batch ingestion to do merges on the mappers if possible	2015-07-20 12:01:53 -05:00
Himanshu Gupta	4ef484048a	take control of InputRow serde between Mapper/Reducer in Hadoop Indexing This allows for arbitrary InputFormat while hadoop batch ingestion that can return records of value type other than Text	2015-07-20 12:01:53 -05:00
Himanshu Gupta	f7a92db332	generic byte[] serde for InputRow	2015-07-20 12:01:53 -05:00
Xavier Léauté	4cfb00bc8a	inrement version	2015-07-15 13:09:05 -07:00
Charles Allen	b2bc46be17	Merge pull request #1484 from tubemogul/feature/1463 JobHelper.ensurePaths will set job properties from config (tuningConf…	2015-07-07 10:58:16 -07:00
Michael Schiff	6ad451a44a	JobHelper.ensurePaths will set job properties from config (tuningConfig.jobProperties) before adding input paths to the config. Adding input paths will create Path and FileSystem instances which may depend on the values in the job config. This allows all properties to be set from the spec file, avoiding having to directly edit cluster xml files. IndexGeneratorJob.run adds job properties before adding input paths (adding input paths may depend on having job properies set) JobHelperTest confirms that JobHelper.ensurePaths adds job properties javadoc for addInputPaths to explain relationship with addJobProperties	2015-07-01 12:45:32 -07:00
Davide Anastasia	4a3a7dd1ad	read hadoop-indexer configuration file from HDFS	2015-06-24 14:08:53 -07:00
Hao Xia	1931491c9f	A couple of hdfs related fixes * Class loading issue with hdfs-storage extension * Exception when using hdfs with non-fully qualified segment path	2015-06-19 17:22:20 -07:00
Xavier Léauté	0a5bb909a2	[maven-release-plugin] prepare for next development iteration	2015-06-18 17:35:19 -07:00
Xavier Léauté	59c6b2b279	[maven-release-plugin] prepare release druid-0.8.0-rc1	2015-06-18 17:35:14 -07:00
Charles Allen	94a567732a	Wipe FileContext off the face of the earth * Fixes https://github.com/druid-io/druid/issues/1433 * Works arround https://issues.apache.org/jira/browse/HADOOP-10643 * Reverts to the prior method of renaming	2015-06-16 09:48:09 -07:00
Charles Allen	6230ac90ae	Use IndexMerger for conversion	2015-06-10 11:34:58 -07:00
Charles Allen	056cab93ed	Add Hadoop Converter Job and task * Fixes https://github.com/druid-io/druid/issues/1363 * Add extra utils in JobHelper based on PR feedback	2015-06-09 14:47:38 -07:00
Charles Allen	2a76bdc60a	Abstractify hadoopy indexer configuration. * Moves many items to JobHelper * Remove dependencies of these functions on HadoopDruidIndexerConfig in favor of more general items * Changes functionalities of some of the path methods to always return a path with scheme * Adds retry to uploads * Change output loadSpec determining from using outputFS.getClass().getName() to using outputFS.getScheme()	2015-06-08 10:53:27 -07:00
fjy	be2a35188e	Additional schema validations and better logs for common extensions	2015-05-27 16:25:02 -07:00
Xavier Léauté	4466e77b25	Merge pull request #1371 from guobingkun/unit_test Unit test for IndexGeneratorJob	2015-05-22 10:34:24 -04:00
flow	07659f30ab	bug fix: hdfs task log and indexing task not work properly with Hadoop HA	2015-05-21 20:49:42 +08:00
Bingkun Guo	b46aff12ae	Unit test for IndexGeneratorJob	2015-05-18 12:31:16 -05:00
fjy	7a6acf5c1b	update pom to 0.8	2015-05-11 19:41:58 -06:00
Fangjin Yang	a2dc58cd2d	Merge pull request #1345 from pjain1/unit_test_warn_fix fix warn msg and some unit tests	2015-05-08 08:06:20 -07:00
Parag Jain	01448d264c	Fix warn msg and added some unit tests	2015-05-07 17:10:05 -05:00
fjy	b19435d172	fix typos with batch ingestion in docs	2015-05-07 14:46:17 -07:00
Bingkun Guo	1ee550dd91	Fix a potential issue in DeterminePartitionsJob by making HadoopDruidIndexerConfig non-static, and two unit tests for DeterminPartitionsJob and LocalDataSegmentKiller	2015-05-04 20:00:29 -07:00
Xavier Léauté	3a3046ccf3	add support for dimension compression - compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier - makes dimension compression configurable via IndexSpec - IndexSpec also enables configuring bitmap and metric compression	2015-04-14 10:44:18 -07:00

1 2 3 4 5 ...

725 Commits