25 Commits

Author SHA1 Message Date
Gian Merlino
77a3f3cbe0 Merge branch 'master' into determine-partitions
Conflicts:
	indexer/src/main/java/com/metamx/druid/indexer/IndexGeneratorJob.java
2013-01-21 14:46:13 -08:00
Gian Merlino
d9e6f1d954 DeterminePartitions follow-up
HadoopDruidIndexerConfig:
- Add partitionsSpec (backwards compatible with targetPartitionSize and partitionDimension)
- Add assumeGrouped flag to partitionsSpec

DeterminePartitionsJob:
- Skip group-by job if assumeGrouped is set
- Clean up code a bit
2013-01-21 14:38:35 -08:00
Eric Tschetter
c8cb96b006 1) Remove vast majority of usages of IndexIO.mapDir() and deprecated it. IndexIO.loadIndex() is the new IndexIO.mapDir()
2) Fix bug with IndexMerger and null columns
3) Add QueryableIndexIndexableAdapter so that QueryableIndexes can be merged
4) Adjust twitter example to have multiple values for each hash tag
5) Adjusted GroupByQueryEngine to just drop dimensions that don't exist instead of throwing an NPE
2013-01-16 17:10:33 -06:00
Gian Merlino
7b42ee6a6e Rework DeterminePartitionsJob in the hadoop indexer
- Can handle non-rolled-up input (by grouping input rows using an additional MR stage)
- Can select its own partitioning dimension, if none is supplied
- Can detect and avoid oversized shards due to bad dimension value distribution
- Shares input parsing code with IndexGeneratorJob
2013-01-16 08:15:01 -08:00
Gian Merlino
616415cb7e UniformGranularitySpec: Only return bucketInterval for timestamps that legitimately
overlap our input intervals
2013-01-15 22:30:17 -08:00
Fangjin Yang
5822f4f5f7 refactor master to run rules before cleaning up; more master stats; general improvements 2012-12-03 14:43:04 -08:00
Fangjin Yang
2e5e1ce989 first commit of tiers for compute nodes; working UT at this point 2012-11-28 17:37:08 -08:00
Eric Tschetter
0f63cb4f00 1) Have IndexGeneratorJob write the descriptors for each of the segments it creates to a path in the temporary working directory (generally HDFS)
2) Have the DbUpdaterJob read descriptors from the temporary working directory instead of looking in the final segment output location (often the eventually consistent S3)
3) 1 and 2 Fixes #30
2012-11-20 15:30:50 -06:00
Eric Tschetter
701cc9562b 1) Adjust the StorageAdapters to lowercase names of metrics and dimensions before looking them up.
2) Add some docs to InputRow/Row to indicate that column names passed into the methods are *always* lowercase and that the rows need to act accordingly. (fixes #29, or at least clarifies the behavior...)
2012-11-19 17:01:17 -06:00
Fangjin Yang
0ef40171a8 nodes no longer inherit from interfaces but instead extend classes 2012-11-13 13:18:31 -08:00
Fangjin Yang
24564d73e1 register subtypes for reducer 2012-11-12 16:41:34 -08:00
Fangjin Yang
57468d39ef reverting some of the last changes 2012-11-12 16:14:48 -08:00
Fangjin Yang
c20dccd0f4 modifying the way registering serdes works to hopefully be a bit easier to use 2012-11-12 13:58:43 -08:00
Fangjin Yang
6da047b5fa fix backwards compatibility issues 2012-11-08 15:09:00 -08:00
Fangjin Yang
34cb352cf8 working indexer with registererers 2012-11-06 14:26:53 -08:00
Fangjin Yang
5698f640d7 fix last commit with version 2012-11-06 12:40:53 -08:00
Fangjin Yang
0b6dd99452 set default version if one is not set 2012-11-06 12:36:55 -08:00
Fangjin Yang
34a221a586 fix bug with jackson conversion 2012-11-06 11:56:48 -08:00
Fangjin Yang
68e5adde33 register registererers in the config 2012-11-06 11:49:17 -08:00
Fangjin Yang
eb2b5a61fa fix setters for hadoop node 2012-11-05 18:40:54 -08:00
Fangjin Yang
2ae0a15b5a add register abilities to mapper 2012-11-05 18:31:23 -08:00
Fangjin Yang
9fbee29eb4 change hadoop indexer to be node based 2012-11-05 18:19:04 -08:00
Fangjin Yang
7b2522ff3f allow hadoop druid indexer to register registererers 2012-11-05 16:13:50 -08:00
Eric Tschetter
27999caca0 1) Create LICENSE
2) Attach copyright and notice of license to files
2012-10-24 05:09:47 -04:00
Eric Tschetter
9d41599967 Initial commit of OSS Druid Code 2012-10-24 03:39:51 -04:00