Himanshu Gupta
61aaa09012
support multiple intervals in dataSource input spec
2015-12-03 21:28:04 -06:00
Fangjin Yang
21c84b5ff7
Merge pull request #1896 from gianm/allocate-segment
...
SegmentAllocateAction (fixes #1515 )
2015-11-18 21:05:46 -08:00
Gian Merlino
e4e5f0375b
SegmentAllocateAction ( fixes #1515 )
...
This is a feature meant to allow realtime tasks to work without being told upfront
what shardSpec they should use (so we can potentially publish a variable number
of segments per interval).
The idea is that there is a "pendingSegments" table in the metadata store that
tracks allocated segments. Each one has a segment id (the same segment id we know
and love) and is also part of a sequence.
The sequences are an idea from @cheddar that offers a way of doing replication.
If there are N tasks reading exactly the same data with exactly the same logic
(think Kafka tasks reading a fixed range of offsets) then you can place them
in the same sequence, and they will generate the same sequence of segments.
2015-11-11 16:54:35 -08:00
Xavier Léauté
fa6142e217
cleanup and remove unused imports
2015-11-11 12:25:21 -08:00
Charles Allen
abae47850a
Add backwards compatability for PR #1922
2015-11-11 10:27:00 -08:00
Gian Merlino
dfbd0e2b60
Merge pull request #1925 from gianm/fix-index-generator
...
Fix reference to INDEX_MAKER in IndexGeneratorJob.
2015-11-06 09:56:30 -08:00
Gian Merlino
75122dc396
Fix reference to INDEX_MAKER in IndexGeneratorJob.
2015-11-06 09:19:58 -08:00
Himanshu Gupta
6bed633121
do not use LoggingProcessIndicator in IndexGeneratorJob because that uses Stopwatch methods from guava not available in older guava versions, this makes the behavior same as LegacyIndexGeneratorJob
2015-11-06 00:40:51 -06:00
Charles Allen
929b981710
Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to
2015-11-05 18:10:13 -08:00
Xavier Léauté
223d1ebe9f
fix a very old todo
2015-11-05 13:00:30 -08:00
fjy
8f231fd3e3
cleanup druid codebase
2015-11-04 13:59:53 -08:00
Himanshu Gupta
84f7d8d264
making static final variables in HadoopDruidIndexerConfig upper case
2015-11-02 23:24:26 -06:00
Himanshu Gupta
8b67417ac8
make methods in Index[Merger,Maker,IO] non-static so that they can have
...
appropriate ObjectMapper injected instead of creating one statically
2015-11-02 23:24:26 -06:00
Nishant
3641a0e553
Fix Race in jar upload during hadoop indexing - https://github.com/druid-io/druid/issues/582
...
few fixes
delete intermediate file early
better exception handling
use static pattern instead of compiling it every time
Add retry for transient exceptions
remove usage of deprecated method.
Add test
fix imports
fix javadoc
review comment.
review comment: handle crazy snapshot naming
review comments
remove default retry count in favour of already present constant
review comment
make random intermediate and final paths.
review comment, use temporaryFolder where possible
2015-10-22 21:41:07 +05:30
Himanshu Gupta
0368260018
For dataSource inputSpec in hadoop batch ingestion, use configured query granularity for reading existing segments instead of NONE
2015-10-12 22:19:44 -05:00
Gian Merlino
3aba401ee0
SQLMetadataConnector: Retry table creation, in case something goes wrong.
...
Also rejigger table creation methods to not take a DBI. It's already available
inside the connector, and everyone was just using that one anyway.
2015-09-24 21:39:36 -07:00
Himanshu Gupta
e8b9ee85a7
HadoopyStringInputRowParser to convert stringy Text, BytesWritable etc into InputRow
2015-09-16 10:58:13 -05:00
Himanshu Gupta
74f4572bd4
Lazily deserialize "parser" to InputRowParser in DataSchema
...
so that user hadoop related InputRowParsers are created only when needed
this allows overlord to accept a HadoopIndexTask with a hadoopy InputRowParser
and not fail because hadoopy InputRowParser might need hadoop libraries
2015-09-16 10:58:13 -05:00
Himanshu Gupta
9ca6106128
user specified hadoop settings are ignored if explicitly set in code
2015-08-31 10:50:18 -05:00
Gian Merlino
940e1aa3eb
Replace funky imports with standard ones.
...
1) Lots of Guava imports were not coming from the actual Guava
2) junit.framework.Assert should be org.junit.Assert
2015-08-28 18:02:05 -07:00
jon-wei
e5c4927b14
Add support for parsing BytesWritable strings to Hadoop Indexer
2015-08-28 14:27:14 -07:00
Gian Merlino
414a6fb477
Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat.
...
Fixes #1678 . IngestSegmentFirehose (and its users) need to remember which
windows of which segments should actually be read, based on a timeline.
2015-08-28 07:32:41 -07:00
Himanshu Gupta
2e0dd1d792
adding UTs and addressing review comments to
...
firehoseV2 addition to Realtime[Manager|Plumber],
essential segment metadata persist support,
kafka-simple-consumer-firehose extension patch
2015-08-27 20:50:46 -05:00
lvjq
2237a8cf0f
kafka 8 simple consumer firehose
2015-08-27 20:50:46 -05:00
Charles Allen
e38cf54bc8
Migrate TestDerbyConnector to a JUnit @Rule
2015-08-26 21:47:40 -07:00
Himanshu Gupta
b3c570e78d
update BatchDeltaIngestion.testDeltaIngestion(..) to check for proper glob path handling
2015-08-20 21:36:34 -05:00
Himanshu Gupta
85e3ce9096
split hadoop glob path before adding it to MultipleInputs
...
This can be safely reverted once https://issues.apache.org/jira/browse/MAPREDUCE-5061 is fixed
2015-08-20 21:36:34 -05:00
Himanshu Gupta
a603bd9547
HadoopGlobPathSplitter implementation to split hadoop glob paths
...
This can be safely reverted once https://issues.apache.org/jira/browse/MAPREDUCE-5061 is fixed
2015-08-20 21:36:34 -05:00
Himanshu Gupta
cf3ec8eb46
helpful cause explaining why SegmentDescriptorInfo did not exist
2015-08-19 10:29:04 -05:00
Himanshu Gupta
a3bab5b7d9
IndexGeneratorJobTest type unit test for batch delta ingestion and reindexing
2015-08-16 14:07:35 -05:00
Himanshu Gupta
15fa43dd43
changing DatasourcePathSpec, to get segment list, so that hadoop indexer uses overlord action to get list of segments and passes when running as an overlord task. and, uses metadata store directly when running as standalone hadoop indexer
...
also, serialized list of segments is passed to DatasourcePathSpec so that hadoop classloader issues do not creep up
2015-08-16 14:07:35 -05:00
Himanshu Gupta
45947a1021
add ability to specify Multiple PathSpecs in batch ingestion, so that we can grab data from multiple places in same ingestion
...
Conflicts:
indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java
indexing-hadoop/src/main/java/io/druid/indexer/JobHelper.java
Conflicts:
indexing-hadoop/src/main/java/io/druid/indexer/path/PathSpec.java
2015-08-16 13:15:38 -05:00
Himanshu Gupta
1ae56f139b
Druid Hadoop InputFormat and pathSpec
...
Conflicts:
indexing-hadoop/src/main/java/io/druid/indexer/path/PathSpec.java
indexing-service/pom.xml
2015-08-16 13:15:38 -05:00
Himanshu Gupta
f1d309a671
do not run parser if value from InputFormat is already an InputRow
2015-08-14 14:44:22 -05:00
Himanshu Gupta
0eec1bbee2
json serde tests for HadoopTuningConfig
2015-07-20 12:01:53 -05:00
Himanshu Gupta
f836c3a7ac
adding flag useCombiner to hadoop tuning config that can be used to add a
...
hadoop combiner to hadoop batch ingestion to do merges on the mappers if possible
2015-07-20 12:01:53 -05:00
Himanshu Gupta
4ef484048a
take control of InputRow serde between Mapper/Reducer in Hadoop Indexing
...
This allows for arbitrary InputFormat while hadoop batch ingestion that
can return records of value type other than Text
2015-07-20 12:01:53 -05:00
Himanshu Gupta
f7a92db332
generic byte[] serde for InputRow
2015-07-20 12:01:53 -05:00
Charles Allen
b2bc46be17
Merge pull request #1484 from tubemogul/feature/1463
...
JobHelper.ensurePaths will set job properties from config (tuningConf…
2015-07-07 10:58:16 -07:00
Michael Schiff
6ad451a44a
JobHelper.ensurePaths will set job properties from config (tuningConfig.jobProperties) before adding input paths to the config.
...
Adding input paths will create Path and FileSystem instances which may depend on the values in the job config.
This allows all properties to be set from the spec file, avoiding having to directly edit cluster xml files.
IndexGeneratorJob.run adds job properties before adding input paths (adding input paths may depend on having job properies set)
JobHelperTest confirms that JobHelper.ensurePaths adds job properties
javadoc for addInputPaths to explain relationship with
addJobProperties
2015-07-01 12:45:32 -07:00
Davide Anastasia
4a3a7dd1ad
read hadoop-indexer configuration file from HDFS
2015-06-24 14:08:53 -07:00
Hao Xia
1931491c9f
A couple of hdfs related fixes
...
* Class loading issue with hdfs-storage extension
* Exception when using hdfs with non-fully qualified segment path
2015-06-19 17:22:20 -07:00
Charles Allen
94a567732a
Wipe FileContext off the face of the earth
...
* Fixes https://github.com/druid-io/druid/issues/1433
* Works arround https://issues.apache.org/jira/browse/HADOOP-10643
* Reverts to the prior method of renaming
2015-06-16 09:48:09 -07:00
Charles Allen
6230ac90ae
Use IndexMerger for conversion
2015-06-10 11:34:58 -07:00
Charles Allen
056cab93ed
Add Hadoop Converter Job and task
...
* Fixes https://github.com/druid-io/druid/issues/1363
* Add extra utils in JobHelper based on PR feedback
2015-06-09 14:47:38 -07:00
Charles Allen
2a76bdc60a
Abstractify hadoopy indexer configuration.
...
* Moves many items to JobHelper
* Remove dependencies of these functions on HadoopDruidIndexerConfig in favor of more general items
* Changes functionalities of some of the path methods to always return a path with scheme
* Adds retry to uploads
* Change output loadSpec determining from using outputFS.getClass().getName() to using outputFS.getScheme()
2015-06-08 10:53:27 -07:00
fjy
be2a35188e
Additional schema validations and better logs for common extensions
2015-05-27 16:25:02 -07:00
Xavier Léauté
4466e77b25
Merge pull request #1371 from guobingkun/unit_test
...
Unit test for IndexGeneratorJob
2015-05-22 10:34:24 -04:00
flow
07659f30ab
bug fix: hdfs task log and indexing task not work properly with Hadoop HA
2015-05-21 20:49:42 +08:00
Bingkun Guo
b46aff12ae
Unit test for IndexGeneratorJob
2015-05-18 12:31:16 -05:00
Fangjin Yang
a2dc58cd2d
Merge pull request #1345 from pjain1/unit_test_warn_fix
...
fix warn msg and some unit tests
2015-05-08 08:06:20 -07:00
Parag Jain
01448d264c
Fix warn msg and added some unit tests
2015-05-07 17:10:05 -05:00
fjy
b19435d172
fix typos with batch ingestion in docs
2015-05-07 14:46:17 -07:00
Bingkun Guo
1ee550dd91
Fix a potential issue in DeterminePartitionsJob by making HadoopDruidIndexerConfig non-static, and two unit tests for DeterminPartitionsJob and LocalDataSegmentKiller
2015-05-04 20:00:29 -07:00
Xavier Léauté
3a3046ccf3
add support for dimension compression
...
- compression for single-value dimensions using CompressedVSizeIntsIndexedSupplier
- makes dimension compression configurable via IndexSpec
- IndexSpec also enables configuring bitmap and metric compression
2015-04-14 10:44:18 -07:00
Prajwal Tuladhar
3044bf5592
use Job.getInstance() to fix deprecated warnings
2015-04-09 13:22:21 -04:00
Xavier Léauté
8b5fa8f85d
always upload SNAPSHOT self-contained jars
2015-04-03 21:18:09 -07:00
Dia Kharrat
3a6dc99384
log invalid rows in mapper of Hadoop indexer
2015-03-19 22:31:04 -07:00
Dia Kharrat
58d5f5e7f0
Honor ignoreInvalidRows in Hadoop indexer
...
The reducer of the hadoop indexer now ignores lines with parsing
exceptions (if enabled by the indexer config).
2015-03-19 22:31:04 -07:00
Himanshu Gupta
8c1f0834ba
Removing MapWritableInputRowParser from indexing-hadoop it should really be an extension if user needs
2015-03-19 18:37:08 -05:00
Himanshu Gupta
3f7a7ba5d3
For batch hadoop indexing, make hadoop input format configuration. Given input format must extend from org.apache.hadoop.mapreduce.InputFormat
2015-03-18 16:09:45 -05:00
fjy
bfe10bd156
This fixes arbitrary gran spec breaking
2015-03-17 12:19:43 -07:00
Himanshu Gupta
6a0405de20
fail early if there is no input data for batch hadoop indexing
2015-03-07 12:45:57 -06:00
Himanshu Gupta
30f64ff19e
UTs update for indexing-hadoop
2015-02-25 15:45:57 -08:00
Xavier Léauté
0784d7e30e
Merge pull request #1152 from himanshug/metastorage-pwd-provider
...
support for metadata store PasswordProvider interface
2015-02-25 15:19:37 -08:00
Fangjin Yang
708f35151d
Merge pull request #1121 from gianm/issue-1116
...
Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )
2015-02-25 13:03:59 -08:00
Fangjin Yang
6424815f88
Merge pull request #1097 from metamx/better-hadoop-sort-key
...
Sort HadoopIndexer rows by time+dim bucket to help reduce spilling
2015-02-25 12:49:58 -08:00
Himanshu Gupta
126262edce
support for PasswordProvider interface to enable writing druid extension which can get metadata store password from secured location or anywhere instead of plain text properties file
2015-02-25 14:05:19 -06:00
Himanshu Gupta
01a4f19ea2
removing dependency on NativeS3FileSystem and other file systems
2015-02-23 14:27:50 -06:00
Gian Merlino
fd5a7d1f08
Use the proper FileSystems for writing segments and caching jars. (for issue #1116 )
2015-02-12 16:20:10 -08:00
Xavier Léauté
b1ec7afc12
Sort HadoopIndexer rows by time+dim bucket to help reduce spilling
2015-02-10 14:26:28 -08:00
Fangjin Yang
92e616de11
Merge pull request #1077 from metamx/remove-unused-imports
...
remove unused imports
2015-02-02 10:45:27 -08:00
nishantmonu51
ba932bb1f2
remove unused imports
2015-02-02 21:53:39 +05:30
fjy
d05032b98a
towards a community led druid
2015-01-31 20:57:36 -08:00
Xavier Léauté
cd9635ff5e
Merge pull request #1034 from druid-io/minor-rename
...
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 15:46:13 -08:00
fjy
ccddbf8747
minor rename of things in hadoop ingestion config to match 0.6.x
2015-01-15 14:04:55 -08:00
Fangjin Yang
5bfcc43377
Merge pull request #1008 from metamx/stringConversionJavaUtilUpdate
...
Update all String conversions to and from byte[] to use the java-util StringUtils functions
2015-01-15 13:50:27 -08:00
Fangjin Yang
852e863425
Merge pull request #981 from druid-io/strictModuleTyping
...
Use Module instead of generic Object in Guice related items
2015-01-05 12:43:20 -08:00
Charles Allen
b1b5c9099e
Update all String conversions to and from byte[] to use the java-util StringUtils functions
...
* Speedup of GroupBy with javaScript filters by ~10%
* Requires https://github.com/metamx/java-util/pull/15
2015-01-05 11:22:32 -08:00
Xavier Léauté
f1375b0bfb
workaround to pass down bitmap type to map-reduce tasks
2015-01-02 17:29:00 -08:00
Charles Allen
7c8d4a7433
Use Module instead of generic Object in Guice related items
2014-12-19 10:54:06 -08:00
fjy
43d27ddaf0
update http client and fix logging
2014-12-15 16:59:57 -08:00
fjy
e872952390
fix working path default bug
2014-12-15 14:51:58 -08:00
fjy
28b72a69ad
redocumenting ingestion
2014-12-08 16:15:46 -08:00
nishantmonu51
40f223215a
fix buffer pool usage
2014-12-05 16:09:26 +05:30
nishantmonu51
6e03a6245f
Merge branch 'master' into onheap-incremental-index
2014-12-05 10:40:28 +05:30
Xavier Léauté
7cd45a6e1f
IncrementalIndex throws exception if limit exceeded
...
- For now uses a hardcoded ratio of aggregator to timeanddim buffer sizes
- canAppendRow is a workaround for realtime index since the
Firehose currently does not have a way of rolling back the last event in
case of error
- canAppendRow needs a fudge factor; there is a race between checking
if we can add a row and actually adding a row, because of the way MapDB
reports its size.
2014-12-04 14:38:16 -08:00
Gian Merlino
20a7239ffd
Replace google-http-client imports with real guava imports.
2014-12-04 10:57:57 -08:00
Charles Allen
c2add5730b
Fix Hadoop CLI jobs
...
* Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop
* Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec
2014-12-04 10:57:56 -08:00
xvrl
c867d59ee0
fix error message
2014-12-03 15:30:32 -08:00
Xavier Léauté
2e6c254937
metadata injection not needed for indexing service
2014-12-03 15:09:31 -08:00
Gian Merlino
d388a8fe89
Replace google-http-client imports with real guava imports.
2014-12-03 10:52:57 -08:00
nishantmonu51
4dc0fdba8a
consider mapped size in limit calculation & review comments
2014-12-03 23:47:30 +05:30
nishantmonu51
da8bd7836b
Introduce buffer size
2014-12-03 16:28:22 +05:30
Charles Allen
7cd689be75
Fix Hadoop CLI jobs
...
* Change "schema" --> "spec" for cli hadoop to keep up with internal hadoop
* Added check for HadoopDruidIndexerConfig deserialization from Map to see if it is trying to get a HadoopDruidIndexerConfig or a HadoopIngestionSpec
2014-12-02 11:23:04 -08:00
nishantmonu51
eac776f1a7
tests passing with on heap incremental index
2014-12-02 22:29:28 +05:30
Xavier Léauté
59542c41f8
fix port not set in DruidNode
2014-12-01 14:37:28 -08:00
Charles Allen
8b3652a67a
Modify HadoopDruidIndexerConfig to give a port of 0 instead of -1 when binding DruidNode @Self annotation
2014-12-01 14:08:41 -08:00
fjy
fdeab0c6af
make Druid case sensitive
2014-11-19 14:27:31 -08:00
nishantmonu51
f0452c5968
merge from master
2014-11-18 19:34:51 +05:30
nishantmonu51
edf0fc0851
Make hashed partitions spec default
...
- make hashed partitionsSpec as default partitions spec for 0.7
2014-11-17 19:48:12 +05:30
nishantmonu51
0c2d06475d
merge from master
2014-11-17 19:19:18 +05:30
Xavier Léauté
0498df25df
override metadata storage injection in CliHadoopIndexer
2014-11-07 13:44:56 -08:00
Xavier Léauté
50a191425c
fix injection on MetadataStorageUpdaterJob
2014-11-07 11:11:14 -08:00
Xavier Léauté
20a9aef96a
fix test
2014-11-06 17:27:05 -08:00
Xavier Léauté
9c06db021f
rename db->metadata postgres->postgresql
2014-10-31 10:30:27 -07:00
jisookim0513
aa754b86e8
build success!
2014-10-24 11:28:42 -07:00
fjy
bef74104d9
merge with 0.7.x and resolve any conflicts
2014-10-23 17:24:06 -07:00
fjy
d76d57d95d
update docs
2014-10-22 16:16:28 -07:00
jisookim0513
37979282fe
enabled ansi-quote in mysql; insert statement should now work
2014-10-21 00:09:19 -07:00
jisookim0513
7d5c5f2083
fixed createTable; fixed miscellaneous stuff; added DerbyMetadataRuleManagerProvider
2014-10-17 00:10:36 -07:00
nishantmonu51
41e88baeca
Add test for bucket selection
2014-10-15 23:09:28 +05:30
nishantmonu51
f4a97aebbc
fix rollup for hashed partitions
...
truncate timestamp while calculating the partitionNumber
2014-10-15 22:32:56 +05:30
nishantmonu51
b5d66381f3
more cleanup
2014-10-14 18:32:40 +05:30
nishantmonu51
454acd3f5a
remove backwards compatible code
...
1) remove backwards compatible and deprecated code
2) make hashed partitions spec default
2014-10-13 19:30:44 +05:30
fjy
c7b4d5b7b4
Merge branch 'master' into druid-0.7.x
...
Conflicts:
processing/src/test/java/io/druid/segment/filter/SpatialFilterTest.java
2014-10-02 18:12:10 -07:00
nishantmonu51
ad75a21040
separate offheapIncrementalIndex implementation
2014-10-01 13:58:51 +05:30
jisookim0513
9d7b5d9b0f
fixed javadoc; fixed pom files; deleted unnecessary class
2014-09-30 13:47:35 -07:00
nishantmonu51
358ff915bb
fix merge conflicts
2014-09-30 22:19:18 +05:30
nishantmonu51
2789536bed
merge changes from druid-0.7.x
2014-09-30 22:05:49 +05:30
nishantmonu51
61c7fd2e6e
make ingestOffheap tuneable
2014-09-30 15:30:02 +05:30
nishantmonu51
adb4a65e0a
Merge branch 'offheap-incremental-index' into mapdb-branch
2014-09-29 12:38:31 +05:30
jisookim0513
74565c9371
cleaned up the code
2014-09-27 13:10:01 -07:00
jisookim0513
aa887edb73
added two seperate modules for mysql and postgres
2014-09-27 13:08:53 -07:00
flow
2dd62979bb
Fixed the issue of batch ingestion with indexing service to hdfs end up with the path of metadata in mysql missing "hdfs://host" prefix. The detail describe can be found here: https://groups.google.com/forum/#!topic/druid-development/ofvSxiPpCxI
2014-09-27 22:26:52 +08:00
jisookim0513
6a641621b2
finished merging into druid-0.7.x; derby not working (to be fixed)
2014-09-26 14:24:53 -07:00
jisookim0513
43cc6283d3
trying to revert files that have overwritten changes
2014-09-26 12:38:04 -07:00
fjy
eaf0a48b92
Merge branch 'master' into druid-0.7.x
...
Conflicts:
cassandra-storage/pom.xml
common/pom.xml
examples/pom.xml
hdfs-storage/pom.xml
histogram/pom.xml
indexing-hadoop/pom.xml
indexing-service/pom.xml
kafka-eight/pom.xml
kafka-seven/pom.xml
pom.xml
processing/pom.xml
processing/src/main/java/io/druid/guice/PropertiesModule.java
rabbitmq/pom.xml
s3-extensions/pom.xml
server/pom.xml
services/pom.xml
2014-09-26 11:39:24 -07:00
jisookim0513
3bf39cc9f8
attempted to fix merge-conflicts
2014-09-24 15:55:42 -07:00
nishantmonu51
f51ab84386
merge changes from druid-0.7.x
2014-09-22 23:48:45 +05:30
nishantmonu51
443e5788fb
make OffheapIncrementalIndex tuneable
2014-09-22 19:26:10 +05:30
jisookim0513
273205f217
initial attempt for abstraction; druid cluster works with Derby as a default
2014-09-19 17:39:59 -07:00
nishantmonu51
8eb6466487
revert buffer size and add back rowFlushBoundary
2014-09-19 23:06:04 +05:30
Xavier Léauté
d501b052ea
remove unused columnConfig
2014-09-15 13:02:47 -07:00
Xavier Léauté
e57e2d97ba
make constants final
2014-09-15 12:53:40 -07:00
fjy
469ccbbe5e
Merge branch 'master' into druid-0.7.x
...
Conflicts:
cassandra-storage/pom.xml
common/pom.xml
examples/pom.xml
hdfs-storage/pom.xml
histogram/pom.xml
indexing-hadoop/pom.xml
indexing-service/pom.xml
kafka-eight/pom.xml
kafka-seven/pom.xml
pom.xml
processing/pom.xml
processing/src/main/java/io/druid/query/FinalizeResultsQueryRunner.java
processing/src/main/java/io/druid/query/UnionQueryRunner.java
processing/src/main/java/io/druid/query/groupby/GroupByQueryRunnerFactory.java
processing/src/main/java/io/druid/query/topn/TopNQueryEngine.java
processing/src/main/java/io/druid/query/topn/TopNQueryRunnerFactory.java
rabbitmq/pom.xml
s3-extensions/pom.xml
server/pom.xml
server/src/test/java/io/druid/server/initialization/JettyTest.java
services/pom.xml
2014-09-11 16:20:50 -07:00
fjy
fec7b43fcb
make making v9 segments something completely configurable
2014-09-10 15:28:30 -07:00
fjy
351afb8be7
allow legacy index generator
2014-09-09 17:04:35 -07:00
Xavier Léauté
58ab759fc6
remove unused imports
2014-08-29 14:03:47 -07:00
Xavier Léauté
ac05836833
make Java 8 javadoc happy
2014-08-29 13:58:50 -07:00
fjy
12f4147df5
switch index gen job to use logging indicator
2014-08-21 13:28:15 -07:00
fjy
d64879ccca
more cleanup
2014-08-20 13:22:42 -07:00
fjy
bb73b2556e
fix compilation
2014-08-20 13:17:00 -07:00
fjy
92f26d9a1f
cleanup rowflushboundary
2014-08-20 13:09:37 -07:00
nishantmonu51
79ff993b31
increase default buffer size to 512m
2014-08-20 22:15:06 +05:30
nishantmonu51
33354cf7fe
replace maxRowsInMemory with BufferSize
2014-08-20 20:59:44 +05:30
fjy
88a904e0b3
address cr about progress ind
2014-08-19 12:59:01 -07:00
nishantmonu51
c6712739dc
merge changes from druid-0.7.x
2014-08-12 15:47:42 +05:30
nishantmonu51
9598a524a8
review comment - move index closure to finally
2014-08-12 14:58:55 +05:30
nishantmonu51
637bd35785
merge changes from druid-0.7.x
2014-07-31 16:07:22 +05:30
nishantmonu51
4ce12470a1
Add way to skip determine partitions for index task
...
Add a way to skip determinePartitions for IndexTask by manually
specifying numShards.
2014-07-18 18:52:15 +05:30
nishantmonu51
f5f05e3a9b
Sync changes from branch new-ingestion PR #599
...
Sync and Resolve Conflicts
2014-07-11 16:15:10 +05:30
nishantmonu51
fa43049240
review comments & pom changes
2014-07-10 11:48:46 +05:30
nishantmonu51
36fc85736c
Add ShardSpec Lookup
...
Optimize choosing shardSpec for Hash Partitions
2014-07-08 18:01:31 +05:30
fjy
4c40e71e54
address cr
2014-06-19 14:48:46 -07:00
fjy
a870fe5cbe
inject column config
2014-06-19 14:47:57 -07:00
Xavier Léauté
09346b0a3c
make column cache configurable
2014-06-19 14:43:03 -07:00
fjy
a63cda3281
Merge branch 'master' into new-guava
...
Conflicts:
server/src/main/java/io/druid/server/QueryResource.java
2014-06-13 10:08:10 -07:00
nishantmonu51
a7e19ad892
configure buffer sizes
2014-06-12 19:32:37 +05:30
nishantmonu51
6265613bb9
Merge branch 'master' into offheap-incremental-index
2014-06-05 17:42:57 +05:30
nishantmonu51
01e8a713b6
unit tests passing with offheap-indexing
2014-06-05 17:42:53 +05:30
Gian Merlino
1ca7bf03b8
IndexGeneratorJob needs to respect isCombineText, too.
2014-06-04 17:54:31 -07:00
fjy
adc00f2bcf
make combine text configurable
2014-06-04 16:24:56 -07:00
fjy
bb4105ed1a
fix broken standalone hadoop ingestion
2014-06-04 09:23:46 -07:00
fjy
77ec4df797
update guava, java-util, and druid-api
2014-06-03 13:43:38 -07:00
fjy
4c13327297
more logging for determine hashed
2014-05-30 16:19:20 -07:00
fjy
7be93a770a
make all firehoses work with tasks, add a lot more documentation about configuration
2014-05-28 16:33:59 -07:00
Deepak
7d92cf2b3b
Update IndexGeneratorJob.java
...
CombineTextInputFormat instead of TextInputFormat combines multiple splits for a single mapper and reduces the strain on hadoop platform. It greatly improves job completion time as there are fewer number of mappers to bookkeep.
2014-05-22 15:08:12 +05:30
Deepak
de0a7b27e7
Update DetermineHashedPartitionsJob.java
...
CombineTextInputFormat instead of TextInputFormat combines multiple splits for a single mapper and reduces the strain on hadoop platform. It greatly improves job completion time as there are fewer number of mappers to bookkeep.
2014-05-22 15:06:56 +05:30
Xavier Léauté
9ec7c71e0f
fix compilation error with updated druid-api
2014-05-19 14:06:23 -07:00
fjy
1100d2f2a1
rename configs to make a bit more sense
2014-05-06 14:52:50 -07:00
fjy
b6fb4245aa
Merge branch 'master' into new-schema
...
Conflicts:
indexing-hadoop/src/main/java/io/druid/indexer/HadoopDriverConfig.java
indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfig.java
indexing-hadoop/src/main/java/io/druid/indexer/HadoopDruidIndexerConfigBuilder.java
pom.xml
server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
server/src/main/java/io/druid/segment/realtime/firehose/EventReceiverFirehoseFactory.java
2014-05-06 14:32:51 -07:00
Gian Merlino
bdf9e74a3b
Allow config-based overriding of hadoop job properties.
2014-05-06 09:11:31 -07:00
fjy
f9523274ac
remove extra println
2014-05-01 15:06:51 -07:00
nishantmonu51
5137031304
use same logic for compression
...
Use same logic for compression across creating files, reading from
files, and checking file existence
2014-05-01 15:20:47 +05:30
nishantmonu51
728f1e8ee3
fix exists check with compression
2014-05-01 15:01:10 +05:30
nishantmonu51
01e84f10b7
add the checks again.
...
removing these checks breaks when there is no data for any interval
2014-05-01 14:35:09 +05:30
fjy
76e0a48527
Merge branch 'master' into new-schema
...
Conflicts:
indexing-hadoop/src/main/java/io/druid/indexer/DbUpdaterJob.java
indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
indexing-service/src/main/java/io/druid/indexing/common/task/HadoopIndexTask.java
server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumber.java
server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-04-25 14:03:28 -07:00
fjy
2d1f33e59f
Merge pull request #500 from metamx/batch-ingestion-fixes
...
Batch ingestion fixes
2014-04-22 17:59:24 -06:00
nishantmonu51
357bbf5127
add all the shard specs
2014-04-23 05:23:11 +05:30
nishantmonu51
625a5418d2
minor fix
2014-04-23 05:05:51 +05:30
nishantmonu51
1ca61237c1
review comments- use final variables
2014-04-23 03:33:28 +05:30
nishantmonu51
0d8c1ffe54
review comments and add partitioner
2014-04-23 03:30:30 +05:30
nishantmonu51
ea4a80e8d2
Add serde test for shardCount
2014-04-23 00:24:08 +05:30
nishantmonu51
e920cec5d0
remove unused import
2014-04-23 00:13:30 +05:30
nishantmonu51
0748eabe9b
batch ingestion fixes
...
1) Fix path when mapped output is compressed
2) Add number of reducers to the determine hashed partitions job
manually
3) Add a way to disable determine partitions and specify shardCount in
HashedPartitionsSpec
2014-04-23 00:05:08 +05:30
Crystark
40a6804192
Support for postgresql
...
I think it was the last request using 'end' missing the postgresql support.
2014-04-07 17:37:03 +02:00
fjy
2adcf07f5f
Merge branch 'master' into new-schema
...
Conflicts:
indexing-hadoop/src/main/java/io/druid/indexer/DetermineHashedPartitionsJob.java
indexing-service/src/main/java/io/druid/indexing/common/task/RealtimeIndexTask.java
indexing-service/src/test/java/io/druid/indexing/common/task/TaskSerdeTest.java
processing/src/test/java/io/druid/segment/TestIndex.java
server/src/main/java/io/druid/segment/realtime/RealtimeManager.java
server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-03-17 10:59:31 -07:00
nishantmonu51
4ec1959c30
Use druid implementation of HyperLogLog
...
remove dependency on clear spring analytics
2014-03-07 00:06:40 +05:30
fjy
5db00afb37
clean up and default values
2014-03-04 14:38:27 -08:00
fjy
c4c4d80336
make local testing pass
2014-03-03 14:52:43 -08:00
fjy
46b9ac78e7
Merge branch 'master' into new-schema
...
Conflicts:
indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
pom.xml
publications/whitepaper/druid.pdf
publications/whitepaper/druid.tex
2014-03-03 14:48:15 -08:00
fjy
13c7f1c7b1
remove dead code
2014-02-27 15:52:19 -08:00
fjy
bf2ddda897
unit tests passing after more refactoring
2014-02-27 15:21:09 -08:00
nishantmonu51
5e0d418b4b
fix determine partitions partitioner to work in local mode
2014-02-26 16:31:42 +05:30
nishantmonu51
1ed5254d5b
improvements
...
1) Number of reducers use 1 only when intervals are to be determined
2) Read only useful bytes from BytesWritable
2014-02-26 02:51:45 +05:30
nishantmonu51
8af63005a6
refactor randomPartitionsSpec to hashedPartitionsSpec
...
refactor to a more appropriate name
2014-02-25 03:07:31 +05:30
fjy
5d2367f0fd
unit tests pass at this point
2014-02-20 15:52:12 -08:00
fjy
20cac8c506
not compiling yet but close
2014-02-19 15:54:27 -08:00
fjy
4b7c76762d
unit tests passingn at this point, finished rt port maybe
2014-02-18 15:14:38 -08:00
nishantmonu51
fde7269c86
check published segments before the intermediate files are cleaned up
2014-02-15 04:30:28 +05:30
fjy
3979eb270c
Revert "Revert "Merge branch 'determine-partitions-improvements'""
...
This reverts commit 189b3e2b9b
.
2014-02-14 12:58:56 -08:00
fjy
a8c4362d72
rejiggering druid api
2014-02-14 12:57:52 -08:00
fjy
189b3e2b9b
Revert "Merge branch 'determine-partitions-improvements'"
...
This reverts commit 7ad228ceb5
, reversing
changes made to 9c55e2b779
.
2014-02-14 12:47:34 -08:00
nishantmonu51
48d0c37f98
documentation for random partition spec
2014-02-05 15:30:44 +05:30
nishantmonu51
bacc72415f
correct locking and partitionsSpec
2014-02-05 03:17:47 +05:30
nishantmonu51
569452121e
fix partitioner for loca mode
2014-01-31 21:59:17 +05:30
nishantmonu51
82b748ad43
review comments
2014-01-31 20:19:33 +05:30
nishantmonu51
97e5d68635
determine intervals working with determine partitions
2014-01-31 19:04:52 +05:30
nishantmonu51
5fd76067cd
remove logging and use new determine partition job
2014-01-31 13:51:38 +05:30
nishantmonu51
7ca87d59df
Determine partitions using cardinality
2014-01-31 00:49:11 +05:30
fjy
f898c29e20
fix batch indexing and prepare for next release
2014-01-17 15:52:04 -08:00
fjy
3b17c4c03c
a whole bunch of docs and fixes
2014-01-13 18:01:56 -08:00
fjy
1ecc94cfb6
another attempt at index task
2014-01-10 17:56:22 -08:00
Hagen Rother
52746b8ea6
fix hadoop intake's parser exception catching (was too specific)
2013-12-19 07:04:47 +01:00
fjy
a1c09df17f
make the hadoop index task work again
2013-10-16 09:45:17 -07:00
cheddar
c47fe202c7
Fix HadoopDruidIndexer to work with the new way of things
...
There are multiple and sundry changes in here.
First, "HadoopDruidIndexer" has been split into two pieces, (1) CliHadoop which pulls the hadoop version and builds up the right classpath with the proper hadoop version to run the indexer and (2) CliInternalHadoopIndexer which actually runs the indexer.
In order to work around a bunch of jets3t version conflicts with Hadoop and Druid, I needed to extract the S3 deep storage stuff into its own module. I then also moved the HDFS stuff into its own module so that I could eliminate the dependency on Hadoop for druid-server.
In doing these changes, I wanted to make the extensions buildable with only the druid-api jar, so a few other things had to move out of Druid and into druid-api. They are all API-level things, however, so they really belong in druid-api instead.
Lastly, I removed the druid-realtime module and put it all in druid-server.
2013-10-09 15:15:44 -05:00
fjy
a79ad7bab4
make dynamic master resource configuration work again
2013-09-27 15:00:40 -07:00
fjy
8bc56daa66
fix things up according to code review comments
2013-09-26 11:35:45 -07:00
fjy
87259321b6
port hadoop druid indexer to new guice framework
2013-09-26 11:04:42 -07:00
cheddar
3c39f90c89
1) Move Firehose interface and dependencies to druid-api
...
2) Move DataSegment* interfaces and dependencies to druid-api
2013-08-31 16:43:28 -05:00
cheddar
5ab671050e
No more com.metamx.druid, it is now all io.druid!
2013-08-30 19:42:12 -05:00
cheddar
bd0756e360
More stuff moved, things still compiling and tests still passing. Yay!
2013-08-30 18:58:35 -05:00
cheddar
56e2b956d0
OMG!!! A lot of stuff has been moved. Modules have been created and destroyed, but everything is compiling and unit tests are passing, OMFG this is awesome.!
2013-08-30 18:21:04 -05:00
cheddar
2a46086e20
1) Didn't remove the io.druid files from client. Remove those and make sure things compile
...
2) Switch DefaultObjectMapper to CommonObjectMapper
3) Create new DefaultObjectMapper in client that has Query stuff registered on it by default
2013-08-29 15:25:36 -05:00
cheddar
9c30ced5ea
1) Move various "api" classes to io.druid packages and make sure things compile and stuff
2013-08-28 15:51:02 -05:00
cheddar
5fa944dd26
Merge branch 'master' into guice
...
Conflicts:
client/src/main/java/com/metamx/druid/coordination/BatchDataSegmentAnnouncer.java
client/src/main/java/com/metamx/druid/curator/announcement/Announcer.java
client/src/main/java/com/metamx/druid/query/filter/SelectorDimFilter.java
client/src/main/java/com/metamx/druid/query/search/SearchQueryQueryToolChest.java
indexing-service/src/main/java/com/metamx/druid/indexing/common/tasklogs/S3TaskLogs.java
indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/ForkingTaskRunner.java
indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunner.java
indexing-service/src/main/java/com/metamx/druid/indexing/worker/WorkerCuratorCoordinator.java
indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/RemoteTaskRunnerTest.java
pom.xml
server/src/main/java/com/metamx/druid/http/MasterMain.java
server/src/main/java/com/metamx/druid/http/MasterServletModule.java
server/src/main/java/com/metamx/druid/master/DruidMasterConfig.java
server/src/test/java/com/metamx/druid/master/DruidMasterTest.java
server/src/test/java/com/metamx/druid/query/group/GroupByQueryRunnerTest.java
2013-08-27 14:27:32 -05:00
fjy
d11d0a8284
fix according to code review
2013-08-22 10:49:46 -07:00
fjy
778fd0f10e
Fix persist of empty indexes in index generator job
2013-08-22 10:16:43 -07:00
cheddar
eee1efdcb5
Merge branch 'master' into guice
...
Conflicts:
client/src/main/java/com/metamx/druid/client/DruidServerConfig.java
indexing-service/src/main/java/com/metamx/druid/indexing/common/index/ChatHandlerProvider.java
indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/TaskMasterLifecycle.java
indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
indexing-service/src/test/java/com/metamx/druid/indexing/coordinator/TaskLifecycleTest.java
2013-08-06 13:33:31 -07:00
cheddar
3c808b15c3
1) Fix HadoopDruidIndexerConfigTest to actually verify the current correct behavior.
2013-08-05 11:37:20 -07:00
cheddar
2b71505421
1) Fix HadoopDruidIndexerConfig to no longer replace ":" with "_" on the segmentOutputDir. The segmentOutputDir is user-supplied so they should have the ability to just not set a bad directory.
2013-08-05 11:22:26 -07:00
cheddar
2361e0112a
Make it all compile again...
2013-08-02 10:14:46 -07:00
cheddar
9e78bb38f5
Merge branch 'master' into guice
...
Conflicts:
client/src/main/java/com/metamx/druid/QueryableNode.java
client/src/main/java/com/metamx/druid/client/ServerInventoryView.java
client/src/main/java/com/metamx/druid/coordination/SingleDataSegmentAnnouncer.java
client/src/main/java/com/metamx/druid/initialization/CuratorDiscoveryConfig.java
client/src/main/java/com/metamx/druid/query/MetricsEmittingExecutorService.java
indexing-hadoop/src/test/java/com/metamx/druid/indexer/HadoopDruidIndexerConfigTest.java
indexing-service/src/main/java/com/metamx/druid/indexing/common/TaskToolbox.java
indexing-service/src/main/java/com/metamx/druid/indexing/coordinator/http/IndexerCoordinatorNode.java
indexing-service/src/main/java/com/metamx/druid/indexing/worker/executor/ExecutorNode.java
indexing-service/src/main/java/com/metamx/druid/indexing/worker/http/WorkerNode.java
pom.xml
server/src/main/java/com/metamx/druid/coordination/ServerManager.java
server/src/main/java/com/metamx/druid/coordination/ZkCoordinator.java
server/src/main/java/com/metamx/druid/db/DatabaseRuleManager.java
server/src/main/java/com/metamx/druid/db/DatabaseSegmentManager.java
server/src/main/java/com/metamx/druid/http/ComputeNode.java
server/src/main/java/com/metamx/druid/http/MasterMain.java
server/src/main/java/com/metamx/druid/loading/SegmentLoaderConfig.java
server/src/main/java/com/metamx/druid/loading/SingleSegmentLoader.java
server/src/main/java/com/metamx/druid/master/DruidMaster.java
2013-08-01 16:42:47 -07:00
Jan Rudert
ad087a7a22
correct segment path for hadoop indexer
2013-07-10 09:21:45 +02:00
cheddar
2f56c24259
1) Inject IndexingServiceClient
...
2) Switch all the DBI references to IDBI
2013-06-07 17:37:33 -07:00
cheddar
f68df7ab69
1) Make tests work and continue trying to make the DruidMaster start up with just Guice
2013-06-07 12:01:46 -07:00
fjy
42cc87a294
Merge branch 'master' into refactor-indexing
...
Conflicts:
indexing-service/src/main/java/com/metamx/druid/indexing/common/task/IndexTask.java
pom.xml
2013-05-31 17:28:59 -07:00
fjy
08d84001ba
Merge branch 'master' into refactor-indexing
2013-05-16 16:03:29 -07:00
fjy
26e0eb62cb
merge and other refactorings
2013-05-15 17:28:08 -07:00