Commit Graph

95 Commits

Author SHA1 Message Date
Gian Merlino 940e1aa3eb Replace funky imports with standard ones.
1) Lots of Guava imports were not coming from the actual Guava
2) junit.framework.Assert should be org.junit.Assert
2015-08-28 18:02:05 -07:00
Himanshu Gupta 2e0dd1d792 adding UTs and addressing review comments to
firehoseV2 addition to Realtime[Manager|Plumber],
essential segment metadata persist support,
kafka-simple-consumer-firehose extension patch
2015-08-27 20:50:46 -05:00
lvjq 2237a8cf0f kafka 8 simple consumer firehose 2015-08-27 20:50:46 -05:00
Fangjin Yang 33b862166a Merge pull request #1659 from himanshug/segment_kill_update
on kill segment, dont leave version, interval and dataSource dir behind on HDFS
2015-08-26 07:23:20 -07:00
Xavier Léauté c4d0e8d29b remove unnecessary pom verbiage 2015-08-25 16:07:03 -07:00
Gian Merlino 2bf9a70bfa Consolidate SQL retrying by moving logic into the connectors.
Also change boolean removeLock to void addLock in MetadataStorageActionHandler.
2015-08-25 12:42:29 -07:00
Himanshu Gupta 5b5a76ef6c adding unit test for HdfsDataSegmentKiller.testKill(..) 2015-08-23 22:21:03 -05:00
Himanshu Gupta c2bebfe39e delete version, interval, dataSource directories on segment deletion if possible, so that they are not left behind and consume ns quota on HDFS 2015-08-23 22:06:12 -05:00
Himanshu Gupta 9b54124cd0 pseudo integration tests for approximate histogram 2015-08-20 01:27:20 -05:00
Xavier Léauté 1abcd75696 Merge pull request #1624 from metamx/expandTimeouts
Expand timeouts on JDBCExtractionNamespaceTest
2015-08-18 21:32:50 -07:00
Xavier Léauté 3b2e41e42a update for next release 2015-08-18 17:16:46 -07:00
Charles Allen 38110820c3 Expand timeouts on JDBCExtractionNamespaceTest 2015-08-18 14:28:40 -07:00
Charles Allen db19d2d547 Revert "Update to guice 4.0" 2015-08-14 09:26:07 -07:00
Charles Allen 76fbb12959 Increase timeout in tests for NamespaceExtractionCacheManagerExecutorsTest 2015-08-11 13:54:54 -07:00
Charles Allen 7e61216287 Update to guice 4.0
- Mark a lot of `@Provides` methods as final since guice 4.0 disallows overriding them
2015-08-10 13:57:18 -07:00
Charles Allen 8be82c00bd Better handling of slow stuff in NamespaceExtractionCacheManagerExecutorsTest 2015-08-07 15:11:54 -07:00
Charles Allen e6226968a6 Merge pull request #1589 from druid-io/fix-firehose-doc
Add a lot more docs for firehoses
2015-08-06 12:45:24 -07:00
Charles Allen 8cdcf69714 Better handle timeouts in namespace tests 2015-08-06 10:20:18 -07:00
fjy 012fff6616 fix firehose docs 2015-08-04 09:52:23 -07:00
Fangjin Yang 22567946cf Merge pull request #1259 from metamx/queryTimeLookup
Query Time Lookup
2015-07-28 11:43:05 -10:00
Himanshu cc50217eb0 Merge pull request #1568 from metamx/detailedSegmentLoadingErrors
More detailed error logging on segment activities
2015-07-28 13:31:16 -05:00
Charles Allen 86ede702b1 Add namespaced lookups as extensions
* Adds kafka, URI, and JDBC namespace defintions
* Add ability to explicitly rename using a "namespace" which is a particular data collection that is loaded on all realtime, historic nodes, and brokers. If any of these nodes has the namespace extension, ALL nodes have the namespace extension.
* Add namespace caching and populating (can be on heap or off heap)
* Add NamespaceExtractionCacheManager for handling caches
* Added ExtractionNamespace for handling metadata on the extraction namespaces
* Added ExtractionNamespaceUpdate for handling metadata related to updates
* Add extension which caches renames from a kafka stream (requires kafka8)
* Added README.md for the namespace kafka extension
* Added docs
* Added namespace/size, namespace/count, namespace/deltaTasksStarted metrics

Add static config for namespaces via `druid.query.extraction.namespace`
* This is a rebase of https://github.com/b-slim/druid/tree/static_config_only
2015-07-28 11:14:14 -07:00
Charles Allen c492d4448d More detailed S3DataSegmentKiller error messages 2015-07-27 13:45:03 -07:00
Charles Allen fe7818ddd2 More detailed AzureDataSegmentKiller error messgaes 2015-07-27 13:44:59 -07:00
Charles Allen 3f901e7291 More detailed logging of error message on S3DataSegmentMover 2015-07-27 13:28:54 -07:00
Charles Allen e051e93d19 Merge pull request #1518 from RealROI/more-azure-features
Azure Blob Store support for Firehose and Indexing Service Logs
2015-07-17 16:10:22 -07:00
Zak Kristjanson 0bda7af52c Add more support for Azure Blob Store
Azure Blob Store support for Task Logs and a firehose for data ingestion
2015-07-17 15:38:21 -07:00
Xavier Léauté 4cfb00bc8a inrement version 2015-07-15 13:09:05 -07:00
Hao Xia 1931491c9f A couple of hdfs related fixes
* Class loading issue with hdfs-storage extension
* Exception when using hdfs with non-fully qualified segment path
2015-06-19 17:22:20 -07:00
Xavier Léauté 0a5bb909a2 [maven-release-plugin] prepare for next development iteration 2015-06-18 17:35:19 -07:00
Xavier Léauté 59c6b2b279 [maven-release-plugin] prepare release druid-0.8.0-rc1 2015-06-18 17:35:14 -07:00
Charles Allen f48db09e35 Add optimizations for ExtractionFn by enabling MANY_TO_ONE vs ONE_TO_ONE codepaths
* Also adds LookupExtractionFn and MapLookupExtractor which takes in an explicit mapping of renames
* Add injective to javascript extraction fn
2015-06-02 12:22:56 -07:00
fjy be2a35188e Additional schema validations and better logs for common extensions 2015-05-27 16:25:02 -07:00
cheddar c1b1752595 Merge pull request #1383 from metamx/psql-transient
retry transient exceptions for PostgreSQL, fixes #1382
2015-05-22 13:01:53 -07:00
Xavier Léauté 6b23e02d2b retry transient exceptions for PostgreSQL, fixes #1382 2015-05-22 14:47:27 -04:00
flow 07659f30ab bug fix: hdfs task log and indexing task not work properly with Hadoop HA 2015-05-21 20:49:42 +08:00
Xavier Léauté 3c3db7229c Merge pull request #1355 from himanshug/long_max_min_aggregators
Long max/min aggregators
2015-05-13 12:08:11 -07:00
Himanshu Gupta d0ec945129 adding aliases doubleMax and doubleMin for max and min respectively
renamed all [Max/Min]*.java to [DoubleMax/DoubleMin]*.java and created [Max/Min]AggregatorFactory.java which can be removed when we dont need the min/max aggregator type backward compatibility
2015-05-13 09:25:41 -05:00
fjy 7a6acf5c1b update pom to 0.8 2015-05-11 19:41:58 -06:00
David Rodrigues 11a76169b4 Overall improvement on Azure Deep Storage extension.
* Remove hard-coded azure path manipulation from the puller.
  * Fix segment size not being zero after uploading it do Azure.
  * Remove both index and desc files only on a success upload to Azure.
  * Add Azure container name to load spec.
      This patch would help future-proof azure deep-storage module and avoid
      having to introduce ugly backwards-compatibility fixes when we want to
      support multiple containers or moving data between containers.
2015-05-05 15:17:25 -07:00
Charles Allen 16a0c40d4c Fix concatenated gzip files in StaticS3FirehoseFactory 2015-04-24 15:06:28 -07:00
David Pinheiro baeef08c4c Add Microsoft Azure as a Deep Storage option. 2015-04-16 15:39:36 -07:00
Charles Allen abdeaa0746 Add stricter checking for potential coding errors
Can use via `mvn clean compile test-compile -P strict'
2015-04-15 14:52:25 -07:00
Charles Allen b29816bddb Minor fix in hdfs-storage pom.xml 2015-04-08 14:29:16 -07:00
Fangjin Yang 208e307915 Merge pull request #1251 from metamx/uriSegmentLoaders
Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
2015-03-30 17:43:51 -07:00
fjy aea7f9d192 [maven-release-plugin] prepare for next development iteration 2015-03-30 16:35:24 -07:00
fjy 060d7aef03 [maven-release-plugin] prepare release druid-0.7.1 2015-03-30 16:35:20 -07:00
Charles Allen 1c6cbea89c Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
This reverts commit f904bc7858.
2015-03-30 13:40:04 -07:00
Fangjin Yang f904bc7858 Revert "Overhaul of SegmentPullers to add consistency and retries" 2015-03-30 13:15:50 -07:00
Charles Allen 6d407e8677 Add URI handling to SegmentPullers
* Requires https://github.com/druid-io/druid-api/pull/37
* Requires https://github.com/metamx/java-util/pull/22
* Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl
  * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
2015-03-30 12:33:23 -07:00