Commit Graph

170 Commits

Author SHA1 Message Date
Charles Allen abae47850a Add backwards compatability for PR #1922 2015-11-11 10:27:00 -08:00
Charles Allen 1df4baf489 Move Jackson Guice adapters into io.druid
* Removes access to protected methods in com.fasterxml
* Eliminates druid-common's use of foreign package com.fasterxml
2015-11-09 10:50:45 -08:00
Charles Allen 929b981710 Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to 2015-11-05 18:10:13 -08:00
Lou Marvin Caraig c924f9fe56 Added cloudfiles-extensions in order to support Rackspace's cloudfiles as deep storage 2015-11-04 17:44:48 +01:00
Himanshu Gupta e9cfb7f46f refer to top level property for hadoop version instead of hardcoding 2.3.0 2015-10-26 15:51:48 -05:00
Xavier Léauté e4ac78e43d bump next snapshot to 0.9.0 2015-10-20 13:46:13 -07:00
Xavier Léauté 4c2c7a2c37 update version to 0.8.3 2015-10-14 21:40:55 -07:00
Gian Merlino e3bb93e8c7 Revert "Merge pull request #1781 from dclim/nested-groupby-multiple-same-aggregator-fix-v2"
This reverts commit dae488b7c0, reversing
changes made to 397be4b897.
2015-10-01 00:05:59 -04:00
Fangjin Yang dae488b7c0 Merge pull request #1781 from dclim/nested-groupby-multiple-same-aggregator-fix-v2
Fix failure in nested groupBy with multiple aggregators with same fie…
2015-09-30 22:28:34 -04:00
David Lim 70ae5ca922 Fix failure in nested groupBy with multiple aggregators with same fieldName
Version 2 - Throws an exception if an outer query references an
aggregator that doesn't exist in the inner query, and then uses the
inner query aggregator names to form the columns for the intermediate
incremental index.

Also deleted all the getRequiredColumns() methods which are no longer
being used.

We do something wacky by adding an aggregator factory for the post
aggregators when building the intermediate incremental index, otherwise
queries on post aggregate results fail because the data isn't in the
incremental index.

Closes #1419
2015-09-30 15:43:11 -06:00
Charles Allen bc22d4ff6c Cleanup kafka-extraction-namespace
Remove extra build defines in kafka-extraction-namespace's pom.xml
2015-09-30 11:33:04 -07:00
Xavier Léauté 8a21b4cae3 Merge pull request #1697 from metamx/betterMissingQTLLogging
Better logging of URIExtractionNamespace failures due to missing files
2015-09-15 15:29:27 -07:00
Charles Allen f5ed6e885c Merge pull request #1702 from himanshug/double_datasource_in_storage_dir
do not have dataSource twice in path to segment storage on hdfs
2015-09-15 14:00:35 -07:00
Fangjin Yang 34ef81572d Merge pull request #1700 from himanshug/update_agg_test_helper
update indexing in the helper to use multiple persists and merge
2015-09-14 06:56:29 -07:00
Himanshu Gupta b989a7054c fix for "java.io.IOException: No FileSystem for scheme: hdfs" error
aka workaround for https://issues.apache.org/jira/browse/HDFS-8750
2015-09-11 15:35:59 -05:00
Himanshu Gupta 67aa3dc153 on HDFS store segments in "dataSource/interval/.." and not in "dataSource/dataSource/interval.." 2015-09-09 11:12:01 -05:00
Himanshu Gupta 5da58e48e0 use Rule based TemporaryFolder for cleanup of temp directory/files 2015-09-09 11:10:33 -05:00
Charles Allen 1977ac9c5d Better logging of URIExtractionNamespace failures due to missing files 2015-09-08 13:33:32 -07:00
Charles Allen 0b8a3035c6 Better timing and locking in NamespaceExtractionCacheManagerExecutorsTest 2015-09-04 13:02:14 -07:00
Nishant 0096e6a0a0 Merge pull request #1658 from metamx/cleanupJDBCExtractionNamespaceTest
Hopefully add better timeouts and ordering to JDBCExtractionNamespaceTest
2015-09-02 23:49:49 +05:30
Xavier Léauté 82f9ecf56b Merge pull request #1620 from metamx/longFriendlyQTL
Allow long values in the key or value fields for URIExtractionNamespace
2015-09-02 10:55:35 -07:00
cheddar 4f61b42f40 Merge pull request #1578 from b-slim/fix_extraction_filter_2
Fix UT and documentation to the extraction filter
2015-09-01 10:46:20 -07:00
Gian Merlino 940e1aa3eb Replace funky imports with standard ones.
1) Lots of Guava imports were not coming from the actual Guava
2) junit.framework.Assert should be org.junit.Assert
2015-08-28 18:02:05 -07:00
Himanshu Gupta 2e0dd1d792 adding UTs and addressing review comments to
firehoseV2 addition to Realtime[Manager|Plumber],
essential segment metadata persist support,
kafka-simple-consumer-firehose extension patch
2015-08-27 20:50:46 -05:00
lvjq 2237a8cf0f kafka 8 simple consumer firehose 2015-08-27 20:50:46 -05:00
Charles Allen ac8e32b58e Hopefully add better timeouts and ordering to JDBCExtractionNamespaceTest 2015-08-26 23:05:51 -07:00
Charles Allen b24a88b328 Allow long values in the key or value fields for URIExtractionNamespace 2015-08-26 09:44:03 -07:00
Fangjin Yang 33b862166a Merge pull request #1659 from himanshug/segment_kill_update
on kill segment, dont leave version, interval and dataSource dir behind on HDFS
2015-08-26 07:23:20 -07:00
Xavier Léauté c4d0e8d29b remove unnecessary pom verbiage 2015-08-25 16:07:03 -07:00
Gian Merlino 2bf9a70bfa Consolidate SQL retrying by moving logic into the connectors.
Also change boolean removeLock to void addLock in MetadataStorageActionHandler.
2015-08-25 12:42:29 -07:00
Himanshu Gupta 5b5a76ef6c adding unit test for HdfsDataSegmentKiller.testKill(..) 2015-08-23 22:21:03 -05:00
Himanshu Gupta c2bebfe39e delete version, interval, dataSource directories on segment deletion if possible, so that they are not left behind and consume ns quota on HDFS 2015-08-23 22:06:12 -05:00
Himanshu Gupta 9b54124cd0 pseudo integration tests for approximate histogram 2015-08-20 01:27:20 -05:00
Xavier Léauté 1abcd75696 Merge pull request #1624 from metamx/expandTimeouts
Expand timeouts on JDBCExtractionNamespaceTest
2015-08-18 21:32:50 -07:00
Xavier Léauté 3b2e41e42a update for next release 2015-08-18 17:16:46 -07:00
Charles Allen 38110820c3 Expand timeouts on JDBCExtractionNamespaceTest 2015-08-18 14:28:40 -07:00
Charles Allen db19d2d547 Revert "Update to guice 4.0" 2015-08-14 09:26:07 -07:00
Charles Allen 76fbb12959 Increase timeout in tests for NamespaceExtractionCacheManagerExecutorsTest 2015-08-11 13:54:54 -07:00
Charles Allen 7e61216287 Update to guice 4.0
- Mark a lot of `@Provides` methods as final since guice 4.0 disallows overriding them
2015-08-10 13:57:18 -07:00
Charles Allen 8be82c00bd Better handling of slow stuff in NamespaceExtractionCacheManagerExecutorsTest 2015-08-07 15:11:54 -07:00
Charles Allen e6226968a6 Merge pull request #1589 from druid-io/fix-firehose-doc
Add a lot more docs for firehoses
2015-08-06 12:45:24 -07:00
Charles Allen 8cdcf69714 Better handle timeouts in namespace tests 2015-08-06 10:20:18 -07:00
fjy 012fff6616 fix firehose docs 2015-08-04 09:52:23 -07:00
Slim Bouguerra 7848429cbf unused imports 2015-08-03 14:50:52 -05:00
Fangjin Yang 22567946cf Merge pull request #1259 from metamx/queryTimeLookup
Query Time Lookup
2015-07-28 11:43:05 -10:00
Himanshu cc50217eb0 Merge pull request #1568 from metamx/detailedSegmentLoadingErrors
More detailed error logging on segment activities
2015-07-28 13:31:16 -05:00
Charles Allen 86ede702b1 Add namespaced lookups as extensions
* Adds kafka, URI, and JDBC namespace defintions
* Add ability to explicitly rename using a "namespace" which is a particular data collection that is loaded on all realtime, historic nodes, and brokers. If any of these nodes has the namespace extension, ALL nodes have the namespace extension.
* Add namespace caching and populating (can be on heap or off heap)
* Add NamespaceExtractionCacheManager for handling caches
* Added ExtractionNamespace for handling metadata on the extraction namespaces
* Added ExtractionNamespaceUpdate for handling metadata related to updates
* Add extension which caches renames from a kafka stream (requires kafka8)
* Added README.md for the namespace kafka extension
* Added docs
* Added namespace/size, namespace/count, namespace/deltaTasksStarted metrics

Add static config for namespaces via `druid.query.extraction.namespace`
* This is a rebase of https://github.com/b-slim/druid/tree/static_config_only
2015-07-28 11:14:14 -07:00
Charles Allen c492d4448d More detailed S3DataSegmentKiller error messages 2015-07-27 13:45:03 -07:00
Charles Allen fe7818ddd2 More detailed AzureDataSegmentKiller error messgaes 2015-07-27 13:44:59 -07:00
Charles Allen 3f901e7291 More detailed logging of error message on S3DataSegmentMover 2015-07-27 13:28:54 -07:00
Charles Allen e051e93d19 Merge pull request #1518 from RealROI/more-azure-features
Azure Blob Store support for Firehose and Indexing Service Logs
2015-07-17 16:10:22 -07:00
Zak Kristjanson 0bda7af52c Add more support for Azure Blob Store
Azure Blob Store support for Task Logs and a firehose for data ingestion
2015-07-17 15:38:21 -07:00
Xavier Léauté 4cfb00bc8a inrement version 2015-07-15 13:09:05 -07:00
Hao Xia 1931491c9f A couple of hdfs related fixes
* Class loading issue with hdfs-storage extension
* Exception when using hdfs with non-fully qualified segment path
2015-06-19 17:22:20 -07:00
Xavier Léauté 0a5bb909a2 [maven-release-plugin] prepare for next development iteration 2015-06-18 17:35:19 -07:00
Xavier Léauté 59c6b2b279 [maven-release-plugin] prepare release druid-0.8.0-rc1 2015-06-18 17:35:14 -07:00
Charles Allen f48db09e35 Add optimizations for ExtractionFn by enabling MANY_TO_ONE vs ONE_TO_ONE codepaths
* Also adds LookupExtractionFn and MapLookupExtractor which takes in an explicit mapping of renames
* Add injective to javascript extraction fn
2015-06-02 12:22:56 -07:00
fjy be2a35188e Additional schema validations and better logs for common extensions 2015-05-27 16:25:02 -07:00
cheddar c1b1752595 Merge pull request #1383 from metamx/psql-transient
retry transient exceptions for PostgreSQL, fixes #1382
2015-05-22 13:01:53 -07:00
Xavier Léauté 6b23e02d2b retry transient exceptions for PostgreSQL, fixes #1382 2015-05-22 14:47:27 -04:00
flow 07659f30ab bug fix: hdfs task log and indexing task not work properly with Hadoop HA 2015-05-21 20:49:42 +08:00
Xavier Léauté 3c3db7229c Merge pull request #1355 from himanshug/long_max_min_aggregators
Long max/min aggregators
2015-05-13 12:08:11 -07:00
Himanshu Gupta d0ec945129 adding aliases doubleMax and doubleMin for max and min respectively
renamed all [Max/Min]*.java to [DoubleMax/DoubleMin]*.java and created [Max/Min]AggregatorFactory.java which can be removed when we dont need the min/max aggregator type backward compatibility
2015-05-13 09:25:41 -05:00
fjy 7a6acf5c1b update pom to 0.8 2015-05-11 19:41:58 -06:00
David Rodrigues 11a76169b4 Overall improvement on Azure Deep Storage extension.
* Remove hard-coded azure path manipulation from the puller.
  * Fix segment size not being zero after uploading it do Azure.
  * Remove both index and desc files only on a success upload to Azure.
  * Add Azure container name to load spec.
      This patch would help future-proof azure deep-storage module and avoid
      having to introduce ugly backwards-compatibility fixes when we want to
      support multiple containers or moving data between containers.
2015-05-05 15:17:25 -07:00
Charles Allen 16a0c40d4c Fix concatenated gzip files in StaticS3FirehoseFactory 2015-04-24 15:06:28 -07:00
David Pinheiro baeef08c4c Add Microsoft Azure as a Deep Storage option. 2015-04-16 15:39:36 -07:00
Charles Allen abdeaa0746 Add stricter checking for potential coding errors
Can use via `mvn clean compile test-compile -P strict'
2015-04-15 14:52:25 -07:00
Charles Allen b29816bddb Minor fix in hdfs-storage pom.xml 2015-04-08 14:29:16 -07:00
Fangjin Yang 208e307915 Merge pull request #1251 from metamx/uriSegmentLoaders
Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
2015-03-30 17:43:51 -07:00
fjy aea7f9d192 [maven-release-plugin] prepare for next development iteration 2015-03-30 16:35:24 -07:00
fjy 060d7aef03 [maven-release-plugin] prepare release druid-0.7.1 2015-03-30 16:35:20 -07:00
Charles Allen 1c6cbea89c Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
This reverts commit f904bc7858.
2015-03-30 13:40:04 -07:00
Fangjin Yang f904bc7858 Revert "Overhaul of SegmentPullers to add consistency and retries" 2015-03-30 13:15:50 -07:00
Charles Allen 6d407e8677 Add URI handling to SegmentPullers
* Requires https://github.com/druid-io/druid-api/pull/37
* Requires https://github.com/metamx/java-util/pull/22
* Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl
  * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
2015-03-30 12:33:23 -07:00
Prajwal Tuladhar fb7005435b use ByteSink and ByteSource instead of OutputSupplier and InputSupplier
They are being deprecated and will eventually be removed in Guava 18.0
2015-03-26 14:45:00 -04:00
Charles Allen 3ed4b19201 Update mysql-connector-java to 5.1.34 2015-03-23 15:43:34 -07:00
fjy b389cfe404 [maven-release-plugin] prepare for next development iteration 2015-03-19 12:38:17 -07:00
fjy 60e7d543cc [maven-release-plugin] prepare release druid-0.7.1-rc1 2015-03-19 12:38:13 -07:00
fjy 6a47c1530c update versions to prepare for rc release 2015-03-19 11:39:38 -07:00
Xavier Léauté 11b3230602 update to kafka 0.8.2.1, because it's better™ 2015-03-12 09:59:24 -07:00
Xavier Léauté 217e674063 Handling aggregators and post aggregators with duplicate names
* add test for same-name groupBy hyperUniques post-agg
* add test for same-name post-agg in groupby with approx histogram
* Fixes https://github.com/druid-io/druid/issues/1045
* Throws an error if post aggs and aggs do not have unique names
* Add more groupBy tests for Having filters
2015-03-10 17:10:43 -07:00
Fangjin Yang e8605c63a9 Merge pull request #1150 from himanshug/broker-parallel-chunk-process
interval chunk query runner now processes individual chunk in a threadpool
2015-03-02 13:50:23 -08:00
Himanshu Gupta 29039fd541 interval chunk query runner now processes individual chunk in a thread pool and prints metrics query/time per chunk 2015-03-02 15:45:09 -06:00
Xavier Léauté b167dcf82c [maven-release-plugin] prepare for next development iteration 2015-02-23 14:28:06 -08:00
Xavier Léauté e81ac2ba43 [maven-release-plugin] prepare release druid-0.7.0 2015-02-23 14:27:58 -08:00
Xavier Léauté 38e8dfdc98 replace Kafka 0.8.1.1 with 0.8.2.0 stable 2015-02-13 14:48:36 -08:00
Xavier Léauté 1971c1679c do not build kafka-seven extension by default 2015-02-13 14:32:47 -08:00
Xavier Léauté 78df7f6165 Move Druid release artifacts to Sonatype
- Switch to using Druid parent POM
- Add required fields for Sonatype
- Common plugin versions and settings have been moved to the parent pom
- Cleanup artifacts and POMs for consistent formatting
- Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed
2015-02-13 14:26:31 -08:00
fjy d29740ed9f [maven-release-plugin] prepare for next development iteration 2015-02-12 16:16:00 -08:00
fjy 211fd15b7e [maven-release-plugin] prepare release druid-0.7.0-rc3 2015-02-12 16:15:56 -08:00
fjy 1f12c5b2f1 [maven-release-plugin] prepare for next development iteration 2015-02-03 12:06:49 -08:00
fjy e82d431be7 [maven-release-plugin] prepare release druid-0.7.0-rc2 2015-02-03 12:06:41 -08:00
Fangjin Yang 92e616de11 Merge pull request #1077 from metamx/remove-unused-imports
remove unused imports
2015-02-02 10:45:27 -08:00
nishantmonu51 ba932bb1f2 remove unused imports 2015-02-02 21:53:39 +05:30
fjy d05032b98a towards a community led druid 2015-01-31 20:57:36 -08:00
Xavier Léauté f00872c41b move common AWS related classes into a separate module 2015-01-29 13:55:49 -08:00
fjy 1f94de22c6 [maven-release-plugin] prepare for next development iteration 2015-01-20 14:23:55 -08:00
fjy 17476edc31 [maven-release-plugin] prepare release druid-0.7.0-rc1 2015-01-20 14:23:51 -08:00
Xavier Léauté c532d07635 Merge pull request #1011 from metamx/log4j2
Upgrade to log4j2
2015-01-20 12:51:07 -08:00