Commit Graph

1385 Commits

Author SHA1 Message Date
Roman Leventov ae900a4934 Update versions to 0.11.0-SNAPSHOT (#4483) 2017-06-28 17:05:58 -07:00
David Lim 0f99467cfb rollback to previous httpclient/httpcore versions (#4457) 2017-06-23 21:43:49 -05:00
Himanshu 61c38b66ad exclude aws-java-sdk from hadoop-aws dep in hdfs-storage module (#4437)
* exclude aws-java-sdk from hdfs-storage module

* address review comments
2017-06-22 15:56:35 -05:00
Roman Leventov 5285eb961b Update dependencies (#4313)
* Update dependencies

* Downgrade curator

* Rollback aws-java-sdk dependency to 1.10.77

* Revert exclusions in integration-tests

* Depend only on aws-java-sdk-ec2 instead of umbrella aws-java-sdk (fixes #4382)
2017-06-09 14:32:07 -07:00
Roman Leventov 31d33b333e Make using implicit system Charset an error (#4326)
* Make using implicit system charset an error

* Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String()

* Use English locale in StringUtils.safeFormat()

* Restore comment
2017-06-05 23:57:25 -07:00
Slim a2584d214a Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116)
* Adding s3a schema and s3a implem to hdfs storage module.

* use 2.7.3

* use segment pusher to make loadspec

* move getStorageDir and makeLoad spec under DataSegmentPusher

* fix uts

* fix comment part1

* move to hadoop 2.8

* inject deep storage properties

* set version to 2.7.3

* fix build issue about static class

* fix comments

* fix default hadoop default coordinate

* fix create filesytem

* downgrade aws sdk

* bump the version
2017-06-04 00:55:09 -06:00
Kenji Noguchi 3400f601db Protobuf extension (#4039)
* move ProtoBufInputRowParser from processing module to protobuf extensions

* Ported PR #3509

* add DynamicMessage

* fix local test stuff that slipped in

* add license header

* removed redundant type name

* removed commented code

* fix code style

* rename ProtoBuf -> Protobuf

* pom.xml: shade protobuf classes, handle .desc resource file as binary file

* clean up error messages

* pick first message type from descriptor if not specified

* fix protoMessageType null check. add test case

* move protobuf-extension from contrib to core

* document: add new configuration keys, and descriptions

* update document. add examples

* move protobuf-extension from contrib to core (2nd try)

* touch

* include protobuf extensions in the distribution

* fix whitespace

* include protobuf example in the distribution

* example: create new pb obj everytime

* document: use properly quoted json

* fix whitespace

* bump parent version to 0.10.1-SNAPSHOT

* ignore Override check

* touch
2017-05-30 13:11:58 -07:00
Jihoon Son 000b0ffed7 Increase the max heap size for strict compilation (#4306) 2017-05-21 03:42:44 +09:00
Roman Leventov b7a52286e8 Make @Override annotation obligatory (#4274)
* Make MissingOverride an error

* Make travis stript to fail fast

* Add missing Override annotations

* Comment
2017-05-16 13:30:30 -05:00
Roman Leventov 1ebfa22955 Update Error prone configuration; Fix bugs (#4252)
* Make Errorprone the default compiler

* Address comments

* Make Error Prone's ClassCanBeStatic rule a error

* Preconditions allow only %s pattern

* Fix DruidCoordinatorBalancerTester

* Try to give the compiler more memory

* Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now

* Don't show compiler warnings

* Try different travis script

* Fix travis.yml

* Make Error Prone optional again

* For error-prone compiler

* Increase compiler's maxmem

* Don't run Error Prone for benchmarks because of OOM

* Skip install step in Travis

* Remove MetricHolder.writeToChannel()

* In travis.yml, check compilation before tests, because it may fail faster
2017-05-12 15:55:17 +09:00
Gian Merlino 2ca7b00346 Update versions to 0.10.1-SNAPSHOT. (#4191) 2017-04-20 18:12:28 -07:00
Dongkyu Hwangbo 0d2e91ed50 Adding Kafka-emitter (#3860)
* Initial commit

* Apply another config: clustername

* Rename variable

* Fix bug

* Add retry logic

* Edit retry logic

* Upgrade kafka-clients version to the most recent release

* Make callback single object

* Write documentation

* Rewrite error message and emit logic

* Handling AlertEvent

* Override toString()

* make clusterName more optional

* bump up druid version

* add producer.config option which make user can apply another optional config value of kafka producer

* remove potential blocking in emit()

* using MemoryBoundLinkedBlockingQueue

* Fixing coding convention

* Remove logging every exception and just increment counting

* refactoring

* trivial modification

* logging when callback has exception

* Replace kafka-clients 0.10.1.1 with 0.10.2.0

* Resolve the problem related of classloader

* adopt try statement

* code reformatting

* make variables final

* rewrite toString
2017-04-04 14:07:43 -07:00
Gian Merlino 81d6b49d69 Downgrade Curator. (#4103)
Reverts #4060, fixes #4095, unfixes #4056, #3837. Better the devil you
know than the devil you don't, I always say.

See also https://issues.apache.org/jira/browse/CURATOR-394.
2017-03-23 13:44:00 -07:00
Roman Leventov 84fe91ba0b Monomorphic processing of TopN queries with 1 and 2 aggregators (key part of #3798) (#3889)
* Monomorphic processing: add HotLoopCallee, CalledFromHotLoop, RuntimeShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing.

* Use Execs.singleThreaded()

* RuntimeShapeInspector to support nullable fields

* Make CalledFromHotLoop annotation Inherited

* Remove unnecessary conversion of array of ColumnSelectorPluses to list and back to array in CardinalityAggregatorFactory

* Close InputStream in SpecializationService

* Formatting

* Test specialized PooledTopNScanners

* Set flags in PooledTopNAlgorithm directly

* Fix tests, dependent on CountAggragatorFactory toString() form

* Fix

* Revert CountAggregatorFactory changes

* Implement inspectRuntimeShape() for LongWrappingDimensionSelector and FloatWrappingDimensionSelector

* Remove duplicate RoaringBitmap dependency in the extendedset pom.xml

* Fix

* Treat ByteBuffers specially in StringRuntimeShape

* Doc fix

* Annotate BufferAggregator.init() with CalledFromHotLoop

* Make triggerSpecializationIterationsThreshold an int

* Remove SpecializationService.PerPrototypeClassState.of()

* Add comments

* Limit the amount of specializations that SpecializationService could make

* Add default implementation for BufferAggregator.inspectRuntimeShape(), for compatibility with extensions

* Use more efficient ConcurrentMap's idioms in SpecializationService
2017-03-17 14:44:36 -05:00
Gian Merlino 9cd666282c Update Curator to 2.12.0. (#4060)
Fixes #4056, #3837.
2017-03-15 09:38:31 -07:00
Charles Allen 805d85afda Allow compilation as Java8 source and target (#3328)
* Allow compilation as Java8 source and target for everything except API

* Remove conditions in tests which assume that we may run with Java 7

* Update easymock to 3.4

* Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration

* Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity()

* Remove java7 special for druid-api
2017-03-14 22:23:47 -06:00
Eugene Sevastyanov 16bf62bacc BACKEND-564: Emitter upgrade from 0.4.0 to 0.4.1 (#3977) 2017-03-01 13:03:01 -08:00
Gian Merlino 78b0d134ae Require Java 8 and include some Java 8 dependencies. (#3914)
* Require Java 8 and include some Java 8 dependencies.

- Upgrade Jetty to 9.3.16.v20170120.
- Upgrade DataSketches to 0.8.4.
- Bundle caffeine-cache by default.
- Still target Java 7 when compiling base Druid classes.

* Update cluster, quickstart docs.

* Remove oraclejdk7 from travis.yml.
2017-02-14 12:51:51 -08:00
Roman Leventov 38000576ea Optimizations of union, intersection and iterators of concise bitsets (part of #3798) (#3883)
* Port of metamx/extendedset#10, metamx/extendedset#13, metamx/extendedset#14, metamx/extendedset#15, metamx/bytebuffer-collections@9b199e3349, metamx/bytebuffer-collections#38 to Druid, remove unused code from extendedset module

* Remove ConciseSet.modCount

* Replace comments with assertions in ImmutableConciseSet

* Fix comments

* Fix asssertions in ImmutableConciseSet

* Add tests

* Comment fix
2017-02-10 18:02:26 -08:00
Gian Merlino 9191588656 Fix mvn javadoc:jar failure due to HadoopFsWrapper. (#3912) 2017-02-08 13:54:41 -06:00
Gian Merlino 12317fd001 Bump version to 0.10.0-SNAPSHOT. (#3913) 2017-02-06 17:54:35 -08:00
DaimonPl 93b71e265e Extract HLL related code to separate module (#3900) 2017-02-03 09:45:11 -08:00
Nishant Bangarwa a457cded28 Druid Extension to enable Authentication using Kerberos. (#3853)
* Add extension for supporting kerberos security

- This PR adds an extension for supporting druid authentication via
Kerberos.
- Working on the docs.

* Add docs

* review comments

* more review comments

* Block all paths by default

* more review comments - use proper Oid

* Allow extensions to override httpclient for integration tests

* Add kerberos lock to prevent multithreaded issues.

* review comment - remove enabled flag and fix router injection

* Add Cookie Handling and more detailed docs

* review comment - rename DruidKerberosConfig -> AuthKerberosConfig

* review comments

* fix travis failure on jdk7
2017-02-02 14:55:21 -06:00
Himanshu efb1b40fe0 build sqlserver-metadata-storage contrib extension (#3871) 2017-01-20 14:39:15 -08:00
kaijianding 33ae9dd485 streaming version of select query (#3307)
* streaming version of select query

* use columns instead of dimensions and metrics;prepare for valueVector;remove granularity

* respect query limit within historical

* use constant

* fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner

* add some test for scan query

* add scan query document

* fix merge conflicts

* add compactedList resultFormat, this format is better for json ser/der

* respect query timeout

* respect query limit on broker

* use static consts and remove unused code
2017-01-19 16:09:53 -06:00
Slim ae5a349a54 Exclude the transitive dependency LGPL jar since it is not needed (#3865)
* Exclude the transitive dependency LGPL jar since it is not needed

* add reason why exclude

* exclude from the root dependency

* add banning tool  to enforce exclusions
2017-01-19 11:49:08 -08:00
Akash Dwivedi dd0c4e2ead Migrating extendedset from Metamarkets. (#3694)
* Migrating extendedset from Metamarkets.

* Notice change

* More details in NOTICE

* NOTICE formatting.

* suppress header checkstlye for extendedset.
2017-01-17 10:10:27 -08:00
Jihoon Son d80bec83cc Enable auto license checking (#3836)
* Enable license checking

* Clean duplicated license headers
2017-01-10 18:13:47 -08:00
Gian Merlino a4f81a6471 Update to Calcite 1.11.0. (#3825) 2017-01-06 14:45:17 -08:00
Gian Merlino 1f35120c7e Downgrade to avatica-server 1.8.0, skip avatica-core. (#3813)
This matches the version bundled by Calcite 1.10.0.
2017-01-03 16:00:37 -08:00
Roman Leventov 76cb06a8d8 Lookup cache refactoring (the main part of #3667) (#3697)
* Lookup cache refactoring (the main part of druid-io/druid#3667)

* Use PowerMock's static methods in NamespaceLookupExtractorFactoryTest

* Fix KafkaLookupExtractorFactoryTest

* Use VisibleForTesting annotation instead of Javadoc comment

* Create a NamespaceExtractionCacheManager separately for each test in NamespaceExtractionCacheManagersTest

* Rename CacheScheduler.NoCache.ENTRY_DISPOSED to ENTRY_CLOSED

* Reduce visibility of NamespaceExtractionCacheManager.cacheCount() and monitor() implementations, and don't run NamespaceExtractionCacheManagerExecutorsTest with off-heap cache (it didn't before)

* In NamespaceLookupExtractorFactory, use safer idiom to check if CacheState is NoCache or VersionedCache

* More logging in CacheHandler constructor and close(), VersionedCache.close()

* PR comments addressed

* Make CacheScheduler.EntryImpl AutoCloseable, avoid 'dispose' verb in comments, logging and naming in CacheScheduler in favor of 'close'

* More Javadoc comments to CacheScheduler

* Fix NPE

* Remove logging in OnHeapNamespaceExtractionCacheManager.expungeCollectedCaches()

* Make NamespaceExtractionCacheManagersTest.testRacyCreation() to have similar load to what it be before the refactoring

* Unwrap NamespaceExtractionCacheManager.scheduledExecutorService from unneeded MoreExecutors.listeningDecorator() and specify that this is ScheduledThreadPoolExecutor, which ensures happens-before between periodic runs of the tasks

* More comments on MapDbCacheDisposer.disposed

* Replace concat with Long.toString()

* Comment on why NamespaceExtractionCacheManager.scheduledExecutorService() returns ScheduledThreadPoolExecutor

* Place logging statements in VersionedCache.close() and CacheHandler.close() after actual closing logic, because logging may fail

* Make JDBCExtractionNamespaceCacheFactory and StaticMapExtractionNamespaceCacheFactory to try to close newly created VersionedCache if population has failed, as it is done already in URIExtractionNamespaceCacheFactory

* Don't close the whole CacheScheduler.Entry, if the cache update task failed

* Replace AtomicLong updateCounter and firstRunLatch with Phaser-based UpdateCounter in CacheScheduler.EntryImpl
2016-12-23 18:04:27 -08:00
Gian Merlino 6440ddcbca Fix #3795 (Java 7 compatibility). (#3796)
* Fix #3795 (Java 7 compatibility).

Also introduce Animal Sniffer checks during build, which would
have caught the original problems.

* Add Animal Sniffer on caffeine-cache for JDK8.
2016-12-21 10:19:13 -08:00
Nishant f576a0ff14 Contrib Extension for Ambari Metrics Emitter (#3767)
* Contrib Extension for Ambari Metrics Emitter

extension to enable druid to send metrics to ambari metrics server
(https://cwiki.apache.org/confluence/display/AMBARI/Metrics)

review comments

switch to public repo

* review comments

* add docs

* fix pom version

* Add link for doc page in extensions.md

* remove unused imports

* review comments

review comments

remove unused dependency

review comment
2016-12-19 11:12:47 -08:00
Gian Merlino dd63f54325 Built-in SQL. (#3682) 2016-12-16 17:15:59 -08:00
Jihoon Son 5e39578eee Enable parallel test (#3774)
* Enable parallel test

* Remove unnecessary NotThreadSafe annocation

* Randomize the start port when finding available ports

* Fix test failure

* Change to handle all negatives
2016-12-14 21:05:56 -08:00
Ninglin Du 469ab21091 [Feature] Thrift support for realtime and batch ingestion (#3418)
* Thrift ingestion plugin

1. thrift binary is platform dependent, use scrooge to generate java files to avoid style check failure
2. stream and hadoop ingesion are both supported, input format can be sequence file and lzo thrift block file.
3. base64 and protocol aware

change header

* fix conlicts in pom
2016-12-13 10:05:15 -08:00
Gleb Smirnov 07384d6f40 Update Apache curator to a non-leaky version (see CURATOR-354) (#3769) 2016-12-12 09:52:40 -08:00
Akash Dwivedi 6386e6a4dc root and java-util pom cleanup (#3764)
* Remove bytebuffer-collections dependency from the root pom and java-util pom cleanup.

* Remove json-smart exclusion from root pom
2016-12-08 11:30:19 -08:00
Gian Merlino 943982b7b0 Configurable HTTP compression. (#3759)
* Configurable HTTP compression.

* Call real-time nodes real-time processes in docs.
2016-12-07 17:40:39 -08:00
Roman Leventov 949e65165c Bitset iteration optimization and improve safety (#3753)
* Deduplicate looking for bitset.nextSetBit() in BitSetIterator.next() and hasNext()

* Add BitmapIterationTest

* More elaborate comment on why Roaring is not tested in BitmapIterationTest
2016-12-07 15:49:16 -08:00
Navis Ryu c74d267f50 Support virtual column for select query (#2511)
* Support virtual column for select query

* Addressed comments
2016-12-05 15:14:35 -08:00
Erik Dubbelboer 7d36f540e8 WIP: Add Google Storage support (#2458)
Also excludes the correct artifacts from #2741
2016-11-16 14:06:45 +05:30
Keuntae Park 094f5b851b Support Min/Max for Timestamp (#3299)
* Min/Max aggregator for Timestamp

* remove unused imports and method

* rebase and zip the test data

* add docs
2016-11-14 23:00:21 -08:00
Roman Leventov fbbb55f867 Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics (#3679)
* Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics, not only query metrics

* Remove unused imports

* Use empty string instead of "testing-version" as a version placeholder
2016-11-11 17:17:27 -06:00
Akash Dwivedi 3e408497b3 Migrating bytebuffercollections from Metamarkets. (#3647)
* Migrating  bytebuffercollections from Metamarkets.

* resolving code conflicts and removing <p> from bytebuffer-collections.
2016-11-11 10:51:07 -08:00
Gian Merlino 657e4512d2 Checkstyle checks for AvoidStaticImport, UnusedImports. (#3660)
Excludes tests from AvoidStaticImport, since those are used often there and
I didn't want to make this changeset too large. Production code use was minimal
and I switched those to non-static imports.
2016-11-05 11:34:36 -07:00
Akash Dwivedi 4b3bd8bd63 Migrating java-util from Metamarkets. (#3585)
* Migrating java-util from Metamarkets.

* checkstyle and updated license on java-util files.

* Removed unused imports from whole project.

* cherry pick metamx/java-util@826021f.

* Copyright changes on java-util pom, address review comments.
2016-10-21 14:57:07 -07:00
Roman Leventov 5dc95389f7 Add Checkstyle framework (#3551)
* Add Checkstyle framework

* Avoid star import

* Need braces for control flow statements

* Redundant imports

* Add NewLineAtEndOfFile check
2016-10-13 13:37:47 -07:00
Roman Leventov 85ac8eff90 Improve performance of IndexMergerV9 (#3440)
* Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9

* Don't mask index in MergeIntIterator.makeQueueElement()

* DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator

* Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable

* Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }
2016-10-13 08:28:46 -07:00
Gian Merlino 40f2fe7893 Bump versions to 0.9.3-SNAPSHOT (#3524) 2016-09-29 13:53:32 -07:00
John Zhang 78b06a7d7e make global http client worker threads configurable (#3514) 2016-09-28 23:18:51 -07:00
Slim 3175e17a3b Cached lookup module. first cut implementing JDBC cache (#2819) 2016-09-16 13:45:54 -07:00
Gian Merlino 76fcbd8fc5 Update Curator, ZK to latest stable versions. (#3461) 2016-09-16 09:16:14 -07:00
Gian Merlino 2613e68477 Update java-util to 0.27.10. (#3337) 2016-08-09 13:37:30 +05:30
Navis Ryu 5b3f0ccb1f Support variance and standard deviation (#2525)
* Support variance and standard deviation

* addressed comments
2016-08-04 17:32:58 -07:00
Keuntae Park 95a58097e2 Hadoop InputRowParser for Orc file (#3019)
* InputRowParser to decode OrcStruct from OrcNewInputFormat

* add unit test for orc hadoop indexing

* update docs and fix test code bug

* doc updated

* resove maven dependency conflict

* remove unused imports

* fix returning array type from Object[] to correct primitive array type

* fix to support getDimension() of MapBasedRow : changing return type of orc list from array to list

* rebase and updated based on comments

* updated based on comments

* on reflecting review comments

* fix bug in typeStringFromParseSpec() and add unit test

* add license header
2016-07-26 09:42:56 -07:00
Nishant 47894c4eff add comment for default hadoop coordinates (#3257)
1) Modify CliHadoopIndexer to share constant from `TaskConfig.DEFAULT_DEFAULT_HADOOP_COORDINATES`
2) add comment to pom.xml as discussed in
https://github.com/druid-io/druid/pull/3044

fix name
2016-07-18 15:23:11 -07:00
Gian Merlino 13d8d96bc6 Update to guice-4.1.0. (#3222) 2016-07-18 08:08:43 -07:00
Charles Allen 3f1681c16c Caffeine cache extension (#3028)
* Initial commit of caffeine cache

* Address code comments

* Move and fixup README.md a bit

* Improve caffeine readme information

* Cleanup caffeine pom

* Address review comments

* Bump caffeine to 2.3.1

* Bump druid version to 0.9.2-SNAPSHOT

* Make test not fail randomly.

See https://github.com/ben-manes/caffeine/pull/93#issuecomment-227617998 for an explanation

* Fix distribution and documentation

* Add caffeine to extensions.md

* Fix links in extensions.md

* Lexicographic
2016-07-06 15:42:54 -07:00
Gian Merlino ebf890fe79 Update master version to 0.9.2-SNAPSHOT. (#3133) 2016-06-13 13:10:38 -07:00
Charles Allen aa2982ee31 Update bytebuffer-collections to 0.2.5 (#3117) 2016-06-13 08:41:20 -07:00
Fangjin Yang 53886a677c include avro in the druid tarball (#3123) 2016-06-13 16:58:21 +05:30
David Lim 6d38dde2f8 exclude slf4j-log4j12 (#3075) 2016-06-03 11:39:23 -07:00
Charles Allen 8024b915e2 [QTL] Implement LookupExtractorFactory of namespaced lookup (#2926)
* support LookupReferencesManager registration of namespaced lookup and eliminate static configurations for lookup from namespecd lookup extensions

- druid-namespace-lookup and druid-kafka-extraction-namespace are modified
- However, druid-namespace-lookup still has configuration about ON/OFF
  HEAP cache manager selection, which is not namespace wide
  configuration but node wide configuration as multiple namespace shares
  the same cache manager

* update KafkaExtractionNamespaceTest to reflect argument signature changes

* Add more synchronization functionality to NamespaceLookupExtractorFactory

* Remove old way of using extraction namespaces

* resolve compile error by supporting LookupIntrospectHandler

* Remove kafka lookups

* Remove unused stuff

* Fix start and stop behavior to be consistent with new javadocs

* Remove unused strings

* Add timeout option

* Address comments on configurations and improve docs

* Add more options and update hash key and replaces

* Move monitoring to the overriding classes

* Add better start/stop logging

* Remove old docs about namespace names

* Fix bad comma

* Add `@JsonIgnore` to lookup factory

* Address code review comments

* Remove ExtractionNamespace from module json registration

* Fix problems with naming and initialization. Add tests

* Optimize imports / reformat

* Fix future not being properly cancelled on failed initial scheduling

* Fix delete returns

* Add more docs about whole introspection

* Add `/version` introspection point for lookups

* Add more tests and address comments

* Add StaticMap extraction namespace for testing. Also add a bunch of tests

* Move cache system property to `druid.lookup.namespace.cache.type`

* Make VERSION lower case

* Change poll period to 0ms  for StaticMap

* Move cache key to bytebuffer

* Change hashCode and equals on static map extraction fn

* Add more comments on StaticMap

* Address comments

* Make scheduleAndWait use a latch

* Sanity renames and fix imports

* Remove extra info in docs

* Fix review comments

* Strengthen failure on start from warn to error

* Address comments

* Rename namespace-lookup to lookups-cached-global

* Fix injective mis-naming
* Also add serde test
2016-05-24 10:56:40 -07:00
Xavier Léauté e79284da59 new interval based cost function (#2972)
* new interval based cost function

Addresses issues with balancing of segments in the existing cost function
- `gapPenalty` led to clusters of segments ~30 days apart
- `recencyPenalty` caused imbalance among recent segments
- size-based cost could be skewed by compression

New cost function is purely based on segment intervals:
- assumes each time-slice of a partition is a constant cost
- cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C)
- cost decays exponentially based on distance between time-slices

* comments and formatting

* add more comments to explain the calculation
2016-05-17 09:56:00 -07:00
michaelschiff 2203a812bc statsd-emitter (#2410) 2016-04-28 18:41:02 -07:00
Xavier Léauté fc91120b54 Merge pull request #2857 from metamx/upgrade-zk
upgrade zookeeper client dependency to 3.4.8
2016-04-20 10:36:07 +05:30
Xavier Léauté 838768c632 upgrade curator, fixes #2829 (#2849) 2016-04-18 13:17:36 -07:00
Himanshu Gupta 308211cc18 math expression language with parser/lexer generated using ANTLR 2016-04-08 11:40:29 -05:00
DuNinglin [杜宁林] 0f67ff7dfb reoganize code folder according to recent upstream folder changes, seperate it from avro code and take it into extensions-conrib. docs rewite too 2016-03-30 11:21:41 +08:00
Fangjin Yang 62c1dc7a09 Merge pull request #2602 from binlijin/distinctcount
implement special distinctcount
2016-03-28 17:20:17 -07:00
Gian Merlino 977e867ad8 Downgrade geoip2, exclude com.google.http-client.
Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client
from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting
to run Druid along with an upgraded com.google.http-client) while preventing
Jackson conflicts pointed out in #2717.

Fixes #2717.

This reverts commit 21b7572533.
2016-03-25 14:43:22 -07:00
Gian Merlino 7e7a886f65 Move druid-api into the druid repo.
This is from druid-api-0.3.17, as of commit 51884f1d05d5512cacaf62cedfbb28c6ab2535cf
in the druid-api repo.
2016-03-24 11:04:34 -07:00
binlijin 2729efca71 implement special distinctcount 2016-03-24 11:11:11 +08:00
jon-wei a59c9ee1b1 Support use of DimensionSchema class in DimensionsSpec 2016-03-21 13:12:04 -07:00
Gian Merlino 738dcd8cd9 Update version to 0.9.1-SNAPSHOT.
Fixes #2462
2016-03-17 10:34:20 -07:00
Nishant 773d6fe86c Merge pull request #2646 from atomx/update-maxmind
Update com.maxmind.geoip2 to 2.6.0
2016-03-14 11:20:48 -07:00
Erik Dubbelboer 21b7572533 Update com.maxmind.geoip2 to 2.6.0
com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old).
When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.
2016-03-12 09:44:00 +00:00
Gian Merlino f22fb2c2cf KafkaIndexTask.
Reads a specific offset range from specific partitions, and can use dataSource metadata
transactions to guarantee exactly-once ingestion.

Each task has a finite lifecycle, so it is expected that some process will be supervising
existing tasks and creating new ones when needed.
2016-03-10 18:41:43 -08:00
Nishant ba1185963b Fix a bunch of dependencies
* Eliminate exclusion groups from pull-deps
* Only consider dependency nodes in pull-deps if they are not in the following scopes
	* provided
	* test
	* system
* Fix a bunch of `<scope>provided</scope>` missing tags
* Better exclusions for a couple of problematic libs
2016-03-10 10:18:08 -08:00
fjy e3e932a4d4 refactor extensions into core and contrib 2016-03-08 17:12:09 -08:00
Gian Merlino 004028b887 Make first few allocatePendingSegment retries quiet.
Some light retrying can happen during normal operation (SELECT -> INSERT races) and the
ensuing log messages would be scary for users.
2016-03-02 13:40:29 -08:00
Fangjin Yang 3a9fe2aad0 Merge pull request #2231 from lizhanhui/pull_request
Add druid-rocketmq module
2016-02-25 17:19:57 -08:00
Bingkun Guo 9e4c908922 generate tarball by mvn package 2016-02-18 16:42:41 -06:00
Slim Bouguerra 4e119b7a24 Adding lookup ref manager and lookup dimension spec impl 2016-02-11 12:11:51 -06:00
Charles Allen 3a26b3926c Identify druid.io as committer in pom.xml 2016-02-02 17:01:58 -08:00
Xavier Léauté e3d1e07b34 Merge pull request #2261 from metamx/improve-segment-ordering
Prioritize loading of segments based on segment interval
2016-01-27 10:05:54 -08:00
Nishant fd6bf3fe22 Use interval comparator instead of bucketMonthComparator
fix when two segments have same interval

review comments
2016-01-27 17:35:43 +05:30
Charles Allen 937ae6ad20 Update druid-api to 0.3.16
Fixes https://github.com/druid-io/druid/issues/2316
2016-01-22 14:37:16 -08:00
Slim Bouguerra e0d90f875c Graphite emitter 2016-01-21 13:43:37 -06:00
Fangjin Yang 1b162a67ff Merge pull request #2235 from druid-io/updateCommonsIO
Update commons-io to 2.4
2016-01-10 08:48:25 -08:00
pdeva 62aa8fec94 Updated log4j version 2016-01-09 10:45:40 -08:00
Charles Allen c1abcc3ef9 Update commons-io to 2.4
Hadoop2.3.0 uses version  2.4 as per http://central.maven.org/maven2/org/apache/hadoop/hadoop-project/2.3.0/hadoop-project-2.3.0.pom
2016-01-08 21:39:50 -08:00
Li Zhanhui 8eb332c1c4 Add druid-rocketmq module 2016-01-08 08:13:04 +08:00
Charles Allen b7b4d9f284 Update bytebuffer-collections to 0.2.4
Pulls in fix for https://github.com/RoaringBitmap/RoaringBitmap/issues/61
2016-01-07 10:21:49 -08:00
Charles Allen 3c4bdb7cc8 Manually update <tag> from <scm> in pom.xml 2016-01-05 14:42:25 -08:00
Gian Merlino b93feb5e77 Update java-util, fixes #2193 2016-01-05 11:16:03 -05:00
Zhao Weinan 5e57ddb8cc Adding avro support to realtime & hadoop batch indexing. 2016-01-05 10:21:27 +08:00
Charles Allen 2097669cce Update bytebuffer-collections to 0.2.3
* Fixes https://github.com/druid-io/druid/issues/2175
2016-01-04 11:20:45 -08:00
Gian Merlino 891d639188 Remove unused kafka-seven extension. 2015-12-29 12:05:27 -05:00
fjy 398a3ec620 add docs for more specs 2015-12-17 18:06:30 -08:00
jon-wei c53bf85d83 Add docs and benchmark for JSON flattening parser 2015-12-09 16:13:30 -08:00
Gian Merlino f6f7bec2b6 Update java-util. 2015-12-08 15:32:27 -08:00
Himanshu 5f2466afd1 Merge pull request #2045 from metamx/updateEmitter036
Update mmx emitter to 0.3.6
2015-12-05 23:20:17 -06:00
Charles Allen ea5fdc30f8 Update mmx emitter to 0.3.6
* 0.3.5 updated better logging messages
* 0.3.6 updates validator dependency to help prevent stale validator jars from being pulled in
2015-12-04 12:50:22 -08:00
Gian Merlino fde4753e25 Disable javadoc linting. 2015-12-03 19:11:29 -08:00
Himanshu Gupta 9c569be11e adding datasketches module to top level pom 2015-11-12 00:04:33 -06:00
Xavier Léauté a57cbfd2c3 Merge pull request #1387 from metamx/enableShutdownLogging
Add special handler to allow logger messages during shutdown
2015-11-09 17:20:09 -08:00
Xavier Léauté c896818241 Update curator to 2.9.1
Lots of bugfixes since 2.8.0
- https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12314425&version=12333324
- https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12314425&version=12332392
2015-11-05 15:53:01 -08:00
Lou Marvin Caraig c924f9fe56 Added cloudfiles-extensions in order to support Rackspace's cloudfiles as deep storage 2015-11-04 17:44:48 +01:00
Nishant dcd4468156 update emitter version
contains changes -
- https://github.com/metamx/emitter/pull/9
- https://github.com/metamx/emitter/pull/13
- https://github.com/metamx/emitter/pull/12
- https://github.com/metamx/emitter/pull/10
2015-10-29 17:43:03 +05:30
Nishant 20a3ebc022 update server metrics version
- fixes Sigar loading for JvmCpuMetrics
https://github.com/metamx/server-metrics/pull/16

update server metrics
2015-10-29 17:37:45 +05:30
Gian Merlino 7df7370935 Merge pull request #1862 from metamx/indexingServiceMMGone
Add timeout to shutdown request to middle manager for indexing service
2015-10-27 14:38:01 -07:00
Charles Allen 7a2ceef690 Add special handler to allow logger messages during shutdown
* Adds a special PropertyChecker interface which is ONLY for setting string properties at the very start of psvm
2015-10-27 14:33:36 -07:00
Charles Allen 44a2b204df Add timeout to shutdown request to middle manager for indexing service 2015-10-27 13:56:03 -07:00
Bingkun Guo 4914925d65 New extension loading mechanism
1) Remove maven client from downloading extensions at runtime.
2) Provide a way to load Druid extensions and hadoop dependencies through file system.
3) Refactor pull-deps so that it can download extensions into extension directories.
4) Add documents on how to use this new extension loading mechanism.
5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0
are packaged within the Druid tarball.
2015-10-21 14:22:36 -05:00
Xavier Léauté e4ac78e43d bump next snapshot to 0.9.0 2015-10-20 13:46:13 -07:00
Xavier Léauté 4c2c7a2c37 update version to 0.8.3 2015-10-14 21:40:55 -07:00
Xavier Léauté 5a98d4e650 update coveralls plugin 2015-10-14 10:25:23 -07:00
Charles Allen e9b81430f4 Bump server-metrics to 0.2.5 to catch a few fixes. 2015-10-08 11:05:51 -07:00
Nishant 38664904e2 Revert Jetty version
update to 9.2.10, latest version that is working

revert to jetty 9.2.5, last known good version
2015-10-08 21:53:13 +05:30
Nishant 42e971d1c1 Merge pull request #1797 from himanshug/fix_ingest_segment_firehose_ut
ingest segment firehose ut
2015-10-02 21:57:22 +05:30
Himanshu Gupta e2b16ab281 update java-util dep version 2015-10-01 16:06:04 -05:00
Charles Allen bdae0cb135 Update httpcomponents and aws-sdk 2015-10-01 13:28:46 -07:00
Himanshu 63a3a4a254 Merge pull request #1763 from metamx/server-metrics-fixes
fix NPE and duplicate metric keys
2015-09-22 09:39:01 -05:00
Xavier Léauté 0fe9aeb3d6 fix NPE and duplicate metric keys 2015-09-21 22:50:49 -07:00
Charles Allen 045f72505c Merge pull request #1759 from metamx/update-roaring
better faster smaller roaring bitmaps
2015-09-21 18:50:19 -07:00
Xavier Léauté df6988bbd2 better faster smaller roaring bitmaps 2015-09-21 17:00:57 -07:00
Xavier Léauté af86c0e6ea update druid-api + java-util for timstamp parsing speedup 2015-09-21 09:57:29 -07:00
Himanshu Gupta ebdb612933 composing emitter module to use multiple emitters together 2015-09-09 16:45:50 -05:00
Himanshu Gupta 2e0dd1d792 adding UTs and addressing review comments to
firehoseV2 addition to Realtime[Manager|Plumber],
essential segment metadata persist support,
kafka-simple-consumer-firehose extension patch
2015-08-27 20:50:46 -05:00
lvjq 2237a8cf0f kafka 8 simple consumer firehose 2015-08-27 20:50:46 -05:00
Xavier Léauté 25853aee9b update druid-api for jackson 2.4.6 2015-08-26 19:23:37 -07:00
Gian Merlino 2a866f49df Downgrade Jackson to 2.4.6. 2015-08-26 18:25:55 -07:00
Xavier Léauté d5143c0807 update java-util to 0.27.2 2015-08-25 16:07:02 -07:00
Xavier Léauté 21aadb8927 update joda to 2.8.2 2015-08-25 16:07:02 -07:00
Xavier Léauté dc11f907c9 update jdbi to 2.63.1 2015-08-25 16:07:02 -07:00
Xavier Léauté 123f6340d5 update coveralls plugin to 3.2.0 2015-08-25 16:07:02 -07:00
Xavier Léauté 3a83a0fe40 update irc library to 1.0-0014 2015-08-25 16:07:02 -07:00
Xavier Léauté 2fa87c4bba update joda-time to 2.8 2015-08-25 16:07:02 -07:00
Xavier Léauté f681b2022d update aws-sdk to 1.10.12 2015-08-25 16:07:01 -07:00
Xavier Léauté 8be5fe091f update slf4j to 1.7.12 2015-08-25 16:07:01 -07:00
Xavier Léauté 172a44b794 update log4j to 2.3 2015-08-25 16:07:01 -07:00
Xavier Léauté 51f6a9a2c9 update jackson to 2.6.1 2015-08-25 16:07:01 -07:00
Xavier Léauté ac4a856a17 update jetty to 9.2.13.v20150730 (latest Java 7 compatible version) 2015-08-25 16:07:01 -07:00
Xavier Léauté 8b294d4d98 update mapdb to 1.0.8 2015-08-25 16:07:00 -07:00
Xavier Léauté edaaf528ab update jets3t to 0.9.4 2015-08-25 16:07:00 -07:00
Xavier Léauté 9a0c15c52c update airline to 0.7 2015-08-25 16:07:00 -07:00
Xavier Léauté d601038271 update druid-api to 0.3.11 2015-08-25 16:07:00 -07:00
Fangjin Yang e21954a70f Merge pull request #1619 from metamx/update-server-metrics
update server metrics
2015-08-25 13:58:28 -07:00
Xavier Léauté 2093187c91 rework tarball distribution:
- move assembly out of druid-services into a 'distribution' module
- create separate 'extensions-distribution' module and assembly to
  package extensions and their dependencies into a local maven
  repository
- include this extensions maven repository in the binaries tarball
2015-08-18 18:32:33 -07:00
Xavier Léauté 3b2e41e42a update for next release 2015-08-18 17:16:46 -07:00
Charles Allen db19d2d547 Revert "Update to guice 4.0" 2015-08-14 09:26:07 -07:00
Nishant a6a9886339 update server metrics
changes include -
1) https://github.com/metamx/server-metrics/pull/8 - Add custom
dimensions for Jvm and Sys monitors
2) https://github.com/metamx/server-metrics/pull/9 - Add net, tcp,
uptime and cpu metrics
2015-08-12 22:52:17 +05:30
Charles Allen c8c8169c69 Bump druid-api to 0.3.10 to include guice 4.0 update 2015-08-10 13:57:55 -07:00
Charles Allen 7e61216287 Update to guice 4.0
- Mark a lot of `@Provides` methods as final since guice 4.0 disallows overriding them
2015-08-10 13:57:18 -07:00
Charles Allen 7d5a77b882 Bumb Jersey to 1.19 2015-08-07 17:32:27 -07:00
Himanshu Gupta a9ee2a383f update druid-api to 0.3.9 2015-07-30 16:35:37 -05:00
Charles Allen 86ede702b1 Add namespaced lookups as extensions
* Adds kafka, URI, and JDBC namespace defintions
* Add ability to explicitly rename using a "namespace" which is a particular data collection that is loaded on all realtime, historic nodes, and brokers. If any of these nodes has the namespace extension, ALL nodes have the namespace extension.
* Add namespace caching and populating (can be on heap or off heap)
* Add NamespaceExtractionCacheManager for handling caches
* Added ExtractionNamespace for handling metadata on the extraction namespaces
* Added ExtractionNamespaceUpdate for handling metadata related to updates
* Add extension which caches renames from a kafka stream (requires kafka8)
* Added README.md for the namespace kafka extension
* Added docs
* Added namespace/size, namespace/count, namespace/deltaTasksStarted metrics

Add static config for namespaces via `druid.query.extraction.namespace`
* This is a rebase of https://github.com/b-slim/druid/tree/static_config_only
2015-07-28 11:14:14 -07:00
Himanshu Gupta efef20e754 update curator version to 2.8.0
to get fix for https://issues.apache.org/jira/browse/CURATOR-168
2015-07-23 13:49:55 -05:00
Xavier Léauté 4cfb00bc8a inrement version 2015-07-15 13:09:05 -07:00
Nishant 6fa49b771c Merge pull request #1453 from metamx/concicebenchmarkComplement
Update bytebuffer-collections and add basic ConciseComplementBenchmark for testing concise complements
2015-06-23 08:41:42 +05:30
Charles Allen 88b2ef5d6b Update bytebuffer-collections 2015-06-22 18:07:24 -07:00
Xavier Léauté f1951b253c Fix bad distribution of cache keys across nodes
With the existing hash function some nodes could end up with 3 times the
number of keys as others. The following changes improve that to roughly
less than 5% differences across nodes.

- switch from fnv-1a to murmur3_128 hash
- increase repetitions for ketama algorithm
- test to analyze distribution

Also updates spymemcached for recent bugfixes
2015-06-19 15:30:35 -07:00
Xavier Léauté 0a5bb909a2 [maven-release-plugin] prepare for next development iteration 2015-06-18 17:35:19 -07:00
Xavier Léauté 59c6b2b279 [maven-release-plugin] prepare release druid-0.8.0-rc1 2015-06-18 17:35:14 -07:00
Charles Allen 056cab93ed Add Hadoop Converter Job and task
* Fixes https://github.com/druid-io/druid/issues/1363
* Add extra utils in JobHelper based on PR feedback
2015-06-09 14:47:38 -07:00
nishant 81415282aa Enabling compression for multiValued dimension
Add test and refactoring

Add benchmark tests
2015-05-27 00:09:14 +05:30
fjy 7a6acf5c1b update pom to 0.8 2015-05-11 19:41:58 -06:00
fjy d6ef2c20df Update server metrics to use schemaless design 2015-04-27 15:06:25 -07:00
fjy 0c734c691a update druid-api version to fix bug 2015-04-18 15:52:39 -07:00
fjy d260515a43 update druid-api version 2015-04-17 14:58:35 -07:00
David Pinheiro baeef08c4c Add Microsoft Azure as a Deep Storage option. 2015-04-16 15:39:36 -07:00
Fangjin Yang 1cfc13cc41 Merge pull request #1286 from metamx/fix-java-util-conflict
updated dependencies to java-util 0.27
2015-04-15 17:36:30 -07:00
Xavier Léauté b535d30912 updated dependencies to java-util 0.27
java-util 0.27 is not binary compatible with 0.26
2015-04-15 16:23:01 -07:00
Charles Allen abdeaa0746 Add stricter checking for potential coding errors
Can use via `mvn clean compile test-compile -P strict'
2015-04-15 14:52:25 -07:00
Xavier Léauté 2c773dee60 work around Travis CI not setting appropriate heap limits https://github.com/travis-ci/travis-ci/issues/3396 2015-04-08 13:19:40 -07:00
Fangjin Yang 208e307915 Merge pull request #1251 from metamx/uriSegmentLoaders
Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
2015-03-30 17:43:51 -07:00
fjy aea7f9d192 [maven-release-plugin] prepare for next development iteration 2015-03-30 16:35:24 -07:00
fjy 060d7aef03 [maven-release-plugin] prepare release druid-0.7.1 2015-03-30 16:35:20 -07:00
Charles Allen 1c6cbea89c Revert "Revert "Overhaul of SegmentPullers to add consistency and retries""
This reverts commit f904bc7858.
2015-03-30 13:40:04 -07:00
Fangjin Yang f904bc7858 Revert "Overhaul of SegmentPullers to add consistency and retries" 2015-03-30 13:15:50 -07:00
Charles Allen 6d407e8677 Add URI handling to SegmentPullers
* Requires https://github.com/druid-io/druid-api/pull/37
* Requires https://github.com/metamx/java-util/pull/22
* Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl
  * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
2015-03-30 12:33:23 -07:00
Charles Allen 9cd6c08e96 Exclude log4j from curator dependencies in favor of log4j-1.2-api 2015-03-26 13:05:12 -07:00
Xavier Léauté 95602de34e Merge pull request #1230 from metamx/updateMapDbVersion107
Update MapDB to version 1.0.7
2015-03-23 14:04:25 -07:00
Charles Allen 5250d89adf Update MapDB to version 1.0.7
The update fixes a bunch of HTree issues which we have not encountered yet, but we use HTRee in many of our MapDB usages
http://www.mapdb.org/changelog.html#Version_107_2015-02-19
2015-03-20 08:31:23 -07:00
fjy b389cfe404 [maven-release-plugin] prepare for next development iteration 2015-03-19 12:38:17 -07:00
fjy 60e7d543cc [maven-release-plugin] prepare release druid-0.7.1-rc1 2015-03-19 12:38:13 -07:00
fjy 6a47c1530c update versions to prepare for rc release 2015-03-19 11:39:38 -07:00
Slim Bouguerra 03547b4088 Using new release of the http-client.
This new version 1.0.1 it has a bug fix that was  causing can not add a handler exceptions.
Bug description can be found [here](https://github.com/metamx/http-client/pull/16)
2015-03-18 09:33:45 -05:00
fjy 2c834b878e Update jets3t version to support aws v4 auth 2015-03-16 10:50:04 -07:00
cheddar 526a386f50 Merge pull request #1193 from metamx/reduce-test-verbosity
move test output to file for cleaner build logs
2015-03-12 15:42:34 -07:00
Xavier Léauté 7fb4b1d2bb Merge pull request #1144 from metamx/javaLoggingManager
Add log4j2 hooks to other loggings
2015-03-12 14:38:51 -07:00
Charles Allen edfcea18d8 Add log4j2 hooks to standard java logging 2015-03-12 14:29:41 -07:00
Xavier Léauté fc613771d2 move test output to file for cleaner build logs
- removes the need for special test log4j2.xml
2015-03-11 17:56:19 -07:00
Xavier Léauté ef842b2eae add test coverage 2015-03-09 14:32:13 -07:00
Himanshu Gupta 492148c837 updating java-util version to 0.26.15 2015-03-05 18:29:50 -06:00
Himanshu Gupta fba9d31f6e updating druid-api version to 0.3.5 2015-03-05 18:07:32 -06:00
Himanshu Gupta e6ee98e2d2 UTs update for common 2015-02-25 15:40:26 -08:00
Fangjin Yang 005f4da2c0 Merge pull request #1143 from metamx/update-rhino-1.7rc5
Update Rhino to 1.7RC5
2015-02-25 12:50:23 -08:00