Commit Graph

1171 Commits

Author SHA1 Message Date
Xavier Léauté e79284da59 new interval based cost function (#2972)
* new interval based cost function

Addresses issues with balancing of segments in the existing cost function
- `gapPenalty` led to clusters of segments ~30 days apart
- `recencyPenalty` caused imbalance among recent segments
- size-based cost could be skewed by compression

New cost function is purely based on segment intervals:
- assumes each time-slice of a partition is a constant cost
- cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C)
- cost decays exponentially based on distance between time-slices

* comments and formatting

* add more comments to explain the calculation
2016-05-17 09:56:00 -07:00
michaelschiff 2203a812bc statsd-emitter (#2410) 2016-04-28 18:41:02 -07:00
Xavier Léauté fc91120b54 Merge pull request #2857 from metamx/upgrade-zk
upgrade zookeeper client dependency to 3.4.8
2016-04-20 10:36:07 +05:30
Xavier Léauté 838768c632 upgrade curator, fixes #2829 (#2849) 2016-04-18 13:17:36 -07:00
Himanshu Gupta 308211cc18 math expression language with parser/lexer generated using ANTLR 2016-04-08 11:40:29 -05:00
DuNinglin [杜宁林] 0f67ff7dfb reoganize code folder according to recent upstream folder changes, seperate it from avro code and take it into extensions-conrib. docs rewite too 2016-03-30 11:21:41 +08:00
Fangjin Yang 62c1dc7a09 Merge pull request #2602 from binlijin/distinctcount
implement special distinctcount
2016-03-28 17:20:17 -07:00
Gian Merlino 977e867ad8 Downgrade geoip2, exclude com.google.http-client.
Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client
from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting
to run Druid along with an upgraded com.google.http-client) while preventing
Jackson conflicts pointed out in #2717.

Fixes #2717.

This reverts commit 21b7572533.
2016-03-25 14:43:22 -07:00
Gian Merlino 7e7a886f65 Move druid-api into the druid repo.
This is from druid-api-0.3.17, as of commit 51884f1d05d5512cacaf62cedfbb28c6ab2535cf
in the druid-api repo.
2016-03-24 11:04:34 -07:00
binlijin 2729efca71 implement special distinctcount 2016-03-24 11:11:11 +08:00
jon-wei a59c9ee1b1 Support use of DimensionSchema class in DimensionsSpec 2016-03-21 13:12:04 -07:00
Gian Merlino 738dcd8cd9 Update version to 0.9.1-SNAPSHOT.
Fixes #2462
2016-03-17 10:34:20 -07:00
Nishant 773d6fe86c Merge pull request #2646 from atomx/update-maxmind
Update com.maxmind.geoip2 to 2.6.0
2016-03-14 11:20:48 -07:00
Erik Dubbelboer 21b7572533 Update com.maxmind.geoip2 to 2.6.0
com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old).
When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.
2016-03-12 09:44:00 +00:00
Gian Merlino f22fb2c2cf KafkaIndexTask.
Reads a specific offset range from specific partitions, and can use dataSource metadata
transactions to guarantee exactly-once ingestion.

Each task has a finite lifecycle, so it is expected that some process will be supervising
existing tasks and creating new ones when needed.
2016-03-10 18:41:43 -08:00
Nishant ba1185963b Fix a bunch of dependencies
* Eliminate exclusion groups from pull-deps
* Only consider dependency nodes in pull-deps if they are not in the following scopes
	* provided
	* test
	* system
* Fix a bunch of `<scope>provided</scope>` missing tags
* Better exclusions for a couple of problematic libs
2016-03-10 10:18:08 -08:00
fjy e3e932a4d4 refactor extensions into core and contrib 2016-03-08 17:12:09 -08:00
Gian Merlino 004028b887 Make first few allocatePendingSegment retries quiet.
Some light retrying can happen during normal operation (SELECT -> INSERT races) and the
ensuing log messages would be scary for users.
2016-03-02 13:40:29 -08:00
Fangjin Yang 3a9fe2aad0 Merge pull request #2231 from lizhanhui/pull_request
Add druid-rocketmq module
2016-02-25 17:19:57 -08:00
Bingkun Guo 9e4c908922 generate tarball by mvn package 2016-02-18 16:42:41 -06:00
Slim Bouguerra 4e119b7a24 Adding lookup ref manager and lookup dimension spec impl 2016-02-11 12:11:51 -06:00
Charles Allen 3a26b3926c Identify druid.io as committer in pom.xml 2016-02-02 17:01:58 -08:00
Xavier Léauté e3d1e07b34 Merge pull request #2261 from metamx/improve-segment-ordering
Prioritize loading of segments based on segment interval
2016-01-27 10:05:54 -08:00
Nishant fd6bf3fe22 Use interval comparator instead of bucketMonthComparator
fix when two segments have same interval

review comments
2016-01-27 17:35:43 +05:30
Charles Allen 937ae6ad20 Update druid-api to 0.3.16
Fixes https://github.com/druid-io/druid/issues/2316
2016-01-22 14:37:16 -08:00
Slim Bouguerra e0d90f875c Graphite emitter 2016-01-21 13:43:37 -06:00
Fangjin Yang 1b162a67ff Merge pull request #2235 from druid-io/updateCommonsIO
Update commons-io to 2.4
2016-01-10 08:48:25 -08:00
pdeva 62aa8fec94 Updated log4j version 2016-01-09 10:45:40 -08:00
Charles Allen c1abcc3ef9 Update commons-io to 2.4
Hadoop2.3.0 uses version  2.4 as per http://central.maven.org/maven2/org/apache/hadoop/hadoop-project/2.3.0/hadoop-project-2.3.0.pom
2016-01-08 21:39:50 -08:00
Li Zhanhui 8eb332c1c4 Add druid-rocketmq module 2016-01-08 08:13:04 +08:00
Charles Allen b7b4d9f284 Update bytebuffer-collections to 0.2.4
Pulls in fix for https://github.com/RoaringBitmap/RoaringBitmap/issues/61
2016-01-07 10:21:49 -08:00
Charles Allen 3c4bdb7cc8 Manually update <tag> from <scm> in pom.xml 2016-01-05 14:42:25 -08:00
Gian Merlino b93feb5e77 Update java-util, fixes #2193 2016-01-05 11:16:03 -05:00
Zhao Weinan 5e57ddb8cc Adding avro support to realtime & hadoop batch indexing. 2016-01-05 10:21:27 +08:00
Charles Allen 2097669cce Update bytebuffer-collections to 0.2.3
* Fixes https://github.com/druid-io/druid/issues/2175
2016-01-04 11:20:45 -08:00
Gian Merlino 891d639188 Remove unused kafka-seven extension. 2015-12-29 12:05:27 -05:00
fjy 398a3ec620 add docs for more specs 2015-12-17 18:06:30 -08:00
jon-wei c53bf85d83 Add docs and benchmark for JSON flattening parser 2015-12-09 16:13:30 -08:00
Gian Merlino f6f7bec2b6 Update java-util. 2015-12-08 15:32:27 -08:00
Himanshu 5f2466afd1 Merge pull request #2045 from metamx/updateEmitter036
Update mmx emitter to 0.3.6
2015-12-05 23:20:17 -06:00
Charles Allen ea5fdc30f8 Update mmx emitter to 0.3.6
* 0.3.5 updated better logging messages
* 0.3.6 updates validator dependency to help prevent stale validator jars from being pulled in
2015-12-04 12:50:22 -08:00
Gian Merlino fde4753e25 Disable javadoc linting. 2015-12-03 19:11:29 -08:00
Himanshu Gupta 9c569be11e adding datasketches module to top level pom 2015-11-12 00:04:33 -06:00
Xavier Léauté a57cbfd2c3 Merge pull request #1387 from metamx/enableShutdownLogging
Add special handler to allow logger messages during shutdown
2015-11-09 17:20:09 -08:00
Xavier Léauté c896818241 Update curator to 2.9.1
Lots of bugfixes since 2.8.0
- https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12314425&version=12333324
- https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12314425&version=12332392
2015-11-05 15:53:01 -08:00
Lou Marvin Caraig c924f9fe56 Added cloudfiles-extensions in order to support Rackspace's cloudfiles as deep storage 2015-11-04 17:44:48 +01:00
Nishant dcd4468156 update emitter version
contains changes -
- https://github.com/metamx/emitter/pull/9
- https://github.com/metamx/emitter/pull/13
- https://github.com/metamx/emitter/pull/12
- https://github.com/metamx/emitter/pull/10
2015-10-29 17:43:03 +05:30
Nishant 20a3ebc022 update server metrics version
- fixes Sigar loading for JvmCpuMetrics
https://github.com/metamx/server-metrics/pull/16

update server metrics
2015-10-29 17:37:45 +05:30
Gian Merlino 7df7370935 Merge pull request #1862 from metamx/indexingServiceMMGone
Add timeout to shutdown request to middle manager for indexing service
2015-10-27 14:38:01 -07:00
Charles Allen 7a2ceef690 Add special handler to allow logger messages during shutdown
* Adds a special PropertyChecker interface which is ONLY for setting string properties at the very start of psvm
2015-10-27 14:33:36 -07:00