Commit Graph

173 Commits

Author SHA1 Message Date
Fangjin Yang d435a9b1e9 Merge pull request #2448 from metamx/fixBigJarHadoopPlace
Fix dependencies.
2016-02-11 10:47:00 -08:00
Will Lauer 189376a6f9 Adding optional error bounds to sketch aggs and post-aggs
By setting a new optional parameter, `errorBoundsStdDev`, to the number
of standard deviations to use when computing error bounds, the return
type for both the SketchMergeAggregator and the SketchEstimate
PostAggregator can be changed from a simple double (estimate) to a JSON
object containing the estimate, expected high bound, expected low bound,
and standard devations used when computing bounds (same value as passed
in).
2016-02-11 10:18:16 -06:00
Charles Allen 40ade32a1f Fix dependencies.
* Don't put druid****selfcontained.jar at the end of the hadoop isolated classpath
* Add `<scope>provided</scope>` to prevent repeated dependency inclusion in the extension directories
2016-02-11 07:30:14 -08:00
Erik Dubbelboer f72b613499 Remove incorrect comment.
The CloudFilesDataSegmentPuller can't handle URI data pulls.
This comment was obviously copied from the s3 module and never removed.
2016-01-30 11:02:44 +00:00
Charles Allen 508734c8b0 Long constant reformatting in tests `l` --> `L` 2016-01-27 08:59:19 -08:00
Gian Merlino cac4651da0 Fix spelling of 'dimReverseExtractionNamespace'. 2016-01-26 23:08:02 -08:00
Charles Allen e941303bc6 Remove sorting of dimensions in AvroStreamInputRowParserTest
Due to https://github.com/druid-io/druid-api/pull/68
2016-01-22 16:01:41 -08:00
Slim Bouguerra e0d90f875c Graphite emitter 2016-01-21 13:43:37 -06:00
Fangjin Yang 0c31f007fc Merge pull request #1728 from himanshug/aggregators_in_segment_metadata
Store AggregatorFactory[] in segment metadata
2016-01-19 12:55:49 -08:00
Himanshu Gupta a99aef29a1 adding aggregators to segment metadata 2016-01-19 14:23:39 -06:00
Himanshu Gupta 52eb0f04a7 adding a new method getMergingFactory(..) to AggregatorFactory 2016-01-18 22:03:46 -06:00
Himanshu Gupta 77fc86c015 making AggregatorFactory abstract class 2016-01-18 22:03:46 -06:00
Himanshu Gupta dcd3a24f59 adding log line for segment being killed in HdfsDataSegmentKiller 2016-01-18 21:51:04 -06:00
Kurt Young 1f2168fae5 add IndexMergerV9
add unit tests for IndexMergerV9 and fix some bugs

add more unit tests and fix bugs

handle null values and add more tests

minor changes & use LoggingProgressIndicator in IndexGeneratorReducer

make some static class public from IndexMerger

minor changes and add some comments

changes for comments
2016-01-16 11:25:28 +08:00
Charles Allen 13c63bad72 Make timeouts more explicit on what is failing in JDBCExtractionNamespaceTest 2016-01-07 11:16:36 -08:00
Fangjin Yang aaea95ed1b Merge pull request #2207 from himanshug/theta_sketch_select_query
fix bug for thetaSketch metric not working with select queries
2016-01-07 09:46:09 -08:00
fjy 2103906a48 add pusher tests for all deep storages 2016-01-05 22:22:48 -08:00
Himanshu Gupta c6634d7c2c adding json for thetaSketch Memory object representation 2016-01-05 22:12:52 -06:00
Himanshu Gupta 62e5e45da8 add select query UT for thetaSketch 2016-01-05 22:12:52 -06:00
Himanshu Gupta 3f048f0b15 adding support to execute Select queries in AggregationTestHelper so that Select query based UTs can be written for complex aggregator implementations 2016-01-05 21:54:55 -06:00
Charles Allen 6d886da7d9 Merge pull request #2191 from duilio/fix-rackspace-cloudfiles-segment-size
store uncompressed index size on cloudfiles storage extension
2016-01-05 17:17:35 -08:00
Zhao Weinan 5e57ddb8cc Adding avro support to realtime & hadoop batch indexing. 2016-01-05 10:21:27 +08:00
Charles Allen 957646be2c Fixes to JDBCExtractionNamespaceTest 2016-01-04 09:56:07 -08:00
maurizio 5ea0b96d9a store uncompressed index size instead of the compressed one in cf storage extension 2016-01-04 14:50:27 +01:00
fjy 57d91d754d Comment out buggy unit tests, fix #2185 2016-01-03 09:50:16 -08:00
fjy 89fc18bb55 increase timeouts for jdbc tearDown 2016-01-01 20:08:06 -08:00
fjy ca46f1d40c attempt to fix transient tests again 2015-12-30 21:39:28 -08:00
Bingkun Guo 492adeaaa7 Merge pull request #2172 from gianm/remove-kafka-seven
Remove unused kafka-seven extension.
2015-12-29 15:19:28 -06:00
Fangjin Yang b1261035a7 Merge pull request #1861 from guobingkun/insert_segment_tool
insert-segment tool
2015-12-29 10:06:07 -08:00
Gian Merlino 891d639188 Remove unused kafka-seven extension. 2015-12-29 12:05:27 -05:00
fjy 38b0f1fbc2 fix transient failures in unit tests 2015-12-28 20:03:30 -08:00
Fangjin Yang e490650865 Merge pull request #2110 from navis/fix-sporadic-testfail
Fix sporadic fail of URIExtractionNamespaceFunctionFactoryTest#testReverseFunction
2015-12-27 14:45:09 -08:00
Charles Allen 05c9e1b598 Reorder Before/After in JDBCExtractionNamespaceTest
* Fixes https://github.com/druid-io/druid/issues/2120
2015-12-22 11:39:46 -08:00
Bingkun Guo 89b477970f DataSegmentFinder tool
`insert-segment-to-db` is a tool that can insert segments into Druid metadata storage. It is intended to be used
to update the segment table in metadata storage after people manually migrate segments from one place to another.
It can also be used to insert missing segment into Druid, or even recover metadata storage by telling it where the
segments are stored.

Note: This tool expects users to have Druid cluster running in a "safe" mode, where there are no active tasks to interfere
the segments being inserted. Users can optionally bring down the cluster to make 100% sure nothing is interfering.
2015-12-21 00:02:04 -06:00
Fangjin Yang 1b46ea7b3d Merge pull request #2121 from metamx/jdbcExtractionNamespaceLocking
Add nicer locking and shorter timeouts to JDBCExtractionNamespaceTest
2015-12-18 19:02:36 -08:00
Fangjin Yang 14229ba0f2 Merge pull request #1922 from metamx/jsonIgnoresFinalFields
Change DefaultObjectMapper to NOT overwrite final fields unless explicitly asked to
2015-12-18 15:38:32 -08:00
Charles Allen 409eb0b7c6 Add nicer locking and shorter timeouts to JDBCExtractionNamespaceTest 2015-12-18 10:33:38 -08:00
navis.ryu 31b205afcd Fix sporadic fail of URIExtractionNamespaceFunctionFactoryTest#testReverseFunction 2015-12-18 14:37:00 +09:00
Slim Bouguerra ee1a39801a adding bulk lookup and reverse lookup 2015-12-10 08:29:41 -06:00
Fangjin Yang f4ba13a1ac Merge pull request #2029 from b-slim/add_reverse_fn
Adding reverse lookup function to LookupExtractor.
2015-12-09 12:50:13 -08:00
Slim Bouguerra 85f339b687 introduction and implem of reverse lookup function unApply. 2015-12-09 10:02:57 -06:00
Gian Merlino f6f7bec2b6 Update java-util. 2015-12-08 15:32:27 -08:00
Himanshu Gupta 62ba9ade37 unifying license header in all java files 2015-12-05 22:16:23 -06:00
Himanshu Gupta f99bad7988 reformat datasketches module to satisfy druid style guidelines 2015-11-19 01:07:03 -06:00
Himanshu Gupta fde9df2720 update to sketches-core-0.2.2 .
adds support for "cardinality" aggregator.
do not create sketch per event at ingestion time to make realtime ingestion faster
2015-11-19 01:05:59 -06:00
Fangjin Yang 21c84b5ff7 Merge pull request #1896 from gianm/allocate-segment
SegmentAllocateAction (fixes #1515)
2015-11-18 21:05:46 -08:00
Xavier Léauté ba41f37ce1 fix #1701 - MySQL 5.7 defaults break database character set check 2015-11-17 15:51:58 -08:00
Fangjin Yang 148153b47c Merge pull request #1897 from himanshug/new_sketch_aggregation
complex aggregator based on http://datasketches.github.io
2015-11-12 09:01:01 -08:00
Himanshu Gupta 338f88b86b further simplifying the api, users just need to use thetaSketch as aggregator 2015-11-12 00:04:34 -06:00
Himanshu Gupta 88ae3c43f9 changing names to be explicit about theta sketch algorithm
old names are still valid though so as to be backwards compatible for now
2015-11-12 00:04:34 -06:00