Commit Graph

7192 Commits

Author SHA1 Message Date
Himanshu ea3281ad78 Merge pull request #2645 from atomx/gs-scheme
Add gs:// hdfs support
2016-03-14 22:15:42 -05:00
Gian Merlino a938f0853b Additional ports docs. 2016-03-14 19:11:18 -07:00
Robin e7a7ecd65d add test for batch indexing from hadoop 2016-03-14 20:02:30 -05:00
Charles Allen 2ac8a22173 Merge pull request #2579 from metamx/closerIsCloser
Make CloserRule use guava's Closer
2016-03-14 17:18:19 -07:00
Jonathan Wei 5ec5ac92c6 Merge pull request #2382 from himanshug/broker_segment_tier_selection
at broker, if configured, only add segments from specific tiers to the timeline
2016-03-14 16:53:06 -07:00
navis.ryu 83e1d5d7bf Add hint message for missing `druid.selectors.coordinator.serviceName` 2016-03-15 08:39:07 +09:00
Charles Allen a64979463f Make CloserRule use guava's Closer 2016-03-14 15:01:24 -07:00
Fangjin Yang 06813b510a Merge pull request #2571 from himanshug/gp_by_avoid_sort
avoid sort while doing groupBy merging when possible
2016-03-14 14:46:51 -07:00
Fangjin Yang a41a70d370 Merge pull request #2651 from gianm/ports-docs
Docs on default ports.
2016-03-14 14:15:52 -07:00
Charles Allen 02805a74a1 Merge pull request #2648 from chtefi/master
Ignore case when testing for table existence
2016-03-14 13:57:53 -07:00
Fangjin Yang dbdbacaa18 Merge pull request #2260 from navis/cardinality-for-searchquery
Support cardinality for search query
2016-03-14 13:24:40 -07:00
Gian Merlino e51277b96c Docs on default ports. 2016-03-14 11:25:21 -07:00
Nishant 773d6fe86c Merge pull request #2646 from atomx/update-maxmind
Update com.maxmind.geoip2 to 2.6.0
2016-03-14 11:20:48 -07:00
Nishant 9cceff2274 Use ImmutableWorkerInfo instead of ZKWorker
review comments

add test for equals and hashcode
2016-03-14 11:17:15 -07:00
Himanshu d51a0a0cf4 Merge pull request #2220 from gianm/appenderator-kafka
Appenderators, DataSource metadata, KafkaIndexTask
2016-03-14 13:14:36 -05:00
rasahner 2861e854f0 Merge pull request #2540 from pjain1/remove_kill
Remove extra parameter from deleteDataSourceSpecificInterval endpoint and correct exception message for invalid interval
2016-03-14 11:16:23 -05:00
Slim 8cc3582e70 Merge pull request #2644 from metamx/optimize-timeboundary
optimize timeboundary for min or max bound
2016-03-13 13:16:24 -05:00
Stéphane Derosiaux 416cb03687 Ignore case when testing for table existence 2016-03-13 11:17:30 +01:00
Erik Dubbelboer 21b7572533 Update com.maxmind.geoip2 to 2.6.0
com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old).
When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.
2016-03-12 09:44:00 +00:00
Erik Dubbelboer 375620cfb3 Add gs:// hdfs support
Used to access google cloud storage
2016-03-12 08:57:57 +00:00
navis.ryu be341bf4e3 Support cardinality for search query (Fix for #2260) 2016-03-12 09:51:01 +09:00
Nishant cf7f6da392 Merge pull request #2634 from gianm/stopGracefully-avoid-interrupt
ThreadPoolTaskRunner: Make graceful shutdown logs less scary.
2016-03-11 16:36:10 -08:00
Charles Allen a3f0048ea4 Merge pull request #2631 from gianm/plumbers-rpe
Better logging for ParseExceptions on index aggregation, and remove unnecessary exception handling.
2016-03-11 14:22:58 -08:00
Xavier Léauté 6f0d6ef0e9 optimize timeboundary for min or max bound 2016-03-11 14:11:47 -08:00
Fangjin Yang f381c6066e Merge pull request #2638 from gianm/attempt-fix-2637
WorkerTaskMonitor: stop() waits for mainLoop to exit.
2016-03-11 13:27:24 -08:00
Gian Merlino 79a95f7789 WorkerTaskMonitor: stop() waits for mainLoop to exit.
Fixes #2637.
2016-03-11 11:40:13 -08:00
Himanshu 61cd838d6b Merge pull request #2640 from guobingkun/fix_doc
fix broken link for Tasks
2016-03-11 12:55:04 -06:00
Bingkun Guo 96c981cd0a fix broken link for Tasks 2016-03-11 11:36:34 -06:00
Gian Merlino 05397a9b4f ThreadPoolTaskRunner: Make graceful shutdown logs less scary.
- It's okay to suppress InterruptedException during graceful shutdown, as
  tasks may use it to accelerate their own shutdown.
- It's okay to ignore return statuses during graceful shutdown (which may
  be FAILED!) because it actually doesn't matter what they are.
2016-03-11 07:49:29 -08:00
Fangjin Yang f4ab1c2e52 Merge pull request #2632 from gianm/examples-druid-provided
examples: Switch druid-server, druid-common to "provided".
2016-03-10 19:57:32 -08:00
Gian Merlino d63473e0d5 examples: Switch druid-server, druid-common to "provided". 2016-03-10 18:43:29 -08:00
Gian Merlino f22fb2c2cf KafkaIndexTask.
Reads a specific offset range from specific partitions, and can use dataSource metadata
transactions to guarantee exactly-once ingestion.

Each task has a finite lifecycle, so it is expected that some process will be supervising
existing tasks and creating new ones when needed.
2016-03-10 18:41:43 -08:00
Gian Merlino 187569e702 DataSource metadata.
Geared towards supporting transactional inserts of new segments. This involves an
interface "DataSourceMetadata" that allows combining of partially specified metadata
(useful for partitioned ingestion).

DataSource metadata is stored in a new "dataSource" table.
2016-03-10 17:41:50 -08:00
Gian Merlino 3d2214377d Appenderatoring.
Appenderators are a way of getting more control over the ingestion process
than a Plumber allows. The idea is that existing Plumbers could be implemented
using Appenderators, but you could also implement things that Plumbers can't do.

FiniteAppenderatorDrivers help simplify indexing a finite stream of data.

Also:
- Sink: Ability to consider itself "finished" vs "still writable".
- Sink: Ability to return the number of rows contained within the sink.
2016-03-10 17:41:50 -08:00
Gian Merlino 08284fea62 Publish test-jar for indexing-service. 2016-03-10 16:50:37 -08:00
Gian Merlino 92c828f904 Make SegmentHandoffNotifier Closeable. 2016-03-10 16:50:37 -08:00
Gian Merlino ad5ffdf483 Nix Committers.supplierOf; Suppliers.ofInstance is good enough. 2016-03-10 16:50:37 -08:00
Gian Merlino 8a11161b20 Plumbers: Move plumber.add out of try/catch for ParseException.
The incremental indexes handle that now so it's not necessary.

Also, add debug logging and more detailed exceptions to the incremental
indexes for the case where there are parse exceptions during aggregation.
2016-03-10 16:39:26 -08:00
Himanshu Gupta dc0214bddb while GroupBy merging use unsorted facts in IncrementalIndex wherever possible 2016-03-10 16:11:48 -06:00
Himanshu Gupta 02dfd5cd80 update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance 2016-03-10 16:11:48 -06:00
Fangjin Yang 1e49092ce7 Merge pull request #2627 from himanshug/fix_datasource_inputformat_locations
fix regression - bug in DatasourceInputFormat best effort split location finder code
2016-03-10 13:46:04 -08:00
Xavier Léauté 90d7409e1a Merge pull request #2611 from himanshug/gp_by_max_limit
only allow lowering maxResults and maxIntermediateRows from groupBy query context
2016-03-10 13:44:13 -08:00
Himanshu 863aa66808 Merge pull request #2597 from metamx/forwardPortMMX23
Forward port - Fix dependency problems
2016-03-10 14:56:52 -06:00
Himanshu Gupta eab8a0b54d in DatasourceInputFormat code for determining segment block locations avoid the split calulation by helper TextInputFormat 2016-03-10 14:28:53 -06:00
Charles Allen 7b1bfbf704 Add documentation to modules about what should be excluded. 2016-03-10 10:18:33 -08:00
Charles Allen d299540efc Make HadoopTask load hadoop dependency classes LAST for local isolated classrunner 2016-03-10 10:18:23 -08:00
Nishant ba1185963b Fix a bunch of dependencies
* Eliminate exclusion groups from pull-deps
* Only consider dependency nodes in pull-deps if they are not in the following scopes
	* provided
	* test
	* system
* Fix a bunch of `<scope>provided</scope>` missing tags
* Better exclusions for a couple of problematic libs
2016-03-10 10:18:08 -08:00
Fangjin Yang cf3965c82e Merge pull request #2625 from gianm/clarify-parser-docs
Clarify parser docs.
2016-03-10 09:44:23 -08:00
Gian Merlino a2b1652787 Clarify parser docs.
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
Fangjin Yang 68cffe1d91 Merge pull request #2615 from gianm/timeseries-skipEmptyBuckets-cache
Fix caching of skipEmptyBuckets for TimeseriesQuery.
2016-03-09 18:45:59 -08:00