druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	2f275497b6	Fix caching of extension classloaders. (#3289 )	2016-07-26 15:19:15 -07:00
Gian Merlino	8030f1cb67	Be more respectful of maxRowsInMemory. (#3284 ) - Appenderator: Respect maxRowsInMemory across all sinks. - KafkaIndexTask: Respect maxRowsInMemory across all partitions.	2016-07-26 15:02:35 -06:00
Charles Allen	188a4bc89a	Revert "Optionally intern ServerInventoryView inventory objects. (#3238 )" (#3286 ) This reverts commit `a931debf79`. Fixes #3283 The core issue here is that realtime nodes announce their size as 0, so a coordinator which interns the realtime version of the data segment will not be able to see the new sized announcement when handoff occurs. This is caused by the `eauals` method on a `DataSegment` only evaluating the identifier. the `eauals` method should be correct for object equivalence, and things which need to check equivalence of some sub-portion of the object should do so explicitly.	2016-07-26 11:47:34 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Gian Merlino	b316cde554	Appenderator tests for disjoint query intervals. (#3281 )	2016-07-23 19:48:15 -07:00
Charles Allen	c58bbfa0c6	Intern DataSegments in SQLMetadataSegmentManager (#3267 ) * Helps with heap pressure on coordinator	2016-07-21 16:46:08 -07:00
Parag Jain	fd798d32bc	fix testSecuredGetServer ut (#3262 )	2016-07-20 10:20:13 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Himanshu	3f82108d15	optionally enable coordinator auto kill tasks on all dataSources via dynamic config (#3250 )	2016-07-17 18:47:52 -07:00
Gian Merlino	6cd1f5375b	Better harmonized dimensions for query metrics. (#3245 ) All query metrics now start with toolChest.makeMetricBuilder, and all of those now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id" moved to common code, since all query metrics added it anyway. In particular this will add query-type specific dimensions like "threshold" and "numDimensions" to servlet-originated metrics like query/time.	2016-07-14 11:55:51 -07:00
Hyukjin Kwon	55e7a52475	Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215 )	2016-07-14 09:19:17 -07:00
Nishant	a1715c8cda	fix-3237 (#3244 ) DruidBroker use FilteredServerInventoryView instead of ServerInventoryView	2016-07-13 22:30:35 -07:00
Charles Allen	a931debf79	Optionally intern ServerInventoryView inventory objects. (#3238 )	2016-07-14 08:49:26 +05:30
Charles Allen	5d9fd0a713	Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming (#3043 ) * Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming * Missed query from #2859 * Make inReadOnlyTransaction part of SQLMetadataConnector	2016-07-06 16:55:27 -07:00
Himanshu	e1313e4b90	add log msg when event recvr firehose buffer is full (#3209 )	2016-07-01 17:35:30 -05:00
Xavier Léauté	485e381387	remove datasource from hadoop output path (#3196 ) fixes #2083, follow-up to #1702	2016-06-29 08:53:45 -07:00
Gian Merlino	4c9aeb7353	Revert "update druid console version (#3189 )" (#3203 ) This reverts commit `496b801bc3`.	2016-06-29 08:29:57 -07:00
Xavier Léauté	496b801bc3	update druid console version (#3189 )	2016-06-27 18:02:40 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
Charles Allen	15f833a861	Make extension classloader caching keyed on directory (#3165 ) * Make extension classloaders keyed by extension directory * Fixes #3163 * Add in same-directory-name unit test	2016-06-23 17:13:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	0d427923c0	fix caching for search results (#3119 ) * fix caching for search results properly read count when reading from cache. * fix NPE during merging search count and add test * Update cache key to invalidate prev results	2016-06-09 17:49:47 -07:00
Himanshu	ab4209c82a	killDataSourceWhitelist in CoordinatorDynamicConfig accepts comma separated list of strings in addition to json array of strings so that coordinator console can do the updates correctly (#3095 )	2016-06-07 15:39:41 -07:00
Keuntae Park	e6b32c24ae	bug fix for getNewNodes() in ListenerDiscoverer (#3092 )	2016-06-07 16:32:42 +05:30
Gian Merlino	2db5f49f35	Fix JavaScriptConfig. (#3062 )	2016-06-02 23:59:00 -07:00
Charles Allen	bbc5509078	Limit number of jetty threads that can be used by lookups (#3068 )	2016-06-02 22:33:12 -07:00
Slim	545cdd63ab	ensure the cleaning of overshadowed unloaded segments (#3048 ) * ensure the cleaning of overshadowed unloaded segments * add testing plus comments	2016-06-02 09:03:58 -05:00
Gian Merlino	874a0a4bdd	MetadataResource: Fix handling of includeDisabled. (#3042 )	2016-06-01 11:56:37 -07:00
John Wang	e662efa79f	segment interface refactor for proposal 2965 (#2990 )	2016-05-26 20:36:41 -07:00
John Wang	a004f1e1c5	appenderator plumber work from gian's branch (#2913 )	2016-05-26 14:46:32 -07:00
Charles Allen	847501a939	Add better messages around LookupCoordinatorManager failures (#3027 ) * Add better messages around LookupCoordinatorManager failures * Catches #3026 * A few more little tests * Add more forceful shutdown	2016-05-26 14:32:35 -05:00
jaehong choi	e2653a8cf4	handle a NPE in LookupCoordinatorManager.start() (#3026 )	2016-05-26 09:55:33 -07:00
David Lim	3ef24c03b3	Validate X-Druid-Task-Id header in request/response and support retrying on outdated TaskLocation information, add KafkaIndexTaskClient unit tests (#3006 ) * validate X-Druid-Task-Id header in request and add header to response * modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant	2016-05-25 22:05:18 -07:00
Nishant	0ac1b27d53	Allow manually setting of shutoffTime for EventReceiverFirehose (#2803 ) * Allow dynamically setting of shutoffTime for EventReceiverFirehose Allow dynamically setting shutoffTime for EventReceiverFirehose review comments and tests * shut down exec on close	2016-05-24 07:24:00 -07:00
Gian Merlino	970614875b	Fix race where results from an IncrementalIndexSegment could be cached. (#2983 )	2016-05-18 13:57:50 +05:30
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
Xavier Léauté	e79284da59	new interval based cost function (#2972 ) * new interval based cost function Addresses issues with balancing of segments in the existing cost function - `gapPenalty` led to clusters of segments ~30 days apart - `recencyPenalty` caused imbalance among recent segments - size-based cost could be skewed by compression New cost function is purely based on segment intervals: - assumes each time-slice of a partition is a constant cost - cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C) - cost decays exponentially based on distance between time-slices * comments and formatting * add more comments to explain the calculation	2016-05-17 09:56:00 -07:00
Parag Jain	681ffdb417	try to make DruidCoordinatorTest deterministic (#2967 )	2016-05-13 14:43:28 -07:00
Nishant	a9b721a01b	Allow user to set cost balancer threads more than or equal to the number of cores. (#2964 ) * Allow user to set cost balancer threads more than the number of cores. Allow user to set cost balancer threads more than the number of cores. * modify test	2016-05-13 13:27:42 -05:00
Charles Allen	81cab8a7bb	Make lookups more idempotent on update requests. (#2954 ) * No longer fails if an update fails but it shouldn't have replaced it	2016-05-11 11:22:35 -07:00
Jonathan Wei	f2510cf125	Remove DataSchema equals() and hashcode()	2016-05-10 16:09:28 -07:00
Charles Allen	6332bd70f4	Add smile provider (#2951 )	2016-05-10 16:03:39 -07:00
Charles Allen	0c04650e69	Lookup Announcer eager starting (#2944 )	2016-05-10 12:21:47 +05:30
Charles Allen	454bb034f1	Nicer toString on ListneingAnnouncerConfig (#2936 ) * Helps with debugging	2016-05-10 12:21:06 +05:30
David Lim	2cfd337378	Merge pull request #2933 from dclim/SQLMetadataSupervisorManagerTest-fix add uuid to primary key for supervisor table	2016-05-09 10:41:32 -06:00
Nishant	a2dd57cf65	Optimize CostBalancerStrategy (#2910 ) * Optimize CostBalancerStrategy Ignore benchmark test in normal run fix test review comments fix compilation fix test * review comments * review comment	2016-05-05 08:29:08 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Himanshu	6c5bf91f9a	publish metrics numJettyConns to see how number of active jetty connections change over time (#2839 ) this can be compared with numer of active queries to see if requests are waiting in jetty queue	2016-05-02 14:08:25 -07:00
Charles Allen	6b957aa072	[QTL] Make URI Exctraction Namespace take more sane arguments (#2738 ) * Make URI Exctraction Namespace take more sane arguments * Fixes https://github.com/druid-io/druid/issues/2669 * Update docs * Rename error message * Undo overzealous deletion of docs * Explain caching mechanism a bit more in docs	2016-05-02 12:54:34 -07:00
David Lim	5f0a9ccc57	fix ClassCastException in FiniteAppenderatorDriver (#2896 )	2016-04-28 18:39:24 -07:00
Parag Jain	0d745ee120	Basic authorization support in Druid (#2424 ) - Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424	2016-04-28 16:50:28 -07:00
Nishant	c29cb7d711	add pending task based resource management strategy (#2086 )	2016-04-27 10:40:53 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	c74391e54c	JavaScript: Ability to disable. (#2853 ) Fixes #2852.	2016-04-21 09:43:15 -05:00
Xavier Léauté	5938d9085b	Stream segments from database (#2859 ) * Avoids fetching all segment records into heap by JDBC driver * Set connection to read-only to help database optimize queries * Update JDBC drivers (MySQL has fixes for streaming results)	2016-04-21 05:40:07 +08:00
Xavier Léauté	fc91120b54	Merge pull request #2857 from metamx/upgrade-zk upgrade zookeeper client dependency to 3.4.8	2016-04-20 10:36:07 +05:30
Nishant	dbf63f738f	Add ability to filter segments for specific dataSources on broker without creating tiers (#2848 ) * Add back FilteredServerView removed in `a32906c7fd` to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources"	2016-04-19 10:10:06 -07:00
Gian Merlino	08c784fbf6	KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844 ) segment creation deterministic. This means that each segment will contain data from just one Kafka partition. So, users will probably not want to have a super high number of Kafka partitions... Fixes #2703.	2016-04-18 22:29:52 -07:00
Jisoo Kim	7b65ca7889	refactor ClientQuerySegmentWalker (#2837 ) * refactor ClientQuerySegmentWalker * add header to FluentQueryRunnerBuilder * refactor QueryRunnerTestHelper	2016-04-18 14:00:47 -07:00
binlijin	c1e690288c	Improve some log (#2807 )	2016-04-15 09:34:26 -07:00
Nishant	632b21472b	fix test failure (#2818 ) formatting changes	2016-04-14 21:40:19 -07:00
Fangjin Yang	886ee4e30d	Merge pull request #2821 from metamx/review-comments-2784 handle review comments for PR 2784	2016-04-12 10:20:43 -07:00
Fangjin Yang	b486eff6b7	Merge pull request #2805 from metamx/query-time-start request log should reflect time the query was received	2016-04-12 09:44:42 -07:00
Nishant	deb6ecf919	handle review comments for PR 2784 https://github.com/druid-io/druid/pull/2784#discussion_r59062021	2016-04-12 21:52:00 +05:30
Himanshu Gupta	aa6a230c90	remove DruidSQL.g4, its failing with newer version of ANTLR, will bring it back and fix if needed later	2016-04-08 11:46:21 -05:00
Xavier Léauté	d4d1d615c1	request log should reflect time the query was received, as opposed to processed	2016-04-07 12:39:34 -07:00
Nishant	edd74f2b67	Allow Lite DataSegment Announcements separate config for each skipping dimensions, metrics and loadSpec Add test fix test comment Add docs	2016-04-07 18:24:12 +05:30
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Xavier Léauté	728da75224	remove unused code	2016-04-01 13:10:35 -07:00
Fangjin Yang	9cb197adec	Merge pull request #2722 from himanshug/fix_hadoop_jar_upload config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-28 14:49:03 -07:00
Fangjin Yang	7a84c267f7	Merge pull request #2743 from metamx/fixlookuptest Fix LookupCoordinatorManager and Test for alerts	2016-03-28 14:41:37 -07:00
Parag Jain	89a8277ae2	Merge pull request #2712 from guobingkun/make_runnables_pluggable make Coordinator IndexingService helpers pluggable	2016-03-28 12:18:18 -05:00
Charles Allen	05151bc325	Fix LookupCoordinatorManagerTest for alerts * Also fixes bad alerting on missing nodes	2016-03-28 09:41:47 -07:00
Fangjin Yang	3c4691aa5a	Merge pull request #2741 from gianm/examples-wiki Downgrade geoip2, exclude com.google.http-client.	2016-03-25 23:08:38 -07:00
Xavier Léauté	01f3221a62	Merge pull request #2665 from jisookim0513/remove-druid-server-serialization remove serialization of DruidServer	2016-03-25 15:54:05 -07:00
Gian Merlino	977e867ad8	Downgrade geoip2, exclude com.google.http-client. Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting to run Druid along with an upgraded com.google.http-client) while preventing Jackson conflicts pointed out in #2717. Fixes #2717. This reverts commit `21b7572533`.	2016-03-25 14:43:22 -07:00
jisookim	0d3c5a3b6c	remove serialization of Druid Server and add tests for ServersResource	2016-03-25 12:27:27 -07:00
Bingkun Guo	0872448ff0	make Coordinator IndexingService helpers pluggable Fixes #2682 IndexingService helpers are added according to the settings in runtime.properties. Rather than having all the config.isXXX checks there, it makes sense to have a pluggable approach for allowing the dynamic configuration to bring in implementations for helpers without having to have hard-coded sets of available helpers. Plus, it will also make it possible for extensions to plug helpers in. With https://github.com/druid-io/druid-api/pull/76, we could conditionally bind a helper to Coordinator's runlist. The condition is driven by the value set in the runtime.properties.	2016-03-25 11:48:54 -05:00
Himanshu Gupta	e78a469fb7	UTs for ExtensionsConfig	2016-03-25 10:51:28 -05:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Gian Merlino	713062053c	Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters. I believe that the instanceof chain in Filters exists because in the past, Filter and DimFilter were in different packages (DimFilter was in druid-client and Filter was in druid-processing). And since druid-client didn't depend on druid-processing, DimFilter couldn't have a toFilter method. But now it can.	2016-03-23 17:03:49 -07:00
kilida	f25b2ed6f8	Duplicate statement in ReservoirSegmentSamplerTest.java	2016-03-22 22:14:36 -04:00
Fangjin Yang	826b371259	Merge pull request #2697 from guobingkun/remove_duplicate_version_converter remove duplicated DruidCoordinatorVersionConverter	2016-03-22 15:48:09 -07:00
Bingkun Guo	a6e9ff48ec	Merge pull request #2688 from pjain1/props_cli do not inject properties directly in module	2016-03-22 15:27:19 -05:00
Bingkun Guo	3778adf1f4	remove duplicated DruidCoordinatorVersionConverter	2016-03-22 14:45:52 -05:00
Parag Jain	7b93195dc6	do not inject properties directly in module	2016-03-22 14:30:10 -05:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Xavier Léauté	25967d0ed8	fix servlet startup sequence, fixes #2681	2016-03-18 15:06:15 -07:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Charles Allen	45c413af7e	Merge pull request #2674 from metamx/fix-broadcast-lockup separate HTTP client pool for cancellation requests	2016-03-17 15:23:42 -07:00
Xavier Léauté	1718a7224b	separate HTTP pool for cancellation requests * reduces contention between queries and cancellation requests * more aggressive timeouts for cancellation requests	2016-03-17 12:11:18 -07:00
Charles Allen	c716af5b04	Merge pull request #2678 from metamx/fixImports Fix some google related imports	2016-03-17 11:53:16 -07:00
Charles Allen	a52c6d3bee	Fix some google related imports	2016-03-17 11:03:29 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Parag Jain	948b19a088	do not silently ingnore rows	2016-03-16 09:30:19 -05:00
Fangjin Yang	ec949d76e3	Merge pull request #2655 from navis/hint-coordinator-client Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-14 20:57:40 -07:00
Jonathan Wei	5ec5ac92c6	Merge pull request #2382 from himanshug/broker_segment_tier_selection at broker, if configured, only add segments from specific tiers to the timeline	2016-03-14 16:53:06 -07:00
navis.ryu	83e1d5d7bf	Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-15 08:39:07 +09:00
Fangjin Yang	06813b510a	Merge pull request #2571 from himanshug/gp_by_avoid_sort avoid sort while doing groupBy merging when possible	2016-03-14 14:46:51 -07:00
Nishant	773d6fe86c	Merge pull request #2646 from atomx/update-maxmind Update com.maxmind.geoip2 to 2.6.0	2016-03-14 11:20:48 -07:00
Himanshu	d51a0a0cf4	Merge pull request #2220 from gianm/appenderator-kafka Appenderators, DataSource metadata, KafkaIndexTask	2016-03-14 13:14:36 -05:00
rasahner	2861e854f0	Merge pull request #2540 from pjain1/remove_kill Remove extra parameter from deleteDataSourceSpecificInterval endpoint and correct exception message for invalid interval	2016-03-14 11:16:23 -05:00
Erik Dubbelboer	21b7572533	Update com.maxmind.geoip2 to 2.6.0 com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old). When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.	2016-03-12 09:44:00 +00:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Gian Merlino	3d2214377d	Appenderatoring. Appenderators are a way of getting more control over the ingestion process than a Plumber allows. The idea is that existing Plumbers could be implemented using Appenderators, but you could also implement things that Plumbers can't do. FiniteAppenderatorDrivers help simplify indexing a finite stream of data. Also: - Sink: Ability to consider itself "finished" vs "still writable". - Sink: Ability to return the number of rows contained within the sink.	2016-03-10 17:41:50 -08:00
Gian Merlino	92c828f904	Make SegmentHandoffNotifier Closeable.	2016-03-10 16:50:37 -08:00
Gian Merlino	ad5ffdf483	Nix Committers.supplierOf; Suppliers.ofInstance is good enough.	2016-03-10 16:50:37 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Himanshu Gupta	02dfd5cd80	update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance	2016-03-10 16:11:48 -06:00
Gian Merlino	708bc674fa	Make specifying query context booleans more consistent. Before, some needed to be strings and some needed to be real booleans. Now they can all be either one.	2016-03-08 19:38:26 -08:00
Fangjin Yang	9c2420a1bc	Merge pull request #2599 from himanshug/datasource_isolation make coordinator db polling for list of segments more robust	2016-03-08 12:43:49 -08:00
Fangjin Yang	e7018f524f	Merge pull request #2598 from himanshug/handoff_timeout optional ability to configure handoff wait timeout on realtime tasks	2016-03-08 12:43:36 -08:00
Slim Bouguerra	c72438ead0	override metric name	2016-03-08 10:58:12 -06:00
Himanshu Gupta	1288784bde	in coordinator db polling for available segments, ignore corrupted entries in segments table so that coordinator continues to load new segments even if there are few corrupted segment entries	2016-03-07 15:13:10 -06:00
Himanshu Gupta	0402636598	configurable handoffConditionTimeout in realtime tasks for segment handoff wait	2016-03-05 10:14:54 -06:00
Fangjin Yang	d06c1c5c85	Merge pull request #2583 from guobingkun/fix_multiple_specs_2 update querySegmentSpec when passing query to getQueryRunner	2016-03-02 18:05:34 -08:00
David Lim	9e74772d6b	Merge pull request #2574 from gianm/allostuff Make first few allocatePendingSegment retries quiet.	2016-03-02 16:16:53 -07:00
Bingkun Guo	cfe2dbf1eb	Merge pull request #2580 from gianm/rtc-basePersist RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 16:56:49 -06:00
Bingkun Guo	4a58462fc7	update querySegmentSpec when passing query to getQueryRunner After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment. In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.	2016-03-02 16:44:56 -06:00
Gian Merlino	e65e6a49a5	RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 13:57:53 -08:00
Gian Merlino	004028b887	Make first few allocatePendingSegment retries quiet. Some light retrying can happen during normal operation (SELECT -> INSERT races) and the ensuing log messages would be scary for users.	2016-03-02 13:40:29 -08:00
Fangjin Yang	612e327426	Merge pull request #2581 from gianm/fix-deadlock CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource.	2016-03-02 11:37:49 -08:00
Gian Merlino	7557eb2800	CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource. See stack traces here, from current master: https://gist.github.com/gianm/bd9a66c826995f97fc8f 1. The thread "qtp925672150-62" holds the lock on InternalInjectorCreator.class, used by Scopes.SINGLETON, and wants the lock on "handlers" in Lifecycle.addMaybeStartHandler called by DiscoveryModule.getServiceAnnouncer. 2. The main thread holds the lock on "handlers" in Lifecycle.addMaybeStartHandler, which it took because it's trying to add the ExecutorLifecycle to the lifecycle. main is trying to get the InternalInjectorCreator.class lock because it's running ExecutorLifecycle.start, which does some Jackson deserialization, and Jackson needs that lock in order to inject stuff into the Task it's deserializing. This patch eagerly instantiates ChatHandlerResource (which I believe is what's trying to create the ServiceAnnouncer in the qtp925672150-62 jetty thread) and the ExecutorLifecycle.	2016-03-02 10:53:42 -08:00
Gian Merlino	102fc92120	SQLMetadataConnector: Fix overzealous retries that were preventing EntryExistsException from making it out.	2016-03-01 17:20:33 -08:00
Fangjin Yang	9340cae985	Merge pull request #2457 from bjozet/docs/fixes Default value for maxRowsInMemory	2016-03-01 07:43:26 -08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Bingkun Guo	4edcb1b861	Refactor FireChief + UTs for RealtimeManagerTest Add tests that verify whether RealtimeManager is querying the correct FireChief for a specific partition make FireChief static and package private, add latches in the UT	2016-02-29 14:41:10 -06:00
Eric Tschetter	68631d89e9	Allow realtime nodes to have multiple shards of the same datasource	2016-02-29 12:30:25 -06:00
Parag Jain	6b3c96c63a	better exception for invalid interval	2016-02-26 10:02:38 -06:00
Fangjin Yang	29d29ba98d	Merge pull request #2263 from jon-wei/flex_dims3 Allow IncrementalIndex to store Long/Float dimensions	2016-02-25 17:23:02 -08:00
Gian Merlino	b331fb4a83	Fix parsing of druid.indexer.server.maxChatRequests.	2016-02-25 14:47:15 -08:00
Parag Jain	b82b487f20	remove extra kill parameter	2016-02-24 17:16:18 -06:00
jon-wei	c17ce02467	Allow IncrementalIndex to store Long/Float dimensions	2016-02-24 13:51:57 -08:00
Himanshu Gupta	a3b37e9225	In persistAndMerge, increase the scope of try-catch block so that any exception while persisting hydrants is caught and consequently that sink is abandoned or the task will forever wait for handoff to happen.	2016-02-23 22:22:33 -06:00
Nishant	6c9e1a28ad	Merge pull request #2519 from gianm/unparseable-handling Better handling of ParseExceptions.	2016-02-24 04:46:29 +05:30
Fangjin Yang	93540c0631	Merge pull request #2503 from gianm/jetty-qos Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports.	2016-02-23 10:35:53 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Fangjin Yang	0c984f9e32	Merge pull request #2109 from himanshug/segments_in_delta_ingestion idempotent batch delta ingestion	2016-02-22 14:00:45 -08:00
Fangjin Yang	3bdd757024	Merge pull request #1773 from b-slim/log_details Adding downstream source when throwing QueryInterruptedException	2016-02-22 10:16:07 -08:00
Himanshu Gupta	21b0b8a07d	new coordinator endpoint to get list of used segment given a dataSource and list of intervals	2016-02-21 23:17:58 -06:00
Slim Bouguerra	77925cc061	adding downstream source of QueryInterruptedException	2016-02-20 13:05:14 -06:00
Gian Merlino	23c993c9e7	Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports. - Add druid.indexer.server.maxChatRequests, which sets up a QoSFilter on the main Jetty server. - Deprecate druid.indexer.runner.separateIngestionEndpoint - Deprecate druid.indexer.server.chathandler.*	2016-02-19 13:36:09 -08:00

1 2 3 4 5 ...

2845 Commits