druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
Charles Allen	15f833a861	Make extension classloader caching keyed on directory (#3165 ) * Make extension classloaders keyed by extension directory * Fixes #3163 * Add in same-directory-name unit test	2016-06-23 17:13:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	0d427923c0	fix caching for search results (#3119 ) * fix caching for search results properly read count when reading from cache. * fix NPE during merging search count and add test * Update cache key to invalidate prev results	2016-06-09 17:49:47 -07:00
Himanshu	ab4209c82a	killDataSourceWhitelist in CoordinatorDynamicConfig accepts comma separated list of strings in addition to json array of strings so that coordinator console can do the updates correctly (#3095 )	2016-06-07 15:39:41 -07:00
Keuntae Park	e6b32c24ae	bug fix for getNewNodes() in ListenerDiscoverer (#3092 )	2016-06-07 16:32:42 +05:30
Gian Merlino	2db5f49f35	Fix JavaScriptConfig. (#3062 )	2016-06-02 23:59:00 -07:00
Charles Allen	bbc5509078	Limit number of jetty threads that can be used by lookups (#3068 )	2016-06-02 22:33:12 -07:00
Slim	545cdd63ab	ensure the cleaning of overshadowed unloaded segments (#3048 ) * ensure the cleaning of overshadowed unloaded segments * add testing plus comments	2016-06-02 09:03:58 -05:00
Gian Merlino	874a0a4bdd	MetadataResource: Fix handling of includeDisabled. (#3042 )	2016-06-01 11:56:37 -07:00
John Wang	e662efa79f	segment interface refactor for proposal 2965 (#2990 )	2016-05-26 20:36:41 -07:00
John Wang	a004f1e1c5	appenderator plumber work from gian's branch (#2913 )	2016-05-26 14:46:32 -07:00
Charles Allen	847501a939	Add better messages around LookupCoordinatorManager failures (#3027 ) * Add better messages around LookupCoordinatorManager failures * Catches #3026 * A few more little tests * Add more forceful shutdown	2016-05-26 14:32:35 -05:00
jaehong choi	e2653a8cf4	handle a NPE in LookupCoordinatorManager.start() (#3026 )	2016-05-26 09:55:33 -07:00
David Lim	3ef24c03b3	Validate X-Druid-Task-Id header in request/response and support retrying on outdated TaskLocation information, add KafkaIndexTaskClient unit tests (#3006 ) * validate X-Druid-Task-Id header in request and add header to response * modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant	2016-05-25 22:05:18 -07:00
Nishant	0ac1b27d53	Allow manually setting of shutoffTime for EventReceiverFirehose (#2803 ) * Allow dynamically setting of shutoffTime for EventReceiverFirehose Allow dynamically setting shutoffTime for EventReceiverFirehose review comments and tests * shut down exec on close	2016-05-24 07:24:00 -07:00
Gian Merlino	970614875b	Fix race where results from an IncrementalIndexSegment could be cached. (#2983 )	2016-05-18 13:57:50 +05:30
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
Xavier Léauté	e79284da59	new interval based cost function (#2972 ) * new interval based cost function Addresses issues with balancing of segments in the existing cost function - `gapPenalty` led to clusters of segments ~30 days apart - `recencyPenalty` caused imbalance among recent segments - size-based cost could be skewed by compression New cost function is purely based on segment intervals: - assumes each time-slice of a partition is a constant cost - cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C) - cost decays exponentially based on distance between time-slices * comments and formatting * add more comments to explain the calculation	2016-05-17 09:56:00 -07:00
Parag Jain	681ffdb417	try to make DruidCoordinatorTest deterministic (#2967 )	2016-05-13 14:43:28 -07:00
Nishant	a9b721a01b	Allow user to set cost balancer threads more than or equal to the number of cores. (#2964 ) * Allow user to set cost balancer threads more than the number of cores. Allow user to set cost balancer threads more than the number of cores. * modify test	2016-05-13 13:27:42 -05:00
Charles Allen	81cab8a7bb	Make lookups more idempotent on update requests. (#2954 ) * No longer fails if an update fails but it shouldn't have replaced it	2016-05-11 11:22:35 -07:00
Jonathan Wei	f2510cf125	Remove DataSchema equals() and hashcode()	2016-05-10 16:09:28 -07:00
Charles Allen	6332bd70f4	Add smile provider (#2951 )	2016-05-10 16:03:39 -07:00
Charles Allen	0c04650e69	Lookup Announcer eager starting (#2944 )	2016-05-10 12:21:47 +05:30
Charles Allen	454bb034f1	Nicer toString on ListneingAnnouncerConfig (#2936 ) * Helps with debugging	2016-05-10 12:21:06 +05:30
David Lim	2cfd337378	Merge pull request #2933 from dclim/SQLMetadataSupervisorManagerTest-fix add uuid to primary key for supervisor table	2016-05-09 10:41:32 -06:00
Nishant	a2dd57cf65	Optimize CostBalancerStrategy (#2910 ) * Optimize CostBalancerStrategy Ignore benchmark test in normal run fix test review comments fix compilation fix test * review comments * review comment	2016-05-05 08:29:08 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Himanshu	6c5bf91f9a	publish metrics numJettyConns to see how number of active jetty connections change over time (#2839 ) this can be compared with numer of active queries to see if requests are waiting in jetty queue	2016-05-02 14:08:25 -07:00
Charles Allen	6b957aa072	[QTL] Make URI Exctraction Namespace take more sane arguments (#2738 ) * Make URI Exctraction Namespace take more sane arguments * Fixes https://github.com/druid-io/druid/issues/2669 * Update docs * Rename error message * Undo overzealous deletion of docs * Explain caching mechanism a bit more in docs	2016-05-02 12:54:34 -07:00
David Lim	5f0a9ccc57	fix ClassCastException in FiniteAppenderatorDriver (#2896 )	2016-04-28 18:39:24 -07:00
Parag Jain	0d745ee120	Basic authorization support in Druid (#2424 ) - Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424	2016-04-28 16:50:28 -07:00
Nishant	c29cb7d711	add pending task based resource management strategy (#2086 )	2016-04-27 10:40:53 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	c74391e54c	JavaScript: Ability to disable. (#2853 ) Fixes #2852.	2016-04-21 09:43:15 -05:00
Xavier Léauté	5938d9085b	Stream segments from database (#2859 ) * Avoids fetching all segment records into heap by JDBC driver * Set connection to read-only to help database optimize queries * Update JDBC drivers (MySQL has fixes for streaming results)	2016-04-21 05:40:07 +08:00
Xavier Léauté	fc91120b54	Merge pull request #2857 from metamx/upgrade-zk upgrade zookeeper client dependency to 3.4.8	2016-04-20 10:36:07 +05:30
Nishant	dbf63f738f	Add ability to filter segments for specific dataSources on broker without creating tiers (#2848 ) * Add back FilteredServerView removed in `a32906c7fd` to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources"	2016-04-19 10:10:06 -07:00
Gian Merlino	08c784fbf6	KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844 ) segment creation deterministic. This means that each segment will contain data from just one Kafka partition. So, users will probably not want to have a super high number of Kafka partitions... Fixes #2703.	2016-04-18 22:29:52 -07:00
Jisoo Kim	7b65ca7889	refactor ClientQuerySegmentWalker (#2837 ) * refactor ClientQuerySegmentWalker * add header to FluentQueryRunnerBuilder * refactor QueryRunnerTestHelper	2016-04-18 14:00:47 -07:00
binlijin	c1e690288c	Improve some log (#2807 )	2016-04-15 09:34:26 -07:00
Nishant	632b21472b	fix test failure (#2818 ) formatting changes	2016-04-14 21:40:19 -07:00
Fangjin Yang	886ee4e30d	Merge pull request #2821 from metamx/review-comments-2784 handle review comments for PR 2784	2016-04-12 10:20:43 -07:00
Fangjin Yang	b486eff6b7	Merge pull request #2805 from metamx/query-time-start request log should reflect time the query was received	2016-04-12 09:44:42 -07:00
Nishant	deb6ecf919	handle review comments for PR 2784 https://github.com/druid-io/druid/pull/2784#discussion_r59062021	2016-04-12 21:52:00 +05:30
Himanshu Gupta	aa6a230c90	remove DruidSQL.g4, its failing with newer version of ANTLR, will bring it back and fix if needed later	2016-04-08 11:46:21 -05:00
Xavier Léauté	d4d1d615c1	request log should reflect time the query was received, as opposed to processed	2016-04-07 12:39:34 -07:00
Nishant	edd74f2b67	Allow Lite DataSegment Announcements separate config for each skipping dimensions, metrics and loadSpec Add test fix test comment Add docs	2016-04-07 18:24:12 +05:30
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Xavier Léauté	728da75224	remove unused code	2016-04-01 13:10:35 -07:00
Fangjin Yang	9cb197adec	Merge pull request #2722 from himanshug/fix_hadoop_jar_upload config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-28 14:49:03 -07:00
Fangjin Yang	7a84c267f7	Merge pull request #2743 from metamx/fixlookuptest Fix LookupCoordinatorManager and Test for alerts	2016-03-28 14:41:37 -07:00
Parag Jain	89a8277ae2	Merge pull request #2712 from guobingkun/make_runnables_pluggable make Coordinator IndexingService helpers pluggable	2016-03-28 12:18:18 -05:00
Charles Allen	05151bc325	Fix LookupCoordinatorManagerTest for alerts * Also fixes bad alerting on missing nodes	2016-03-28 09:41:47 -07:00
Fangjin Yang	3c4691aa5a	Merge pull request #2741 from gianm/examples-wiki Downgrade geoip2, exclude com.google.http-client.	2016-03-25 23:08:38 -07:00
Xavier Léauté	01f3221a62	Merge pull request #2665 from jisookim0513/remove-druid-server-serialization remove serialization of DruidServer	2016-03-25 15:54:05 -07:00
Gian Merlino	977e867ad8	Downgrade geoip2, exclude com.google.http-client. Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting to run Druid along with an upgraded com.google.http-client) while preventing Jackson conflicts pointed out in #2717. Fixes #2717. This reverts commit `21b7572533`.	2016-03-25 14:43:22 -07:00
jisookim	0d3c5a3b6c	remove serialization of Druid Server and add tests for ServersResource	2016-03-25 12:27:27 -07:00
Bingkun Guo	0872448ff0	make Coordinator IndexingService helpers pluggable Fixes #2682 IndexingService helpers are added according to the settings in runtime.properties. Rather than having all the config.isXXX checks there, it makes sense to have a pluggable approach for allowing the dynamic configuration to bring in implementations for helpers without having to have hard-coded sets of available helpers. Plus, it will also make it possible for extensions to plug helpers in. With https://github.com/druid-io/druid-api/pull/76, we could conditionally bind a helper to Coordinator's runlist. The condition is driven by the value set in the runtime.properties.	2016-03-25 11:48:54 -05:00
Himanshu Gupta	e78a469fb7	UTs for ExtensionsConfig	2016-03-25 10:51:28 -05:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Gian Merlino	713062053c	Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters. I believe that the instanceof chain in Filters exists because in the past, Filter and DimFilter were in different packages (DimFilter was in druid-client and Filter was in druid-processing). And since druid-client didn't depend on druid-processing, DimFilter couldn't have a toFilter method. But now it can.	2016-03-23 17:03:49 -07:00
kilida	f25b2ed6f8	Duplicate statement in ReservoirSegmentSamplerTest.java	2016-03-22 22:14:36 -04:00
Fangjin Yang	826b371259	Merge pull request #2697 from guobingkun/remove_duplicate_version_converter remove duplicated DruidCoordinatorVersionConverter	2016-03-22 15:48:09 -07:00
Bingkun Guo	a6e9ff48ec	Merge pull request #2688 from pjain1/props_cli do not inject properties directly in module	2016-03-22 15:27:19 -05:00
Bingkun Guo	3778adf1f4	remove duplicated DruidCoordinatorVersionConverter	2016-03-22 14:45:52 -05:00
Parag Jain	7b93195dc6	do not inject properties directly in module	2016-03-22 14:30:10 -05:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Xavier Léauté	25967d0ed8	fix servlet startup sequence, fixes #2681	2016-03-18 15:06:15 -07:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Charles Allen	45c413af7e	Merge pull request #2674 from metamx/fix-broadcast-lockup separate HTTP client pool for cancellation requests	2016-03-17 15:23:42 -07:00
Xavier Léauté	1718a7224b	separate HTTP pool for cancellation requests * reduces contention between queries and cancellation requests * more aggressive timeouts for cancellation requests	2016-03-17 12:11:18 -07:00
Charles Allen	c716af5b04	Merge pull request #2678 from metamx/fixImports Fix some google related imports	2016-03-17 11:53:16 -07:00
Charles Allen	a52c6d3bee	Fix some google related imports	2016-03-17 11:03:29 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Parag Jain	948b19a088	do not silently ingnore rows	2016-03-16 09:30:19 -05:00
Fangjin Yang	ec949d76e3	Merge pull request #2655 from navis/hint-coordinator-client Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-14 20:57:40 -07:00
Jonathan Wei	5ec5ac92c6	Merge pull request #2382 from himanshug/broker_segment_tier_selection at broker, if configured, only add segments from specific tiers to the timeline	2016-03-14 16:53:06 -07:00
navis.ryu	83e1d5d7bf	Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-15 08:39:07 +09:00
Fangjin Yang	06813b510a	Merge pull request #2571 from himanshug/gp_by_avoid_sort avoid sort while doing groupBy merging when possible	2016-03-14 14:46:51 -07:00
Nishant	773d6fe86c	Merge pull request #2646 from atomx/update-maxmind Update com.maxmind.geoip2 to 2.6.0	2016-03-14 11:20:48 -07:00
Himanshu	d51a0a0cf4	Merge pull request #2220 from gianm/appenderator-kafka Appenderators, DataSource metadata, KafkaIndexTask	2016-03-14 13:14:36 -05:00
rasahner	2861e854f0	Merge pull request #2540 from pjain1/remove_kill Remove extra parameter from deleteDataSourceSpecificInterval endpoint and correct exception message for invalid interval	2016-03-14 11:16:23 -05:00
Erik Dubbelboer	21b7572533	Update com.maxmind.geoip2 to 2.6.0 com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old). When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.	2016-03-12 09:44:00 +00:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Gian Merlino	3d2214377d	Appenderatoring. Appenderators are a way of getting more control over the ingestion process than a Plumber allows. The idea is that existing Plumbers could be implemented using Appenderators, but you could also implement things that Plumbers can't do. FiniteAppenderatorDrivers help simplify indexing a finite stream of data. Also: - Sink: Ability to consider itself "finished" vs "still writable". - Sink: Ability to return the number of rows contained within the sink.	2016-03-10 17:41:50 -08:00
Gian Merlino	92c828f904	Make SegmentHandoffNotifier Closeable.	2016-03-10 16:50:37 -08:00
Gian Merlino	ad5ffdf483	Nix Committers.supplierOf; Suppliers.ofInstance is good enough.	2016-03-10 16:50:37 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Himanshu Gupta	02dfd5cd80	update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance	2016-03-10 16:11:48 -06:00
Gian Merlino	708bc674fa	Make specifying query context booleans more consistent. Before, some needed to be strings and some needed to be real booleans. Now they can all be either one.	2016-03-08 19:38:26 -08:00
Fangjin Yang	9c2420a1bc	Merge pull request #2599 from himanshug/datasource_isolation make coordinator db polling for list of segments more robust	2016-03-08 12:43:49 -08:00
Fangjin Yang	e7018f524f	Merge pull request #2598 from himanshug/handoff_timeout optional ability to configure handoff wait timeout on realtime tasks	2016-03-08 12:43:36 -08:00
Slim Bouguerra	c72438ead0	override metric name	2016-03-08 10:58:12 -06:00
Himanshu Gupta	1288784bde	in coordinator db polling for available segments, ignore corrupted entries in segments table so that coordinator continues to load new segments even if there are few corrupted segment entries	2016-03-07 15:13:10 -06:00
Himanshu Gupta	0402636598	configurable handoffConditionTimeout in realtime tasks for segment handoff wait	2016-03-05 10:14:54 -06:00
Fangjin Yang	d06c1c5c85	Merge pull request #2583 from guobingkun/fix_multiple_specs_2 update querySegmentSpec when passing query to getQueryRunner	2016-03-02 18:05:34 -08:00
David Lim	9e74772d6b	Merge pull request #2574 from gianm/allostuff Make first few allocatePendingSegment retries quiet.	2016-03-02 16:16:53 -07:00
Bingkun Guo	cfe2dbf1eb	Merge pull request #2580 from gianm/rtc-basePersist RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 16:56:49 -06:00
Bingkun Guo	4a58462fc7	update querySegmentSpec when passing query to getQueryRunner After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment. In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.	2016-03-02 16:44:56 -06:00
Gian Merlino	e65e6a49a5	RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 13:57:53 -08:00
Gian Merlino	004028b887	Make first few allocatePendingSegment retries quiet. Some light retrying can happen during normal operation (SELECT -> INSERT races) and the ensuing log messages would be scary for users.	2016-03-02 13:40:29 -08:00
Fangjin Yang	612e327426	Merge pull request #2581 from gianm/fix-deadlock CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource.	2016-03-02 11:37:49 -08:00
Gian Merlino	7557eb2800	CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource. See stack traces here, from current master: https://gist.github.com/gianm/bd9a66c826995f97fc8f 1. The thread "qtp925672150-62" holds the lock on InternalInjectorCreator.class, used by Scopes.SINGLETON, and wants the lock on "handlers" in Lifecycle.addMaybeStartHandler called by DiscoveryModule.getServiceAnnouncer. 2. The main thread holds the lock on "handlers" in Lifecycle.addMaybeStartHandler, which it took because it's trying to add the ExecutorLifecycle to the lifecycle. main is trying to get the InternalInjectorCreator.class lock because it's running ExecutorLifecycle.start, which does some Jackson deserialization, and Jackson needs that lock in order to inject stuff into the Task it's deserializing. This patch eagerly instantiates ChatHandlerResource (which I believe is what's trying to create the ServiceAnnouncer in the qtp925672150-62 jetty thread) and the ExecutorLifecycle.	2016-03-02 10:53:42 -08:00
Gian Merlino	102fc92120	SQLMetadataConnector: Fix overzealous retries that were preventing EntryExistsException from making it out.	2016-03-01 17:20:33 -08:00
Fangjin Yang	9340cae985	Merge pull request #2457 from bjozet/docs/fixes Default value for maxRowsInMemory	2016-03-01 07:43:26 -08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Bingkun Guo	4edcb1b861	Refactor FireChief + UTs for RealtimeManagerTest Add tests that verify whether RealtimeManager is querying the correct FireChief for a specific partition make FireChief static and package private, add latches in the UT	2016-02-29 14:41:10 -06:00
Eric Tschetter	68631d89e9	Allow realtime nodes to have multiple shards of the same datasource	2016-02-29 12:30:25 -06:00
Parag Jain	6b3c96c63a	better exception for invalid interval	2016-02-26 10:02:38 -06:00
Fangjin Yang	29d29ba98d	Merge pull request #2263 from jon-wei/flex_dims3 Allow IncrementalIndex to store Long/Float dimensions	2016-02-25 17:23:02 -08:00
Gian Merlino	b331fb4a83	Fix parsing of druid.indexer.server.maxChatRequests.	2016-02-25 14:47:15 -08:00
Parag Jain	b82b487f20	remove extra kill parameter	2016-02-24 17:16:18 -06:00
jon-wei	c17ce02467	Allow IncrementalIndex to store Long/Float dimensions	2016-02-24 13:51:57 -08:00
Himanshu Gupta	a3b37e9225	In persistAndMerge, increase the scope of try-catch block so that any exception while persisting hydrants is caught and consequently that sink is abandoned or the task will forever wait for handoff to happen.	2016-02-23 22:22:33 -06:00
Nishant	6c9e1a28ad	Merge pull request #2519 from gianm/unparseable-handling Better handling of ParseExceptions.	2016-02-24 04:46:29 +05:30
Fangjin Yang	93540c0631	Merge pull request #2503 from gianm/jetty-qos Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports.	2016-02-23 10:35:53 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Fangjin Yang	0c984f9e32	Merge pull request #2109 from himanshug/segments_in_delta_ingestion idempotent batch delta ingestion	2016-02-22 14:00:45 -08:00
Fangjin Yang	3bdd757024	Merge pull request #1773 from b-slim/log_details Adding downstream source when throwing QueryInterruptedException	2016-02-22 10:16:07 -08:00
Himanshu Gupta	21b0b8a07d	new coordinator endpoint to get list of used segment given a dataSource and list of intervals	2016-02-21 23:17:58 -06:00
Slim Bouguerra	77925cc061	adding downstream source of QueryInterruptedException	2016-02-20 13:05:14 -06:00
Gian Merlino	23c993c9e7	Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports. - Add druid.indexer.server.maxChatRequests, which sets up a QoSFilter on the main Jetty server. - Deprecate druid.indexer.runner.separateIngestionEndpoint - Deprecate druid.indexer.server.chathandler.*	2016-02-19 13:36:09 -08:00
Gian Merlino	243ac5399b	Harmonize realtime indexing loop across the task and standalone nodes. - Both now catch ParseExceptions on plumber.add (see https://groups.google.com/d/topic/druid-user/wmiRDvx2RvM/discussion) - Standalone now treats IndexSizeExceededException as fatal (previously only the task did)	2016-02-19 07:34:15 -08:00
Gian Merlino	e0c049c0b0	Make startup properties logging optional. Off by default, but enabled in the example config files. See also #2452.	2016-02-12 14:12:16 -08:00
Fangjin Yang	1430bc2c88	Merge pull request #2276 from harshjain2/feature-2021 Fix for issue 2021.	2016-02-10 17:04:45 -08:00
Gian Merlino	fa92b77f5a	Harmonize znode writing code in RTR and Worker. - Throw most exceptions rather than suppressing them, which should help detect problems. Continue suppressing exceptions that make sense to suppress. - Handle payload length checks consistently, and improve error message. - Remove unused WorkerCuratorCoordinator.announceTaskAnnouncement method. - Max znode length should be int, not long. - Add tests.	2016-02-10 14:52:00 -08:00
Harsh Jain	a3eb863c8e	Fix for issue 2021	2016-02-10 22:19:12 +05:30
Himanshu Gupta	d1cb17d3f7	at broker - only add segments from specific tiers to the timeline	2016-02-09 22:33:22 -06:00
Himanshu Gupta	b40c342cd1	make Global stupid pool cache size configurable	2016-02-05 14:18:06 -06:00
Parag Jain	9002548eeb	increase test time out and general clean up	2016-02-03 13:26:37 -06:00
Charles Allen	5111fd52f2	Add check for log4j-core in Log4jShutterDownerModule	2016-02-02 15:56:48 -08:00
Himanshu	dc89cdd0f9	Merge pull request #2336 from himanshug/fix_2331 limit size of X-Druid-Response-Context header to 7K	2016-02-02 12:06:59 -06:00
navis.ryu	c03918f89a	AsyncQueryForwardingServletTest#testDeleteBroadcast sometimes fails by port conflict	2016-01-29 19:28:58 +09:00
Himanshu Gupta	f6b4dbd697	bug fix and unit tests for DruidCoordinatorSegmentKiller	2016-01-28 14:10:17 -06:00
Himanshu Gupta	ab3edfa8fc	moving DruidCoordinatorSegmentKiller class out of DruidCoordinator	2016-01-28 14:03:56 -06:00
Nishant	3880f54b87	Merge pull request #2332 from himanshug/configurable_partial make populateUncoveredIntervals a configuration in query context	2016-01-28 10:34:35 +05:30
Himanshu Gupta	a7bde8f4da	limit size of X-Druid-Response-Context header to 7K due to https://github.com/druid-io/druid/issues/2331	2016-01-27 15:18:08 -06:00
Xavier Léauté	5a3642bb93	Merge pull request #2247 from metamx/pedanticBuild Enable strict building in travis	2016-01-27 10:27:03 -08:00
Charles Allen	508734c8b0	Long constant reformatting in tests `l` --> `L`	2016-01-27 08:59:19 -08:00
Nishant	fd6bf3fe22	Use interval comparator instead of bucketMonthComparator fix when two segments have same interval review comments	2016-01-27 17:35:43 +05:30
Himanshu Gupta	3719b6e3c8	make populateUncoveredIntervals a configuration in query context	2016-01-26 15:13:45 -06:00

1 2 3 4 5 ...

2826 Commits