druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	970614875b	Fix race where results from an IncrementalIndexSegment could be cached. (#2983 )	2016-05-18 13:57:50 +05:30
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
Xavier Léauté	e79284da59	new interval based cost function (#2972 ) * new interval based cost function Addresses issues with balancing of segments in the existing cost function - `gapPenalty` led to clusters of segments ~30 days apart - `recencyPenalty` caused imbalance among recent segments - size-based cost could be skewed by compression New cost function is purely based on segment intervals: - assumes each time-slice of a partition is a constant cost - cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C) - cost decays exponentially based on distance between time-slices * comments and formatting * add more comments to explain the calculation	2016-05-17 09:56:00 -07:00
Parag Jain	681ffdb417	try to make DruidCoordinatorTest deterministic (#2967 )	2016-05-13 14:43:28 -07:00
Nishant	a9b721a01b	Allow user to set cost balancer threads more than or equal to the number of cores. (#2964 ) * Allow user to set cost balancer threads more than the number of cores. Allow user to set cost balancer threads more than the number of cores. * modify test	2016-05-13 13:27:42 -05:00
Charles Allen	81cab8a7bb	Make lookups more idempotent on update requests. (#2954 ) * No longer fails if an update fails but it shouldn't have replaced it	2016-05-11 11:22:35 -07:00
Jonathan Wei	f2510cf125	Remove DataSchema equals() and hashcode()	2016-05-10 16:09:28 -07:00
Charles Allen	6332bd70f4	Add smile provider (#2951 )	2016-05-10 16:03:39 -07:00
Charles Allen	0c04650e69	Lookup Announcer eager starting (#2944 )	2016-05-10 12:21:47 +05:30
Charles Allen	454bb034f1	Nicer toString on ListneingAnnouncerConfig (#2936 ) * Helps with debugging	2016-05-10 12:21:06 +05:30
David Lim	2cfd337378	Merge pull request #2933 from dclim/SQLMetadataSupervisorManagerTest-fix add uuid to primary key for supervisor table	2016-05-09 10:41:32 -06:00
Nishant	a2dd57cf65	Optimize CostBalancerStrategy (#2910 ) * Optimize CostBalancerStrategy Ignore benchmark test in normal run fix test review comments fix compilation fix test * review comments * review comment	2016-05-05 08:29:08 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Himanshu	6c5bf91f9a	publish metrics numJettyConns to see how number of active jetty connections change over time (#2839 ) this can be compared with numer of active queries to see if requests are waiting in jetty queue	2016-05-02 14:08:25 -07:00
Charles Allen	6b957aa072	[QTL] Make URI Exctraction Namespace take more sane arguments (#2738 ) * Make URI Exctraction Namespace take more sane arguments * Fixes https://github.com/druid-io/druid/issues/2669 * Update docs * Rename error message * Undo overzealous deletion of docs * Explain caching mechanism a bit more in docs	2016-05-02 12:54:34 -07:00
David Lim	5f0a9ccc57	fix ClassCastException in FiniteAppenderatorDriver (#2896 )	2016-04-28 18:39:24 -07:00
Parag Jain	0d745ee120	Basic authorization support in Druid (#2424 ) - Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424	2016-04-28 16:50:28 -07:00
Nishant	c29cb7d711	add pending task based resource management strategy (#2086 )	2016-04-27 10:40:53 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	c74391e54c	JavaScript: Ability to disable. (#2853 ) Fixes #2852.	2016-04-21 09:43:15 -05:00
Xavier Léauté	5938d9085b	Stream segments from database (#2859 ) * Avoids fetching all segment records into heap by JDBC driver * Set connection to read-only to help database optimize queries * Update JDBC drivers (MySQL has fixes for streaming results)	2016-04-21 05:40:07 +08:00
Xavier Léauté	fc91120b54	Merge pull request #2857 from metamx/upgrade-zk upgrade zookeeper client dependency to 3.4.8	2016-04-20 10:36:07 +05:30
Nishant	dbf63f738f	Add ability to filter segments for specific dataSources on broker without creating tiers (#2848 ) * Add back FilteredServerView removed in `a32906c7fd` to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources"	2016-04-19 10:10:06 -07:00
Gian Merlino	08c784fbf6	KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844 ) segment creation deterministic. This means that each segment will contain data from just one Kafka partition. So, users will probably not want to have a super high number of Kafka partitions... Fixes #2703.	2016-04-18 22:29:52 -07:00
Jisoo Kim	7b65ca7889	refactor ClientQuerySegmentWalker (#2837 ) * refactor ClientQuerySegmentWalker * add header to FluentQueryRunnerBuilder * refactor QueryRunnerTestHelper	2016-04-18 14:00:47 -07:00
binlijin	c1e690288c	Improve some log (#2807 )	2016-04-15 09:34:26 -07:00
Nishant	632b21472b	fix test failure (#2818 ) formatting changes	2016-04-14 21:40:19 -07:00
Fangjin Yang	886ee4e30d	Merge pull request #2821 from metamx/review-comments-2784 handle review comments for PR 2784	2016-04-12 10:20:43 -07:00
Fangjin Yang	b486eff6b7	Merge pull request #2805 from metamx/query-time-start request log should reflect time the query was received	2016-04-12 09:44:42 -07:00
Nishant	deb6ecf919	handle review comments for PR 2784 https://github.com/druid-io/druid/pull/2784#discussion_r59062021	2016-04-12 21:52:00 +05:30
Himanshu Gupta	aa6a230c90	remove DruidSQL.g4, its failing with newer version of ANTLR, will bring it back and fix if needed later	2016-04-08 11:46:21 -05:00
Xavier Léauté	d4d1d615c1	request log should reflect time the query was received, as opposed to processed	2016-04-07 12:39:34 -07:00
Nishant	edd74f2b67	Allow Lite DataSegment Announcements separate config for each skipping dimensions, metrics and loadSpec Add test fix test comment Add docs	2016-04-07 18:24:12 +05:30
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Xavier Léauté	728da75224	remove unused code	2016-04-01 13:10:35 -07:00
Fangjin Yang	9cb197adec	Merge pull request #2722 from himanshug/fix_hadoop_jar_upload config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-28 14:49:03 -07:00
Fangjin Yang	7a84c267f7	Merge pull request #2743 from metamx/fixlookuptest Fix LookupCoordinatorManager and Test for alerts	2016-03-28 14:41:37 -07:00
Parag Jain	89a8277ae2	Merge pull request #2712 from guobingkun/make_runnables_pluggable make Coordinator IndexingService helpers pluggable	2016-03-28 12:18:18 -05:00
Charles Allen	05151bc325	Fix LookupCoordinatorManagerTest for alerts * Also fixes bad alerting on missing nodes	2016-03-28 09:41:47 -07:00
Fangjin Yang	3c4691aa5a	Merge pull request #2741 from gianm/examples-wiki Downgrade geoip2, exclude com.google.http-client.	2016-03-25 23:08:38 -07:00
Xavier Léauté	01f3221a62	Merge pull request #2665 from jisookim0513/remove-druid-server-serialization remove serialization of DruidServer	2016-03-25 15:54:05 -07:00
Gian Merlino	977e867ad8	Downgrade geoip2, exclude com.google.http-client. Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting to run Druid along with an upgraded com.google.http-client) while preventing Jackson conflicts pointed out in #2717. Fixes #2717. This reverts commit `21b7572533`.	2016-03-25 14:43:22 -07:00
jisookim	0d3c5a3b6c	remove serialization of Druid Server and add tests for ServersResource	2016-03-25 12:27:27 -07:00
Bingkun Guo	0872448ff0	make Coordinator IndexingService helpers pluggable Fixes #2682 IndexingService helpers are added according to the settings in runtime.properties. Rather than having all the config.isXXX checks there, it makes sense to have a pluggable approach for allowing the dynamic configuration to bring in implementations for helpers without having to have hard-coded sets of available helpers. Plus, it will also make it possible for extensions to plug helpers in. With https://github.com/druid-io/druid-api/pull/76, we could conditionally bind a helper to Coordinator's runlist. The condition is driven by the value set in the runtime.properties.	2016-03-25 11:48:54 -05:00
Himanshu Gupta	e78a469fb7	UTs for ExtensionsConfig	2016-03-25 10:51:28 -05:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Gian Merlino	713062053c	Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters. I believe that the instanceof chain in Filters exists because in the past, Filter and DimFilter were in different packages (DimFilter was in druid-client and Filter was in druid-processing). And since druid-client didn't depend on druid-processing, DimFilter couldn't have a toFilter method. But now it can.	2016-03-23 17:03:49 -07:00
kilida	f25b2ed6f8	Duplicate statement in ReservoirSegmentSamplerTest.java	2016-03-22 22:14:36 -04:00
Fangjin Yang	826b371259	Merge pull request #2697 from guobingkun/remove_duplicate_version_converter remove duplicated DruidCoordinatorVersionConverter	2016-03-22 15:48:09 -07:00
Bingkun Guo	a6e9ff48ec	Merge pull request #2688 from pjain1/props_cli do not inject properties directly in module	2016-03-22 15:27:19 -05:00
Bingkun Guo	3778adf1f4	remove duplicated DruidCoordinatorVersionConverter	2016-03-22 14:45:52 -05:00
Parag Jain	7b93195dc6	do not inject properties directly in module	2016-03-22 14:30:10 -05:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Xavier Léauté	25967d0ed8	fix servlet startup sequence, fixes #2681	2016-03-18 15:06:15 -07:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Charles Allen	45c413af7e	Merge pull request #2674 from metamx/fix-broadcast-lockup separate HTTP client pool for cancellation requests	2016-03-17 15:23:42 -07:00
Xavier Léauté	1718a7224b	separate HTTP pool for cancellation requests * reduces contention between queries and cancellation requests * more aggressive timeouts for cancellation requests	2016-03-17 12:11:18 -07:00
Charles Allen	c716af5b04	Merge pull request #2678 from metamx/fixImports Fix some google related imports	2016-03-17 11:53:16 -07:00
Charles Allen	a52c6d3bee	Fix some google related imports	2016-03-17 11:03:29 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Parag Jain	948b19a088	do not silently ingnore rows	2016-03-16 09:30:19 -05:00
Fangjin Yang	ec949d76e3	Merge pull request #2655 from navis/hint-coordinator-client Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-14 20:57:40 -07:00
Jonathan Wei	5ec5ac92c6	Merge pull request #2382 from himanshug/broker_segment_tier_selection at broker, if configured, only add segments from specific tiers to the timeline	2016-03-14 16:53:06 -07:00
navis.ryu	83e1d5d7bf	Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-15 08:39:07 +09:00
Fangjin Yang	06813b510a	Merge pull request #2571 from himanshug/gp_by_avoid_sort avoid sort while doing groupBy merging when possible	2016-03-14 14:46:51 -07:00
Nishant	773d6fe86c	Merge pull request #2646 from atomx/update-maxmind Update com.maxmind.geoip2 to 2.6.0	2016-03-14 11:20:48 -07:00
Himanshu	d51a0a0cf4	Merge pull request #2220 from gianm/appenderator-kafka Appenderators, DataSource metadata, KafkaIndexTask	2016-03-14 13:14:36 -05:00
rasahner	2861e854f0	Merge pull request #2540 from pjain1/remove_kill Remove extra parameter from deleteDataSourceSpecificInterval endpoint and correct exception message for invalid interval	2016-03-14 11:16:23 -05:00
Erik Dubbelboer	21b7572533	Update com.maxmind.geoip2 to 2.6.0 com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old). When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.	2016-03-12 09:44:00 +00:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Gian Merlino	3d2214377d	Appenderatoring. Appenderators are a way of getting more control over the ingestion process than a Plumber allows. The idea is that existing Plumbers could be implemented using Appenderators, but you could also implement things that Plumbers can't do. FiniteAppenderatorDrivers help simplify indexing a finite stream of data. Also: - Sink: Ability to consider itself "finished" vs "still writable". - Sink: Ability to return the number of rows contained within the sink.	2016-03-10 17:41:50 -08:00
Gian Merlino	92c828f904	Make SegmentHandoffNotifier Closeable.	2016-03-10 16:50:37 -08:00
Gian Merlino	ad5ffdf483	Nix Committers.supplierOf; Suppliers.ofInstance is good enough.	2016-03-10 16:50:37 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Himanshu Gupta	02dfd5cd80	update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance	2016-03-10 16:11:48 -06:00
Gian Merlino	708bc674fa	Make specifying query context booleans more consistent. Before, some needed to be strings and some needed to be real booleans. Now they can all be either one.	2016-03-08 19:38:26 -08:00
Fangjin Yang	9c2420a1bc	Merge pull request #2599 from himanshug/datasource_isolation make coordinator db polling for list of segments more robust	2016-03-08 12:43:49 -08:00
Fangjin Yang	e7018f524f	Merge pull request #2598 from himanshug/handoff_timeout optional ability to configure handoff wait timeout on realtime tasks	2016-03-08 12:43:36 -08:00
Slim Bouguerra	c72438ead0	override metric name	2016-03-08 10:58:12 -06:00
Himanshu Gupta	1288784bde	in coordinator db polling for available segments, ignore corrupted entries in segments table so that coordinator continues to load new segments even if there are few corrupted segment entries	2016-03-07 15:13:10 -06:00
Himanshu Gupta	0402636598	configurable handoffConditionTimeout in realtime tasks for segment handoff wait	2016-03-05 10:14:54 -06:00
Fangjin Yang	d06c1c5c85	Merge pull request #2583 from guobingkun/fix_multiple_specs_2 update querySegmentSpec when passing query to getQueryRunner	2016-03-02 18:05:34 -08:00
David Lim	9e74772d6b	Merge pull request #2574 from gianm/allostuff Make first few allocatePendingSegment retries quiet.	2016-03-02 16:16:53 -07:00
Bingkun Guo	cfe2dbf1eb	Merge pull request #2580 from gianm/rtc-basePersist RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 16:56:49 -06:00
Bingkun Guo	4a58462fc7	update querySegmentSpec when passing query to getQueryRunner After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment. In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.	2016-03-02 16:44:56 -06:00
Gian Merlino	e65e6a49a5	RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 13:57:53 -08:00
Gian Merlino	004028b887	Make first few allocatePendingSegment retries quiet. Some light retrying can happen during normal operation (SELECT -> INSERT races) and the ensuing log messages would be scary for users.	2016-03-02 13:40:29 -08:00
Fangjin Yang	612e327426	Merge pull request #2581 from gianm/fix-deadlock CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource.	2016-03-02 11:37:49 -08:00
Gian Merlino	7557eb2800	CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource. See stack traces here, from current master: https://gist.github.com/gianm/bd9a66c826995f97fc8f 1. The thread "qtp925672150-62" holds the lock on InternalInjectorCreator.class, used by Scopes.SINGLETON, and wants the lock on "handlers" in Lifecycle.addMaybeStartHandler called by DiscoveryModule.getServiceAnnouncer. 2. The main thread holds the lock on "handlers" in Lifecycle.addMaybeStartHandler, which it took because it's trying to add the ExecutorLifecycle to the lifecycle. main is trying to get the InternalInjectorCreator.class lock because it's running ExecutorLifecycle.start, which does some Jackson deserialization, and Jackson needs that lock in order to inject stuff into the Task it's deserializing. This patch eagerly instantiates ChatHandlerResource (which I believe is what's trying to create the ServiceAnnouncer in the qtp925672150-62 jetty thread) and the ExecutorLifecycle.	2016-03-02 10:53:42 -08:00
Gian Merlino	102fc92120	SQLMetadataConnector: Fix overzealous retries that were preventing EntryExistsException from making it out.	2016-03-01 17:20:33 -08:00
Fangjin Yang	9340cae985	Merge pull request #2457 from bjozet/docs/fixes Default value for maxRowsInMemory	2016-03-01 07:43:26 -08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Bingkun Guo	4edcb1b861	Refactor FireChief + UTs for RealtimeManagerTest Add tests that verify whether RealtimeManager is querying the correct FireChief for a specific partition make FireChief static and package private, add latches in the UT	2016-02-29 14:41:10 -06:00
Eric Tschetter	68631d89e9	Allow realtime nodes to have multiple shards of the same datasource	2016-02-29 12:30:25 -06:00
Parag Jain	6b3c96c63a	better exception for invalid interval	2016-02-26 10:02:38 -06:00

1 2 3 4 5 ...

2758 Commits