druid

Commit Graph

Author	SHA1	Message	Date
Kamal Gurala	dcb07d6958	Option to configure default analysis types in SegmentMetadataQuery (#4259 ) * Option to configure default analysis types * Updated Docs and renamed * Added serde tests and Null handling * Fixed Documentation * Updated implementation * Updated implementation * Updated implementation * Added usingDefaultIntervals in Builder * Updated implementation * Updated implementation and added failing test * filterSegments implementation updated * Updated imlementation * Padding * Add missing Override * Updated implementation * Fixed a naming bug * Fixed bug * Removed comment	2017-05-26 12:12:39 -07:00
zwang180	2c55a935f8	Delete a duplicate "Bucket Extraction Function" section at the bottom of "Querying"-"DimensionSpec" page (#4331 )	2017-05-25 14:16:00 -07:00
Jihoon Son	11b7b1bea6	Add support for HttpFirehose (#4297 ) * Add support for HttpFirehose * Fix document * Add documents	2017-05-25 16:13:04 -05:00
李成露(StefanLee)	22977780aa	Doc (#4217 ) * Fixed (#4216) Modify the default value of `druid.server.http.numThreads` to `Math.max(10, (Runtime.getRuntime().availableProcessors() * 17) / 16 + 2) + 30` * Fixed(#4216) Modify the default value of `druid.server.http.numThreads` to `max(10, (Number of cores * 17) / 16 + 2) + 30` * Fixed(#4216) Modify the default value of `druid.server.http.numThreads` to `max(10, (Number of cores * 17) / 16 + 2) + 30`	2017-05-23 17:04:52 +09:00
Gian Merlino	adeecc0e72	Add /isLeader call to overlord and coordinator. (#4282 ) This is useful for putting them behind load balancers or proxies, as it lets the load balancer know which server is currently active through an http health check. Also makes the method naming a little more consistent between coordinator and overlord code.	2017-05-18 20:46:13 -05:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00
David Lim	8333043b7b	add skipOffsetGaps flag (#4256 )	2017-05-16 12:19:28 -06:00
Himanshu	136b2fae72	improve query timeout handling and limit max scatter-gather bytes (#4229 ) * improve query timeout handling and limit max scatter-gather bytes * address review comments	2017-05-16 12:47:32 -05:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Himanshu	462f6482df	optionally add extensions to explicitly specified hadoopContainerClassPath (#4230 ) * optionally add extensions to explicitly specified hadoopContainerClassPath * note extensions always pushed in hadoop container when druid.extensions.hadoopContainerDruidClasspath is not provided explicitly	2017-05-08 14:24:14 -05:00
Himanshu	417714d228	additional lookup status discovery http endpoints at coordinator (#4228 ) * additional lookup status discovery http endpoints at coordinator * more changes * jsonize the error msgs as well * fix tests	2017-05-04 11:15:30 -07:00
Parag Jain	4502c207af	fix injection bug and documentation (#4243 )	2017-05-03 15:07:43 -05:00
hzy001	0c464f4a84	Fix docs (#4225 ) * Fix one typo Signed-off-by: Hao Ziyu <haoziyu@qiyi.com> * Fix deprecated links Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>	2017-05-01 09:55:43 -07:00
Jihoon Son	7411b18df9	Add BroadcastDistributionRule (#4077 ) * Add BroadcastDistributionRule * Add missing null check * Rename variable 'colocateDataSource' to 'colocatedDatasource' * Address comments * Document for broadcast rules * Drop segments which are not co-located anymore * Remove duplicated segment loading and dropping * Add caveat * address comments	2017-05-01 09:55:17 -07:00
Himanshu	5a5a2749cd	improvements to coordinator lookups management (#3855 ) * coordinator lookups mgmt improvements * revert replaces removal, deprecate it instead * convert and use older specs stored in db * more tests and updates * review comments * add behavior for 0.10.0 to 0.9.2 downgrade * incorporating more review comments * remove explicit lock and use LifecycleLock in LookupReferencesManager. use LifecycleLock in LookupCoordinatorManager as well * wip on LookupCoordinatorManager * lifecycle lock * refactor thread creation into utility method * more review comments addressed * support smooth roll back of lookup snapshots from 0.10.0 to 0.9.2 * correctly use LifecycleLock in LookupCoordinatorManager and remove synchronization from start/stop * run lookup mgmt on leader coordinator only * wip: changes to do multiple start() and stop() on LookupCoordinatorManager * lifecycleLock fix usage in LookupReferencesManagerTest * add LifecycleLock back * fix license hdr * some fixes * make LookupReferencesManager.getAllLookupsState() consistent while still being lockless * address review comments * addressing leventov's comments * address charle's comments * add IOE.java * for safety in LookupReferencesManager mainThread check for lifecycle started state on each loop in addition to interrupt * move thread creation utility method to Execs * fix names * add tests for LookupCoordinatorManager.lookupManagementLoop() * add further tests for figuring out toBeLoaded and toBeDropped on LookupCoordinatorManager * address leventov comments * remove LookupsStateWithMap and parameterize LookupsState * address review comments * address more review comments * misc fixes	2017-04-28 08:41:38 -05:00
Gian Merlino	631068b099	Fix broken DataSketches link. (#4221 ) * Fix broken DataSketches link. * Better fixed link.	2017-04-27 17:37:12 -07:00
Himanshu	40057570f3	doc update on overlord console url when coordinator is acting as overlord (#4213 )	2017-04-26 15:03:54 -07:00
asrayousuf	e4fbc2bc5b	Updating the description of useCache (#4200 ) Updating the description of useCache Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment	2017-04-25 10:26:15 -07:00
satishbhor	d51097c809	Fix lz4 library incompatibility in kafka-indexing-service extension (#4115 ) * Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension #4115	2017-04-25 12:23:51 +09:00
Jihoon Son	5b69f2eff2	Make timeout behavior consistent to document (#4134 ) * Make timeout behavior consistent to document * Refactoring BlockingPool and add more methods to QueryContexts * remove unused imports * Addressed comments * Address comments * remove unused method * Make default query timeout configurable * Fix test failure * Change timeout from period to millis	2017-04-19 09:47:53 +09:00
Gian Merlino	b2954d5fea	Better groupBy error messages and docs around resource limits. (#4162 ) * Better groupBy error messages and docs around resource limits. * Fix BufferGrouper test from datasketches. * Further clarify.	2017-04-13 10:38:53 -07:00
Xiuming Chen	7e4e5510e0	Outdated property names (#4146 ) Outdated property names?	2017-04-05 16:37:38 -07:00
Dongkyu Hwangbo	0d2e91ed50	Adding Kafka-emitter (#3860 ) * Initial commit * Apply another config: clustername * Rename variable * Fix bug * Add retry logic * Edit retry logic * Upgrade kafka-clients version to the most recent release * Make callback single object * Write documentation * Rewrite error message and emit logic * Handling AlertEvent * Override toString() * make clusterName more optional * bump up druid version * add producer.config option which make user can apply another optional config value of kafka producer * remove potential blocking in emit() * using MemoryBoundLinkedBlockingQueue * Fixing coding convention * Remove logging every exception and just increment counting * refactoring * trivial modification * logging when callback has exception * Replace kafka-clients 0.10.1.1 with 0.10.2.0 * Resolve the problem related of classloader * adopt try statement * code reformatting * make variables final * rewrite toString	2017-04-04 14:07:43 -07:00
JackyWoo	a0f2cf05d5	Add EqualDistributionWithAffinityWorkerSelectStrategy which balance w… (#3998 ) * Add EqualDistributionWithAffinityWorkerSelectStrategy which balance work load within affinity workers. * add docs to equalDistributionWithAffinity	2017-03-25 19:15:49 -07:00
Gian Merlino	dd6c0ab509	Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. (#4055 ) * Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. * Fix tests.	2017-03-24 17:38:36 -07:00
Himanshu	de081c711b	RealtimeIndexTask to support alertTimeout in context (#4089 ) * RealtimeIndexTask to support alertTimeout in context and raise alert if task process exists after the timeout * move alertTimeout config to tuningConfig and document	2017-03-24 12:48:12 -07:00
Gian Merlino	b4289c0004	Remove "granularity" from IngestSegmentFirehose. (#4110 ) It wasn't doing anything useful (the sequences were being concatted, and cursor.getTime() wasn't being called) and it defaulted to Granularities.NONE. Changing it to Granularities.ALL gave me a 700x+ performance boost on a small dataset I was reindexing (2m27s to 365ms). Most of that was from avoiding making a lot of unnecessary column selectors.	2017-03-24 10:28:54 -07:00
Erik Dubbelboer	2cbc4764f8	Comparing dimensions to each other in a filter (#3928 ) Comparing dimensions to each other using a select filter	2017-03-23 18:23:46 -07:00
Gian Merlino	db15d494ca	Update docs for query filter HavingSpecs. (#4063 )	2017-03-15 13:59:09 -04:00
hzy001	c4f44c0590	Update the docs (#4059 ) Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>	2017-03-15 10:32:29 -04:00
Gian Merlino	3216134f8c	SQL: Make row extractions extensible and add one for lookups. (#3991 ) This is a reopening of #3989, since that PR was merged to master prematurely and accidentally.	2017-03-13 21:56:16 -07:00
Gian Merlino	cab2e2f5d5	Add docs about filtering and indexes on numeric columns. (#4035 )	2017-03-10 12:48:59 -08:00
Gian Merlino	960769c583	SQL: Fix example INFORMATION_SCHEMA query. (#4017 )	2017-03-06 16:07:47 -08:00
Gian Merlino	4ca5270e88	Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. (#4004 ) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.	2017-03-06 12:27:02 -06:00
kaijianding	19ac1c7c2c	Add SameIntervalMergeTask for easier usage of MergeTask (#3981 ) * Add SameIntervalMergeTask for easier usage of MergeTask * fix a bug and add ut * remove same_interval_merge_sub from Task.java and remove other no needed code	2017-03-06 11:21:32 -06:00
Gian Merlino	337f3870d8	Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. (#4007 ) * Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. * Remove unused import. * Use defaults in cache key.	2017-03-04 17:41:59 -08:00
Gian Merlino	af5a4cce3c	SQL: Clarify approximate distinct count behavior. (#4000 )	2017-03-03 13:42:30 -08:00
Himanshu	e7e3c2dc5a	support singleThreaded flag for groupBy-v2 as well (#3992 )	2017-03-03 23:43:06 +05:30
Gian Merlino	4a56d7d8a0	SQL: Ability to generate exact distinct count queries. (#3999 )	2017-03-03 23:40:36 +05:30
Gian Merlino	3e8dbd59f8	Fix groupBy docs to reflect that 'v2' is default. (#3993 )	2017-03-02 15:13:39 -08:00
Gian Merlino	e63eefd7ff	Revert "SQL: Make row extractions extensible and add one for lookups. (#3989 )" The PR was merged to master accidentally. This reverts commit `23927a3c96`.	2017-03-01 17:06:12 -08:00
Jonathan Wei	5fb1638534	Add default configuration for select query 'fromNext' parameter (#3986 ) * Add default configuration for select query 'fromNext' parameter * PR comments * Fix PagingSpec config injection * Injection fix for test	2017-03-01 17:05:35 -08:00
Gian Merlino	23927a3c96	SQL: Make row extractions extensible and add one for lookups. (#3989 ) * SQL: Make row extractions extensible and add one for lookups. * Fix QuantileSqlAggregatorTest.	2017-03-01 17:03:43 -08:00
Aseem Bansal	b8ba237f78	Update toc.md (#3704 )	2017-03-01 14:33:39 -08:00
Fokko Driesprong	add17fa7db	Remove the metadataUpdateSpec from specfile (#3973 ) Get rid of the metadataUpdateSpec section in the json example to ingest parquet into druid. When this element is present, it will fail start an indexing job.	2017-03-01 14:24:36 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
Jonathan Wei	a08660a9ca	Support ingestion of long/float dimensions (#3966 ) * Support ingestion for long/float dimensions * Allow non-arrays for key components in indexing type strategy interfaces * Add numeric index merge test, fixes * Docs for numeric dims at ingestion * Remove unused import * Adjust docs, add aggregate on numeric dims tests * remove unused imports * Throw exception for bitmap method on numerics * Move typed selector creation to DimensionIndexer interface * unused imports * Fix * Remove unused DimensionSpec from indexer methods, check for dims first in inc index storage adapter * Remove spaces	2017-02-28 19:04:41 -08:00
kaijianding	ef6a19c81b	buildV9Directly in MergeTask and AppendTask (#3976 ) * buildV9Directly in MergeTask and AppendTask * add doc	2017-02-28 10:04:32 -08:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Aseem Bansal	1098ba7a7f	Update toc.md (#3703 )	2017-02-23 09:39:06 -08:00

1 2 3 4 5 ...

1348 Commits