druid

Commit Graph

Author	SHA1	Message	Date
Jonathan Wei	a42ccb6d19	Support filtering on long columns (including __time) (#3180 ) * Support filtering on __time column * Rename DruidPredicate * Add docs for ValueMatcherFactory, add comment on getColumnCapabilities * Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate * Address PR comments (support filter on all long columns) * Use predicate factory instead of composite predicate * Address PR comments * Lazily initialize long handling in selector/in filter * Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe * Add multithreaded selector/in filter test * Fix non-final lock object in SelectorDimFilter	2016-07-20 17:08:49 -07:00
Navis Ryu	cd7337fc8a	Calculate max split size based on numMapTask in DatasourceInputFormat (#2882 ) * Calculate max split size based on numMapTask * updated docs & fixed possible ArithmeticException	2016-07-20 16:53:51 -07:00
Parag Jain	fd798d32bc	fix testSecuredGetServer ut (#3262 )	2016-07-20 10:20:13 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Gian Merlino	50db86cb17	Quickstart: Use hadoopyString for batch indexing instead of string. (#3263 )	2016-07-19 10:18:10 -07:00
Nishant	47894c4eff	add comment for default hadoop coordinates (#3257 ) 1) Modify CliHadoopIndexer to share constant from `TaskConfig.DEFAULT_DEFAULT_HADOOP_COORDINATES` 2) add comment to pom.xml as discussed in https://github.com/druid-io/druid/pull/3044 fix name	2016-07-18 15:23:11 -07:00
Emanuele Cesena	a9a73c5f71	Distribution: pull-deps compiled hadoop version (#3044 )	2016-07-18 09:39:15 -07:00
Gian Merlino	13d8d96bc6	Update to guice-4.1.0. (#3222 )	2016-07-18 08:08:43 -07:00
Gian Merlino	dd4ec751d0	Update docs for working with Hadoop dependencies. (#3252 ) - Attempt to make things clearer in general - Point out that HDFS deep storage and MR jobs don't use the same loading mechanism - Recommend using mapreduce.job.classloader = true when possible	2016-07-18 07:47:58 -05:00
Himanshu	3f82108d15	optionally enable coordinator auto kill tasks on all dataSources via dynamic config (#3250 )	2016-07-17 18:47:52 -07:00
Nishant	7995818220	Increase test timeout to prevent failing on slow machines (#3224 ) constantly timing out on one of slow build machines, increasing the timeout fixed it. Running io.druid.granularity.QueryGranularityTest Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.776 sec - in io.druid.granularity.QueryGranularityTest	2016-07-17 18:44:48 -07:00
Gian Merlino	90f5d8cd17	Fix path in cluster.md. (#3253 )	2016-07-17 08:30:20 -07:00
Gian Merlino	6cd1f5375b	Better harmonized dimensions for query metrics. (#3245 ) All query metrics now start with toolChest.makeMetricBuilder, and all of those now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id" moved to common code, since all query metrics added it anyway. In particular this will add query-type specific dimensions like "threshold" and "numDimensions" to servlet-originated metrics like query/time.	2016-07-14 11:55:51 -07:00
Hyukjin Kwon	55e7a52475	Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215 )	2016-07-14 09:19:17 -07:00
Nishant	a1715c8cda	fix-3237 (#3244 ) DruidBroker use FilteredServerInventoryView instead of ServerInventoryView	2016-07-13 22:30:35 -07:00
Gian Merlino	6a03a0cfec	Fix ingest/persist/backPressure docs. (#3243 )	2016-07-13 21:56:28 -07:00
Gian Merlino	c622a25236	BenchmarkDataGenerator: Don't generate timestamps at the end instant of the interval. (#3242 ) Because timestamps at the end instant are not actually part of the interval. This affected benchmark numbers, since it meant some data points would not be queried (the interval for the query was based on getDataInterval) and also the TimestampCheckingOffsets could not use the allWithinThreshold optimization.	2016-07-14 10:20:10 +05:30
Charles Allen	a931debf79	Optionally intern ServerInventoryView inventory objects. (#3238 )	2016-07-14 08:49:26 +05:30
Gian Merlino	3ab4a4efbc	Fix formatting in granularities doc. (#3229 )	2016-07-08 09:29:58 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Charles Allen	5d9fd0a713	Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming (#3043 ) * Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming * Missed query from #2859 * Make inReadOnlyTransaction part of SQLMetadataConnector	2016-07-06 16:55:27 -07:00
Charles Allen	3f1681c16c	Caffeine cache extension (#3028 ) * Initial commit of caffeine cache * Address code comments * Move and fixup README.md a bit * Improve caffeine readme information * Cleanup caffeine pom * Address review comments * Bump caffeine to 2.3.1 * Bump druid version to 0.9.2-SNAPSHOT * Make test not fail randomly. See https://github.com/ben-manes/caffeine/pull/93#issuecomment-227617998 for an explanation * Fix distribution and documentation * Add caffeine to extensions.md * Fix links in extensions.md * Lexicographic	2016-07-06 15:42:54 -07:00
Gian Merlino	b8a4f4ea7b	DumpSegment: Add --dump bitmaps option. (#3221 ) Also make --dump metadata respect --column.	2016-07-06 12:42:50 -07:00
Gian Merlino	fdc7e88a7d	Allow queries with no aggregators. (#3216 ) This is actually reasonable for a groupBy or lexicographic topNs that is being used to do a "COUNT DISTINCT" kind of query. No aggregators are needed for that query, and including a dummy aggregator wastes 8 bytes per row. It's kind of silly for timeseries, but why not.	2016-07-06 20:38:54 +05:30
Charles Allen	bfa5c05aaa	Make global lookup cache introspector class public (#3199 ) * Make global lookup cache introspector class public * Fixes #3187 * Make KafkaLookupExtractorIntrospectionHandler a public static class	2016-07-01 15:50:57 -07:00
Himanshu	e1313e4b90	add log msg when event recvr firehose buffer is full (#3209 )	2016-07-01 17:35:30 -05:00
Fangjin Yang	8eeae2e844	remove bad docs on setting up clusters (#3188 )	2016-07-01 15:41:40 -05:00
Parag Jain	99844dfeb5	remove need for tmp extensions dir (#3211 ) correct lib path relative to base distribution dir	2016-07-01 12:55:57 -07:00
Bingkun Guo	d2636d1a64	[pull-deps] If --clean flag is not set, skip creating root extension directories if they already exist. (#3130 )	2016-07-01 11:18:57 -05:00
Charles Allen	8b7d9750ee	Update extension docs for global lookup module (#3206 )	2016-06-29 12:51:52 -07:00
Xavier Léauté	485e381387	remove datasource from hadoop output path (#3196 ) fixes #2083, follow-up to #1702	2016-06-29 08:53:45 -07:00
Gian Merlino	4c9aeb7353	Revert "update druid console version (#3189 )" (#3203 ) This reverts commit `496b801bc3`.	2016-06-29 08:29:57 -07:00
Jonathan Wei	f3a3662133	Fix compile error in SearchBinaryFnTest (#3201 )	2016-06-29 09:44:45 -05:00
David Lim	b24425a280	update docs with new behavior (#3200 )	2016-06-28 16:17:04 -07:00
jaehong choi	efbcbf5315	Support alphanumeric sort in search query (#2593 ) * support alphanumeric sort in search query * address a comment about handling equals() and hashCode() * address comments * add Ut for string comparators * address a comment about space indentations.	2016-06-28 15:06:18 -07:00
David Lim	1d40df4bb7	fix kafka consumer concurrent access during shutdown (#3193 )	2016-06-28 13:23:17 -07:00
Xavier Léauté	496b801bc3	update druid console version (#3189 )	2016-06-27 18:02:40 -07:00
du00cs	bf53490d70	fix: no split file will throw IndexOutOfBounds Exception (#3179 )	2016-06-26 12:50:18 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Nishant	0aa7d71ca5	Add doc link to eclipse formatting settings as well (#3131 )	2016-06-24 15:27:50 -07:00
Nishant	94b3c74cdc	Druid launch script improvements (#3175 ) * Add status command to launch scripts * make druid init script to pick up config directories from environment variables make druid init script to pick up config directories from environment variables	2016-06-24 15:02:34 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
Charles Allen	6be18376c0	Make forking task runner have more informative thread names during the long-blocking part (#3172 ) * Make forking task runner have more informative thread names during the long-blocking part * Make string.format do the work	2016-06-24 08:56:01 -07:00
Charles Allen	15f833a861	Make extension classloader caching keyed on directory (#3165 ) * Make extension classloaders keyed by extension directory * Fixes #3163 * Add in same-directory-name unit test	2016-06-23 17:13:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	da660bb592	DumpSegment tool. (#3182 ) Fixes #2723.	2016-06-23 14:37:50 -07:00
Gian Merlino	a437fb150b	Fix SegmentMetadataQuery when queryGranularity is requested but not present. (#3181 )	2016-06-23 14:30:50 -07:00
Nishant	2696b0c451	Retry for transient exceptions while doing cleanup for Hadoop Jobs (#3177 ) * fix 1828 fixes https://github.com/druid-io/druid/issues/1828 * remove unused import * Review comment	2016-06-23 13:38:47 -07:00
Nishant	6f330dc816	Better handling for parseExceptions for Batch Ingestion (#3171 ) * Better handling for parseExceptions * make parseException handling consistent with Realtime * change combiner default val to true * review comments * review comments	2016-06-22 16:38:29 -07:00

... 2 3 4 5 6 ...

7480 Commits All Branches Search

7480 Commits

All Branches