druid

Commit Graph

Author	SHA1	Message	Date
Dave Li	c4e8440c22	Adds long compression methods (#3148 ) * add read * update deprecated guava calls * add write and vsizeserde * add benchmark * separate encoding and compression * add header and reformat * update doc * address PR comment * fix buffer order * generate benchmark files * separate encoding strategy and format * fix benchmark * modify supplier write to channel * add float NONE handling * address PR comment * address PR comment 2	2016-08-30 16:17:46 -07:00
Hamlet Lee	e4f0eac8e6	Fix issue #2707 (#2708 )	2016-08-16 12:19:44 -05:00
Gian Merlino	a2bcd97512	IncrementalIndex: Fix multi-value dimensions returned from iterators. (#3344 ) They had arrays as values, which MapBasedRow doesn't understand and toStrings rather than converting to lists.	2016-08-10 08:47:29 -07:00
Gian Merlino	9437a7a313	HLL: Avoid some allocations when possible. (#3314 ) - HLLC.fold avoids duplicating the other buffer by saving and restoring its position. - HLLC.makeCollector(buffer) no longer duplicates incoming BBs. - Updated call sites where appropriate to duplicate BBs passed to HLLC.	2016-08-03 18:08:52 -07:00
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Navis Ryu	cd7337fc8a	Calculate max split size based on numMapTask in DatasourceInputFormat (#2882 ) * Calculate max split size based on numMapTask * updated docs & fixed possible ArithmeticException	2016-07-20 16:53:51 -07:00
Hyukjin Kwon	55e7a52475	Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215 )	2016-07-14 09:19:17 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Nishant	2696b0c451	Retry for transient exceptions while doing cleanup for Hadoop Jobs (#3177 ) * fix 1828 fixes https://github.com/druid-io/druid/issues/1828 * remove unused import * Review comment	2016-06-23 13:38:47 -07:00
Nishant	6f330dc816	Better handling for parseExceptions for Batch Ingestion (#3171 ) * Better handling for parseExceptions * make parseException handling consistent with Realtime * change combiner default val to true * review comments * review comments	2016-06-22 16:38:29 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	778f97a80e	attempt to fix-2906 (#2985 ) * attempt to fix-2984 * review comments * Add test	2016-05-18 15:12:38 -05:00
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
Navis Ryu	49ef4d96cb	Merge pull request #2802 from navis/optimize_multiplepath_concat Optimize adding lots of paths to pathspec	2016-04-11 23:35:28 -05:00
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Gian Merlino	977e867ad8	Downgrade geoip2, exclude com.google.http-client. Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting to run Druid along with an upgraded com.google.http-client) while preventing Jackson conflicts pointed out in #2717. Fixes #2717. This reverts commit `21b7572533`.	2016-03-25 14:43:22 -07:00
Gian Merlino	ff25325f3b	Improved docs for multi-value dimensions. - Add central doc for multi-value dimensions, with some content from other docs. - Link to multi-value dimension doc from topN and groupBy docs. - Fixes a broken link from dimensionspecs.md, which was presciently already linking to this nonexistent doc. - Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes "multi-value") in favor of "multi-value".	2016-03-22 14:40:55 -07:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Himanshu	ea3281ad78	Merge pull request #2645 from atomx/gs-scheme Add gs:// hdfs support	2016-03-14 22:15:42 -05:00
Erik Dubbelboer	375620cfb3	Add gs:// hdfs support Used to access google cloud storage	2016-03-12 08:57:57 +00:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Fangjin Yang	1e49092ce7	Merge pull request #2627 from himanshug/fix_datasource_inputformat_locations fix regression - bug in DatasourceInputFormat best effort split location finder code	2016-03-10 13:46:04 -08:00
Himanshu Gupta	eab8a0b54d	in DatasourceInputFormat code for determining segment block locations avoid the split calulation by helper TextInputFormat	2016-03-10 14:28:53 -06:00
Nishant	ba1185963b	Fix a bunch of dependencies * Eliminate exclusion groups from pull-deps * Only consider dependency nodes in pull-deps if they are not in the following scopes * provided * test * system * Fix a bunch of `<scope>provided</scope>` missing tags * Better exclusions for a couple of problematic libs	2016-03-10 10:18:08 -08:00
Bingkun Guo	c20d7682a9	log exceptions correctly in DatasourceInputFormat and IndexGeneratorJob	2016-03-09 13:41:31 -06:00
gaodayue	a6dc3703ca	use ISODataTimeFormat for both hdfs and viewfs schema to support Federationed HDFS	2016-03-08 13:55:05 +08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Himanshu Gupta	09ffcae4ae	give user the option to specify the segments for dataSource inputSpec	2016-02-21 23:15:31 -06:00
Himanshu Gupta	2faae9d0d1	In JobHelper.makeSegmentOutputPath(..) use DataSegmentPusherUtils to construct the segment storage path	2016-02-09 21:42:32 -06:00
Himanshu Gupta	b3437825f0	add ignoreWhenNoSegments flag to optionally ignore the dataSource inputSpec when no segments were found	2016-01-26 17:23:55 -06:00
binlijin	cd1c71ceb4	rename persistBackgroundCount to numBackgroundPersistThreads	2016-01-22 14:29:41 +08:00
Charles Allen	2a69a58570	Merge pull request #2149 from binlijin/master Do persist IncrementalIndex in another thread in IndexGeneratorReducer	2016-01-20 17:06:42 -08:00
Fangjin Yang	996c1173c6	Merge pull request #2223 from navis/besteffort-split-locations Best effort to find locations for input splits	2016-01-20 16:53:43 -08:00
Fangjin Yang	695f107870	Merge pull request #2302 from metamx/lowerCaseGranPathTest Make GranularityPathSpecTest check with lower-case enums	2016-01-20 09:18:06 -08:00
Charles Allen	3c5ca3a5f2	Make GranularityPathSpecTest check with lower-case enums	2016-01-20 08:35:13 -08:00
binlijin	8e43e2c446	Do persist IncrementalIndex in another thread in IndexGeneratorReducer	2016-01-20 09:20:09 +08:00
jon-wei	747343e621	Preserve dimension order across indexes during ingestion	2016-01-19 13:34:11 -08:00
Jonathan Wei	df2906a91c	Merge pull request #2290 from gianm/index-merger-v9-stuff Respect buildV9Directly in PlumberSchools, so it works on standalone realtime.	2016-01-19 13:04:00 -08:00
Gian Merlino	1dcf22edb7	Respect buildV9Directly in PlumberSchools, so it works on standalone realtime nodes. Also parameterize some tests to run with/without buildV9Directly: - IndexGeneratorJobTest - RealtimeIndexTaskTest - RealtimePlumberSchoolTest	2016-01-19 12:15:06 -08:00
Himanshu Gupta	164b0aad7a	removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class instead of a Map to store segment metadata	2016-01-18 22:03:46 -06:00
navis.ryu	f03f7fb625	Best effort to find locations for input splits	2016-01-18 08:31:05 +09:00
Kurt Young	82ff98c2bf	add config for build v9 directly and update docs	2016-01-16 11:26:34 +08:00

1 2 3 4 5 ...

797 Commits