druid

Commit Graph

Author	SHA1	Message	Date
Jihoon Son	84ff0d2352	Fix TSV bugs (#9199 ) * working * - support multi-char delimiter for tsv - respect "delimiter" property for tsv * default value check for findColumnsFromHeader * remove CSVParser to have a true and only CSVParser * fix tests * fix another test	2020-01-17 15:35:14 -08:00
singh	936b9bdfd0	add deets about the keyfile (#9209 )	2020-01-17 11:24:49 -08:00
Fokko Driesprong	12b84cfb33	Bump Jackson to 2.10.2 (#9173 )	2020-01-17 11:39:32 +01:00
Vadim Ogievetsky	ab2672514b	allow empty values to be set in the auto form (#9198 )	2020-01-16 21:06:51 -08:00
Maytas Monsereenusorn	68ed2a2c8f	Fix LATEST / EARLIEST Buffer Aggregator does not work on String column (#9197 ) * fix buff limit bug * add tests * add test * add tests * fix checkstyle	2020-01-16 21:02:37 -08:00
Gian Merlino	448da78765	Speed up String first/last aggregators when folding isn't needed. (#9181 ) * Speed up String first/last aggregators when folding isn't needed. Examines the value column, and disables fold checking via a needsFoldCheck flag if that column can't possibly contain SerializableLongStringPairs. This is helpful because it avoids calling getObject on the value selector when unnecessary; say, because the time selector didn't yield an earlier or later value. * PR comments. * Move fastLooseChop to StringUtils.	2020-01-16 21:02:02 -08:00
Fokko Driesprong	486c0fd149	Bump Apache Parquet to 1.11.0 (#9129 ) * Bump Parquet to 1.11.0 * Update licenses.yaml * Add parquet-format-structures	2020-01-16 16:24:25 -08:00
Gian Merlino	bd49ec03bc	Move result-to-array logic from SQL layer into QueryToolChests. (#9130 ) * Move result-to-array logic from SQL layer into QueryToolChests. * Checkstyle adjustment. * Fix typo.	2020-01-16 15:42:10 -08:00
Gian Merlino	bfcb30e48f	Add javadocs and small improvements to join code. (#9196 ) A follow-up to #9111.	2020-01-16 15:25:38 -08:00
Maytas Monsereenusorn	42359c93dd	Implement ANY aggregator (#9187 ) * Implement ANY aggregator * Add copyright headers * Add unit tests * fix BufferAggregator * Fix bug in BufferAggregator * hook up the SQL command * add check for buffer aggregator * Address comment * address comments * add docs * Address comments * add more tests for numeric columns that have null values when run in sql compatible null mode * fix checkstyle errors * fix failing tests * fix failing tests	2020-01-16 14:40:32 -08:00
Gian Merlino	a87db7f353	Add HashJoinSegment, a virtual segment for joins. (#9111 ) * Add HashJoinSegment, a virtual segment for joins. An initial step towards #8728. This patch adds enough functionality to implement a joining cursor on top of a normal datasource. It does not include enough to actually do a query. For that, future patches will need to wire this low-level functionality into the query language. * Fixups. * Fix missing format argument. * Various tests and minor improvements. * Changes. * Remove or add tests for unused stuff. * Fix up package locations.	2020-01-16 13:14:20 -08:00
Vadim Ogievetsky	09efd20b42	fix refresh button (#9195 )	2020-01-16 10:13:47 -08:00
Suneet Saldanha	92ac22d060	Link javaOpts to middlemanager runtime.properties docs (#9101 ) * Link javaOpts to middlemanager runtime.properties docs * fix broken link * reword config links	2020-01-15 21:22:49 -08:00
Suneet Saldanha	85a3d416b0	Tutorials use new ingestion spec where possible (#9155 ) * Tutorials use new ingestion spec where possible There are 2 main changes * Use task type index_parallel instead of index * Remove the use of parser + firehose in favor of inputFormat + inputSource index_parallel is the preferred method starting in 0.17. Setting the job to index_parallel with the default maxNumConcurrentSubTasks(1) is the equivalent of an index task Instead of using a parserSpec, dimensionSpec and timestampSpec have been promoted to the dataSchema. The format is described in the ioConfig as the inputFormat. There are a few cases where the new format is not supported * Hadoop must use firehoses instead of the inputSource and inputFormat * There is no equivalent of a combining firehose as an inputSource * A Combining firehose does not support index_parallel * fix typo	2020-01-15 14:08:29 -08:00
Lucas Capistrant	4716e0b585	Fix concurrency of ComplexMetrics.java (#9134 )	2020-01-15 17:19:45 +03:00
Chi Cao Minh	b2877119d0	Suppress CVE-2019-20330 for htrace-core-4.0.1 (#9189 ) CVE-2019-20330 was updated on 14 Jan 2020, which now gets flagged by the security vulnerability scan. Since the CVE is for jackson-databind, via htrace-core-4.0.1, it can be added to the existing list of security vulnerability suppressions for that dependency.	2020-01-14 21:15:24 -08:00
Chi Cao Minh	1fd05bef9a	Add jackson-mapper-asl for hdfs-storage extension (#9178 ) Previously jackson-mapper-asl was excluded to remove a security vulnerability; however, it is required for functionality (e.g., org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator).	2020-01-14 09:50:45 -08:00
Atul Mohan	ea51bc45bf	Fix nullhandling in tests (#9119 )	2020-01-12 20:19:12 -08:00
Atul Mohan	b642b1aa5b	Fix deserialization of maxBytesInMemory (#9092 ) * Fix deserialization of maxBytesInMemory * Add maxBytes check	2020-01-12 20:08:07 -08:00
Clint Wylie	85219ece13	fix null handling for arithmetic post aggregator comparator (#9159 ) * fix null handling for arithmetic postagg comparator, add test for comparator for min/max/quantile postaggs in histogram ext * fix	2020-01-10 13:49:19 -08:00
Jonathan Wei	8c53818fa9	Add numeric nulls to sample data, fix some numeric null handling issues (#9154 ) * Fix LongSumAggregator comparator null handling * Remove unneeded GroupBy test change * Checkstyle * Update other processing tests for new sample data * Remove unused code * Fix SearchQueryRunner column selectors * Fix DimensionIndexer null handling and ScanQueryRunnerTest * Fix TeamCity errors	2020-01-10 13:49:06 -08:00
Clint Wylie	f245292e5d	add middle manager and indexer worker category to tier column of services view (#9158 )	2020-01-09 12:20:42 -08:00
Jihoon Son	e27a1e8604	Fix handling nullable writableComparable in OrcStructConverter (#9138 ) * Handle nullable writableComparable in OrcStructConverter * add missing dependency	2020-01-08 13:40:24 -08:00
Clint Wylie	7439f73c23	web console services tab treat indexer as a real service (#9139 )	2020-01-07 18:14:04 -08:00
Clint Wylie	28edd3b44e	data loader style fix for double typed columns (#9137 )	2020-01-07 16:07:30 -08:00
Jonathan Wei	d1500c1328	Update Kinesis resharding information about task failures (#9104 )	2020-01-07 15:44:48 -08:00
Clint Wylie	f540216931	fix InputFormat serde issue with SeekableStream based supervisors (#9136 )	2020-01-07 16:18:54 -06:00
Clint Wylie	c248e00984	fix moment sketch null handling (#9075 )	2020-01-07 14:15:59 -06:00
Clint Wylie	7af85250cb	null handling for doubles sketch and array of doubles sketch aggs (#9112 ) * doubles sketch and array of doubles sketch aggs now skip rows with nulls in sql compatible null handling mode * formatting	2020-01-07 14:15:32 -06:00
Clint Wylie	14702429a0	fix web console data loader dimension types (#9135 )	2020-01-06 20:56:58 -08:00
Jonathan Wei	58d337186b	Graduation update for ASF release process guide and download links (#9126 ) * Graduation update for ASF release process guide and download links * Fix release vote thread typo * Fix pom.xml	2020-01-06 15:00:33 -06:00
Gian Merlino	66657012bf	Replace CaseFilteredAggregatorRule with Calcite equivalent. (#9113 ) AggregateCaseToFilterRule was added to Calcite in https://issues.apache.org/jira/browse/CALCITE-3144, and was originally copied from Druid's CaseFilteredAggregatorRule. So there isn't a good reason to keep using our version.	2020-01-04 19:11:18 -08:00
Suneet Saldanha	bdd0d0d8a5	Add avro dependency to parquet extension (#9124 ) * Add avro dependency to parquet extension If the parquet extension is loaded and an ingestionSpec uses the older format specifying a 'parser' instead of using an 'inputFormat' the job fails with the following error java.lang.TypeNotPresentException: Type org.apache.avro.generic.GenericRecord not present This change removes the exclusion of the avro package so that the missing class can be found. * Address review comments and add dependency version	2020-01-03 20:11:13 -06:00
Jonathan Wei	aa539177ec	De-incubation cleanup in code, docs, packaging (#9108 ) * De-incubation cleanup in code, docs, packaging * remove unused docs script	2020-01-03 12:33:19 -05:00
Gian Merlino	eb124a3068	Fix DistinctCountGroupByQueryTest Y2020 bug. (#9120 ) It used data with the current timestamp alongside a query that had an end instant of 2020-01-01.	2020-01-02 21:10:32 -05:00
Jonathan Wei	4e8368a5d9	Set version to 0.18.0-SNAPSHOT (#9109 )	2020-01-02 17:55:10 -05:00
Gian Merlino	18eb456fe6	S3: Improvements to prefix listing (including fix for an infinite loop) (#9098 ) * S3: Improvements to prefix listing (including fix for an infinite loop) 1) Fixes #9097, an infinite loop that occurs when more than one batch of objects is retrieved during a prefix listing. 2) Removes the Access Denied fallback code added in #4444. I don't think the behavior is reasonable: its purpose is to fall back from a prefix listing to a single-object access, but it's only activated when the end user supplied a prefix, so it would be better to simply fail, so the end user knows that their request for a prefix-based load is not going to work. Presumably the end user can switch from supplying 'prefixes' to supplying 'uris' if desired. 3) Filters out directory placeholders when walking prefixes. 4) Splits LazyObjectSummariesIterator into its own class and adds tests. * Adjust S3InputSourceTest. * Changes from review. * Include hamcrest-core.	2019-12-31 19:06:49 -05:00
Suneet Saldanha	dec619ebf4	Optimize CachingLocalSegmentAllocator#getSequenceName (#8909 ) * Optimize CachingLocalSegmentAllocator#getSequenceName Replace StringUtils#format with string addition to generate the sequence name for an interval and partition. This is faster because format uses a Matcher under the covers to replace the string format with the variables. * fix imports and add test * Add comment about optimization * Use renamed function for TaskToolbox * Move tests after refactor * Rename tests	2019-12-23 18:33:22 -08:00
Vadim Ogievetsky	320c50d24a	Web console: fix spec reset (#9081 ) * extract spec type * better text * better copy * de incubate the console * fix status dialog scss	2019-12-23 18:23:14 -08:00
Samarth Jain	9ec9619143	Handle null values for metrics in TDigest aggregators. (#9073 ) Add support for rollup during ingestion.	2019-12-23 17:49:06 -08:00
Vadim Ogievetsky	a24e2f347f	make supervisor statistics dialog more robust (#9089 )	2019-12-23 17:43:08 -08:00
Benedict Jin	7a7c948595	Exclude .asf.yaml from the configuration of the rat plugin (#9088 )	2019-12-23 13:08:23 -08:00
Fangjin Yang	2231e69b7f	Update README.md	2019-12-20 20:56:53 -08:00
Chi Cao Minh	513bb1f6da	Get proper Kinesis index task AWS credentials (#9082 ) Previously, the configured S3 credentials would be used instead of the ones configured for Kinesis for Kinesis index tasks.	2019-12-20 19:35:05 -08:00
Gian Merlino	342107b4c2	Add .asf.yaml. (#9083 ) Based on the docs at https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories.	2019-12-20 16:45:38 -08:00
Clint Wylie	8ccce9857a	fix vectorized query engine numeric filter matchers against null values (#9063 ) * fix druid-sql issue with filtering numeric columns by null values * fix vector numeric column matchers to check null vector for null matches	2019-12-20 13:15:48 -08:00
Fangjin Yang	60d896a67c	Update README.md	2019-12-19 22:32:08 -08:00
Clint Wylie	c2e9ab8100	benchmark schema with numeric dimensions and null column values (#9036 ) * benchmark schema with null column values * oops * adjustments * rename again, different null percentage so rows more variety * more schema	2019-12-19 17:45:19 -08:00
Jihoon Son	3c31493772	Add missing docs for http client configurations (#9054 ) * Add missing docs for http client configurations * fix typo * backticks	2019-12-19 17:41:04 -08:00
Suneet Saldanha	3c13444167	Fix flaky ITBasicAuthConfigurationTest (#9072 ) This test was failing to authenticate using the admin credentials. These should be available by default in the metadata store. This indicates that the credentials are not successfully being syncd before the test is run. This change increases the number of retries to 20 so that the services are syncd before the test runs	2019-12-19 17:38:55 -08:00

... 5 6 7 8 9 ...

10338 Commits All Branches Search

10338 Commits

All Branches