druid

Commit Graph

Author	SHA1	Message	Date
frank chen	55a01a030a	Clarify that Broker caching for groupBy v2 queries does not work (#11370 ) * Add a note * Update docs/configuration/index.md Co-authored-by: sthetland <steve.hetland@imply.io> * clarify that both of non-result level cache and result level cache are not supported Co-authored-by: sthetland <steve.hetland@imply.io>	2021-08-03 10:01:15 -07:00
Yi Yuan	f1e52ab356	add doc (#11531 ) Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-08-03 12:20:29 +08:00
Agustin Gonzalez	a2da407b70	Add error msg to parallel task's TaskStatus (#11486 ) * Add error msg to parallel task's TaskStatus * Consolidate failure block * Add failure test * Make it fail * Add fail while stopped * Simplify hash task test using a runner that fails after so many runs (parameter) * Remove unthrown exception * Use runner names to identify phase * Added range partition kill test & fixed a timing bug with the custom runner * Forbidden api * Style * Unit test code cleanup * Added message to invalid state exception and improved readability of the phase error messages for the parallel task failure unit tests	2021-08-02 12:11:28 -07:00
dependabot[bot]	cf674c833c	Bump maven-resources-plugin from 3.1.0 to 3.2.0 (#11525 ) Bumps [maven-resources-plugin](https://github.com/apache/maven-resources-plugin) from 3.1.0 to 3.2.0. - [Release notes](https://github.com/apache/maven-resources-plugin/releases) - [Commits](https://github.com/apache/maven-resources-plugin/compare/maven-resources-plugin-3.1.0...maven-resources-plugin-3.2.0) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-resources-plugin dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-08-02 09:38:34 -07:00
dependabot[bot]	2e850b5655	Bump aws.sdk.version from 1.11.884 to 1.12.37 (#11367 ) * Bump aws.sdk.version from 1.11.884 to 1.12.8 Bumps `aws.sdk.version` from 1.11.884 to 1.12.8. Updates `aws-java-sdk-core` from 1.11.884 to 1.12.8 - [Release notes](https://github.com/aws/aws-sdk-java/releases) - [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-java/compare/1.11.884...1.12.8) Updates `aws-java-sdk-ec2` from 1.11.884 to 1.12.8 - [Release notes](https://github.com/aws/aws-sdk-java/releases) - [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-java/compare/1.11.884...1.12.8) Updates `aws-java-sdk-s3` from 1.11.884 to 1.12.8 - [Release notes](https://github.com/aws/aws-sdk-java/releases) - [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-java/compare/1.11.884...1.12.8) Updates `aws-java-sdk-sts` from 1.11.884 to 1.12.8 - [Release notes](https://github.com/aws/aws-sdk-java/releases) - [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-java/compare/1.11.884...1.12.8) Updates `aws-java-sdk-kinesis` from 1.11.884 to 1.12.8 - [Release notes](https://github.com/aws/aws-sdk-java/releases) - [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-java/compare/1.11.884...1.12.8) Updates `aws-java-sdk-rds` from 1.11.884 to 1.12.8 - [Release notes](https://github.com/aws/aws-sdk-java/releases) - [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-java/compare/1.11.884...1.12.8) --- updated-dependencies: - dependency-name: com.amazonaws:aws-java-sdk-core dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: com.amazonaws:aws-java-sdk-ec2 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: com.amazonaws:aws-java-sdk-s3 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: com.amazonaws:aws-java-sdk-sts dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: com.amazonaws:aws-java-sdk-kinesis dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: com.amazonaws:aws-java-sdk-rds dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Fix license and bump to latest aws Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Suneet Saldanha <suneet@apache.org>	2021-08-01 00:12:57 -07:00
Dongjoon Hyun	dbed4424b5	Upgrade ORC to 1.6.9 (#11518 )	2021-07-31 23:33:03 -07:00
Jihoon Son	8ba7f6a48c	Fix incorrect result of exact topN on an inner join with limit (#11517 )	2021-07-31 15:55:49 -07:00
Jihoon Son	98312d54cf	Fix CI for master (#11522 )	2021-07-30 15:41:21 -07:00
Victoria Lim	949484728f	docs fix for doubleMean description (#11513 ) * fix for doubleMean description * include quantile aggregator description from Suneet * update hyperlink to quantiles aggregator	2021-07-30 12:39:44 -07:00
Maytas Monsereenusorn	05a7da792f	compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded (#11510 ) * fix compaction status api * fix checkstyle * address comment	2021-07-30 22:19:24 +07:00
Harini Rajendran	995d99d9e4	add ingest/notices/queueSize metric to give visibility into supervisor notices queue size (#11417 )	2021-07-30 07:59:26 -07:00
Yuanli Han	b83742179a	Reduce method invocation of reservoir sampling (#11257 ) * reduce method invocation of reservoir sampling * add a dynamic parameter and add benchmark * rebase	2021-07-30 22:09:50 +08:00
Jihoon Son	59e61e127a	Add a section for updating the release branch before making a tag in the release process (#11449 )	2021-07-29 17:05:33 -07:00
Xavier Léauté	4bca7f014e	update error-prone to 2.8.0 with fix for crashing check (#11494 ) * error-prone 2.8.0 fixes https://github.com/google/error-prone/issues/2396 * fix for a few ignored return values * fix unknown args in sub-modules	2021-07-29 09:13:46 -07:00
Jonathan Wei	9b250c54aa	Allow kill task to mark segments as unused (#11501 ) * Allow kill task to mark segments as unused * Add IndexerSQLMetadataStorageCoordinator test * Update docs/ingestion/data-management.md Co-authored-by: Jihoon Son <jihoonson@apache.org> * Add warning to kill task doc Co-authored-by: Jihoon Son <jihoonson@apache.org>	2021-07-29 10:48:43 -05:00
John Gozde	280c08045f	Update awesome-code-style (#11503 )	2021-07-28 09:25:18 -07:00
Peter Marshall	0de1837ff7	Docs - partitioning note re: skew / dim concatenation + nav update (#11488 ) * Update native-batch.md Knowledge from https://the-asf.slack.com/archives/CJ8D1JTB8/p1595434977062400 * Update native-batch.md * Fixed broken link + some grammar	2021-07-27 09:17:01 -07:00
Kashif Faraz	8a4e27f51d	Select broker based on query context parameter `brokerService` (#11495 ) This change allows the selection of a specific broker service (or broker tier) by the Router. The newly added ManualTieredBrokerSelectorStrategy works as follows: Check for the parameter brokerService in the query context. If this is a valid broker service, use it. Check if the field defaultManualBrokerService has been set in the strategy. If this is a valid broker service, use it. Move on to the next strategy	2021-07-27 20:56:05 +05:30
Peter Marshall	60fdf7a734	Rollup measurement query amended (#11479 ) By user request from https://groups.google.com/g/druid-user/c/bFkOtE-1eQg - gives the measure as a floating point instead of an integer.	2021-07-27 06:29:29 -07:00
Jonathan Wei	676efb1b3f	Fix integration test credential resource path handling (#11487 ) This PR fixes an issue with the integration test copy_resources.sh script. The "install druid jars" portion was removing the $SHARED_DIR/docker directory, which wipes out the $SHARED_DIR/docker/extensions and $SHARED_DIR/docker/credentials directories created just before, which leads to issues later in the script when copying resources to the $SHARED_DIR/docker/credentials/ dir.	2021-07-27 12:32:34 +05:30
Maytas Monsereenusorn	c068906fca	Make intermediate store for shuffle tasks an extension point (#11492 ) * add interface * add docs * fix errors * fix injection * fix injection * update javadoc	2021-07-27 11:29:43 +07:00
Suneet Saldanha	3f456fe305	Address CVE-2021-35515 CVE-2021-36090 (#11496 ) * Address CVE-2021-35515 CVE-2021-36090 Bump commons-compress to deal with new CVEs * fix licenses	2021-07-26 14:54:32 -07:00
Peter Marshall	973e5bf7d0	Docs - HLL lgK tip and slight layout change (#11482 ) * HLL lgK and a tip Knowledge transfer from https://the-asf.slack.com/archives/CJ8D1JTB8/p1600699967024200. Attempted to make a connection between the SQL HLL function and the HLL underneath without getting too complicated. Also added a note about using K over 16 being pretty much pointless. * Corrected spelling * Create datasketches-hll.md Put roll-up back to rollup * Update docs/development/extensions-core/datasketches-hll.md Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com> Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>	2021-07-26 12:28:53 -07:00
Abhishek Agarwal	fcb908d505	Make buildQueryRunnerForSegment protected in ServerManager(#11493 ) This is a minor change in ServerManager. Any sub-class can access the buildQueryRunnerForSegment in an extension if required.	2021-07-24 11:08:28 +05:30
Rohan Garg	c98e7c3aa3	Fix left join SQL queries with IS NOT NULL filter (#11434 ) This PR fixes the incorrect results for query : SELECT dim1, l1.k FROM foo LEFT JOIN (select k \|\| '' as k from lookup.lookyloo group by 1) l1 ON foo.dim1 = l1.k WHERE l1.k IS NOT NULL (in CalciteQueryTests) In the current code, the WHERE clause gets removed from the top of the left join and is pushed to the table foo leading to incorrect results. The fix for such a situation is done by : Converting such left joins into inner joins (since logically the mentioned left join query is equivalent to an inner join) using Calcite while maintaining that the druid execution layer can execute such inner joins. Preferring converted inner joins over original left joins in our cost model	2021-07-23 20:57:19 +05:30
Maytas Monsereenusorn	161f4dbc0e	Add integration tests for S3 Assume Role ingestion feature (#11472 ) * add IT for S3 assume role * fix checkstyle * fix test * fix pom * fix test	2021-07-23 10:09:09 +07:00
Lucas Capistrant	9767b42e85	Add a new metric query/segments/count that is not emitted by default (#11394 ) * Add a new metric query/segments/count that is not emitted by default * docs * test the default implementation of the metric * fix spelling error in docs * document the fact that query retries will result in additional metric emissions * update using recommended text from @jihoonson	2021-07-22 17:57:35 -07:00
Abhishek Agarwal	ce1faa5635	Make SegmentLoader extensible and customizable (#11398 ) This PR refactors the code related to segment loading specifically SegmentLoader and SegmentLoaderLocalCacheManager. SegmentLoader is marked UnstableAPI which means, it can be extended outside core druid in custom extensions. Here is a summary of changes SegmentLoader returns an instance of ReferenceCountingSegment instead of Segment. Earlier, SegmentManager was wrapping Segment objects inside ReferenceCountingSegment. That is now moved to SegmentLoader. With this, a custom implementation can track the references of segments. It also allows them to create custom ReferenceCountingSegment implementations. For this reason, the constructor visibility in ReferenceCountingSegment is changed from private to protected. SegmentCacheManager has two additional methods called - reserve(DataSegment) and release(DataSegment). These methods let the caller reserve or release space without calling SegmentLoader#getSegment. We already had similar methods in StorageLocation and now they are available in SegmentCacheManager too which wraps multiple locations. Refactoring to simplify the code in SegmentCacheManager wherever possible. There is no change in the functionality.	2021-07-22 18:00:49 +05:30
benkrug	167c45260c	Update druid-vs-kudu.md (#11470 ) small typo - "need" to "needed"	2021-07-21 22:58:14 +08:00
Maytas Monsereenusorn	6ce3b6ca2d	Improve documentation for druid.indexer.autoscale.workerCapacityHint config (#11444 ) * fix doc * address comments * Update docs/configuration/index.md Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com> Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>	2021-07-21 12:48:56 +07:00
Jihoon Son	0453e461f6	Add errorMessage in taskStatus for task failures in middleManagers/indexers/peons (#11446 ) * Add error message; add unit tests for ForkingTaskRunner * add tests * fix comment * unused import * add exit code in error message * fix test	2021-07-20 21:34:53 -07:00
Jihoon Son	84c957f541	Add more sql tests for groupby queries (#11454 ) * Add more sql tests for simple groupby queries * unused import * fix tests * javadocs * unused import	2021-07-20 21:05:11 -07:00
zachjsh	a2538d264d	Add back missing unit test coverage in AvroFlattenerMakerTest (#11451 ) * Add back missing unit test coverage in AvroFlattenerMakerTest Adds back test coverage for Avro flattener that was mistakenly removed in https://github.com/apache/druid/pull/10505. Recfactored the tests a bit too. * resolve checkstyle warnings	2021-07-20 18:27:00 -07:00
Paul Rogers	aa8c615ac2	Updates to source and doc build pages (#11464 ) * Updates to source and doc build pages. Clarifies a few points for newbies. * Fixed spelling error And added spellcheck info to website README file.	2021-07-20 18:07:34 -07:00
Vadim Ogievetsky	aee2f2e24f	Web console: better handle BigInt math (#11450 ) * better handle BigInt math * correctly brace bigint * feedback fixes and tests	2021-07-20 17:17:19 -07:00
Suneet Saldanha	1937b5c0da	Pin Jetty version to 9.4.x (#11453 ) Major version bumps in jetty are too scary for now. So let's keep up to date with the latest 9.4.x	2021-07-20 14:50:39 -07:00
Abhishek Agarwal	94c1671eaf	Split SegmentLoader into SegmentLoader and SegmentCacheManager (#11466 ) This PR splits current SegmentLoader into SegmentLoader and SegmentCacheManager. SegmentLoader - this class is responsible for building the segment object but does not expose any methods for downloading, cache space management, etc. Default implementation delegates the download operations to SegmentCacheManager and only contains the logic for building segments once downloaded. . This class will be used in SegmentManager to construct Segment objects. SegmentCacheManager - this class manages the segment cache on the local disk. It fetches the segment files to the local disk, can clean up the cache, and in the future, support reserve and release on cache space. [See https://github.com/Make SegmentLoader extensible and customizable #11398]. This class will be used in ingestion tasks such as compaction, re-indexing where segment files need to be downloaded locally.	2021-07-21 00:14:19 +05:30
Junegunn Choi	69b0c6a47b	"druid.request.logging.type" should allow "noop" value (#10774 )	2021-07-20 09:10:46 -07:00
jerryleooo	c7fdf1d685	Fix typo in ingestion spec sample (#11433 ) * Update index.md Fix typo in the ingestion spec sample * fixed more typos	2021-07-19 22:02:21 -07:00
Dongjoon Hyun	5037493e45	Bump commons-io to 2.11.0 (#11460 ) * Bump commons-io to 2.11.0 * Address comments * Remove try catch * Fix checkstyle	2021-07-19 15:47:14 -07:00
Clint Wylie	2705fe98fa	Fix avro json serde issues (#11455 )	2021-07-20 00:32:05 +08:00
Jihoon Son	8729b40893	Add the error message in taskStatus for task failures in overlord (#11419 ) * add error messages in taskStatus for task failures in overlord * unused imports * add helper message for logs to look up * fix tests * fix counting the same task failures more than once * same fix for HttpRemoteTaskRunner	2021-07-15 13:14:28 -07:00
sthetland	a366753ba5	Consolidate multi-value dimension doc and highlight configurability (#11428 ) * Clarify options for multi-value dims * Add first example	2021-07-15 10:19:10 -07:00
Maytas Monsereenusorn	8d7d60d18e	Improve Auto scaler pendingTaskBased provisioning strategy to handle when there are no currently running worker node better (#11440 ) * fix pendingTaskBased * fix doc * address comments * address comments * address comments * address comments * address comments * address comments * address comments	2021-07-15 06:52:25 +07:00
Maytas Monsereenusorn	d3e82b1114	speed up test (#11442 )	2021-07-14 21:14:38 +07:00
zachjsh	ace4b807f4	update dependency-check cron job to purge cache before checking (#11436 ) The dependency-check cron job now purges any caches NVD before performing dependency check. Without this, a high CVE vulernability was reported in this job a few months after the nvd was updated for it.	2021-07-13 01:43:31 -07:00
zachjsh	73711a456a	Suppress CVE-2021-27568 from json-smart 2.3 dependency (#11438 ) Dependency on hadoop 2.8.5 is preventing us form updating this dependency to a later version. We don't believe that this is a major concern since Druid eats uncaught exceptions, and only displays them in logs. This issue also should only affect ingestion jobs, which can only be run by admin type users.	2021-07-12 22:58:06 -04:00
Maytas Monsereenusorn	05d5dd9289	compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded (#11426 ) * compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded * compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded * compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded * fix test * fix test	2021-07-13 09:48:06 +07:00
Maytas Monsereenusorn	f5d53569ca	Supervisor metadata auto cleanup failing as missing Guice injection (#11424 ) * Fix Supervisor metadata auto cleanup failing as missing Guice injection * Fix Supervisor metadata auto cleanup failing as missing Guice injection * fix IT * fix IT * Update services/src/main/java/org/apache/druid/cli/CliCoordinator.java Co-authored-by: Clint Wylie <cjwylie@gmail.com> * fix * fix * fix * fix * fix * fix * fix Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2021-07-13 09:47:49 +07:00
frank chen	2236cf2234	eliminate extra object instantiation (#11345 )	2021-07-12 18:31:39 -07:00

1 2 3 4 5 ...

11239 Commits All Branches Search

11239 Commits

All Branches