druid

mirror of https://github.com/apache/druid.git synced 2025-02-09 11:34:54 +00:00

Author	SHA1	Message	Date
Suneet Saldanha	7510e6e722	Fix potential NPEs in joins (#9760 ) * Fix potential NPEs in joins intelliJ reported issues with potential NPEs. This was first hit in testing with a filter being pushed down to the left hand table when joining against an indexed table. * More null check cleanup * Optimize filter value rewrite for IndexedTable * Add unit tests for LookupJoinable * Add tests for IndexedTableJoinable * Add non null assert for dimension selector * Supress null warning in LookupJoinMatcher * remove some null checks on hot path	2020-04-29 11:03:13 -07:00
Aleksei Chumagin	0642f778fa	changed Preview to Apply (#9757 )	2020-04-29 09:53:25 -07:00
Maytas Monsereenusorn	6bc64b731f	Improve "waiting for tasks complete" logic in integration tests (#9759 ) * improve waiting for tasks complete logic in integration tests * improve waiting for tasks complete logic in integration tests * fix forbidden check	2020-04-29 08:53:45 -07:00
Maytas Monsereenusorn	a107ee3ed2	Fix problem when running single integration test using -Dit.test= (#9778 ) * fix running single it * fix checksyle	2020-04-29 08:53:25 -07:00
James Dalton	b279e04a31	table fix (#9769 )	2020-04-28 11:23:24 -07:00
Francesco Nidito	e7e41e3a36	Adding support for autoscaling in GCE (#8987 ) * Adding support for autoscaling in GCE * adding extra google deps also in gce pom * fix link in doc * remove unused deps * adding terms to spelling file * version in pom 0.17.0-incubating-SNAPSHOT --> 0.18.0-SNAPSHOT * GCEXyz -> GceXyz in naming for consistency * add preconditions * add VisibleForTesting annotation * typos in comments * use StringUtils.format instead of String.format * use custom exception instead of exit * factorize interval time between retries * making literal value a constant * iter all network interfaces * use provided on google (non api) deps * adding missing dep * removing unneded this and use Objects methods instead o 3-way if in hash and comparison * adding import * adding retries around getRunningInstances and adding limit for operation end waiting * refactor GceEnvironmentConfig.hashCode * 0.18.0-SNAPSHOT -> 0.19.0-SNAPSHOT * removing unused config * adding tests to hash and equals * adding nullable to waitForOperationEnd * adding testTerminate * adding unit tests for createComputeService * increasing retries in unrelated integration-test to prevent sporadic failure (hopefully) * reverting queryResponseTemplate change * adding comment for Compute.Builder.build() returning null	2020-04-28 03:13:39 -07:00
Maytas Monsereenusorn	8b78eebdbd	Test reading from empty kafka/kinesis partitions (#9729 ) * add test for stream sequence number returns null * fix checkstyle * add index test for when stream returns null * retrigger test	2020-04-27 10:23:56 -07:00
Jonathan Wei	fe000a9e4b	Adjust string comparators used for ingestion (#9742 ) * Adjust string comparators used for ingestion * Small tweak * Fix inspection, more javadocs * Address PR comment * Add rollup comment * Add ordering test * Fix IncrementaIndexRowCompTest	2020-04-25 13:47:07 -07:00
Clint Wylie	7711f776a0	fix issue where CloseableIterator.flatMap does not close inner CloseableIterator (#9761 ) * fix issue where CloseableIterator.flatMap does not close inner CloseableIterator * more test * style * clarify test	2020-04-24 13:52:50 -07:00
Clint Wylie	fc5383cd00	revert datasketches-java version to 1.1.0-incubating until new version is released (#9751 ) * revert datasketches-java version to 1.1.0-incubating until fix is in place * fix tests * checkstyle	2020-04-24 12:52:12 -07:00
Jihoon Son	7fa72fbf15	Initialize SettableByteEntityReader only when inputFormat is not null (#9734 ) * Lazy initialization of SettableByteEntityReader to avoid NPE * toInputFormat for tsv * address comments * common code	2020-04-24 10:22:51 -07:00
BIGrey	c5bfe36011	Optimize FileWriteOutBytes to avoid high system cpu usage (#9722 ) * optimize FileWriteOutBytes to avoid high sys cpu * optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException * optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException in writeOutBytes.size * Revert "optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException in writeOutBytes.size" This reverts commit 965f7421 * Revert "optimize FileWriteOutBytes to avoid high sys cpu -- remove IOException" This reverts commit 149e08c0 * optimize FileWriteOutBytes to avoid high sys cpu -- avoid IOEception never thrown check * Fix size counting to handle IOE in FileWriteOutBytes + tests * remove unused throws IOException in WriteOutBytes.size() * Remove redundant throws IOExcpetion clauses * Parameterize IndexMergeBenchmark Co-authored-by: huanghui.bigrey <huanghui.bigrey@bytedance.com> Co-authored-by: Suneet Saldanha <suneet.saldanha@imply.io>	2020-04-23 20:18:42 -07:00
Gian Merlino	4087a015e8	Datasource doc structure adjustments. (#9716 ) - Reorder both the datasource and query-execution page orderings to table, lookup, union, inline, query, join. (Roughly increasing order of conceptual "fanciness".) - Add more crosslinks from datasource page to query-execution page: one per datasource type.	2020-04-23 16:04:59 -07:00
Maytas Monsereenusorn	16f5ae4405	Add integration tests for kafka ingestion (#9724 ) * add kafka admin and kafka writer * refactor kinesis IT * fix typo refactor * parallel * parallel * parallel * parallel works now * add kafka it * add doc to readme * fix tests * fix failing test * test * test * test * test * address comments * addressed comments	2020-04-22 10:43:34 -07:00
Gian Merlino	479c290fb9	Add QueryResource to log4j2 template. (#9735 )	2020-04-22 09:18:45 -07:00
Maytas Monsereenusorn	cff39892ba	Fixes intermittent failure in ITAutoCompactionTest (#9739 ) * fix intermittent failure in ITAutoCompactionTest * fix typo * update javadoc	2020-04-21 20:56:17 -07:00
calvinhkf	b146f8a2a7	Align library version (#9636 ) * align JUnitParams version 1.1.1,1.0.4 to 1.1.1 * aligin junit version 4.8.1,4.12 to 4.12 * exclude explicitly specified version	2020-04-21 20:19:38 -07:00
Abhishek Radhakrishnan	8abcbf671d	Fix numbered list formatting in markdown. (#9664 )	2020-04-21 20:18:12 -07:00
Clint Wylie	68cc0b2e1c	fixes for inline subqueries when multi-value dimension is present (#9698 ) * fixes for inline subqueries when multi-value dimension is present * fix test * allow missing capabilities for vectorized group by queries to be treated as single dims since it means that column doesnt exist * add comment	2020-04-21 18:44:26 -07:00
Clint Wylie	28f56978ab	web-console clean coverage report on build clean (#9718 )	2020-04-21 17:02:05 -07:00
Jenson	b9ad250c00	Fix misuse of Integer.SIZE in FileWriteOutBytes.writeInt (#9723 ) * change Integer.SIZE to Integer.BYTES in FileWriteOutBytes#writeInt * Add ASF header Co-authored-by: jenson <junstan@paypal.com>	2020-04-19 18:16:53 +08:00
Clint Wylie	e677c62484	document useFilterCNF query context parameter (#9647 ) * document useFilterCNF query context parameter * move context key to QueryContexts * Update .spelling	2020-04-16 22:12:20 -07:00
mcbrewster	b7fdb29423	add joins to column tree menu (#9705 ) * add joins to column tree menu * fix capitalization * add keyword, keep columns if replaced * actually fix capitalization * add keywords	2020-04-16 17:51:59 -07:00
Clint Wylie	b89ad49396	disable group by config applyLimitPushDownToSegment by default (#9711 ) * disable group by config applyLimitPushDownToSegment by default * document	2020-04-16 03:03:35 -07:00
Gian Merlino	42590ae64b	Refresh query docs. (#9704 ) * Refresh query docs. Larger changes: - New doc: querying/datasource.md describes the various kinds of datasources you can use, and has examples for both SQL and native. - New doc: querying/query-execution.md describes how native queries are executed at a high level. It doesn't go into the details of specific query engines or how queries run at a per-segment level. But I think it would be good to add or link that content here in the future. - Refreshed doc: querying/sql.md updated to refer to joins, reformatted a bit, added a new "Query translation" section that explains how queries are translated from SQL to native, and removed configuration details (moved to configuration/index.md). - Refreshed doc: querying/joins.md updated to refer to join datasources. Smaller changes: - Add helpful banners to the top of query documentation pages telling people whether a given page describes SQL, native, or both. - Add SQL metrics to operations/metrics.md. - Add some color and cross-links in various places. - Add native query component docs to the sidebar, and renamed them so they look nicer. - Remove Select query from the sidebar. - Fix Broker SQL configs in configuration/index.md. Remove them from querying/sql.md. - Combined querying/searchquery.md and querying/searchqueryspec.md. * Updates. * Fix numbering. * Fix glitches. * Add new words to spellcheck file. * Assorted changes. * Further adjustments. * Add missing punctuation.	2020-04-15 16:12:20 -07:00
Himanshu	b082262a2a	druid-pac4j:add custom SSL handling to com.nimbusds.oauth2.sdk.http.HTTPRequest objects (#9695 )	2020-04-15 15:59:24 -07:00
Maytas Monsereenusorn	8328d91b30	Add missing integration tests for the compaction by the coordinator (#9644 ) * Add API to trigger a compaction by the coordinator for integration tests * Add missing integration tests for the compaction by the coordinator * address comments	2020-04-15 14:27:33 -07:00
Jihoon Son	b8f7128b2d	Revert "remove ServerDiscoverySelector from DruidLeaderClient (#9481 )" (#9702 ) * Revert "remove ServerDiscoverySelector from DruidLeaderClient (#9481)" This reverts commit 072bbe210f162228e85391b464f114da448df24a. * fix build	2020-04-14 20:42:56 -07:00
Will Salisbury	cda9f41e69	s/S3/GCS/g (#9700 ) fix typo [ at least I hope this was a typo… ]	2020-04-14 18:39:54 -07:00
Chi Cao Minh	2262e33316	Fix flaky web console E2E test (#9685 ) web-console/e2e-tests/tutorial-batch.spec.ts would occasionally timeout between the transition from the data loader "configure schema" and "partition" steps due to missing waits when toggling the rollup setting. Also, fix shellcheck warnings for script/druid.	2020-04-14 15:27:16 -07:00
Maytas Monsereenusorn	d930f04e6a	Test file format extensions for inputSource (orc, parquet) (#9632 ) * Test file format extensions for inputSource (orc, parquet) * Test file format extensions for inputSource (orc, parquet) * fix path * resolve merge conflict * fix typo	2020-04-13 13:03:56 -07:00
Jihoon Son	6a52bdc605	Skip license check for dependency reduced pom files (#9687 )	2020-04-11 18:11:53 -07:00
Chi Cao Minh	e6dd6a4119	Skip node dev dependency vulnerability scan (#9684 ) Since they are not production dependencies, security vulnerabilities in the dev dependencies can be ignored.	2020-04-11 14:24:25 -07:00
Abhishek Radhakrishnan	cbbfd63bed	Add 0.18.0 to .backportrc.json to facilitate backport. (#9661 )	2020-04-11 13:49:04 -07:00
Clint Wylie	0ff926b1a1	fix issue with group by limit pushdown for extractionFn, expressions, joins, etc (#9662 ) * fix issue with group by limit pushdown for extractionFn, expressions, joins, etc * remove unused * fix test * revert unintended change * more tests * consider capabilities for StringGroupByColumnSelectorStrategy * fix test * fix and more test * revert because im scared	2020-04-11 01:18:11 -07:00
Jihoon Son	1b60148ec6	Missing license changes for sources in licenses.yaml (#9678 )	2020-04-10 23:06:33 -07:00
Gian Merlino	5249155284	Fix off-by-one in IndexedTableJoinMatcher.getCardinality. (#9674 ) * Fix off-by-one in IndexedTableJoinMatcher.getCardinality. It would report a cardinality that is one lower than the actual cardinality. The missing value is the phantom null that can be generated by outer joins. * Fix tests.	2020-04-10 18:11:05 -07:00
Himanshu	ca369e5768	druid-pac4j: add ability to use custom ssl trust store while talking to auth server (#9637 ) * druid-pac4j: add ability for custom ssl trust store for talking to auth server * fix nimbusds DefaultResourceRetriever name in comment	2020-04-10 18:01:59 -07:00
Suneet Saldanha	332ca19621	Fix potential integer overflow issues (#9609 ) ApproximateHistogram - seems unlikely SegmentAnalyzer - unclear if this is an actual issue GenericIndexedWriter - unclear if this is an actual issue IncrementalIndexRow and OnheapIncrementalIndex are non-issues becaus it's very unlikely for the number of dims to be large enough to hit the overflow condition	2020-04-10 11:47:08 -07:00
Suneet Saldanha	22d3eed80c	Do not use external input in format strings (#9665 ) https://lgtm.com/rules/7900080/	2020-04-10 10:46:04 -07:00
Suneet Saldanha	bd1cff24a2	Remove no-op assert statement in ClientQuerySegmentWalker (#9607 ) * Remove no-op assert statement The assert statement in ClientQuerySegmentWalker will always be true because of the preceeding while loop which has the same condition. This change removes dead code to fix an error reported by LGTM * Suppress lgtm * cleanup whitespace	2020-04-10 10:41:29 -07:00
Suneet Saldanha	642fe83897	Indexing Service validates externally received taskId (#9666 ) Addresses issues flagged by https://lgtm.com/rules/5970070/	2020-04-10 10:36:26 -07:00
Suneet Saldanha	1ced3b33fb	IntelliJ inspections cleanup (#9339 ) * IntelliJ inspections cleanup * Standard Charset object can be used * Redundant Collection.addAll() call * String literal concatenation missing whitespace * Statement with empty body * Redundant Collection operation * StringBuilder can be replaced with String * Type parameter hides visible type * fix warnings in test code * more test fixes * remove string concatenation inspection error * fix extra curly brace * cleanup AzureTestUtils * fix charsets for RangerAdminClient * review comments	2020-04-10 10:04:40 -07:00
Jihoon Son	e157fb089a	Fix wrong cardinality computation in BufferArrayGrouper (#9655 ) * Fix wrong cardinality computation in BufferArrayGrouper * fix javadoc	2020-04-10 09:05:38 -07:00
yuanli	8ccc0b241a	Fix some flaws of KafkaEmitter (#9573 ) * fix flaws of KafkaEmitter * fix flaws of KafkaEmitter * fix flaws of KafkaEmitter * Update extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java Co-Authored-By: Himanshu <g.himanshu@gmail.com> * Update extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java Co-Authored-By: Himanshu <g.himanshu@gmail.com> Co-authored-by: Himanshu <g.himanshu@gmail.com>	2020-04-09 23:31:32 -07:00
Suneet Saldanha	65de636893	Fix potential integer overflow in BufferArrayGrouper (#9605 ) This change fixes a potential integer overflow in BufferArrayGrouper that was flagged by LGTM. It also adds a check that the vectorized arrays are initialized before aggregateVector is called. The changes in HashTableUtils should not have any effect since the numbers being multiplied are small, but the change will remove the warnings from being flagged in LGTM.	2020-04-09 17:46:15 -07:00
Suneet Saldanha	9888268000	Suppress LGTM warnings about stack trace exposure (#9631 ) Since Druid is an open source project, these warnings are not concerning as the information it may potentially leak is already available in the open.	2020-04-09 17:31:03 -07:00
Gian Merlino	75c543b50f	SQL: More straightforward handling of join planning. (#9648 ) * SQL: More straightforward handling of join planning. Two changes that simplify how joins are planned: 1) Stop using JoinProjectTransposeRule as a way of guiding subquery removal. Instead, add logic to DruidJoinRule that identifies removable subqueries and removes them at the point of creating a DruidJoinQueryRel. This approach reduces the size of the planning space and allows the planner to complete quickly. 2) Remove rules that reorder joins. Not because of an impact on the planning time (it seems minimal), but because the decisions that the planner was making in the new tests were sometimes worse than the user-provided order. I think we'll need to go with the user-provided order for now, and revisit reordering when we can add more smarts to the cost estimator. A third change updates numeric ExprEval classes to store their value as a boxed type that corresponds to what it is supposed to be. This is useful because it affects the behavior of "asString", and is included in this patch because it is needed for the new test "testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an actual double would. Fixes #9646. * Fix comments. * Fix tests.	2020-04-09 16:21:43 -07:00
Chi Cao Minh	eb45981b60	Upgrade netty 4 to fix CVE-2020-11612 (#9651 )	2020-04-09 13:26:14 -07:00
Chi Cao Minh	84c1c2505d	Web console basic end-to-end-test (#9595 ) Load data and query (i.e., automate https://druid.apache.org/docs/latest/tutorials/tutorial-batch.html) to have some basic checks ensuring the web console is wired up to druid correctly. The new end-to-end tests (tutorial-batch.spec.ts) are added to `web-console/e2e-tests`. Within that directory: - `components` represent the various tabs of the web console. Currently, abstractions for `load data`, `ingestion`, `datasources`, and `query` are implemented. - `components/load-data/data-connector` contains abstractions for the different data source options available to the data loader's `Connect` step. Currently, only the `Local file` data source connector is implemented. - `components/load-data/config` contains abstractions for the different configuration options available for each step of the data loader flow. Currently, the `Configure Schema`, `Partition`, and `Publish` steps have initial implementation of their configuration options. - `util` contains various helper methods for the tests and does not contain abstractions of the web console. Changes to add the new tests to CI: - `.travis.yml`: New "web console end-to-end tests" job - `web-console/jest.*.js`: Refactor jest configurations to have different flavors for unit tests and for end-to-end tests. In particular, the latter adds a jest setup configuration to wait for the web console to be ready (`web-console/e2e-tests/util/setup.ts`). - `web-console/package.json`: Refactor run scripts to add new script for running end-to-end tests. - `web-console/script/druid`: Utility scripts for building, starting, and stopping druid. Other changes: - `pom.xml`: Refactor various settings disable java static checks and to disable java tests into two new maven profiles. Since the same settings are used in several places (e.g., .travis.yml, Dockerfiles, etc.), having them in maven profiles makes it more maintainable. - `web-console/src/console-application.tsx`: Fix typo ("the the").	2020-04-09 12:38:09 -07:00

1 2 3 4 5 ...

10297 Commits