druid

Commit Graph

Author	SHA1	Message	Date
zachjsh	e855c7fe1b	Allow Cloud Deep Storage configs without segment bucket or path specified (#9588 ) * Allow Cloud SegmentKillers to be instantiated without segment bucket or path This change fixes a bug that was introduced that causes ingestion to fail if data is ingested from one of the supported cloud storages (Azure, Google, S3), and the user is using another type of storage for deep storage. In this case the all segment killer implementations are instantiated. A change recently made forced a dependency between the supported cloud storage type SegmentKiller classes and the deep storage configuration for that storage type being set, which forced the deep storage bucket and prefix to be non-null. This caused a NullPointerException to be thrown when instantiating the SegmentKiller classes during ingestion. To fix this issue, the respective deep storage segment configs for the cloud storage types supported in druid are now allowed to have nullable bucket and prefix configurations * * Allow google deep storage bucket to be null	2020-04-01 11:57:32 -07:00
Jihoon Son	0da8ffc3ff	Bump up development version to 0.19.0-SNAPSHOT (#9586 )	2020-03-30 16:24:04 -07:00
Himanshu	839379246a	remove commons-lang3 usage from DoubleMeanAggregatorFactoryTest (#9578 )	2020-03-30 14:31:50 -07:00
Clint Wylie	fa5da6693c	add lane enforcement for joinish queries (#9563 ) * add lane enforcement for joinish queries * oops * style * review stuffs	2020-03-30 11:58:16 -07:00
Chi Cao Minh	c0195a19e4	Fix HDFS input source split (#9574 ) Fixes an issue where splitting an HDFS input source for use in native parallel batch ingestion would cause the subtasks to get a split with an invalid HDFS path.	2020-03-28 15:45:57 -07:00
Stanislav Poryadnyi	9081b5f25c	fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder (#9568 ) * fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder * byte[] type handling in deserialize and finalizeComputation for DoubleMeanAggregatorFactory * DoubleMeanAggregatorFactory tests: Max Intermediate Size, Deserialize, finalizeComputation * moved byte[] check to first position Co-authored-by: Stanislav <S.Poryadnyi@abcconsulting.ru>	2020-03-27 22:26:31 -07:00
Xavier Léauté	b4ad3d0d88	fix nullhandling exceptions related to test ordering (#9570 ) * fix nullhandling exceptions related to test ordering Tests might get executed in different order depending on the maven version and the test environment. This may lead to "NullHandling module not initialized" errors for some tests where we do not initialize null-handling explicitly. * use InitializedNullHandlingTest	2020-03-27 09:46:31 -07:00
Clint Wylie	2c49f6d89a	error on value counter overflow instead of writing sad segments (#9559 )	2020-03-26 16:54:48 -07:00
Suneet Saldanha	e6e2836b0e	Instructions to run integration tests against quickstart (#9560 ) * Instructions to run integration tests against quickstart * Address review comments * actually exclude the test group * Revert "actually exclude the test group" This reverts commit `66f366409e`. * update comment	2020-03-26 13:22:53 -07:00
Suneet Saldanha	55c08e0746	DruidSegmentReader should work if timestamp is specified as a dimension (#9530 ) * DruidSegmentReader should work if timestamp is specified as a dimension * Add integration tests Tests for compaction and re-indexing a datasource with the timestamp column * Instructions to run integration tests against quickstart * address pr	2020-03-25 13:47:34 -07:00
Maytas Monsereenusorn	3f521943fc	S3 ingestion spec should not uses the default credentials provider chain when environment value password provider is misconfigured. (#9552 ) * fix s3 optional cred * S3 ingestion spec uses the default credentials provider chain when environment value password provider is misconfigured. * fix failing test	2020-03-24 15:09:02 -07:00
mcbrewster	e1b201c279	Add view values to lookup actions menu (#9549 ) * add test, add query * jest -u * add limit, explicitly get columns, remoove map	2020-03-24 09:57:33 -07:00
Neil Volungis	0ac875a8b4	Update docker.md readme to note memory requirements (#9529 ) * Update docker.md readme to note memory requirements * Fix grammatical error Co-Authored-By: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com> Co-authored-by: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com>	2020-03-24 03:33:29 -07:00
Maytas Monsereenusorn	e97695d9da	fix Hadoop ingestion fails due to error 'JavaScript is disabled' on certain config (#9553 ) * fix Hadoop ingestion fails due to error 'JavaScript is disabled', if determine partition hadoop job is run * add test * fix checkstyle * address comments * address comments	2020-03-23 23:09:21 -07:00
JaeGeun	57018adf23	change backtick() and fix broken links (#9550 )	2020-03-23 20:57:03 -07:00
Clint Wylie	2bc29543e5	modify QueryCapacityExceededException to provide better messaging (#9547 ) * modify QueryCapacityExceededException to provide better messaging * style	2020-03-23 20:05:11 -07:00
Clint Wylie	bf85ea19b2	roaring bitmaps by default (#9548 ) * it is finally time * fix it * more docs * fix doc	2020-03-23 18:15:57 -07:00
Himanshu	5604ac7963	druid extension for OpenID Connect auth using pac4j lib (#8992 ) * druid pac4j security extension for OpenID Connect OAuth 2.0 authentication * update version in druid-pac4j pom * introducing unauthorized resource filter * authenticated but authorized /unified-webconsole.html * use httpReq.getRequestURI() for matching callback path * add documentation * minor doc addition * licesne file updates * make dependency analyze succeed * fix doc build * hopefully fixes doc build * hopefully fixes license check build * yet another try on fixing license build * revert unintentional changes to website folder * update version to 0.18.0-SNAPSHOT * check session and its expiry on each request * add crypto service * code for encrypting the cookie * update doc with cookiePassphrase * update license yaml * make sessionstore in Pac4jFilter private non static * make Pac4jFilter fields final * okta: use sha256 for hmac * remove incubating * add UTs for crypto util and session store impl * use standard charsets * add license header * remove unused file * add org.objenesis.objenesis to license.yaml * a bit of nit changes in CryptoService and embedding EncryptionResult for clarity * rename alg to cipherAlgName * take cipher alg name, mode and padding as input * add java doc for CryptoService and make it more understandable * another UT for CryptoService * cache pac4j Config * use generics clearly in Pac4jSessionStore * update cookiePassphrase doc to mention PasswordProvider * mark stuff Nullable where appropriate in Pac4jSessionStore * update doc to mention jdbc * add error log on reaching callback resource * javadoc for Pac4jCallbackResource * introduce NOOP_HTTP_ACTION_ADAPTER * add correct module name in license file * correct extensions folder name in licenses.yaml * replace druid-kubernetes-extensions to druid-pac4j * cache SecureRandom instance * rename UnauthorizedResourceFilter to AuthenticationOnlyResourceFilter	2020-03-23 18:15:45 -07:00
Vadim Ogievetsky	cdf4a26904	clean up spec before reopening in data loader (#9536 )	2020-03-23 16:57:51 -07:00
Clint Wylie	d8833316c4	fix broken links (#9537 ) * fix broken links * missing / * adjustment	2020-03-22 17:41:18 -07:00
Gian Merlino	54c9325256	SQL support for joins on subqueries. (#9545 ) * SQL support for joins on subqueries. Changes to SQL module: - DruidJoinRule: Allow joins on subqueries (left/right are no longer required to be scans or mappings). - DruidJoinRel: Add cost estimation code for joins on subqueries. - DruidSemiJoinRule, DruidSemiJoinRel: Removed, since DruidJoinRule can handle this case now. - DruidRel: Remove Nullable annotation from toDruidQuery, because it is no longer needed (it was used by DruidSemiJoinRel). - Update Rules constants to reflect new rules available in our current version of Calcite. Some of these are useful for optimizing joins on subqueries. - Rework cost estimation to be in terms of cost per row, and place all relevant constants in CostEstimates. Other changes: - RowBasedColumnSelectorFactory: Don't set hasMultipleValues. The lack of isComplete is enough to let callers know that columns might have multiple values, and explicitly setting it to true causes ExpressionSelectors to think it definitely has multiple values, and treat the inputs as arrays. This behavior interfered with some of the new tests that involved queries on lookups. - QueryContexts: Add maxSubqueryRows parameter, and use it in druid-sql tests. * Fixes for tests. * Adjustments.	2020-03-22 16:43:55 -07:00
Maytas Monsereenusorn	5f127a1829	Add integration tests for HDFS (#9542 ) * HDFS IT * HDFS IT * HDFS IT * fix checkstyle	2020-03-20 15:46:08 -07:00
zachjsh	4870ad7b56	Azure deep storage does not work with datasource name containing non-ASCII chars (#9525 ) * Azure deep storage does not work with datasource name containing non-ASCII chars Fixed a bug where recording the segment file location fails when using Azure Deep Storage, if the datasource has any special characters * * update jacoco thresholds * * resolve merge conflicts * address review comments	2020-03-19 12:32:35 -07:00
Jihoon Son	1e667362eb	Do not use UnmodifiableList in auto compaction (#9535 )	2020-03-19 11:43:33 -07:00
Clint Wylie	68013fbc64	fix issue where total limit was being applied even when not configured (#9534 ) * fix issue where total limit was being applied even when not configured * fix inspection * add reserved lane name check to manual laning strategy	2020-03-18 18:05:59 -07:00
zachjsh	838735411f	Ability to Delete task logs and segments from Google Storage (#9519 ) * Ability to Delete task logs and segments from Google Storage * implement ability to delete all tasks logs or all task logs written before a particular date when written to Google storage * implement ability to delete all segments from Google deep storage * * Address review comments	2020-03-18 18:00:43 -07:00
zachjsh	b18dd2b7a9	Ability to Delete task logs and segments from Azure Storage (#9523 ) * Ability to Delete task logs and segments from Azure Storage * implement ability to delete all tasks logs or all task logs written before a particular date when written to Azure storage * implement ability to delete all segments from Azure deep storage * * Address review comments	2020-03-18 17:59:17 -07:00
Vadim Ogievetsky	3b536eea7f	Web console: expose props for S3 (#9432 ) * expose props for S3 * added env inputs * add scarry warning * use .password * put the warning front and center * Update web-console/src/views/load-data-view/load-data-view.tsx Co-Authored-By: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com> * let prettier rewrap the text Co-authored-by: Suneet Saldanha <44787917+suneet-s@users.noreply.github.com>	2020-03-18 15:32:12 -07:00
Gian Merlino	1ef25a438f	Broker: Add ability to inline subqueries. (#9533 ) * Broker: Add ability to inline subqueries. The main changes: - ClientQuerySegmentWalker: Add ability to inline queries. - Query: Add "getSubQueryId" and "withSubQueryId" methods. - QueryMetrics: Add "subQueryId" dimension. - ServerConfig: Add new "maxSubqueryRows" parameter, which is used by ClientQuerySegmentWalker to limit how many rows can be inlined per query. - IndexedTableJoinMatcher: Allow creating keys on top of unknown types, by assuming they are strings. This is useful because not all types are known for fields in query results. - InlineDataSource: Store RowSignature rather than component parts. Add more zealous "equals" and "hashCode" methods to ease testing. - Moved QuerySegmentWalker test code from CalciteTests and SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest. * Adjustments from CI. * Fix integration test.	2020-03-18 15:06:45 -07:00
Maytas Monsereenusorn	4c620b8f1c	Adding s3, gcs, azure integration tests (#9501 ) * exclude pulling s3 segments for tests that doesnt need it * fix script * fix script * fix script * add s3 test * refactor sample data script * add tests * add tests * add license header * fix failing tests * change bucket and path to config * update integration test readme * fix typo	2020-03-17 03:08:44 -07:00
Jonathan Wei	b1847364b0	More efficient join filter rewrites (#9516 ) * More efficient join filter rewrites * Rebase * Remove unused functions * PR comments, fix compile * Adjust comment * Allow filter rewrite when join condition has LHS expression * Fix inspections * Fix tests	2020-03-16 22:16:14 -07:00
Clint Wylie	142742f291	add kinesis lag metric (#9509 ) * add kinesis lag metric * fixes * heh * do it right this time * more test * split out supervisor report lags into lagMillis, remove latest offsets from kinesis supervisor report since always null, review stuffs	2020-03-16 21:39:53 -07:00
Vadim Ogievetsky	7626be26ca	Web console: add config control for the query context (#9499 ) * add default and mandetory query contexts * added config docs	2020-03-16 14:34:19 -07:00
Maytas Monsereenusorn	09600db8f2	Add the option to start Hadoop docker container when running integration tests (#9513 ) * hadoop docker it * hadoop docker container it * fix hadoop container	2020-03-16 12:04:05 -07:00
Chi Cao Minh	e7b3dd9cd1	Update to mysql connector 5.1.48 (#9514 )	2020-03-16 10:38:31 -07:00
Chi Cao Minh	100d587583	Suppress CWE-400 for node-sass:4.13.1 (#9517 ) The vulnerability is fixed in 4.13.1: https://github.com/sass/node-sass/issues/2816#issuecomment-575136455 But the dependency check plugin thinks its still broken as the affected/fixed versions has not been updated yet on Sonatype OSS Index: https://ossindex.sonatype.org/vuln/c97f4ae7-be1f-4f71-b238-7c095b126e74	2020-03-16 09:42:33 -07:00
Clint Wylie	69af760a19	add manual laning strategy, integration test (#9492 ) * add manual laning strategy, integration test, json config test * share percent conversion method * wrong assert * review stuffs * doc adjustments * more tests * test adjustment * adjust docs * Update index.md	2020-03-13 20:06:55 -07:00
mcbrewster	bcb9a632c7	Web console: update druid-query-toolkit to version 0.4.x (#9500 ) * add support for new version of DQT * update druid-query-toolkit * fix direction css * fix remove * update package * remove useless conditional * bump package * jest -u Co-authored-by: Maggie Brewster <maggiebrewster@implydata20sMBP.attlocal.net>	2020-03-13 18:09:47 -07:00
Clint Wylie	6afd55c8f4	threshold based automatic query prioritization (#9493 ) * threshold based automatic query prioritization * fixes * spelling and fixes * fix docs * spelling * checkstyle * adjustments * doc fix	2020-03-13 01:41:54 -07:00
Chi Cao Minh	6b02991464	Match GREATEST/LEAST function behavior to other DBs (#9488 ) * Match GREATEST/LEAST function behavior Change the behavior of the GREATEST / LEAST functions to be similar to how it is implemented in other databases (as functions instead of aggregators). The GREATEST/LEAST functions are not in the SQL standard, but users will expect behavior similar to what other databases provide. * Match postgres behavior & handle more SQL types * Fix imports	2020-03-12 15:10:11 -07:00
Vadim Ogievetsky	ddc6f87920	Web console: standardize the spec format (#9477 ) * standerdize the spec format * fix spec upgrade	2020-03-12 14:21:23 -07:00
Himanshu	1ba1a3c523	fix worker category on Indexer node (#9510 )	2020-03-12 14:11:02 -07:00
Gian Merlino	ff59d2e78b	Move RowSignature from druid-sql to druid-processing and make use of it. (#9508 ) * Move RowSignature from druid-sql to druid-processing and make use of it. 1) Moved (most of) RowSignature from sql to processing. Left behind the SQL-specific stuff in a RowSignatures utility class. It also picked up some new convenience methods along the way. 2) There were a lot of places in the code where Map<String, ValueType> was used to associate columns with type info. These are now all replaced with RowSignature. 3) QueryToolChest's resultArrayFields method is replaced with resultArraySignature, and it now provides type info. * Fix up extensions. * Various fixes	2020-03-12 11:06:44 -07:00
Jonathan Wei	3082b9289a	Fix NPE when using IndexedTable and all left rows are filtered out (#9490 ) * Fix NPE when using IndexedTable and all left rows are filtered out * Fix compile * Add constant for uninitialized current row * Fix checkstyle	2020-03-11 19:23:05 -07:00
Gian Merlino	2ef5c17441	Link up row-based datasources to serving layer. (#9503 ) * Link up row-based datasources to serving layer. - Add SegmentWrangler interface that allows linking of DataSources to Segments. - Add LocalQuerySegmentWalker that uses SegmentWranglers to compute queries on data that is available locally. - Modify ClientQuerySegmentWalker to use LocalQuerySegmentWalker when the base datasource is concrete and not a table. - Add SegmentWranglerModule to the Broker so it has them available and can properly instantiate . LocalQuerySegmentWalkers. - Set InlineDataSource and LookupDataSource to concrete, since they can be directly queried now. * Fix tests.	2020-03-11 11:32:27 -07:00
Maytas Monsereenusorn	e9888f41cb	Modify check java version script to indicate experimental support for Java 11 (#9455 ) * Modify check java version script to indicate experimental support for Java 11 * update docs	2020-03-11 09:22:39 -07:00
Maytas Monsereenusorn	9231f2acb3	Integration test compile with Java 8 and run with Java 8 and 11 (#9491 ) * test integration compile with 8 and run with 11 * Integration test compile with Java 8 and run with Java 8 and 11	2020-03-11 09:22:27 -07:00
Gian Merlino	4f085896c6	Ability to directly query row-based datasources. (#9502 ) * Ability to directly query row-based datasources. Includes: - Foundational classes RowBasedSegment, RowBasedStorageAdapter, RowBasedCursor provide a queryable interface on top of a RowBasedColumnSelectorFactory. - Add LookupSegment: A RowBasedSegment that is built on lookup data. - Improve capability reporting in RowBasedColumnSelectorFactory. * Fix import. * Remove unthrown IOException.	2020-03-10 20:39:01 -07:00
Samarth Jain	c74749f0f4	Don't exclude null dimension values from the map based query response (#9438 )	2020-03-10 15:06:03 -07:00
Jihoon Son	7401bb3f93	Improve OvershadowableManager performance (#9441 ) * Use the iterator instead of higherKey(); use the iterator API instead of stream * Fix tests; fix a concurrency bug in timeline * fix test * add tests for findNonOvershadowedObjectsInInterval * fix test * add missing tests; fix a bug in QueueEntry * equals tests * fix test	2020-03-10 13:22:19 -07:00

1 2 3 4 5 ...

10226 Commits All Branches Search

10226 Commits

All Branches