druid

Commit Graph

Author	SHA1	Message	Date
Jian Wang	20fd72bd13	Fix NPE when brokers use custom priority list (#9878 )	2020-06-26 17:28:54 -07:00
Jihoon Son	aaee72c781	Allow append to existing datasources when dynamic partitioning is used (#10033 ) * Fill in the core partition set size properly for batch ingestion with dynamic partitioning * incomplete javadoc * Address comments * fix tests * fix json serde, add tests * checkstyle * Set core partition set size for hash-partitioned segments properly in batch ingestion * test for both parallel and single-threaded task * unused variables * fix test * unused imports * add hash/range buckets * some test adjustment and missing json serde * centralized partition id allocation in parallel and simple tasks * remove string partition chunk * revive string partition chunk * fill numCorePartitions for hadoop * clean up hash stuffs * resolved todos * javadocs * Fix tests * add more tests * doc * unused imports * Allow append to existing datasources when dynamic partitioing is used * fix test * checkstyle * checkstyle * fix test * fix test * fix other tests.. * checkstyle * hansle unknown core partitions size in overlord segment allocation * fail to append when numCorePartitions is unknown * log * fix comment; rename to be more intuitive * double append test * cleanup complete(); add tests * fix build * add tests * address comments * checkstyle	2020-06-25 13:37:31 -07:00
Parag Jain	422a8af14e	Fix balancer strategy (#10070 ) * fix server overassignment * fix random balancer strategy, add more tests * comment * added more tests * fix forbidden apis * fix typo	2020-06-25 16:45:00 +05:30
Dylan Wylie	0470fcc9da	change default number of segment loading threads (#9856 ) * change default number of segment loading threads * fix docs * missed file * min -> max for segment loading threads Co-authored-by: Dylan <dwylie@spotx.tv>	2020-06-23 13:56:44 -07:00
Maytas Monsereenusorn	191572ad5e	Add safeguard to make sure new Rules added are aware of Rule usage in loadstatus API (#10054 ) * Add safeguard to make sure new Rules added are aware of Rule usuage in loadstatus API * address comments * address comments * add tests	2020-06-19 20:18:56 -07:00
Clint Wylie	c2f5d453f8	fix topn on string columns with non-sorted or non-unique dictionaries (#10053 ) * fix topn on string columns with non-sorted or non-unique dictionaries * fix metadata tests * refactor, clarify comments and code, fix ci failures	2020-06-19 11:35:18 -07:00
Jonathan Wei	37e150c075	Fix join filter rewrites with nested queries (#10015 ) * Fix join filter rewrites with nested queries * Fix test, inspection, coverage * Remove clauses from group key * Fix import order Co-authored-by: Gian Merlino <gianmerlino@gmail.com>	2020-06-18 21:32:29 -07:00
Jihoon Son	d644a27f1a	Create packed core partitions for hash/range-partitioned segments in native batch ingestion (#10025 ) * Fill in the core partition set size properly for batch ingestion with dynamic partitioning * incomplete javadoc * Address comments * fix tests * fix json serde, add tests * checkstyle * Set core partition set size for hash-partitioned segments properly in batch ingestion * test for both parallel and single-threaded task * unused variables * fix test * unused imports * add hash/range buckets * some test adjustment and missing json serde * centralized partition id allocation in parallel and simple tasks * remove string partition chunk * revive string partition chunk * fill numCorePartitions for hadoop * clean up hash stuffs * resolved todos * javadocs * Fix tests * add more tests * doc * unused imports	2020-06-18 18:40:43 -07:00
Maytas Monsereenusorn	857e5204bf	Coordinator loadstatus API full format does not consider Broadcast rules (#10048 ) * Coordinator loadstatus API full format does not consider Broadcast rules * address comments * fix checkstyle * minor optimization * address comments	2020-06-18 17:52:33 -07:00
Clint Wylie	b5e6569d2c	global table only if joinable (#10041 ) * global table if only joinable * oops * fix style, add more tests * Update sql/src/test/java/org/apache/druid/sql/calcite/schema/DruidSchemaTest.java * better information schema columns, distinguish broadcast from joinable * fix javadoc * fix mistake Co-authored-by: Jihoon Son <jihoonson@apache.org>	2020-06-18 17:32:10 -07:00
Aleksey Plekhanov	2c384b61ff	IntelliJ inspection and checkstyle rule for "Collection.EMPTY_* field accesses replaceable with Collections.empty()" (#9690 ) IntelliJ inspection and checkstyle rule for "Collection.EMPTY_* field accesses replaceable with Collections.empty()" Reverted checkstyle rule * Added tests to pass CI * Codestyle	2020-06-18 09:47:07 -07:00
Maytas Monsereenusorn	1a2620606d	API to verify a datasource has the latest ingested data (#9965 ) * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * fix checksyle * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * API to verify a datasource has the latest ingested data * fix spelling * address comments * fix checkstyle * update docs * fix tests * fix doc * address comments * fix typo * fix spelling * address comments * address comments * fix typo in docs	2020-06-16 20:48:30 -10:00
Clint Wylie	68aa384190	global table datasource for broadcast segments (#10020 ) * global table datasource for broadcast segments * tests * fix * fix test * comments and javadocs * review stuffs * use generated equals and hashcode	2020-06-16 17:58:05 -07:00
Gian Merlino	9330ca9717	Remove LegacyDataSource. (#10037 ) * Remove LegacyDataSource. Its purpose was to enable deserialization of strings into TableDataSources. But we can do this more straightforwardly with Jackson annotations. * Slight test improvement.	2020-06-16 14:40:35 -07:00
Jihoon Son	9a10f8352b	Set the core partition set size properly for batch ingestion with dynamic partitioning (#10012 ) * Fill in the core partition set size properly for batch ingestion with dynamic partitioning * incomplete javadoc * Address comments * fix tests * fix json serde, add tests * checkstyle	2020-06-12 21:39:37 -07:00
Clint Wylie	8a7e7e773a	fix balancer + broadcast segments npe (#10021 )	2020-06-12 13:09:22 -07:00
Jonathan Wei	fe2f656427	Fix broadcast rule drop and docs (#10019 ) * Fix broadcast rule drop and docs * Remove racy test check * Don't drop non-broadcast segments on tasks, add overshadowing handling * Don't use realtimes for overshadowing * Fix dropping for ingestion services	2020-06-12 02:33:28 -07:00
Stefan Birkner	369ed2503e	Remove duplicate parameters from test (#10022 ) Commit `771870ae2d` removed constructor arguments from the rules. Therefore multiple parameters of the test are now the same and can be removed.	2020-06-11 14:15:02 -07:00
Clint Wylie	96eb69e475	ignore brokers in broker views (#10017 )	2020-06-10 12:29:30 -07:00
Clint Wylie	f8b643ec72	make joinables closeable (#9982 ) * make joinables closeable * tests and adjustments * refactor to make join stuffs impelement ReferenceCountedObject instead of Closable, more tests * fixes * javadocs and stuff * fix bugs * more test * fix lgtm alert * simplify * fixup javadoc * review stuffs * safeguard against exceptions * i hate this checkstyle rule * make IndexedTable extend Closeable	2020-06-09 20:12:36 -07:00
Atul Mohan	17cf8ea8f2	Add Sql InputSource (#9449 ) * Add Sql InputSource * Add spelling * Use separate DruidModule * Change module name * Fix docs * Use sqltestutils for tests * Add additional tests * Fix inspection * Add module test * Fix md in docs * Remove annotation Co-authored-by: Atul Mohan <atulmohan@yahoo-inc.com>	2020-06-09 12:55:20 -07:00
Jonathan Wei	771870ae2d	Load broadcast datasources on broker and tasks (#9971 ) * Load broadcast datasources on broker and tasks * Add javadocs * Support HTTP segment management * Fix indexer maxSize * inspection fix * Make segment cache optional on non-historicals * Fix build * Fix inspections, some coverage, failed tests * More tests * Add CliIndexer to MainTest * Fix inspection * Rename UnprunedDataSegment to LoadableDataSegment * Address PR comments * Fix	2020-06-08 20:15:59 -07:00
Gian Merlino	3dfd7c30c0	Add REGEXP_LIKE, fix bugs in REGEXP_EXTRACT. (#9893 ) * Add REGEXP_LIKE, fix empty-pattern bug in REGEXP_EXTRACT. - Add REGEXP_LIKE function that returns a boolean, and is useful in WHERE clauses. - Fix REGEXP_EXTRACT return type (should be nullable; causes incorrect filter elision). - Fix REGEXP_EXTRACT behavior for empty patterns: should always match (previously, they threw errors). - Improve error behavior when REGEXP_EXTRACT and REGEXP_LIKE are passed non-literal patterns. - Improve documentation of REGEXP_EXTRACT. * Changes based on PR review. * Fix arg check. * Important fixes! * Add speller. * wip * Additional tests. * Fix up tests. * Add validation error tests. * Additional tests. * Remove useless call.	2020-06-03 14:31:37 -07:00
Maytas Monsereenusorn	0d22462e07	Document unsupported Join on multi-value column (#9948 ) * Document Unsupported Join on multi-value column * Document Unsupported Join on multi-value column * address comments * Add unit tests * address comments * add tests	2020-06-03 09:55:52 -10:00
Xavier Léauté	a934b2664c	remove ListenableFutures and revert to using the Guava implementation (#9944 ) This change removes ListenableFutures.transformAsync in favor of the existing Guava Futures.transform implementation. Our own implementation had a bug which did not fail the future if the applied function threw an exception, resulting in the future never completing. An attempt was made to fix this bug, however when running againts Guava's own tests, our version failed another half dozen tests, so it was decided to not continue down that path and scrap our own implementation. Explanation for how was this bug manifested itself: An exception thrown in BaseAppenderatorDriver.publishInBackground when invoked via transformAsync in StreamAppenderatorDriver.publish will cause the resulting future to never complete. This explains why when encountering https://github.com/apache/druid/issues/9845 the task will never complete, forever waiting for the publishFuture to register the handoff. As a result, the corresponding "Error while publishing segments ..." message only gets logged once the index task times out and is forcefully shutdown when the future is force-cancelled by the executor.	2020-06-03 10:46:03 -07:00
Xavier Léauté	acfcfd35b1	fix unsafe concurrent access in StreamAppenderatorDriver (#9943 ) during segment publishing we do streaming operations on a collection not safe for concurrent modification. To guarantee correct results we must also guard any operations on the stream itself. This may explain the issue seen in https://github.com/apache/druid/issues/9845	2020-05-31 09:12:25 -07:00
Clint Wylie	2e9548d93d	refactor SeekableStreamSupervisor usage of RecordSupplier (#9819 ) * refactor SeekableStreamSupervisor usage of RecordSupplier to reduce contention between background threads and main thread, refactor KinesisRecordSupplier, refactor Kinesis lag metric collection and emitting * fix style and test * cleanup, refactor, javadocs, test * fixes * keep collecting current offsets and lag if unhealthy in background reporting thread * review stuffs * add comment	2020-05-16 14:09:39 -07:00
mcbrewster	28be107a1c	add flag to flattenSpec to keep null columns (#9814 ) * add flag to flattenSpec to keep null columns * remove changes to inputFormat interface * add comment * change comment message * update web console e2e test * move keepNullColmns to JSONParseSpec * fix merge conflicts * fix tests * set keepNullColumns to false by default * fix lgtm * change Boolean to boolean, add keepNullColumns to hash, add tests for keepKeepNullColumns false + true with no nuulul columns * Add equals verifier tests	2020-05-08 21:53:39 -07:00
Francesco Nidito	e7e41e3a36	Adding support for autoscaling in GCE (#8987 ) * Adding support for autoscaling in GCE * adding extra google deps also in gce pom * fix link in doc * remove unused deps * adding terms to spelling file * version in pom 0.17.0-incubating-SNAPSHOT --> 0.18.0-SNAPSHOT * GCEXyz -> GceXyz in naming for consistency * add preconditions * add VisibleForTesting annotation * typos in comments * use StringUtils.format instead of String.format * use custom exception instead of exit * factorize interval time between retries * making literal value a constant * iter all network interfaces * use provided on google (non api) deps * adding missing dep * removing unneded this and use Objects methods instead o 3-way if in hash and comparison * adding import * adding retries around getRunningInstances and adding limit for operation end waiting * refactor GceEnvironmentConfig.hashCode * 0.18.0-SNAPSHOT -> 0.19.0-SNAPSHOT * removing unused config * adding tests to hash and equals * adding nullable to waitForOperationEnd * adding testTerminate * adding unit tests for createComputeService * increasing retries in unrelated integration-test to prevent sporadic failure (hopefully) * reverting queryResponseTemplate change * adding comment for Compute.Builder.build() returning null	2020-04-28 03:13:39 -07:00
Clint Wylie	68cc0b2e1c	fixes for inline subqueries when multi-value dimension is present (#9698 ) * fixes for inline subqueries when multi-value dimension is present * fix test * allow missing capabilities for vectorized group by queries to be treated as single dims since it means that column doesnt exist * add comment	2020-04-21 18:44:26 -07:00
Maytas Monsereenusorn	8328d91b30	Add missing integration tests for the compaction by the coordinator (#9644 ) * Add API to trigger a compaction by the coordinator for integration tests * Add missing integration tests for the compaction by the coordinator * address comments	2020-04-15 14:27:33 -07:00
Jihoon Son	b8f7128b2d	Revert "remove ServerDiscoverySelector from DruidLeaderClient (#9481 )" (#9702 ) * Revert "remove ServerDiscoverySelector from DruidLeaderClient (#9481)" This reverts commit `072bbe210f`. * fix build	2020-04-14 20:42:56 -07:00
Gian Merlino	5249155284	Fix off-by-one in IndexedTableJoinMatcher.getCardinality. (#9674 ) * Fix off-by-one in IndexedTableJoinMatcher.getCardinality. It would report a cardinality that is one lower than the actual cardinality. The missing value is the phantom null that can be generated by outer joins. * Fix tests.	2020-04-10 18:11:05 -07:00
Suneet Saldanha	22d3eed80c	Do not use external input in format strings (#9665 ) https://lgtm.com/rules/7900080/	2020-04-10 10:46:04 -07:00
Suneet Saldanha	bd1cff24a2	Remove no-op assert statement in ClientQuerySegmentWalker (#9607 ) * Remove no-op assert statement The assert statement in ClientQuerySegmentWalker will always be true because of the preceeding while loop which has the same condition. This change removes dead code to fix an error reported by LGTM * Suppress lgtm * cleanup whitespace	2020-04-10 10:41:29 -07:00
Suneet Saldanha	1ced3b33fb	IntelliJ inspections cleanup (#9339 ) * IntelliJ inspections cleanup * Standard Charset object can be used * Redundant Collection.addAll() call * String literal concatenation missing whitespace * Statement with empty body * Redundant Collection operation * StringBuilder can be replaced with String * Type parameter hides visible type * fix warnings in test code * more test fixes * remove string concatenation inspection error * fix extra curly brace * cleanup AzureTestUtils * fix charsets for RangerAdminClient * review comments	2020-04-10 10:04:40 -07:00
Abhishek Radhakrishnan	08851c0198	Preserve the null values for numeric type dimensions post-compaction. (#9622 ) * Add selector null check to preserve null values as-is. * Fix typo. * add wrapping dimension selector test. * Address review comments. * nit: replace exception type. * uh, float is indeed NOT a special case.	2020-04-08 18:56:06 -07:00
Clint Wylie	7bf2dfa3b1	fix flaky jetty test (#9633 )	2020-04-08 10:07:06 -07:00
Maytas Monsereenusorn	b95a1b9878	Fix NPE in RemoteTaskRunner event handler causes JVM shutdown (#9610 ) * Fix NPE in RemoteTaskRunner event handler causes JVM shutdown * address comments * fix compile * fix checkstyle * fix lgtm * fix merge * fix test * fix tests * change scope * address comments * address comments	2020-04-07 14:53:51 -07:00
Clint Wylie	d267b1c414	check paths used for shuffle intermediary data manager get and delete (#9630 ) * check paths used for shuffle intermediary data manager get and delete * add test * newline * meh	2020-04-07 09:47:18 -07:00
Suneet Saldanha	7bf1ebb0b8	Add tests for valid and invalid datasource names (#9614 ) * Add tests for valid and invalid datasource names * code review * clean up dependencies	2020-04-06 16:02:50 -07:00
Clint Wylie	4d277dbf99	Fix double count ssl connection metrics (#9594 ) * fix double counted jetty/numOpenConnections metric for ssl connections * tests * more better * style	2020-04-03 23:29:23 -07:00
Jihoon Son	0da8ffc3ff	Bump up development version to 0.19.0-SNAPSHOT (#9586 )	2020-03-30 16:24:04 -07:00
Clint Wylie	fa5da6693c	add lane enforcement for joinish queries (#9563 ) * add lane enforcement for joinish queries * oops * style * review stuffs	2020-03-30 11:58:16 -07:00
Clint Wylie	2bc29543e5	modify QueryCapacityExceededException to provide better messaging (#9547 ) * modify QueryCapacityExceededException to provide better messaging * style	2020-03-23 20:05:11 -07:00
Clint Wylie	bf85ea19b2	roaring bitmaps by default (#9548 ) * it is finally time * fix it * more docs * fix doc	2020-03-23 18:15:57 -07:00
Himanshu	5604ac7963	druid extension for OpenID Connect auth using pac4j lib (#8992 ) * druid pac4j security extension for OpenID Connect OAuth 2.0 authentication * update version in druid-pac4j pom * introducing unauthorized resource filter * authenticated but authorized /unified-webconsole.html * use httpReq.getRequestURI() for matching callback path * add documentation * minor doc addition * licesne file updates * make dependency analyze succeed * fix doc build * hopefully fixes doc build * hopefully fixes license check build * yet another try on fixing license build * revert unintentional changes to website folder * update version to 0.18.0-SNAPSHOT * check session and its expiry on each request * add crypto service * code for encrypting the cookie * update doc with cookiePassphrase * update license yaml * make sessionstore in Pac4jFilter private non static * make Pac4jFilter fields final * okta: use sha256 for hmac * remove incubating * add UTs for crypto util and session store impl * use standard charsets * add license header * remove unused file * add org.objenesis.objenesis to license.yaml * a bit of nit changes in CryptoService and embedding EncryptionResult for clarity * rename alg to cipherAlgName * take cipher alg name, mode and padding as input * add java doc for CryptoService and make it more understandable * another UT for CryptoService * cache pac4j Config * use generics clearly in Pac4jSessionStore * update cookiePassphrase doc to mention PasswordProvider * mark stuff Nullable where appropriate in Pac4jSessionStore * update doc to mention jdbc * add error log on reaching callback resource * javadoc for Pac4jCallbackResource * introduce NOOP_HTTP_ACTION_ADAPTER * add correct module name in license file * correct extensions folder name in licenses.yaml * replace druid-kubernetes-extensions to druid-pac4j * cache SecureRandom instance * rename UnauthorizedResourceFilter to AuthenticationOnlyResourceFilter	2020-03-23 18:15:45 -07:00
Gian Merlino	54c9325256	SQL support for joins on subqueries. (#9545 ) * SQL support for joins on subqueries. Changes to SQL module: - DruidJoinRule: Allow joins on subqueries (left/right are no longer required to be scans or mappings). - DruidJoinRel: Add cost estimation code for joins on subqueries. - DruidSemiJoinRule, DruidSemiJoinRel: Removed, since DruidJoinRule can handle this case now. - DruidRel: Remove Nullable annotation from toDruidQuery, because it is no longer needed (it was used by DruidSemiJoinRel). - Update Rules constants to reflect new rules available in our current version of Calcite. Some of these are useful for optimizing joins on subqueries. - Rework cost estimation to be in terms of cost per row, and place all relevant constants in CostEstimates. Other changes: - RowBasedColumnSelectorFactory: Don't set hasMultipleValues. The lack of isComplete is enough to let callers know that columns might have multiple values, and explicitly setting it to true causes ExpressionSelectors to think it definitely has multiple values, and treat the inputs as arrays. This behavior interfered with some of the new tests that involved queries on lookups. - QueryContexts: Add maxSubqueryRows parameter, and use it in druid-sql tests. * Fixes for tests. * Adjustments.	2020-03-22 16:43:55 -07:00
Jihoon Son	1e667362eb	Do not use UnmodifiableList in auto compaction (#9535 )	2020-03-19 11:43:33 -07:00
Clint Wylie	68013fbc64	fix issue where total limit was being applied even when not configured (#9534 ) * fix issue where total limit was being applied even when not configured * fix inspection * add reserved lane name check to manual laning strategy	2020-03-18 18:05:59 -07:00
Gian Merlino	1ef25a438f	Broker: Add ability to inline subqueries. (#9533 ) * Broker: Add ability to inline subqueries. The main changes: - ClientQuerySegmentWalker: Add ability to inline queries. - Query: Add "getSubQueryId" and "withSubQueryId" methods. - QueryMetrics: Add "subQueryId" dimension. - ServerConfig: Add new "maxSubqueryRows" parameter, which is used by ClientQuerySegmentWalker to limit how many rows can be inlined per query. - IndexedTableJoinMatcher: Allow creating keys on top of unknown types, by assuming they are strings. This is useful because not all types are known for fields in query results. - InlineDataSource: Store RowSignature rather than component parts. Add more zealous "equals" and "hashCode" methods to ease testing. - Moved QuerySegmentWalker test code from CalciteTests and SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest. * Adjustments from CI. * Fix integration test.	2020-03-18 15:06:45 -07:00
Jonathan Wei	b1847364b0	More efficient join filter rewrites (#9516 ) * More efficient join filter rewrites * Rebase * Remove unused functions * PR comments, fix compile * Adjust comment * Allow filter rewrite when join condition has LHS expression * Fix inspections * Fix tests	2020-03-16 22:16:14 -07:00
Clint Wylie	69af760a19	add manual laning strategy, integration test (#9492 ) * add manual laning strategy, integration test, json config test * share percent conversion method * wrong assert * review stuffs * doc adjustments * more tests * test adjustment * adjust docs * Update index.md	2020-03-13 20:06:55 -07:00
Clint Wylie	6afd55c8f4	threshold based automatic query prioritization (#9493 ) * threshold based automatic query prioritization * fixes * spelling and fixes * fix docs * spelling * checkstyle * adjustments * doc fix	2020-03-13 01:41:54 -07:00
Gian Merlino	ff59d2e78b	Move RowSignature from druid-sql to druid-processing and make use of it. (#9508 ) * Move RowSignature from druid-sql to druid-processing and make use of it. 1) Moved (most of) RowSignature from sql to processing. Left behind the SQL-specific stuff in a RowSignatures utility class. It also picked up some new convenience methods along the way. 2) There were a lot of places in the code where Map<String, ValueType> was used to associate columns with type info. These are now all replaced with RowSignature. 3) QueryToolChest's resultArrayFields method is replaced with resultArraySignature, and it now provides type info. * Fix up extensions. * Various fixes	2020-03-12 11:06:44 -07:00
Gian Merlino	2ef5c17441	Link up row-based datasources to serving layer. (#9503 ) * Link up row-based datasources to serving layer. - Add SegmentWrangler interface that allows linking of DataSources to Segments. - Add LocalQuerySegmentWalker that uses SegmentWranglers to compute queries on data that is available locally. - Modify ClientQuerySegmentWalker to use LocalQuerySegmentWalker when the base datasource is concrete and not a table. - Add SegmentWranglerModule to the Broker so it has them available and can properly instantiate . LocalQuerySegmentWalkers. - Set InlineDataSource and LookupDataSource to concrete, since they can be directly queried now. * Fix tests.	2020-03-11 11:32:27 -07:00
Jihoon Son	7401bb3f93	Improve OvershadowableManager performance (#9441 ) * Use the iterator instead of higherKey(); use the iterator API instead of stream * Fix tests; fix a concurrency bug in timeline * fix test * add tests for findNonOvershadowedObjectsInInterval * fix test * add missing tests; fix a bug in QueueEntry * equals tests * fix test	2020-03-10 13:22:19 -07:00
Himanshu	75a5591448	remove old unused zookeeper dependent lookups code (#9480 ) * remove old unused zookeeper dependent lookups code * make intellij inspector happy	2020-03-10 12:12:48 -07:00
Clint Wylie	8b9fe6f584	query laning and load shedding (#9407 ) * prototype * merge QueryScheduler and QueryManager * everything in its right place * adjustments * docs * fixes * doc fixes * use resilience4j instead of semaphore * more tests * simplify * checkstyle * spelling * oops heh * remove unused * simplify * concurrency tests * add SqlResource tests, refactor error response * add json config tests * use LongAdder instead of AtomicLong * remove test only stuffs from scheduler * javadocs, etc * style * partial review stuffs * adjust * review stuffs * more javadoc * error response documentation * spelling * preserve user specified lane for NoSchedulingStrategy * more test, why not * doc adjustment * style * missed review for make a thing a constant * fixes and tests * fix test * Update docs/configuration/index.md Co-Authored-By: sthetland <steve.hetland@imply.io> * doc update Co-authored-by: sthetland <steve.hetland@imply.io>	2020-03-10 02:57:16 -07:00
Jonathan Wei	0136dba95d	Add option to control join filter rewrites (#9472 ) * Add option to control join filter rewrites * Fix inspections	2020-03-09 17:36:07 -07:00
Himanshu	072bbe210f	remove ServerDiscoverySelector from DruidLeaderClient (#9481 )	2020-03-09 12:13:59 -07:00
Lijia Liu	063811710e	#8690 use utc interval when create pedding segments (#9142 ) Co-authored-by: Gian Merlino <gianmerlino@gmail.com>	2020-02-26 13:20:59 -08:00
Clint Wylie	6d8dd5ec10	string -> expression -> string -> expression (#9367 ) * add Expr.stringify which produces parseable expression strings, parser support for null values in arrays, and parser support for empty numeric arrays * oops, macros are expressions too * style * spotbugs * qualified type arrays * review stuffs * simplify grammar * more permissive array parsing * reuse expr joiner * fix it	2020-02-21 15:43:02 -08:00
Chi Cao Minh	e7eb45e648	Run IntelliJ inspections on Travis (#9179 ) * Run IntelliJ inspections on Travis Running IntelliJ inspections currently takes about 90 minutes, but they can be run in about 30 minutes on Travis. * Restore assert statements	2020-02-19 11:34:19 +03:00
Jihoon Son	3bb9e7e53a	Inject things instead of subclassing everything for parallel task testing (#9353 ) * Inject things instead of subclassing everything for parallel task testing * javadoc * fix compilation * fix wrong merge * Address comments	2020-02-16 13:00:12 -08:00
Atul Mohan	043abd5529	Fix compatibility issues with SqlFirehose (#9365 ) * Make SqlFirehose compatible with FiniteFirehose * Fix build	2020-02-14 17:45:12 -08:00
Jihoon Son	bcf8f91e46	Add unit tests for CoordinatorRuleManager (#9318 )	2020-02-13 19:29:57 -08:00
Adam Peck	e9aebd994a	Fix for building in Eclipse & VS Code. (#7481 ) Fixes #6866 Reverse dependencies from /main/ to /test/ Add generated-test-sources to source path for Eclipse	2020-02-13 14:58:32 -08:00
Suneet Saldanha	b1f38131af	Fix timestamp extract fn to match postgreSQL (#9337 ) * Fix timestamp extract fn to match postgres Update the timestamp extract function so that it matches the PostgreSQL docs. Examples from the PostgreSQL docs were added as tests for DECADE, CENTURY and MILLENIUM extraction. There were bugs in CENTURY and MILLENIUM that were spotted because of intelliJ inspections - 'Integer division in floating point context' * Update CalciteQueryTest * remove useless round * mark integer division as an error	2020-02-12 15:39:19 -08:00
Maytas Monsereenusorn	c30579e47b	ANY Aggregator should not skip null values implementation (#9317 ) * ANY Aggregator should not skip null values implementation * add tests * add more tests * Update documentation * add more tests * address review comments * optimize StringAnyBufferAggregator * fix failing tests * address pr comments	2020-02-12 14:01:41 -08:00
Jonathan Wei	b2c00b3a79	Add query context option to disable join filter push down (#9335 )	2020-02-11 15:31:34 -08:00
Clint Wylie	831ec172f1	Logging large segment list handling (#9312 ) * better handling of large segment lists in logs * more * adjust * exceptions * fixes * refactor * debug * heh * dang	2020-02-07 21:42:45 -08:00
Jihoon Son	e81230f9ab	Refactoring some codes around ingestion (#9274 ) * Refactoring codes around ingestion: - Parallel index task and simple task now use the same segment allocator implementation. This is reusable for the future implementation as well. - Added PartitionAnalysis to store the analysis of the partitioning - Move some util methods to SegmentLockHelper and rename it to TaskLockHelper * fix build * fix SingleDimensionShardSpecFactory * optimize SingledimensionShardSpecFactory * fix test * shard spec builder * import order * shardSpecBuilder -> partialShardSpec * build -> complete * fix comment; add unit tests for partitionBoundaries * add more tests and fix javadoc * fix toString(); add serde tests for HashBasedNumberedPartialShardSpec and SegmentAllocateAction * fix test * add equality test for hash and range partial shard specs	2020-02-07 16:23:07 -08:00
Lucas Capistrant	53bb45fc9a	Forbid easily misused HashSet and HashMap constructors (#9165 ) * Forbid easily misused HashSet and HashMap constructors * Add two LinkedHashMap constructors to forbidden-apis and create utility method as replacement for them * Fix visibility of constant in CollectionUtils.java * Make an exception for an instance of LinkedHashMap#<init>(int) because proper sizing is used * revert changes to sql module tests that should be in separate PR * Finish reverting changes to sql module tests that were flagged in checkstyle during CI * Add netty dependency resulting from SupressForbidden	2020-02-07 10:44:09 +03:00
Lucas Capistrant	2e1dbe598c	Create new dynamic config to pause coordinator helpers when needed (#9224 ) * Create new dynamic config to pause coordinator helpers when needed * Fix spelling mistakes flagged in Travis build * Add an integration test for coordinator pause dynamic config * Improve documentation for new dynamic coordinator config and remove un-needed info logs in favor of debug * address naming convention of 'deep store' vs 'deep storage' in new configs doc line * Fix newline at end of configuration index.md * Last try to resolve newline issue in configuration readme * fix spell checks from travis build * Fix another flagges spelling error from Travis	2020-02-05 15:33:42 -08:00
Zhenxiao Luo	98cefc61fa	Not use ConcurrentHashMap in CoordinatorRuleManager.rules (#9302 )	2020-02-05 15:33:31 -08:00
Gian Merlino	475b90c3a6	Remove EasyMock dependency from CalciteTests. (#9310 ) * Remove EasyMock dependency from CalciteTests. Useful because CalciteTests is used by other modules (e.g. druid-benchmarks) and we don't want them to have to pull in EasyMock. * CalciteTests no longer needs curator-x-discovery either.	2020-02-04 22:10:17 -08:00
lamber-ken	48b95f02f2	Remove unnecessary casts (#9208 )	2020-02-04 21:52:33 -08:00
Gian Merlino	b411443d22	SQL join support for lookups. (#9294 ) * SQL join support for lookups. 1) Add LookupSchema to SQL, so lookups show up in the catalog. 2) Add join-related rels and rules to SQL, allowing joins to be planned into native Druid queries. * Add two missing LookupSchema calls in tests. * Fix tests. * Fix typo.	2020-01-31 23:51:16 -08:00
Gian Merlino	4963a113dc	Make JoinableFactoryModule tests look at all the actual mappings. (#9295 ) By depending on JoinableFactoryModule.FACTORY_MAPPINGS, we verify that the bound JoinableFactory can actually handle and create all default classes.	2020-01-31 17:15:38 -08:00
Gian Merlino	204ba9966f	Add LookupJoinableFactory. (#9281 ) * Add LookupJoinableFactory. Enables joins where the right-hand side is a lookup. Includes an integration test. Also, includes changes to LookupExtractorFactoryContainerProvider: 1) Add "getAllLookupNames", which will be needed to eventually connect lookups to Druid's SQL catalog. 2) Convert "get" from nullable to Optional return. 3) Swap out most usages of LookupReferencesManager in favor of the simpler LookupExtractorFactoryContainerProvider interface. * Fixes for tests. * Fix another test. * Java 11 message fix. * Fixups. * Fixup benchmark class.	2020-01-30 14:46:21 -08:00
Suneet Saldanha	6b44d4aa80	Add getRightEquiConditionKeys to JoinConditionAnalysis (#9287 ) * Add getRightColumns to JoinConditionAnalysis This change other implementations of JoinableFactory to ask the analysis for the right key columns instead of having to calculate it themselves. * Address some review comments * more code review stuff	2020-01-29 22:31:29 -08:00
Suneet Saldanha	303b02eba1	intelliJ inspections cleanup (#9260 ) * intelliJ inspections cleanup - remove redundant escapes - performance warnings - access static member via instance reference - static method declared final - inner class may be static Most of these changes are aesthetic, however, they will allow inspections to be enabled as part of CI checks going forward The valuable changes in this delta are: - using StringBuilder instead of string addition in a loop indexing-hadoop/.../Utils.java processing/.../ByteBufferMinMaxOffsetHeap.java - Use class variables instead of static variables for parameterized test processing/src/.../ScanQueryLimitRowIteratorTest.java * Add intelliJ inspection warnings as errors to druid profile * one more static inner class	2020-01-29 11:50:52 -08:00
Suneet Saldanha	6ee0afa8e5	Rename MapDataSourceJoinableFactoryWarehouse (#9275 )	2020-01-28 19:00:07 -08:00
Suneet Saldanha	0ccfe5ca89	Expose JoinableFactory through Guice Bindings (#9271 ) * Make JoinableFactory an extension point This change makes it so that extensions can register a JoinableFactory that should be used for a DataSource. Extensions can provide the factories via DruidBinders#joinableFactoryBinder Known DataSources - like InlineDataSource are provided in the JoinableFactoryModule. This module installs a FactoryWarehouse that is used to decide which factory should be used to generate the Joinable for the provided DataSource. The ExtensionPoint is marked as Beta since it is not yet clear if this needs to remain available to other extensions or if the best way to register a factory is by using the datasource class. * Add module test * remove useless bindings in test * remove ExtensionPoint annotation * Make LifecycleLock not final to help with testing	2020-01-28 13:59:06 -08:00
Roman Leventov	b9186f8f9f	Reconcile terminology and method naming to 'used/unused segments'; Rename MetadataSegmentManager to MetadataSegmentsManager (#7306 ) * Reconcile terminology and method naming to 'used/unused segments'; Don't use terms 'enable/disable data source'; Rename MetadataSegmentManager to MetadataSegments; Make REST API methods which mark segments as used/unused to return server error instead of an empty response in case of error * Fix brace * Import order * Rename withKillDataSourceWhitelist to withSpecificDataSourcesToKill * Fix tests * Fix tests by adding proper methods without interval parameters to IndexerMetadataStorageCoordinator instead of hacking with Intervals.ETERNITY * More aligned names of DruidCoordinatorHelpers, rename several CoordinatorDynamicConfig parameters * Rename ClientCompactTaskQuery to ClientCompactionTaskQuery for consistency with CompactionTask; ClientCompactQueryTuningConfig to ClientCompactionTaskQueryTuningConfig * More variable and method renames * Rename MetadataSegments to SegmentsMetadata * Javadoc update * Simplify SegmentsMetadata.getUnusedSegmentIntervals(), more javadocs * Update Javadoc of VersionedIntervalTimeline.iterateAllObjects() * Reorder imports * Rename SegmentsMetadata.tryMark... methods to mark... and make them to return boolean and the numbers of segments changed and relay exceptions to callers * Complete merge * Add CollectionUtils.newTreeSet(); Refactor DruidCoordinatorRuntimeParams creation in tests * Remove MetadataSegmentManager * Rename millisLagSinceCoordinatorBecomesLeaderBeforeCanMarkAsUnusedOvershadowedSegments to leadingTimeMillisBeforeCanMarkAsUnusedOvershadowedSegments * Fix tests, refactor DruidCluster creation in tests into DruidClusterBuilder * Fix inspections * Fix SQLMetadataSegmentManagerEmptyTest and rename it to SqlSegmentsMetadataEmptyTest * Rename SegmentsAndMetadata to SegmentsAndCommitMetadata to reduce the similarity with SegmentsMetadata; Rename some methods * Rename DruidCoordinatorHelper to CoordinatorDuty, refactor DruidCoordinator * Unused import * Optimize imports * Rename IndexerSQLMetadataStorageCoordinator.getDataSourceMetadata() to retrieveDataSourceMetadata() * Unused import * Update terminology in datasource-view.tsx * Fix label in datasource-view.spec.tsx.snap * Fix lint errors in datasource-view.tsx * Doc improvements * Another attempt to please TSLint * Another attempt to please TSLint * Style fixes * Fix IndexerSQLMetadataStorageCoordinator.createUsedSegmentsSqlQueryForIntervals() (wrong merge) * Try to fix docs build issue * Javadoc and spelling fixes * Rename SegmentsMetadata to SegmentsMetadataManager, address other comments * Address more comments	2020-01-27 11:24:29 -08:00
Gian Merlino	19b427e8f3	Add JoinableFactory interface and use it in the query stack. (#9247 ) * Add JoinableFactory interface and use it in the query stack. Also includes InlineJoinableFactory, which enables joining against inline datasources. This is the first patch where a basic join query actually works. It includes integration tests. * Fix test issues. * Adjustments from code review.	2020-01-24 13:10:01 -08:00
Gian Merlino	f0f68570ec	Use DataSourceAnalysis throughout the query stack. (#9239 ) Builds on #9235, using the datasource analysis functionality to replace various ad-hoc approaches. The most interesting changes are in ClientQuerySegmentWalker (brokers), ServerManager (historicals), and SinkQuerySegmentWalker (indexing tasks). Other changes related to improving how we analyze queries: 1) Changes TimelineServerView to return an Optional timeline, which I thought made the analysis changes cleaner to implement. 2) Added QueryToolChest#canPerformSubquery, which is now used by query entry points to determine whether it is safe to pass a subquery dataSource to the query toolchest. Fixes an issue introduced in #5471 where subqueries under non-groupBy-typed queries were silently ignored, since neither the query entry point nor the toolchest did anything special with them. 3) Removes the QueryPlus.withQuerySegmentSpec method, which was mostly being used in error-prone ways (ignoring any potential subqueries, and not verifying that the underlying data source is actually a table). Replaces with a new function, Queries.withSpecificSegments, that includes sanity checks.	2020-01-23 14:07:14 -08:00
Zhenxiao Luo	479c09751c	Add MostAvailableSizeStorageLocationSelectorStrategy (#8879 ) * Add MostAvailableSize LocationSelectorStrategy * Add doc for mostAvailableSize strategy * Fix docs for mostAvailableSize	2020-01-23 13:42:03 -08:00
Gian Merlino	d886463253	Add join-related DataSource types, and analysis functionality. (#9235 ) * Add join-related DataSource types, and analysis functionality. Builds on #9111 and implements the datasource analysis mentioned in #8728. Still can't handle join datasources, but we're a step closer. Join-related DataSource types: 1) Add "join", "lookup", and "inline" datasources. 2) Add "getChildren" and "withChildren" methods to DataSource, which will be used in the future for query rewriting (e.g. inlining of subqueries). DataSource analysis functionality: 1) Add DataSourceAnalysis class, which breaks down datasources into three components: outer queries, a base datasource (left-most of the highest level left-leaning join tree), and other joined-in leaf datasources (the right-hand branches of the left-leaning join tree). 2) Add "isConcrete", "isGlobal", and "isCacheable" methods to DataSource in order to support analysis. Other notes: 1) Renamed DataSource#getNames to DataSource#getTableNames, which I think is clearer. Also, made it a Set, so implementations don't need to worry about duplicates. 2) The addition of "isCacheable" should work around #8713, since UnionDataSource now returns false for cacheability. * Remove javadoc comment. * Updates reflecting code review. * Add comments. * Add more comments.	2020-01-22 14:54:47 -08:00
Gian Merlino	d21054f7c5	Remove the deprecated interval-chunking stuff. (#9216 ) * Remove the deprecated interval-chunking stuff. See https://github.com/apache/druid/pull/6591, https://github.com/apache/druid/pull/4004#issuecomment-284171911 for details. * Remove unused import. * Remove chunkInterval too.	2020-01-19 17:14:23 -08:00
Gian Merlino	bd49ec03bc	Move result-to-array logic from SQL layer into QueryToolChests. (#9130 ) * Move result-to-array logic from SQL layer into QueryToolChests. * Checkstyle adjustment. * Fix typo.	2020-01-16 15:42:10 -08:00
Atul Mohan	b642b1aa5b	Fix deserialization of maxBytesInMemory (#9092 ) * Fix deserialization of maxBytesInMemory * Add maxBytes check	2020-01-12 20:08:07 -08:00
Jonathan Wei	aa539177ec	De-incubation cleanup in code, docs, packaging (#9108 ) * De-incubation cleanup in code, docs, packaging * remove unused docs script	2020-01-03 12:33:19 -05:00
Jonathan Wei	4e8368a5d9	Set version to 0.18.0-SNAPSHOT (#9109 )	2020-01-02 17:55:10 -05:00
Suneet Saldanha	301c0649a7	Fix equalsAndHashCode in ClientCompactQueryTuningConfig (#9035 ) * Fix equalsAndHashCode in ClientCompactQueryTuningConfig This change introduces a dependency to EqualsVerifier for the test scope. The dependency is licensed under Apache 2. The library makes it trivial to add equals and hashCode checks to prevent bugs like this from happening in the future * fix checkstyle * fix test name	2019-12-16 14:33:00 -08:00
Himanshu	9236dd9467	optionally enable Jetty ForwardedRequestCustomizer (#9010 ) * optionally enable Jetty ForwardedRequestCustomizer * fix doc build	2019-12-12 17:00:08 -08:00
Jihoon Son	e5e1e9c4ee	Fix broken master (#9005 ) * Multibinding for NodeRole * Fix endpoints * fix doc * fix test	2019-12-11 15:56:36 -08:00
Jonathan Wei	8af41d7cd0	Update version to 0.18.0-incubating-SNAPSHOT (#9009 )	2019-12-11 14:04:03 -08:00
Parag Jain	24fe824055	add readiness endpoints to processes having initialization delays (#8841 )	2019-12-10 17:26:13 -08:00
Parag Jain	9640f9649a	fix npe while logging sql/query request (#9001 ) * fix npe while logging sql/query request * forbid forbidden DateTime API	2019-12-09 12:02:11 -08:00
Roman Leventov	1c62987783	Add SelfDiscoveryResource; rename org.apache.druid.discovery.No… (#6702 ) * Add SelfDiscoveryResource * Rename org.apache.druid.discovery.NodeType to NodeRole. Refactor CuratorDruidNodeDiscoveryProvider. Make SelfDiscoveryResource to listen to updates only about a single node (itself). * Extended docs * Fix brace * Remove redundant throws in Lifecycle.Handler.stop() * Import order * Remove unresolvable link * Address comments * tmp * tmp * Rollback docker changes * Remove extra .sh files * Move filter * Fix SecurityResourceFilterTest	2019-12-08 18:47:58 +03:00
Clint Wylie	06cd30460e	add query metrics for broker parallel merges, off by default (#8981 ) * add a bunch of metrics for broker parallel merges, off by default, and tests * fix tests * review stuffs * propogateIfPossible	2019-12-06 13:42:53 -08:00
Chi Cao Minh	af74acaa85	Address security vulnerabilities CVSS >= 7 (#8980 ) * Address security vulnerabilities CVSS >= 7 Update dependencies to address security vulnerabilities with CVSS scores of 7 or higher. A new Travis CI job is added to prevent new high/critical security vulnerabilities from being added. Updated dependencies: - api-util 1.0.0 -> 1.0.3 - jackson 2.9.10 -> 2.10.1 - kafka 2.1.0 -> 2.1.1 - libthrift 0.10.0 -> 0.13.0 - protobuf 3.2.0 -> 3.11.0 The following high/critical security vulnerabilities are currently suppressed (so that the new Travis CI job can be added now) and are left as future work to fix: - hibernate-validator:5.2.5 - jackson-mapper-asl:1.9.13 - libthrift:0.6.1 - netty:3.10.6 - nimbus-jose-jwt:4.41.1 * Rename EDL1 license file * Fix inspection errors	2019-12-05 14:34:35 -08:00
Fangyuan Deng	187cf0dd3f	[Improvement] historical fast restart by lazy load columns metadata(20X faster) (#6988 ) * historical fast restart by lazy load columns metadata * delete repeated code * add documentation for druid.segmentCache.lazyLoadOnStart * fix unit test fail * fix spellcheck * update docs * update docs mentioning a catch	2019-12-03 09:47:01 -08:00
Jonathan Wei	00ce18a0ea	Additional Kinesis resharding fixes (#8870 ) * Additional Kinesis resharding fixes * Address PR comments * Remove unused method * Adjust SegmentTransactionalInsertAction null handling * Check for unchanged metadata on empty publish * Add logs for empty publish * Fix javadoc * Clear offset when invalid endOffsets are seen * Fix LGTM alert * Fix build * Add resharding note to Kinesis docs * Checkstyle * Spelling * Address PR comments * Checkstyle	2019-11-28 12:59:01 -08:00
jon-wei	dfbc066163	Revert "[maven-release-plugin] prepare release druid-0.16.1-incubating-rc1" This reverts commit `a0f21d9b07`.	2019-11-27 23:22:43 -08:00
jon-wei	0402ff85b8	Revert "[maven-release-plugin] prepare for next development iteration" This reverts commit `8ffa71e7e6`.	2019-11-27 23:22:32 -08:00
jon-wei	8ffa71e7e6	[maven-release-plugin] prepare for next development iteration	2019-11-27 23:18:48 -08:00
jon-wei	a0f21d9b07	[maven-release-plugin] prepare release druid-0.16.1-incubating-rc1	2019-11-27 23:18:37 -08:00
Chi Cao Minh	fba876b607	Update jackson to 2.9.10 (#8940 ) Addresses security vulnerabilities: - sonatype-2016-0397: https://github.com/FasterXML/jackson-core/issues/315 - sonatype-2017-0355: https://github.com/FasterXML/jackson-core/pull/322	2019-11-26 21:41:14 -08:00
Gian Merlino	e0eb85ace7	Add FileUtils.createTempDir() and enforce its usage. (#8932 ) * Add FileUtils.createTempDir() and enforce its usage. The purpose of this is to improve error messages. Previously, the error message on a nonexistent or unwritable temp directory would be "Failed to create directory within 10,000 attempts". * Further updates. * Another update. * Remove commons-io from benchmark. * Fix tests.	2019-11-22 19:48:49 -08:00
SeKing	9955107e8e	RandomLocationSelectorStrategy to Choose an available disk(location) to store a segment. With unit tests. (#8461 )	2019-11-22 03:46:54 -08:00
Jihoon Son	934547a215	RetryingInputEntity to retry on transient errors (#8923 ) * RetryingInputEntity to retry on transient errors * fix some javadoc and httpEntity * Make it interface * Javadoc for offset	2019-11-21 21:32:18 -08:00
Chi Cao Minh	ff6217365b	Refactor parallel indexing perfect rollup partitioning (#8852 ) * Refactor parallel indexing perfect rollup partitioning Refactoring to make it easier to later add range partitioning for perfect rollup parallel indexing. This is accomplished by adding several new base classes (e.g., PerfectRollupWorkerTask) and new classes for encapsulating logic that needs to be changed for different partitioning strategies (e.g., IndexTaskInputRowIteratorBuilder). The code is functionally equivalent to before except for the following small behavior changes: 1) PartialSegmentMergeTask: Previously, this task had a priority of DEFAULT_TASK_PRIORITY. It now has a priority of DEFAULT_BATCH_INDEX_TASK_PRIORITY (via the new PerfectRollupWorkerTask base class), since it is a batch index task. 2) ParallelIndexPhaseRunner: A decorator was added to subTaskSpecIterator to ensure the subtasks are generated with unique ids. Previously, only tests (i.e., MultiPhaseParallelIndexingTest) would have this decorator, but this behavior is desired for non-test code as well. * Fix forbidden apis and pmd warnings * Fix analyze dependencies warnings * Fix IndexTask json and add IT diags * Fix parallel index supervisor<->worker serde * Fix TeamCity inspection errors/warnings * Fix TeamCity inspection errors/warnings again * Integrate changes with those from #8823 * Address review comments * Address more review comments * Fix forbidden apis * Address more review comments	2019-11-20 17:24:12 -08:00
Jihoon Son	ac6d703814	Support inputFormat and inputSource for sampler (#8901 ) * Support inputFormat and inputSource for sampler * Cleanup javadocs and names * fix style * fix timed shutoff input source reader * fix timed shutoff input source reader again * tidy up timed shutoff reader * unused imports * fix tc	2019-11-20 14:51:25 -08:00
Gian Merlino	c44452f0c1	Tidy up lifecycle, query, and ingestion logging. (#8889 ) * Tidy up lifecycle, query, and ingestion logging. The goal of this patch is to improve the clarity and usefulness of Druid's logging for cluster operators. For more information, see https://twitter.com/cowtowncoder/status/1195469299814555648. Concretely, this patch does the following: - Changes a lot of INFO logs to DEBUG, and DEBUG to TRACE, with the goal of reducing redundancy and improving clarity by avoiding showing rarely-useful log messages. This includes most "starting" and "stopping" messages, and most messages related to individual columns. - Adds new log4j2 templates that show operators how to enabled DEBUG logging for certain important packages. - Eliminate stack traces for query errors, unless log level is DEBUG or more. This is useful because query errors often indicate user error rather than system error, but dumping stack trace often gave operators the impression that there was a system failure. - Adds task id to Appenderator, AppenderatorDriver thread names. In the default log4j2 configuration, this will put them in log lines as well. It's very useful if a user is using the Indexer, where multiple tasks run in the same JVM. - More consistent terminology when it comes to "sequences" (sets of segments that are handed-off together by Kafka ingestion) and "offsets" (cursors in partitions). These terms had been confused in some log messages due to the fact that Kinesis calls offsets "sequence numbers". - Replaces some ugly toString calls with either the JSONification or something more operator-accessible (like a URL or segment identifier, instead of JSON object representing the same). * Adjustments. * Adjust integration test.	2019-11-19 13:57:58 -08:00
Chi Cao Minh	8365bdf62a	Address security vulnerabilities (#8878 ) * Address security vulnerabilities Security vulnerabilities addressed by upgrading 3rd party libs: - Upgrade avro-ipc to 1.9.1 - sonatype-2019-0115 - Upgrade caffeine to 2.8.0 - sonatype-2019-0282 - Upgrade commons-beanutils to 1.9.4 - CVE-2014-0114 - Upgrade commons-codec to 1.13 - sonatype-2012-0050 - Upgrade commons-compress to 1.19 - CVE-2019-12402 - sonatype-2018-0293 - Upgrade hadoop-common to 2.8.5 - CVE-2018-11767 - Upgrade hadoop-mapreduce-client-core to 2.8.5 - CVE-2017-3166 - Upgrade hibernate-validator to 5.2.5 - CVE-2017-7536 - Upgrade httpclient to 4.5.10 - sonatype-2017-0359 - Upgrade icu4j to 55.1 - CVE-2014-8147 - Upgrade jackson-databind to 2.6.7.3: - CVE-2017-7525 - Upgrade jetty-http to 9.4.12: - CVE-2017-7657 - CVE-2017-7658 - CVE-2017-7656 - CVE-2018-12545 - Upgrade log4j-core to 2.8.2 - CVE-2017-5645: - Upgrade netty to 3.10.6 - CVE-2015-2156 - Upgrade netty-common to 4.1.42 - CVE-2019-9518 - Upgrade netty-codec-http to 4.1.42 - CVE-2019-16869 - Upgrade nimbus-jose-jwt to 4.41.1 - CVE-2017-12972 - CVE-2017-12974 - Upgrade plexus-utils to 3.0.24 - CVE-2017-1000487 - sonatype-2015-0173 - sonatype-2016-0398 - Upgrade postgresql to 42.2.8 - CVE-2018-10936 Note that if users are using JDBC lookups with postgres, they may need to update the JDBC jar used by the lookup extension. * Fix license for postgresql	2019-11-19 09:14:33 -08:00
Vadim Ogievetsky	17d773dca2	Web console: replace (and remove) old consoles (#8838 ) * first steps * clean licenses * fix capabilities * fix specs * more tests * new web console on coordinator and overlord, remove setup for old consoles, old configs * better message * update licenses * sync license files * more button * fix tslint issue * jetty-rewrite dependency to add redirects for old console paths * put dependency in the right place * fix overlord detection * fix notices, dedupe licenses * make segment timeline work in no SQL mode * update license * revert hard coded coordinator mode from testing * update restricted mode copy	2019-11-15 19:45:14 -08:00
Jihoon Son	1611792855	Add InputSource and InputFormat interfaces (#8823 ) * Add InputSource and InputFormat interfaces * revert orc dependency * fix dimension exclusions and failing unit tests * fix tests * fix test * fix test * fix firehose and inputSource for parallel indexing task * fix tc * fix tc: remove unused method * Formattable * add needsFormat(); renamed to ObjectSource; pass metricsName for reader * address comments * fix closing resource * fix checkstyle * fix tests * remove verify from csv * Revert "remove verify from csv" This reverts commit `1ea7758489`. * address comments * fix import order and javadoc * flatMap * sampleLine * Add IntermediateRowParsingReader * Address comments * move csv reader test * remove test for verify * adjust comments * Fix InputEntityIteratingReader * rename source -> entity * address comments	2019-11-15 09:22:09 -08:00
Clint Wylie	7aafcf8bca	parallel broker merges on fork join pool (#8578 ) * sketch of broker parallel merges done in small batches on fork join pool * fix non-terminating sequences, auto compute parallelism * adjust benches * adjust benchmarks * now hella more faster, fixed dumb * fix * remove comments * log.info for debug * javadoc * safer block for sequence to yielder conversion * refactor LifecycleForkJoinPool into LifecycleForkJoinPoolProvider which wraps a ForkJoinPool * smooth yield rate adjustment, more logs to help tune * cleanup, less logs * error handling, bug fixes, on by default, more parallel, more tests * remove unused var * comments * timeboundary mergeFn * simplify, more javadoc * formatting * pushdown config * use nanos consistently, move logs back to debug level, bit more javadoc * static terminal result batch * javadoc for nullability of createMergeFn * cleanup * oops * fix race, add docs * spelling, remove todo, add unhandled exception log * cleanup, revert unintended change * another unintended change * review stuff * add ParallelMergeCombiningSequenceBenchmark, fixes * hyper-threading is the enemy * fix initial start delay, lol * parallelism computer now balances partition sizes to partition counts using sqrt of sequence count instead of sequence count by 2 * fix those important style issues with the benchmarks code * lazy sequence creation for benchmarks * more benchmark comments * stable sequence generation time * update defaults to use 100ms target time, 4096 batch size, 16384 initial yield, also update user docs * add jmh thread based benchmarks, cleanup some stuff * oops * style * add spread to jmh thread benchmark start range, more comments to benchmarks parameters and purpose * retool benchmark to allow modeling more typical heterogenous heavy workloads * spelling * fix * refactor benchmarks * formatting * docs * add maxThreadStartDelay parameter to threaded benchmark * why does catch need to be on its own line but else doesnt	2019-11-07 11:58:46 -08:00
Zhenxiao Luo	a9aa416c3d	In DirectDruidClient, don't run Future cancellation listener in… (#8700 ) * In DirectDruidClient, don't run Future cancellation listener in HTTP library executor * extract cancelQuery as a method of DirectDruidClient * Fix testCancel * Add exception as the first argument to log.error	2019-11-07 21:12:18 +03:00
Roman Leventov	5c0fc0a13a	Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 ) * IndexerSQLMetadataStorageCoordinator.getTimelineForIntervalsWithHandle() don't fetch abutting intervals; simplify getUsedSegmentsForIntervals() * Add VersionedIntervalTimeline.findNonOvershadowedObjectsInInterval() method; Propagate the decision about whether only visible segmetns or visible and overshadowed segments should be returned from IndexerMetadataStorageCoordinator's methods to the user logic; Rename SegmentListUsedAction to RetrieveUsedSegmentsAction, SegmetnListUnusedAction to RetrieveUnusedSegmentsAction, and UsedSegmentLister to UsedSegmentsRetriever * Fix tests * More fixes * Add javadoc notes about returning Collection instead of Set. Add JacksonUtils.readValue() to reduce boilerplate code * Fix KinesisIndexTaskTest, factor out common parts from KinesisIndexTaskTest and KafkaIndexTaskTest into SeekableStreamIndexTaskTestBase * More test fixes * More test fixes * Add a comment to VersionedIntervalTimelineTestBase * Fix tests * Set DataSegment.size(0) in more tests * Specify DataSegment.size(0) in more places in tests * Fix more tests * Fix DruidSchemaTest * Set DataSegment's size in more tests and benchmarks * Fix HdfsDataSegmentPusherTest * Doc changes addressing comments * Extended doc for visibility * Typo * Typo 2 * Address comment	2019-11-06 11:07:04 -08:00
Jihoon Son	511fa74fa2	Move maxFetchRetry to FetchConfig; rename OpenObject (#8776 )	2019-11-04 08:26:33 -08:00
yuanli	bca649e492	Case sensitive comparison of nonbinary string in MySQL metadata storage (#8758 )	2019-10-30 20:48:08 -07:00
Clint Wylie	3ff5e02237	remove select query (#8739 ) * remove select query * thanks teamcity * oops * oops * add back a SelectQuery class that throws RuntimeExceptions linking to docs * adjust text * update docs per review * deprecated	2019-10-30 19:29:56 -07:00
Jihoon Son	094936ca03	Remove commit() method Firehose (#8688 ) * Remove commit() method Firehose * fix javadoc	2019-10-23 16:52:02 -07:00
Jihoon Son	2518478b20	Remove deprecated parameter for Checkpoint request (#8707 ) * Remove deprecated parameter for Checkpoint request * fix wrong doc	2019-10-23 16:51:16 -07:00
Jihoon Son	f5b9bf5525	Cluster-wide configuration for query vectorization (#8657 ) * Cluster-wide configuration for query vectorization * add doc * fix build * fix doc * rename to QueryConfig and add javadoc * fix checkstyle * fix variable names	2019-10-23 21:44:28 +08:00
Surekha	98f59ddd7e	Add `sys.supervisors` table to system tables (#8547 ) * Add supervisors table to SystemSchema * Add docs * fix checkstyle * fix test * fix CI * Add comments * Fix javadoc teamcity error * comments * fix links in docs * fix links * rename fullStatus query param to system and remove it from docs	2019-10-18 15:16:42 -07:00
Jihoon Son	30c15900be	Auto compaction based on parallel indexing (#8570 ) * Auto compaction based on parallel indexing * javadoc and doc * typo * update spell * addressing comments * address comments * fix log * fix build * fix test * increase default max input segment bytes per task * fix test	2019-10-18 13:24:14 -07:00
Mingming Qiu	2c758ef5ff	Support assign tasks to run on different categories of MiddleManagers (#7066 ) * Support assign tasks to run on different tiers of MiddleManagers * address comments * address comments * rename tier to category and docs * doc * fix doc * fix spelling errors * docs	2019-10-17 12:57:19 -07:00
Aditya	75527f09cd	implement FiniteFirehoseFactory in InlineFirehose (#8682 ) * implement FiniteFirehoseFactory in InlineFirehose * override isSplittable in InlineFirehoseFactory & improve tests	2019-10-16 19:28:55 -07:00
Jihoon Son	4046c86d62	Stateful auto compaction (#8573 ) * Stateful auto compaction * javaodc * add removed test back * fix test * adding indexSpec to compactionState * fix build * add lastCompactionState * address comments * extract CompactionState * fix doc * fix build and test * Add a task context to store compaction state; add javadoc * fix it test	2019-10-15 22:57:42 -07:00
Sashidhar Thallam	124efa85f6	Fix for RoundRobinStorageLocationSelectorStrategy not to pick the same storage location over and again (#8634 ) * Fix for 8614: iterators returned by strategy don't share iteration state. * Adding comments	2019-10-12 09:53:52 +08:00
Jihoon Son	96d8523ecb	Use hash of Segment IDs instead of a list of explicit segments in auto compaction (#8571 ) * IOConfig for compaction task * add javadoc, doc, unit test * fix webconsole test * add spelling * address comments * fix build and test * address comments	2019-10-09 11:12:00 -07:00
Chi Cao Minh	b6b5517c20	Speed up ParallelIndexSupervisorTask tests (#8633 ) Previously, some tests for ParallelIndexSupervisorTask were being run twice unnecessarily.	2019-10-08 19:56:12 -07:00
Nishant Bangarwa	0853273091	Add tier based usage metrics for historical nodes to help with autoscaling (#8636 ) * Add tier based usage metrics for historical nodes to help with druid historical autoscaling Add tier based usage metrics for historical nodes to help druid cluster orchestration systems understand the historical node usage and requirements. Following metrics would be helpful - tier/required/capacity- total capacity in bytes required in each tier. Dimensions - tier tier/total/capacity - total capacity in bytes available in a given tier. Dimension - tier tier/historical/count - no. of historical nodes available in each tier. Dimension - tier tier/replication/factor - configured maximum replication factor in given tier. Dimension - tier * fix unit test failures	2019-10-08 19:55:32 -07:00
Himanshu	0f7e0ff030	CuratorDruidNodeDiscoveryProvider: do not ignore exception in listener execution and log it (#8616 )	2019-10-02 08:50:28 -07:00
Sashidhar Thallam	51a7235ebc	Making optimal usage of multiple segment cache locations (#8038 ) * #7641 - Changing segment distribution algorithm to distribute segments to multiple segment cache locations * Fixing indentation * WIP * Adding interface for location strategy selection, least bytes used strategy impl, round-robin strategy impl, locationSelectorStrategy config with least bytes used strategy as the default strategy * fixing code style * Fixing test * Adding a method visible only for testing, fixing tests * 1. Changing the method contract to return an iterator of locations instead of a single best location. 2. Check style fixes * fixing the conditional statement * Added testSegmentDistributionUsingLeastBytesUsedStrategy, fixed testSegmentDistributionUsingRoundRobinStrategy * to trigger CI build * Add documentation for the selection strategy configuration * to re trigger CI build * updated docs as per review comments, made LeastBytesUsedStorageLocationSelectorStrategy.getLocations a synchronzied method, other minor fixes * In checkLocationConfigForNull method, using getLocations() to check for null instead of directly referring to the locations variable so that tests overriding getLocations() method do not fail * Implementing review comments. Added tests for StorageLocationSelectorStrategy * Checkstyle fixes * Adding java doc comments for StorageLocationSelectorStrategy interface * checkstyle * empty commit to retrigger build * Empty commit * Adding suppressions for words leastBytesUsed and roundRobin of ../docs/configuration/index.md file * Impl review comments including updating docs as suggested * Removing checkLocationConfigForNull(), @NotEmpty annotation serves the purpose * Round robin iterator to keep track of the no. of iterations, impl review comments, added tests for round robin strategy * Fixing the round robin iterator * Removed numLocationsToTry, updated java docs * changing property attribute value from tier to type * Fixing assert messages	2019-09-28 00:17:44 -06:00
Faxian Zhao	e1b4a3ab71	bug fix for lookup leak when we remove the last lookup from lookup tier (#8598 ) * bug fix for lookup leak when we remove the last lookup from lookup tier * warnings about lookups that will never be loaded * fix unit test	2019-09-27 03:55:02 -07:00
Clint Wylie	7781820dea	JsonParserIterator.init future timeout (#8550 ) * add timeout support for JsonParserIterator init future * add queryId * should be less than 1 * fix * fix npe * fix lgtm * adjust exception, nullable * fix test * refactor * revert queryId change * add log.warn to tie exception to json parser iterator	2019-09-27 09:13:37 +09:00
Fangyuan Deng	a280c5dc03	fix queuedSize not decrease in HttpLoadQueuePeon when load failed (#8596 )	2019-09-26 08:48:00 -07:00
Nishant Bangarwa	a75ddaad9e	Add TrustedDomain Authenticator (#8248 ) * Add TrustedDomain Authenticator update javadoc Add nullable annotations Add cautionary note fix travis failure * add IP to spell checker	2019-09-25 11:25:03 -07:00
Clint Wylie	eabddffd6e	fix http firehose factory leaky connection in constructor (#8576 ) * fix http firehose factory leaky connection in constructor * stylin	2019-09-24 17:08:43 -06:00
SandishKumarHN	ade8d1922d	#8156 : StructuralSearchInspection, Prohibit check on Thread.ge… (#8394 ) * StructuralSearchInspection, Prohibit check on Thread.getState() * review changes - 1 * review changes 2 * review changes 3 * test fix * review changes-2 * review changes-3	2019-09-22 14:12:05 +03:00
Chi Cao Minh	187b507b3d	Fix CuratorModule flaky test (#8562 ) For CuratorModuleTest. exitsJvmWhenMaxRetriesExceeded(), the expected log message is intermittently not the first one in the list of captured log messages. For example, it is the second one in https://travis-ci.org/apache/incubator-druid/jobs/586792178#L754.	2019-09-20 13:36:22 -07:00
Chi Cao Minh	5f39ee21ff	Add diags for flaky CuratorModuleTest (#8528 ) Add diagnostic messages to debug CuratorModuleTest.exitsJvmWhenMaxRetriesExceeded() intermittent test failures.	2019-09-14 11:55:26 -07:00
Jihoon Son	762f4d0e58	Check targetCompactionSizeBytes to search for candidate segments in auto compaction (#8495 ) * Check targetCompactionSizeBytes to search for candidate segments in auto compaction * fix logs * add javadoc * rename	2019-09-09 23:11:08 -07:00
Chi Cao Minh	5f61374cb3	Fix dependency analyze warnings (#8230 ) * Fix dependency analyze warnings Update the maven dependency plugin to the latest version and fix all warnings for unused declared and used undeclared dependencies in the compile scope. Added new travis job to add the check to CI. Also fixed some source code files to use the correct packages for their imports and updated druid-forbidden-apis to prevent regressions. * Address review comments * Adjust scope for org.glassfish.jaxb:jaxb-runtime * Fix dependencies for hdfs-storage * Consolidate netty4 versions	2019-09-09 14:37:21 -07:00
Chi Cao Minh	14a8613d69	Exit JVM on curator unhandled errors (#8458 ) * Exit JVM on curator unhandled errors If an unhandled error occurs when curator is talking to ZooKeeper, exit the JVM in addition to stopping the lifecycle to prevent the process from being left in a zombie state. With this change, BoundedExponentialBackoffRetryWithQuit is no longer needed as when curator exceeds the configured retries, it triggers its unhandled error listeners. A new "connectionTimeoutMs" CuratorConfig setting is added mostly to facilitate testing curator unhandled errors, but it may be useful for users as well. * Address review comments	2019-09-06 16:43:59 -07:00
Rye	645799f977	disallow whitespace characters except space in data source names (#8465 ) * disallow whitespace characters in data source names * wrapped preconditions in a function, and simplify unit tests code * Fixed regex to allow space, simplified repeat logic * Fixed import style against mvn checkstyle * Add msg in case test fails, use emptyMap(), improved naming * Changes on assertion functions * change wording of "whitespace" to "whitespace except space" to avoid misleading	2019-09-06 08:55:21 -07:00
Clint Wylie	c73a489335	bump master version to 0.17.0-incubating-SNAPSHOT (#8421 )	2019-08-28 01:58:36 -07:00
Himanshu	4d87a19547	Logging emitter to publish query and other metric events as valid json objects (#8359 ) * LoggingEmitter: print event as json * use DefaultRequestLogEventBuilderFactory in emitting request logger by default * print context in query metric as json * removed unused jsonMapper from DefaultQueryMetrics * add comment * remove change to DefaultRequestLogEventBuilderFactory.java	2019-08-27 15:00:23 -07:00
Jihoon Son	e5ef5ddafa	Fix the shuffle with TLS enabled for parallel indexing; add an integration test; improve unit tests (#8350 ) * Fix shuffle with tls enabled; add an integration test; improve unit tests * remove debug log * fix tests * unused import * add javadoc * rename to getContent	2019-08-26 19:27:41 -07:00
Xavier Léauté	d5e3c53e74	workaround for Guava 16 bug using Java 9 and above	2019-08-25 00:58:59 -04:00
SandishKumarHN	33f0753a70	Add Checkstyle for constant name static final (#8060 ) * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * check ctyle for constant field name * merging with upstream * review-1 * unknow changes * unknow changes * review-2 * merging with master * review-2 1 changes * review changes-2 2 * bug fix	2019-08-23 13:13:54 +03:00
Jonathan Wei	96e2142ea3	Cleanup appenderators and segment walkers in UnifiedIndexerAppenderatorsManager (#8287 ) * Cleanup Appenderators in UnifiedIndexerAppenderatorsManager * PR comments * More PR comments * Fix test	2019-08-22 12:18:46 -07:00
Jihoon Son	22d6384d36	Fix unrealistic test variables in KafkaSupervisorTest and tidy up unused variable in checkpointing process (#7319 ) * Fix unrealistic test arguments in KafkaSupervisorTest * remove currentCheckpoint from checkpoint action * rename variable	2019-08-21 10:58:22 -07:00
Benedict Jin	14a4238381	Bump JUnitParams from 1.0.4 to 1.1.1 (#8017 )	2019-08-20 16:15:12 -07:00
Fokko Driesprong	cb1339e19a	Bump derby from 10.11.1.1 to 10.14.2.0 (#8292 ) * Bump derby from 10.11.1.1 to 10.15.1.3 * Update server/pom.xml as well * Move to derby 10.14.2.0 10.15.* is Java9+ https://db.apache.org/derby/derby_downloads.html	2019-08-20 14:03:32 -07:00
pphust	ffcbd1ecb8	Ensure ReferenceCountingSegment.decrement() is invoked correctly (#8323 ) * fix issue8291. Make sure ReferenceCountingSegment.decrement() is invoked correctly * add some comments in SinkQuerySegmentWalker.java * extracting perSegmentRunners and perHydrantRunners to reduce the level of nesting	2019-08-20 22:01:16 +03:00
Benedict Jin	781873ba53	Fix resource leak (#8337 ) * Fix resource leak * Patch comments	2019-08-20 12:55:41 +03:00
Benedict Jin	566dc8c719	Fix missing format argument (#8331 )	2019-08-19 16:19:44 +08:00
Jihoon Son	5dac6375f3	Add support for parallel native indexing with shuffle for perfect rollup (#8257 ) * Add TaskResourceCleaner; fix a couple of concurrency bugs in batch tasks * kill runner when it's ready * add comment * kill run thread * fix test * Take closeable out of Appenderator * add javadoc * fix test * fix test * update javadoc * add javadoc about killed task * address comment * Add support for parallel native indexing with shuffle for perfect rollup. * Add comment about volatiles * fix test * fix test * handling missing exceptions * more clear javadoc for stopGracefully * unused import * update javadoc * Add missing statement in javadoc * address comments; fix doc * add javadoc for isGuaranteedRollup * Rename confusing variable name and fix typos * fix typos; move fetch() to a better home; fix the expiration time * add support https	2019-08-15 17:43:35 -07:00
Fokko Driesprong	1a3aa1cfc0	Bump commons-io from 2.5 to 2.6 (#8006 ) * Bump commons-io from 2.5 to 2.6 * Update licenses.yaml * Address comments	2019-08-13 17:10:37 -07:00
Jihoon Son	a5c9c2950f	Add missing maxBytesInMemory in tuningConfig for auto compaction (#8274 ) * Add missing tuningConfigs for auto compaciton * Add doc * add test	2019-08-13 14:10:26 -05:00
Jihoon Son	312cdc2452	Add TaskResourceCleaner; fix a couple of concurrency bugs in batch tasks (#8236 ) * Add TaskResourceCleaner; fix a couple of concurrency bugs in batch tasks * kill runner when it's ready * add comment * kill run thread * fix test * Take closeable out of Appenderator * add javadoc * fix test * fix test * update javadoc * add javadoc about killed task * address comment * handling missing exceptions * more clear javadoc for stopGracefully * update javadoc * Add missing statement in javadoc * typo	2019-08-12 19:42:06 -05:00
Clint Wylie	1054d85171	add mechanism to control filter optimization in historical query processing (#8209 ) * add support for mechanism to control filter optimization in historical query processing * oops * adjust * woo * javadoc * review comments * fix * default * oops * oof * this will fix it * more nullable, refactor DimFilter.getRequiredColumns to use Set, formatting * extract class DimFilterToStringBuilder with common code from custom DimFilter toString implementations * adjust variable naming * missing nullable * more nullable * fix javadocs * nullable * address review comments * javadocs, precondition * nullable * rename method to be consistent * review comments * remove tuning from ColumnComparisonFilter/ColumnComparisonDimFilter	2019-08-09 16:36:18 -07:00
Jonathan Wei	e88bbe71c0	Adjust default globalIngestionHeapLimitBytes for indexer, add more docs (#8255 )	2019-08-07 23:04:07 -07:00
Jihoon Son	8fa114c349	Fix bugs in overshadowableManager and add unit tests (#8222 ) * Fix bugs in overshadowableManager and add unit tests * Fix SegmentManager * add segment manager test * Address comments * Address comments	2019-08-07 15:51:21 -05:00
Jonathan Wei	5e57492298	Add docs for CliIndexer as an experimental feature (#8245 ) * Experimental CliIndexer docs * PR comments	2019-08-06 15:57:17 -07:00
Lucas Capistrant	e252abedc5	Enable toggling request logging on/off for different query types (#7562 ) * Enable ability to toggle SegmentMetadata request logging on/off * Move SegmentMetadata query log filter to FilteredRequestLogger * Update documentation to reflect the segment metadata flag moving to the filtered request logger * Modify patch to allow blacklist of query types to not log to request logger * Address styling and naming requests following latest code review * Fix indentation on multiple locations per Druid style rules	2019-08-06 15:47:30 +03:00
Jihoon Son	ab5b3be6c6	Add shuffleSegmentPusher for data shuffle (#8115 ) * Fix race between canHandle() and addSegment() in StorageLocation * add comment * Add shuffleSegmentPusher which is a dataSegmentPusher used for writing shuffle data in local storage. * add comments * unused import * add comments * fix test * address comments * remove <p> tag from javadoc * address comments * comparingLong * Address comments * fix test	2019-08-05 13:38:35 -07:00
Eugene Sevastianov	3f3162b85e	Enum of ResponseContext keys (#8157 ) * Refactored ResponseContext and aggregated its keys into Enum * Added unit tests for ResponseContext and refactored the serialization * Removed unused methods * Fixed code style * Fixed code style * Fixed code style * Made SerializationResult static * Updated according to the PR discussion: Renamed an argument Updated comparator Replaced Pair usage with Map.Entry Added a comment about quadratic complexity Removed boolean field with an expression Renamed SerializationResult field Renamed the method merge to add and renamed several context keys Renamed field and method related to scanRowsLimit Updated a comment Simplified a block of code Renamed a variable * Added JsonProperty annotation to renamed ScanQuery field * Extension-friendly context key implementation * Refactored ResponseContext: updated delegate type, comments and exceptions Reducing serialized context length by removing some of its' collection elements * Fixed tests * Simplified response context truncation during serialization * Extracted a method of removing elements from a response context and added some comments * Fixed typos and updated comments	2019-08-03 12:05:21 +03:00
Fokko Driesprong	91743eeebe	Spotbugs: NP_NONNULL_PARAM_VIOLATION (#8129 )	2019-08-02 19:20:22 +03:00
Chi Cao Minh	7783b31846	Add IPv4 druid expressions (#8197 ) * Add IPv4 druid expressions New druid expressions for filtering IPv4 addresses: - ipv4address_match: Check if IP address belongs to a subnet - ipv4address_parse: Convert string IP address to long - ipv4address_stringify: Convert long IP address to string These expressions operate on IP addresses represented as either strings or longs, so that they can be applied to dimensions with mixed representation of IP addresses. The filtering is more efficient when operating on IP addresses as longs. In other words, the intended use case is: 1) Use ipv4address_parse to convert to long at ingestion time 2) Use ipv4address_match to filter (on longs) at query time 3) Use ipv4adress_stringify to convert to (readable) string at query time * Fix licenses and null handling * Simplify IPv4 expressions * Fix tests * Fix check for valid ipv4 address string	2019-08-01 11:45:04 -07:00
Jonathan Wei	41893d4647	Simple memory allocation for CliIndexer tasks (#8201 ) * Simple memory allocation for CliIndexer * PR comments * Checkstyle	2019-08-01 10:22:41 +08:00
Gian Merlino	77297f4e6f	GroupBy array-based result rows. (#8196 ) * GroupBy array-based result rows. Fixes #8118; see that proposal for details. Other than the GroupBy changes, the main other "interesting" classes are: - ResultRow: The array-based result type. - BaseQuery: T is no longer required to be Comparable. - QueryToolChest: Adds "decorateObjectMapper" to enable query-aware serialization and deserialization of result rows (necessary due to their positional nature). - QueryResource: Uses the new decoration functionality. - DirectDruidClient: Also uses the new decoration functionality. - QueryMaker (in Druid SQL): Modifications to read ResultRows. These classes weren't changed, but got some new javadocs: - BySegmentQueryRunner - FinalizeResultsQueryRunner - Query * Adjustments for TC stuff.	2019-07-31 16:15:12 -07:00
Fokko Driesprong	faf51107d5	Add SuppressWarnings SS_SHOULD_BE_STATIC (#8138 ) * Spotbugs: SS_SHOULD_BE_STATIC (#8073) * Add SuppressWarnings SS_SHOULD_BE_STATIC Fixes #8073 * Fix the voilation * Make them non-final * Remove @Nonnull	2019-07-31 19:44:42 +03:00
Jihoon Son	385f492a55	Use PartitionsSpec for all task types (#8141 ) * Use partitionsSpec for all task types * fix doc * fix typos and revert to use isPushRequired * address comments * move partitionsSpec to core * remove hadoopPartitionsSpec	2019-07-30 17:24:39 -07:00
Clint Wylie	653b558134	sql firehose and firehose doc adjustments (#8067 ) * firehose doc adjustments * fix typo * additional information on parser types in ingestion docs * clarify ingest segment firehose docs, add sql firehose examples to sql extension pages * fixit * make sql firehose more forgiving my always constructing a MapInputRowParser from the parseSpec of whatever actual InputRowParser impl is provided, remove doc references to map based parsers * transforms * fix tests	2019-07-30 15:28:10 -07:00
Fokko Driesprong	e016995d1f	Enable Spotbugs: WMI_WRONG_MAP_ITERATOR (#8005 ) * WMI_WRONG_MAP_ITERATOR * Fixed missing loop	2019-07-30 19:51:53 +03:00
Jonathan Wei	640b7afc1c	Add CliIndexer process type and initial task runner implementation (#8107 ) * Add CliIndexer process type and initial task runner implementation * Fix HttpRemoteTaskRunnerTest * Remove batch sanity check on PeonAppenderatorsManager * Fix paralle index tests * PR comments * Adjust Jersey resource logging * Additional cleanup * Fix SystemSchemaTest * Add comment to LocalDataSegmentPusherTest absolute path test * More PR comments * Use Server annotated with RemoteChatHandler * More PR comments * Checkstyle * PR comments * Add task shutdown to stopGracefully * Small cleanup * Compile fix * Address PR comments * Adjust TaskReportFileWriter and fix nits * Remove unnecessary closer * More PR comments * Minor adjustments * PR comments * ThreadingTaskRunner: cancel task run future not shutdownFuture and remove thread from workitem	2019-07-29 17:06:33 -07:00
Chi Cao Minh	ab71a2e1e4	Revert "Fix dependency analyze warnings (#8128 )" (#8189 ) This reverts commit `5dd0d8e873`.	2019-07-29 11:42:16 -07:00
Jihoon Son	adf7bafb9f	Fix race between canHandle() and addSegment() in StorageLocation (#8114 ) * Fix race between canHandle() and addSegment() in StorageLocation * add comment * add comments * fix test * address comments * remove <p> tag from javadoc * address comments * comparingLong	2019-07-27 11:11:06 +03:00
Chi Cao Minh	5dd0d8e873	Fix dependency analyze warnings (#8128 ) * Fix dependency analyze warnings Update the maven dependency plugin to the latest version and fix all warnings for unused declared and used undeclared dependencies in the compile scope. Added new travis job to add the check to CI. Also fixed some source code files to use the correct packages for their imports. * Fix licenses and dependencies * Fix licenses and dependencies again * Fix integration test dependency * Address review comments * Fix unit test dependencies * Fix integration test dependency * Fix integration test dependency again * Fix integration test dependency third time * Fix integration test dependency fourth time * Fix compile error * Fix assert package	2019-07-26 10:49:03 -07:00
Parag Jain	31a29d8883	add noop type name to prevent jackson exception when setting type to noop (#8133 )	2019-07-25 16:07:08 -07:00
Jihoon Son	db14946207	Add support minor compaction with segment locking (#7547 ) * Segment locking * Allow both timeChunk and segment lock in the same gruop * fix it test * Fix adding same chunk to atomicUpdateGroup * resolving todos * Fix segments to lock * fix segments to lock * fix kill task * resolving todos * resolving todos * fix teamcity * remove unused class * fix single map * resolving todos * fix build * fix SQLMetadataSegmentManager * fix findInputSegments * adding more tests * fixing task lock checks * add SegmentTransactionalOverwriteAction * changing publisher * fixing something * fix for perfect rollup * fix test * adjust package-lock.json * fix test * fix style * adding javadocs * remove unused classes * add more javadocs * unused import * fix test * fix test * Support forceTimeChunk context and force timeChunk lock for parallel index task if intervals are missing * fix travis * fix travis * unused import * spotbug * revert getMaxVersion * address comments * fix tc * add missing error handling * fix backward compatibility * unused import * Fix perf of versionedIntervalTimeline * fix timeline * fix tc * remove remaining todos * add comment for parallel index * fix javadoc and typos * typo * address comments	2019-07-24 17:35:46 -07:00
Clint Wylie	0695e487e7	fix issue with CuratorLoadQueuePeon shutting down executors it does not own (#8140 ) * fix issue with CuratorLoadQueuePeon shutting down executors it does not own * use lifecycled executors * maybe this	2019-07-24 10:59:43 -07:00
Eugene Sevastianov	799d20249f	Response context refactoring (#8110 ) * Response context refactoring * Serialization/Deserialization of ResponseContext * Added java doc comments * Renamed vars related to ResponseContext * Renamed empty() methods to createEmpty() * Fixed ResponseContext usage * Renamed multiple ResponseContext static fields * Added PublicApi annotations * Renamed QueryResponseContext class to ResourceIOReaderWriter * Moved the protected method below public static constants * Added createEmpty method to ResponseContext with DefaultResponseContext creation * Fixed inspection error * Added comments to the ResponseContext length limit and ResponseContext http header name * Added a comment of possible future refactoring * Removed .gitignore file of indexing-service * Removed a never-used method * VisibleForTesting method reducing boilerplate Co-Authored-By: Clint Wylie <cjwylie@gmail.com> * Reduced boilerplate * Renamed the method serialize to serializeWith * Removed unused import * Fixed incorrectly refactored test method * Added comments for ResponseContext keys * Fixed incorrectly refactored test method * Fixed IntervalChunkingQueryRunnerTest mocks	2019-07-24 18:29:03 +03:00
Clint Wylie	0388581493	Revert "Spotbugs: SS_SHOULD_BE_STATIC (#8073 )" (#8145 ) This reverts commit `04a180a5fb`.	2019-07-23 22:57:19 -07:00
Clint Wylie	83514958db	remove unnecessary lock in ForegroundCachePopulator leading to a lot of contention (#8116 ) * remove unecessary lock in ForegroundCachePopulator leading to a lot of contention * mutableboolean, javadocs,document some cache configs that were missing * more doc stuff * adjustments * remove background documentation	2019-07-23 10:57:59 -07:00
Fokko Driesprong	04a180a5fb	Spotbugs: SS_SHOULD_BE_STATIC (#8073 )	2019-07-23 18:18:49 +08:00
Fokko Driesprong	e1a745717e	Spotbugs: NP_STORE_INTO_NONNULL_FIELD (#8021 )	2019-07-21 21:23:47 +08:00
Clint Wylie	f24e2f16af	fix npe with sql metadata manager polling and empty database (#8106 ) * fix npe with sql metadata manager polling and empty database * treat null segments separately * use preconditions check * add test	2019-07-20 19:09:02 -07:00
Himanshu	54a7b54d2d	avoid 'must return non-void type' warning (#8105 )	2019-07-18 15:02:27 -07:00
Jihoon Son	c7eb7cd018	Add intermediary data server for shuffle (#8088 ) * Add intermediary data server for shuffle * javadoc * adjust timeout * resolved todo * fix test * style * address comments * rename to shuffleDataLocations * Address comments * bit adjustment StorageLocation * fix test * address comment & fix test * handle interrupted exception	2019-07-18 14:46:47 -07:00
Clint Wylie	03e55d30eb	add CachingClusteredClient benchmark, refactor some stuff (#8089 ) * add CachingClusteredClient benchmark, refactor some stuff * revert WeightedServerSelectorStrategy to ConnectionCountServerSelectorStrategy and remove getWeight since felt artificial, default mergeResults in toolchest implementation for topn, search, select * adjust javadoc * adjustments * oops * use it * use BinaryOperator, remove CombiningFunction, use Comparator instead of Ordering, other review adjustments * rename createComparator to createResultComparator, fix typo, firstNonNull nullable parameters	2019-07-18 13:16:28 -07:00
Sashidhar Thallam	72496d3712	#7858 Throwing UnsupportedOperationException from ImmutableDrui… (#7933 ) * #7858 Throwing UnsupportedOperationException from ImmutableDruidDataSource's equals() and hashCode() methods. * 1. Turning ImmutableDruidDataSource into a data container. 2. Adding a Util method to be used in tests for checking equality of ImmutableDruidDataSource objects. * Removing unused method * Fixing assert equals * Fixing assert equals in TestUtils.java * Adding java doc comments, Using ExpectedException in tests * Fixing test cases * Fixed expected exception message in tests * fixed line width * line width fix * code style fixes * code indentation fixes * fixing method name	2019-07-18 22:35:19 +03:00
Surekha	da16144495	Refactoring to use `CollectionUtils.mapValues` (#8059 ) * doc updates and changes to use the CollectionUtils.mapValues utility method * Add Structural Search patterns to intelliJ * refactoring from PR comments * put -> putIfAbsent * do single key lookup	2019-07-17 23:02:22 -07:00
Roman Leventov	ceb969903f	Refactor SQLMetadataSegmentManager; Change contract of REST met… (#7653 ) * Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource * Style fixes * Unused imports * Fix tests * Fix style * Comments * Comment fix * Remove unresolvable Javadoc references; address comments * Add comments to ImmutableDruidDataSource * Merge with master * Fix bad web-console merge * Fixes in api-reference.md * Rename in DruidCoordinatorRuntimeParams * Fix compilation * Residual changes	2019-07-17 17:18:48 +03:00
Clint Wylie	15fbf5983d	add Class.getCanonicalName to forbidden-apis (#8086 ) * add checkstyle to forbid unecessary use of Class.getCanonicalName * use forbiddin-api instead of checkstyle * add space	2019-07-16 15:21:50 -07:00
Chi Cao Minh	da3d141dd2	Add inline firehose (#8056 ) * Add inline firehose To allow users to quickly parsing and schema, add a firehose that reads data that is inlined in its spec. * Address review comments * Remove suppression of sonar warnings	2019-07-11 21:43:46 -07:00
Atul Mohan	631cda649b	Include replicated segment size property for datasources endpoint (#8039 ) * Add replication size * Summon comma	2019-07-11 01:10:38 -07:00
Himanshu	14aec7fcec	add config to optionally disable all compression in intermediate segment persists while ingestion (#7919 ) * disable all compression in intermediate segment persists while ingestion * more changes and build fix * by default retain existing indexingSpec for intermediate persisted segments * document indexSpecForIntermediatePersists index tuning config * fix build issues * update serde tests	2019-07-10 12:22:24 -07:00
Gian Merlino	338b8b3fef	SupervisorManager: Add authorization checks to bulk endpoints. (#8044 ) The endpoints added in #6272 were missing authorization checks. This patch removes the bulk methods from SupervisorManager, and instead has SupervisorResource run the full list through filterAuthorizedSupervisorIds before calling resume/suspend/terminate one by one.	2019-07-09 13:16:54 -07:00
Parag Jain	027291a90d	set DRUID_AUTHORIZATION_CHECKED attribute for router endpoints (#8026 ) * add state resource filter to router endpoints * add RouterResource to ResourceFilter test framework	2019-07-09 00:51:36 -07:00
Sashidhar Thallam	6701dc08fe	Making StatusResponseHandler singleton and fixing all its instantiation invocations (#7969 ) * Making StatusResponseHandler singleton and fixing all its instantiation invocations * Using StatusResponseHandler.getInstance() where applicable	2019-07-08 13:33:00 +05:30
Chi Cao Minh	1166bbcb75	Remove static imports from tests (#8036 ) Make static imports forbidden in tests and remove all occurrences to be consistent with the non-test code. Also, various changes to files affected by above: - Reformat to adhere to druid style guide - Fix various IntelliJ warnings - Fix various SonarLint warnings (e.g., the expected/actual args to Assert.assertEquals() were flipped)	2019-07-06 09:33:12 -07:00
Clint Wylie	42a7b8849a	remove FirehoseV2 and realtime node extensions (#8020 ) * remove firehosev2 and realtime node extensions * revert intellij stuff * rat exclusion	2019-07-04 15:40:22 -07:00
Clint Wylie	f7283378ac	remove deprecated standalone realtime node (#7915 ) * remove CliRealtime, RealtimeManager, etc * add redirects for deleted page to page that explains the deleted thing * adjust docs	2019-07-02 18:12:17 -07:00
Fokko Driesprong	c6baa59f77	Enable DLS_DEAD_LOCAL_STORE (#7967 )	2019-06-28 04:39:42 +08:00
Fokko Driesprong	82b248cc17	Spotbugs: Enable MS_SHOULD_BE_FINAL (#7946 )	2019-06-23 15:42:18 -07:00
SandishKumarHN	e80297efef	Set Test timeout higher for robust performance (#7890 )	2019-06-17 22:01:54 -07:00
SandishKumarHN	01881e3a98	Use only com.google.errorprone.annotations.concurrent.GuardedBy, not javax.annotations.concurrent.GuardedBy (#7889 )	2019-06-17 15:58:51 +02:00
Himanshu	b3328b2785	endpoint to delete lookup tier and remove tier on last lookup deletion (#7852 )	2019-06-15 17:55:50 -07:00
Sashidhar Thallam	3bee6adcf7	Use map.putIfAbsent() or map.computeIfAbsent() as appropriate instead of containsKey() + put() (#7764 ) * https://github.com/apache/incubator-druid/issues/7316 Use Map.putIfAbsent() instead of containsKey() + put() * fixing indentation * Using map.computeIfAbsent() instead of map.putIfAbsent() where appropriate * fixing checkstyle * Changing the recommendation text * Reverting auto changes made by IDE * Implementing recommendation: A ConcurrentHashMap on which computeIfAbsent() is called should be assigned into variables of ConcurrentHashMap type, not ConcurrentMap * Removing unused import	2019-06-14 17:59:36 +02:00
Clint Wylie	3fbb0a5e00	Supervisor list api with states and health (#7839 ) * allow optionally listing all supervisors with their state and health * docs * add state to full * clean * casing * format * spelling	2019-06-07 16:26:33 -07:00
Surekha	ea752ef562	Optimize overshadowed segments computation (#7595 ) * Move the overshadowed segment computation to SQLMetadataSegmentManager's poll * rename method in MetadataSegmentManager * Fix tests * PR comments * PR comments * PR comments * fix indentation * fix tests * fix test * add test for SegmentWithOvershadowedStatus serde format * PR comments * PR comments * fix test * remove snapshot updates outside poll * PR comments * PR comments * PR comments * removed unused import	2019-06-07 19:15:54 +02:00
Xue Yu	d482da6e9b	fix timestamp ceil lower bound bug (#7823 )	2019-06-04 01:16:31 -07:00
Fokko Driesprong	f2b00023f8	Bump Checkstyle to 8.21 (#7826 )	2019-06-04 01:02:46 -07:00
Jihoon Son	61ec521135	Remove keepSegmentGranularity option for compaction (#7747 ) * Remove keepSegmentGranularity option from compaction * fix it test * clean up * remove from web console * fix test	2019-06-03 12:59:15 -07:00
Justin Borromeo	8032c4add8	Add errors and state to stream supervisor status API endpoint (#7428 ) * Add state and error tracking for seekable stream supervisors * Fixed nits in docs * Made inner class static and updated spec test with jackson inject * Review changes * Remove redundant config param in supervisor * Style * Applied some of Jon's recommendations * Add transience field * write test * implement code review changes except for reconsidering logic of markRunFinishedAndEvaluateHealth() * remove transience reporting and fix SeekableStreamSupervisorStateManager impl * move call to stateManager.markRunFinished() from RunNotice to runInternal() for tests * remove stateHistory because it wasn't adding much value, some fixes, and add more tests * fix tests * code review changes and add HTTP health check status * fix test failure * refactor to split into a generic SupervisorStateManager and a specific SeekableStreamSupervisorStateManager * fixup after merge * code review changes - add additional docs * cleanup KafkaIndexTaskTest * add additional documentation for Kinesis indexing * remove unused throws class	2019-05-31 17:16:01 -07:00
Jihoon Son	7abfbb066a	Bump up snapshot version to 0.16.0 (#7802 )	2019-05-30 17:17:33 -07:00
Roman Leventov	782863ed0f	Fix some problems reported by PVS-Studio (#7738 ) * Fix some problems reported by PVS-Studio * Address comments	2019-05-29 11:20:45 -07:00
Gian Merlino	7ec7257e1d	Fix lookup serde on node types that don't load lookups. (#7752 ) This includes the router, overlord, middleManager, and coordinator. Does the following things: - Loads LookupSerdeModule on MM, overlord, and coordinator. - Adds LookupExprMacro to LookupSerdeModule, which allows these node types to understand that the 'lookup' function exists. - Adds a test to make sure that LookupSerdeModule works for virtual columns, filters, transforms, and dimension specs. This is implementing the technique discussed on these two issues: - https://github.com/apache/incubator-druid/issues/7724#issuecomment-494723333 - https://github.com/apache/incubator-druid/pull/7082#discussion_r264888771	2019-05-24 12:30:49 -07:00
Merlin Lee	26fad7e06a	Add checkstyle for "Local variable names shouldn't start with capital" (#7681 ) * Add checkstyle for "Local variable names shouldn't start with capital" * Adjust some local variables to constants * Replace StringUtils.LINE_SEPARATOR with System.lineSeparator()	2019-05-23 18:40:28 +02:00
Jonathan Wei	6901123a53	Fix compareAndSwap() in SQLMetadataConnector (#7661 ) * Fix compareAndSwap() in SQLMetadataConnector * Catch serialization_failure and retry for Postgres	2019-05-15 14:53:04 -07:00
Clint Wylie	b87c8f0314	fix lookup editor to use lookup tiers instead of historical tiers (#7647 ) * fix lookup editor to use lookup tiers instead of historical tiers * use default tier if empty response, fix if configured lookups is null * fixes * fix typo	2019-05-14 13:30:51 -07:00
Fokko Driesprong	2aa9613bed	Bump Checkstyle to 8.20 (#7651 ) * Bump Checkstyle to 8.20 Moderate severity vulnerability that affects: com.puppycrawl.tools:checkstyle Checkstyle prior to 8.18 loads external DTDs by default, which can potentially lead to denial of service attacks or the leaking of confidential information. Affected versions: < 8.18 * Oops, missed one * Oops, missed a few	2019-05-14 11:53:37 -07:00
Xavier Léauté	1d49364d08	Set direct memory if unable to detect JVM config (#7606 ) * Set direct memory if unable to detect JVM config Java 9 and above prevents us from detecting the maximum available direct memory. This change adds a fallback method to use at most 25% of maximum heap size, which should be a reasonable default. Unless -XX:MaxDirectMemorySize is set, recent JVMs will default maximum direct memory to match the maximum heap size, so this should work out of the box in most cases. For completeness we print instructions in the log to explain how to adjust settings if necessary. * skip test rather than succeeding * reword log message Co-Authored-By: Himanshu <g.himanshu@gmail.com>	2019-05-09 22:30:42 -07:00
Jihoon Son	18e0d6acb4	Fix resultLevelCache for timeseries with grandTotal (#7624 ) * Fix resultLevelCache for timeseries with grandTotal * Address comment * fix test	2019-05-09 18:11:04 -07:00
Jonathan Wei	dadf6a2f11	Add tool for migrating from local deep storage/Derby metadata (#7598 ) * Add tool for migrating from local deep storage/Derby metadata * Split deep storage and metadata migration docs * Support import into Derby * Fix create tables cmd * Fix create tables cmd * Fix commands * PR comment * Add -p	2019-05-06 23:39:40 -07:00
Xavier Léauté	c58aa2f2ab	Remove unnecessary cast to URLClassLoader (#7603 ) Java 9 and above will fail trying to cast the system classloader	2019-05-06 20:17:22 -07:00
Xavier Léauté	f7bfe8f269	Update mocking libraries for Java 11 support (#7596 ) * update easymock / powermock for to 4.0.2 / 2.0.2 for JDK11 support * update tests to use new easymock interfaces * fix tests failing due to easymock fixes * remove dependency on jmockit * fix race condition in ResourcePoolTest	2019-05-06 12:28:56 -07:00
Samarth Jain	afbcb9c07f	Improve parallelism of zookeeper based segment change processing (#7088 ) * V1 - improve parallelism of zookeeper based segment change processing * Create zk nodes in batches. Address code review comments. Introduce various configs. * Add documentation for the newly added configs * Fix test failures * Fix more test failures * Remove prinstacktrace statements * Address code review comments * Use a single queue * Address code review comments Since we have a separate load peon for every historical, just having a single SegmentChangeProcessor task per historical is enough. This commit also gets rid of the associated config druid.coordinator.loadqueuepeon.curator.numCreateThreads * Resolve merge conflict * Fix compilation failure * Remove batching since we already have a dynamic config maxSegmentsInNodeLoadingQueue that provides that control * Fix NPE in test * Remove documentation for configs that are no longer needed * Address code review comments * Address more code review comments * Fix checkstyle issue * Address code review comments * Code review comments * Add back monitor node remove executor * Cleanup code to isolate null checks and minor refactoring * Change param name since it conflicts with member variable name	2019-05-03 15:58:42 +02:00
Jonathan Wei	a013350018	Adjust required permissions for system schema (#7579 ) * Adjust required permissions for system schema * PR comments, fix current_size handling * Checkstyle * Set curr_size instead of current_size * Adjust information schema docs * Fix merge conflict * Update tests	2019-05-02 07:18:02 -07:00
David Lim	ec8562c885	Data loader (sampler component) (#7531 ) * sampler initial check-in fix checkstyle issues add sampler fix to process CSV files from cache properly change to composition and rename some classes add tests and report num rows read and indexed remove excludedByFilter flag and don't send filtered out data fix tests to handle both settings for druid.generic.useDefaultValueForNull * wrap sampler firehose in TimedShutoffFirehoseFactory to support timeouts * code review changes - add additional comments, limit maxRows	2019-05-01 22:37:14 -07:00
Surekha	15d19f3059	Add is_overshadowed column to sys.segments table (#7425 ) * Add is_overshadowed column to sys.segments table * update docs * Rename class and variables * PR comments * PR comments * remove unused variables in MetadataResource * move constants together * add getFullyOvershadowedSegments method to ImmutableDruidDataSource * Fix compareTo of SegmentWithOvershadowedStatus * PR comment * PR comments * PR comments * PR comments * PR comments * fix issue with already consumed stream * minor refactoring * PR comments	2019-05-01 18:00:57 +02:00
Xavier Léauté	6d4181191f	replace jdk internal exceptions with closest publicly available one	2019-04-30 14:21:45 -07:00
Gian Merlino	7b8bc9a5ef	EmitterModule: Throw an error on invalid emitter types. (#7328 ) * EmitterModule: Throw an error on invalid emitter types. The current behavior of silently using the "noop" emitter is unhelpful and makes it difficult to debug config typos. * Add comments.	2019-04-29 19:23:53 +02:00
Gian Merlino	ce7298b51e	BaseAppenderatorDriver: Fix potentially overeager segment cleanup. (#7558 ) * BaseAppenderatorDriver: Fix potentially overeager segment cleanup. Here is a thing that I think can go wrong: 1. We push some segments, then try to publish them transactionally. 2. The segments are actually published, but the 200 OK response gets lost (connection dropped, whatever). 3. We try again, and on the second try, the publish fails (because the transaction baseline start metadata no longer matches). 4. Because the publish failed, we delete the pushed segments. 5. But this is bad, because the publish didn't really fail, it actually succeeded in step 2. I haven't seen this in the wild, but thought about it while reviewing #7537. This patch also cleans up logging a bit, making it more accurate and somewhat less chatty. * Avoid wrapping exceptions when not necessary.	2019-04-29 09:55:04 -07:00
Justin Borromeo	07dd742e35	Fix time-ordered scan queries on realtime segments (#7546 ) * Initial commit * Added test for int to long conversion * Add appenderator test for realtime scan query * get rid of todo * Fix forbidden apis * Jon's recommendations * Formatting	2019-04-26 16:12:10 -07:00
Adam Peck	ebdf07b69f	Add reload by interval API (#7490 ) * Add reload by interval API Implements the reload proposal of #7439 Added tests and updated docs * PR updates * Only build timeline with required segments Use 404 with message when a segmentId is not found Fix typo in doc Return number of segments modified. * Fix checkstyle errors * Replace String.format with StringUtils.format * Remove return value * Expand timeline to segments that overlap for intervals Restrict update call to only segments that need updating. * Only add overlapping enabled segments to the timeline * Some renames for clarity Added comments * Don't rely on cached poll data Only fetch required information from DB * Match error style * Merge and cleanup doc * Fix String.format call * Add unit tests * Fix unit tests that check for overshadowing	2019-04-26 16:01:50 -07:00
Surekha	8308ffef1f	API to drop data by interval (#7494 ) * Add api to drop data by interval * update to address comments * unused imports * PR comments + add tests in SQLMetadataSegmentManagerTest * update tests and docs	2019-04-25 14:24:40 -07:00
Jihoon Son	c60e7feab8	Fix encoded taskId check in chatHandlerResource (#7520 ) * Fix encoded taskId check in chatHandlerResource * fix tests	2019-04-20 18:08:34 -07:00
Surekha	c2a42e05bb	Fix result-level cache for queries (#7325 ) * Add SegmentDescriptor interval in the hash while calculating Etag * Add computeResultLevelCacheKey to CacheStrategy Make HavingSpec cacheable and implement getCacheKey for subclasses Add unit tests for computeResultLevelCacheKey * Add more tests * Use CacheKeyBuilder for HavingSpec's getCacheKey * Initialize aggregators map to avoid NPE * adjust cachekey builder for HavingSpec to ignore aggregators * unused import * PR comments	2019-04-18 13:31:29 -07:00
Xavier Léauté	4322ce3303	Java 9 compatible cleaner operations (#7487 ) Java 9 removed support for sun.misc.Cleaner in favor of java.lang.ref.Cleaner. This change adds a thin abstraction to switch between Cleaner implementations based on JDK version at runtime	2019-04-17 08:04:52 -07:00
Lucas Capistrant	8acad27d99	Enhance the Http Firehose to work with URIs requiring basic authentication (#7145 ) * Enhnace the HttpFirehose to work with both insecure URIs and URIs requiring basic authentication * Improve security of enhanced HttpFirehoseFactory by not logging auth credentials * Fix checkstyle failure in HttpFirehoseFactory.java * Update docs and fix TeamCity build with required noinspection * Indentation cleanup and logic modification for HttpFirehose object stream * Remove default Empty string password provider in http firehose * Add JavaDoc for MixIn describing its intended use * Reverting documentation notation for json code to be inline with rest of doc * Improve instantiation of ObjectMappers that require MixIn for redacting password from task logs * Add comment to clarify fully qualified references of Objects in SQLMetadataStorageActionHandler	2019-04-15 14:29:01 -07:00
Surekha	4654e1e851	Remove unnecessary collection (#7350 ) From the discussion [here](https://github.com/apache/incubator-druid/pull/6901#discussion_r265741002) Remove the collection and filter datasources from the stream. Also remove StreamingOutput and JsonFactory constructs.	2019-04-15 19:49:21 +02:00
Gian Merlino	3854cfd15e	SQLMetadataSegmentManager: Comments, formatting adjustments (#7452 ) Follow up to #7447.	2019-04-11 21:57:50 -07:00
Gian Merlino	a517f8ce49	Coordinator: Allow dropping all segments. (#7447 ) Removes the coordinator sanity check that prevents it from dropping all segments. It's useful to get rid of this, since the behavior is unintuitive for dev/testing clusters where users might regularly want to drop all their data to get back to a clean slate. But the sanity check was there for a reason: to prevent a race condition where the coordinator might drop all segments if it ran before the first metadata store poll finished. This patch addresses that concern differently, by allowing methods in MetadataSegmentManager to return null if a poll has not happened yet, and canceling coordinator runs in that case. This patch also makes the "dataSources" reference in SQLMetadataSegmentManager volatile. I'm not sure why it wasn't volatile before, but it seems necessary to me: it's not final, and it's dereferenced from multiple threads without synchronization.	2019-04-11 08:45:38 -07:00
Justin Borromeo	2771ed50b0	Support Kafka supervisor adopting running tasks between versions (#7212 ) * Recompute hash in isTaskCurrent() and added tests * Fixed checkstyle stuff * Fixed failing tests * Make TestableKafkaSupervisorWithCustomIsTaskCurrent static * Add doc * baseSequenceName change * Added comment * WIP * Fixed imports * Undid lambda change for diff sake * Cleanup * Added comment * Reinsert Kafka tests * Readded kinesis test * Readd bad partition assignment in kinesis supervisor test * Nit * Misnamed var	2019-04-10 18:16:38 -07:00
Clint Wylie	76b4a5c62e	refactor lookups to be more chill to router (#7222 ) * refactor lookups to be more chill to router * remove accidental change * fix and combine LookupIntrospectionResourceTest * fix inspection * rename RouterLookupModule to LookupSerdeModule and RouterLookupExtractorFactoryContainerProvider to NoopLookupExtractorFactoryContainerProvider * make comment generic * use ConfigResourceFilter instead of StateResourceFilter * fix indentation * unused import * another unused import * refactor some stuff into processing module, split up LookupModule.java classes into their own files	2019-04-05 14:49:41 -07:00
Gian Merlino	78745fea84	Fix two issues with Coordinator -> Overlord communication. (#7412 ) * Fix two issues with Coordinator -> Overlord communication. 1) ClientCompactQuery needs to recognize the potential for 'intervals' to be set instead of 'segments'. The lack of this led to a NullPointerException on DruidCoordinatorSegmentCompactor.java:102. 2) In two locations (DruidCoordinatorSegmentCompactor, DruidCoordinatorCleanupPendingSegments) tasks were being retrieved using waiting/pending/running tasks in the wrong order: by checking 'running' first and then 'pending', tasks could be missed if they moved from 'pending' to 'running' in between the two calls. Replaced these methods with calls to 'getActiveTasks', a new method that does the calls in the right order. * Remove unused import.	2019-04-04 10:25:18 -07:00
David Glasser	4e23c11345	Make IngestSegmentFirehoseFactory splittable for parallel ingestion (#7048 ) * Make IngestSegmentFirehoseFactory splittable for parallel ingestion * Code review feedback - Get rid of WindowedSegment - Don't document 'segments' parameter or support splitting firehoses that use it - Require 'intervals' in WindowedSegmentId (since it won't be written by hand) * Add missing @JsonProperty * Integration test passes * Add unit test * Remove two FIXME comments from CompactionTask I'd like to leave this PR in a potentially mergeable state, but I still would appreciate reviewer eyes on the questions I'm removing here. * Updates from code review	2019-04-02 14:59:17 -07:00
Michael Trelinski	347779b17a	Zookeeper loss (#6740 ) * Update init Fix bin/init to source from proper directory. * Fix for Proposal #6518: Shutdown druid processes upon complete loss of ZK connectivity * Zookeeper Loss: - Add feature documentation - Cosmetic refactors - Variable extractions - Remove getter * - Change config key name and reword documentation - Switch from Function<Void,Void> to Runnable/Lambda - try { … } finally { … } * Fix line length too long * - change to formatted string for logging - use System.err.println after lifecycle stops * commenting on makeEnsembleProvider()-created Zookeeper termination * Add javadoc * added java doc reference back to apache discussion thread. * move comment to other class * favor two-slash comments instead of multiline comments	2019-03-29 15:10:42 -07:00
Jihoon Son	62c3e89266	maxTotalRows should be checked in DataSourceCompactionConfig before setting targetCompactionSizeBytes (#7368 ) * maxTotalRows should be checked in DataSourceCompactionConfig before setting targetCompactionSizeBytes * remove unnecessary default values * remove flacky test * fix build * Add comments	2019-03-28 20:25:10 -07:00
Justin Borromeo	ad7862c58a	Time Ordering On Scans (#7133 ) * Moved Scan Builder to Druids class and started on Scan Benchmark setup * Need to form queries * It runs. * Stuff for time-ordered scan query * Move ScanResultValue timestamp comparator to a separate class for testing * Licensing stuff * Change benchmark * Remove todos * Added TimestampComparator tests * Change number of benchmark iterations * Added time ordering to the scan benchmark * Changed benchmark params * More param changes * Benchmark param change * Made Jon's changes and removed TODOs * Broke some long lines into two lines * nit * Decrease segment size for less memory usage * Wrote tests for heapsort scan result values and fixed bug where iterator wasn't returning elements in correct order * Wrote more tests for scan result value sort * Committing a param change to kick teamcity * Fixed codestyle and forbidden API errors * . * Improved conciseness * nit * Created an error message for when someone tries to time order a result set > threshold limit * Set to spaces over tabs * Fixing tests WIP * Fixed failing calcite tests * Kicking travis with change to benchmark param * added all query types to scan benchmark * Fixed benchmark queries * Renamed sort function * Added javadoc on ScanResultValueTimestampComparator * Unused import * Added more javadoc * improved doc * Removed unused import to satisfy PMD check * Small changes * Changes based on Gian's comments * Fixed failing test due to null resultFormat * Added config and get # of segments * Set up time ordering strategy decision tree * Refactor and pQueue works * Cleanup * Ordering is correct on n-way merge -> still need to batch events into ScanResultValues * WIP * Sequence stuff is so dirty :( * Fixed bug introduced by replacing deque with list * Wrote docs * Multi-historical setup works * WIP * Change so batching only occurs on broker for time-ordered scans Restricted batching to broker for time-ordered queries and adjusted tests Formatting Cleanup * Fixed mistakes in merge * Fixed failing tests * Reset config * Wrote tests and added Javadoc * Nit-change on javadoc * Checkstyle fix * Improved test and appeased TeamCity * Sorry, checkstyle * Applied Jon's recommended changes * Checkstyle fix * Optimization * Fixed tests * Updated error message * Added error message for UOE * Renaming * Finish rename * Smarter limiting for pQueue method * Optimized n-way merge strategy * Rename segment limit -> segment partitions limit * Added a bit of docs * More comments * Fix checkstyle and test * Nit comment * Fixed failing tests -> allow usage of all types of segment spec * Fixed failing tests -> allow usage of all types of segment spec * Revert "Fixed failing tests -> allow usage of all types of segment spec" This reverts commit `ec470288c7`. * Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge" This reverts commit `57033f36df`, reversing changes made to `8f01d8dd16`. * Check type of segment spec before using for time ordering * Fix bug in numRowsScanned * Fix bug messing up count of rows * Fix docs and flipped boolean in ScanQueryLimitRowIterator * Refactor n-way merge * Added test for n-way merge * Refixed regression * Checkstyle and doc update * Modified sequence limit to accept longs and added test for long limits * doc fix * Implemented Clint's recommendations	2019-03-28 14:37:09 -07:00
Charles Allen	eeb3dbe79d	Move GCP to a core extension (#6953 ) * Move GCP to a core extension * Don't provide druid-core >.< * Keep AWS and GCP modules separate * Move AWSModule to its own module * Add aws ec2 extension and more modules in more places * Fix bad imports * Fix test jackson module * Include AWS and GCP core in server * Add simple empty method comment * Update version to 15 * One more 0.13.0-->0.15.0 change * Fix multi-binding problem * Grep for s3-extensions and update docs * Update extensions.md	2019-03-27 09:00:43 -07:00
Jihoon Son	543324f8a9	Fix logging in IndexerSQLMetadataStorageCoordinator (#7349 )	2019-03-26 20:36:19 -07:00
Jihoon Son	4d37edac1e	Suppress stack trace in warning (#7348 )	2019-03-26 17:27:29 -07:00
Jihoon Son	5294277cb4	Fix exclusive start partitions for sequenceMetadata (#7339 ) * Fix exclusvie start partitions for sequenceMetadata * add empty check	2019-03-26 14:39:07 -07:00
Roman Leventov	bca40dcdaf	Fix some IntelliJ inspections (#7273 ) Prepare TeamCity for IntelliJ 2018.3.1 upgrade. Mostly removed redundant exceptions declarations in `throws` clauses.	2019-03-25 21:11:01 -03:00
Jihoon Son	f410c28af6	Always convert start metadata to start (#7332 )	2019-03-22 21:12:15 -07:00
Jihoon Son	0c5dcf5586	Fix exclusivity for start offset in kinesis indexing service & check exclusivity properly in IndexerSQLMetadataStorageCoordinator (#7291 ) * Fix exclusivity for start offset in kinesis indexing service * some adjustment * Fix SeekableStreamDataSourceMetadata * Add missing javadocs * Add missing comments and unit test * fix SeekableStreamStartSequenceNumbers.plus and add comments * remove extra exclusivePartitions in KafkaIOConfig and fix downgrade issue * Add javadocs * fix compilation * fix test * remove unused variable	2019-03-21 13:12:22 -07:00
Roman Leventov	dfd27e00c0	Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality (#7185 ) * Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality * Fix DruidCoordinatorTest; Renamed DruidCoordinator.getReplicationStatus() to computeUnderReplicationCountsPerDataSourcePerTier() * More Javadocs, typos, refactor DruidCoordinatorRuntimeParams.createAvailableSegmentsSet() * Style * typo * Disable StaticPseudoFunctionalStyleMethod inspection because of too much false positives * Fixes	2019-03-19 18:22:56 -03:00
Jihoon Son	e18d5d96d9	Ignore bad JSON entries in SQLMetadataSupervisorManager.getAll() (#7278 )	2019-03-18 14:28:11 +08:00
Jihoon Son	892d1d35d6	Deprecate NoneShardSpec and drop support for automatic segment merge (#6883 ) * Deprecate noneShardSpec * clean up noneShardSpec constructor * revert unnecessary change * Deprecate mergeTask * add more doc * remove convert from indexMerger * Remove mergeTask * remove HadoopDruidConverterConfig * fix build * fix build * fix teamcity * fix teamcity * fix ServerModule * fix compilation * fix compilation	2019-03-15 23:29:25 -07:00
Atul Mohan	2daeb50008	Add support for optional client authentication on TLS (#7250 ) * Add optional client auth * Add docs	2019-03-15 15:14:34 -07:00
Furkan KAMACI	7ada1c49f9	Prohibit Throwables.propagate() (#7121 ) * Throw caught exception. * Throw caught exceptions. * Related checkstyle rule is added to prevent further bugs. * RuntimeException() is used instead of Throwables.propagate(). * Missing import is added. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * * Checkstyle definition is improved. * Throwables.propagate() usages are removed. * Checkstyle pattern is changed for only scanning "Throwables.propagate(" instead of checking lookbehind. * Throwable is kept before firing a Runtime Exception. * Fix unused assignments.	2019-03-14 18:28:33 -03:00
Hongze Zhang	f9d99b245b	Add missing doc link for operations/http-compression.html; Fix magic numbers in test cases using JettyServerInitUtils.wrapWithDefaultGzipHandler (#7110 )	2019-03-13 14:09:19 -07:00
Clint Wylie	3895914aa2	consolidate CompressionUtils.java since now in the same jar (#6908 )	2019-03-13 11:02:44 -04:00
Clint Wylie	4d3987c1dd	lifecycle stage refactor to ensure proper start and stop ordering of servers and announcements (#7234 ) * lifecycle stage refactor to ensure proper ordering of servers and announcements * move DerivativeDataSourceManager to Lifecycle.Stage.NORMAL	2019-03-12 07:09:03 -07:00
Jihoon Son	e240fba247	Fix logs in SegmentLoaderLocalCacheManager (#7229 )	2019-03-11 21:16:03 -07:00
Gian Merlino	dcfca03718	More accurate RealtimeMetricsMonitor messages. (#7230 ) The old messages did not reflect the full range of reasons why messages could be thrown away.	2019-03-11 19:50:32 -04:00
Samarth Jain	8804bd0dc1	Remove unnecessary check for contains() in LoadRule (#7073 ) See https://github.com/apache/incubator-druid/issues/7072	2019-03-11 13:52:46 -03:00
Clint Wylie	5cc171419c	move jetty module to Lifecycle.Stage.LAST to allow graceful shutdown to work with lookups and stuff, put http-clint on lifecycle modules lifecycle (#7215 )	2019-03-09 15:14:09 -08:00
Jihoon Son	9bebf113ba	Fix race in historical when loading segments in parallel (#7203 ) * Fix race in historical when loading segments in parallel * revert unnecessary change * remove synchronized * add reference counting locking * fix build * fix comment	2019-03-08 17:54:05 -08:00
Clint Wylie	a44df6522c	rename maintenance mode to decommission (#7154 ) * rename maintenance mode to decommission * review changes * missed one * fix straggler, add doc about decommissioning stalling if no active servers * fix missed typo, docs * refine docs * doc changes, replace generals * add explicit comment to mention suppressed stats for balanceTier * rename decommissioningVelocity to decommissioningMaxSegmentsToMovePercent and update docs * fix precondition check * decommissioningMaxPercentOfMaxSegmentsToMove * fix test * fix test * fixes	2019-03-08 16:33:51 -08:00
Roman Leventov	10c9f6d708	Fix and document concurrency of EventReceiverFirehose and TimedShutoffFirehose; Refine concurrency specification of Firehose (#7038 ) #### `EventReceiverFirehoseFactory` Fixed several concurrency bugs in `EventReceiverFirehoseFactory`: - Race condition over putting an entry into `producerSequences` in `checkProducerSequence()`. - `Stopwatch` used to measure time across threads, but it's a non-thread-safe class. - Use `System.nanoTime()` instead of `System.currentTimeMillis()` because the latter are [not suitable](https://stackoverflow.com/a/351571/648955) for measuring time intervals. - `close()` was not synchronized by could be called from multiple threads concurrently. Removed unnecessary `readLock` (protecting `hasMore()` and `nextRow()` which are always called from a single thread). Removed unnecessary `volatile` modifiers. Documented threading model and concurrent control flow of `EventReceiverFirehose` instances. Important: please read the updated Javadoc for `EventReceiverFirehose.addAll()`. It allows events from different requests (batches) to be interleaved in the buffer. Is this OK? #### `TimedShutoffFirehoseFactory` - Fixed a race condition that was possible because `close()` that was not properly synchronized. Documented threading model and concurrent control flow of `TimedShutoffFirehose` instances. #### `Firehose` Refined concurrency contract of `Firehose` based on `EventReceiverFirehose` implementation. Importantly, now it states that `close()` doesn't affect `hasMore()` and `nextRow()` and could be called concurrently with them. In other words, specified that `close()` is for "row supply" side rather than "row consume" side. However, I didn't check that other `Firehose` implementatations adhere to this contract. <hr> This issue is the result of reviewing `EventReceiverFirehose` and `TimedShutoffFirehose` using [this checklist](https://medium.com/@leventov/code-review-checklist-java-concurrency-49398c326154).	2019-03-04 18:50:03 -03:00
Himanshu Pandey	8b803cbc22	Added checkstyle for "Methods starting with Capital Letters" (#7118 ) * Added checkstyle for "Methods starting with Capital Letters" and changed the method names violating this. * Un-abbreviate the method names in the calcite tests * Fixed checkstyle errors * Changed asserts position in the code	2019-02-23 20:10:31 -08:00
David Glasser	1c2753ab90	ParallelIndexSubTask: support ingestSegment in delegating factories (#7089 ) IndexTask had special-cased code to properly send a TaskToolbox to a IngestSegmentFirehoseFactory that's nested inside a CombiningFirehoseFactory, but ParallelIndexSubTask didn't. This change refactors IngestSegmentFirehoseFactory so that it doesn't need a TaskToolbox; it instead gets a CoordinatorClient and a SegmentLoaderFactory directly injected into it. This also refactors SegmentLoaderFactory so it doesn't depend on an injectable SegmentLoaderConfig, since its only method always replaces the preconfigured SegmentLoaderConfig anyway. This makes it possible to use SegmentLoaderFactory without setting druid.segmentCaches.locations to some dummy value. Another goal of this PR is to make it possible for IngestSegmentFirehoseFactory to list data segments outside of connect() --- specifically, to make it a FiniteFirehoseFactory which can query the coordinator in order to calculate its splits. See #7048. This also adds missing datasource name URL-encoding to an API used by CoordinatorBasedSegmentHandoffNotifier.	2019-02-23 17:02:56 -08:00
Jihoon Son	4e2b085201	Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911 ) * Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file * delete descriptor.file when killing segments * fix test * Add doc for ha * improve warning	2019-02-20 15:10:29 -08:00
Mingming Qiu	dd34691004	Coordinator await initialization before finishing startup (#6847 ) * Curator server inventory await initialization * address comments * print exception object in log * remove throws ISE * cachingCost awaitInitialization default to false	2019-02-20 11:56:23 -08:00
Justin Borromeo	c7eeeabf45	2528 Replace Incremental Index Global Flags with Getters (#7043 ) * Eliminated reportParseExceptions and deserializeComplexMetrics * Removed more global flags * Cleanup * Addressed Surekha's recommendations	2019-02-15 13:36:46 -08:00
Jihoon Son	1701fbcad3	Improve error message for revoked locks (#7035 ) * Improve error message for revoked locks * fix test * fix test * fix test * fix toString	2019-02-13 11:22:48 -08:00
Jihoon Son	d42de574d6	Add an api to get all lookup specs (#7025 ) * Add an api to get all lookup specs * add doc	2019-02-08 11:05:59 -08:00
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
Egor Riashin	97b6407983	maintenance mode for Historical (#6349 ) * maintenance mode for Historical forbidden api fix, config deserialization fix logging fix, unit tests * addressed comments * addressed comments * a style fix * addressed comments * a unit-test fix due to recent code-refactoring * docs & refactoring * addressed comments * addressed a LoadRule drop flaw * post merge cleaning up	2019-02-04 18:11:00 -08:00
Roman Leventov	0e926e8652	Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898 ) * Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool * Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java * Remove unnecessary comment * Checkstyle * Fix getFromExtensions() * Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization * Fix UriCacheGeneratorTest * Workaround issue with MaterializedViewQueryQueryToolChest * Strengthen Appenderator's contract regarding concurrency	2019-02-04 09:18:12 -08:00
Surekha	7baa33049c	Introduce published segment cache in broker (#6901 ) * Add published segment cache in broker * Change the DataSegment interner so it's not based on DataSEgment's equals only and size is preserved if set * Added a trueEquals to DataSegment class * Use separate interner for realtime and historical segments * Remove trueEquals as it's not used anymore, change log message * PR comments * PR comments * Fix tests * PR comments * Few more modification to * change the coordinator api * removeall segments at once from MetadataSegmentView in order to serve a more consistent view of published segments * Change the poll behaviour to avoid multiple poll execution at same time * minor changes * PR comments * PR comments * Make the segment cache in broker off by default * Added a config to PlannerConfig * Moved MetadataSegmentView to sql module * Add doc for new planner config * Update documentation * PR comments * some more changes * PR comments * fix test * remove unintentional change, whether to synchronize on lifecycleLock is still in discussion in PR * minor changes * some changes to initialization * use pollPeriodInMS * Add boolean cachePopulated to check if first poll succeeds * Remove poll from start() * take the log message out of condition in stop()	2019-02-02 22:27:13 -08:00
Vadim Ogievetsky	7f1b19bfb1	Adding a Unified web console. (#6923 ) * Adding new web console. * fixed css * fix form height * fix typo * do import custom react-table css * added repo field so npm does not complain * ask travis for node 10 * move indexing-service/src/main/resources/indexer_static into web-console * fix resource names and paths * add licenses * fix exclude file * add licenses to misc files and tidy up * remove rebase marker * fix link * updated env variable name * tidy up licenses and surface errors * cleanup * remove unused code, fix missing await * TeamCity does not like the name aux * add more links to tasks view * rm pages * update gitignore * update readme to be accurate * make clean script * removed old console dependancy * update Jetty routes * add a comment for welcome files for coordinator * do not show inital notifaction for now * renamed overlord console back to console.html * fix coordinator console * rename coordinator-console.html to index.html	2019-01-31 17:26:41 -08:00
Jihoon Son	e56c598cc1	Fall back to the old coordinator API for checking segment handoff if new one is not supported (#6966 )	2019-01-31 08:50:46 -08:00
Benedict Jin	72a571fbf7	For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913 ) * * Add few methods about base64 into StringUtils * Use `java.util.Base64` instead of others * Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis * Rename encodeBase64String & decodeBase64String * Update druid-forbidden-apis	2019-01-25 17:32:29 -08:00
Roman Leventov	8eae26fd4e	Introduce SegmentId class (#6370 ) * Introduce SegmentId class * tmp * Fix SelectQueryRunnerTest * Fix indentation * Fixes * Remove Comparators.inverse() tests * Refinements * Fix tests * Fix more tests * Remove duplicate DataSegmentTest, fixes #6064 * SegmentDescriptor doc * Fix SQLMetadataStorageUpdaterJobHandler * Fix DataSegment deserialization for ignoring id * Add comments * More comments * Address more comments * Fix compilation * Restore segment2 in SystemSchemaTest according to a comment * Fix style * fix testServerSegmentsTable * Fix compilation * Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes * Fix SystemSchemaTest * Fix style * Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment * Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164 * Fix compilation	2019-01-21 11:11:10 -08:00
Clint Wylie	8ba33b2505	add 'init' lifecycle stage for finer control over startup and shutdown (#6864 ) * add Lifecycle.Stage.INIT, put log shutter downer in init stage, tests, rad startup banner * log cleanup * log changes * add task-master lifecycle to module lifecycle to gracefully stop task-master stuff * fix it the right way * remove announce spam * unused import * one more log * updated comments * wrap leadership lifecycle stop to prevent exceptions from wrecking rest of task master stop * add precondition check	2019-01-21 09:01:36 -08:00
Mingming Qiu	b704ebfa37	Let cachingCost balancer strategy only consider segment replicatable nodes (#6879 )	2019-01-17 09:26:33 -08:00
Jihoon Son	a07e66c540	Fix auto compaction to compact only same or abutting intervals (#6808 ) * Fix auto compaction to compact only same or abutting intervals * fix test	2019-01-16 14:54:11 -08:00
Dayue Gao	5b8a221713	Add SQL id, request logs, and metrics (#6302 ) * use SqlLifecyle to manage sql execution, add sqlId * add sql request logger * fix UT * rename sqlId to sqlQueryId, sql/time to sqlQuery/time, etc * add docs and more sql request logger impls * add UT for http and jdbc * fix forbidden use of com.google.common.base.Charsets * fix UT in QuantileSqlAggregatorTest, supressed unused warning of getSqlQueryId * do not use default method in QueryMetrics interface * capitalize 'sql' everywhere in the non-property parts of the docs * use RequestLogger interface to log sql query * minor bugfixes and add switching request logger * add filePattern configs for FileRequestLogger * address review comments, adjust sql request log format * fix inspection error * try SuppressWarnings("RedundantThrows") to fix inspection error on ComposingRequestLoggerProvider	2019-01-15 23:12:59 -08:00
Jonathan Wei	8537a771b0	Some fixes and tests for spaces/non-ASCII chars in datasource names (#6761 ) * Fixes and tests for spaces/non-ASCII datasource names * Some unit test fixes * Fix ITRealtimeIndexTaskTest * Checkstyle * TeamCity * PR comments	2019-01-15 08:33:31 -08:00
Surekha	f72f33f84a	Fix num_replicas count in sys.segments table (#6804 ) * Fix num_replicas count from sys.segments * Adjust unit test for num_replica > 1 * Pass named arguments instead of passing boolean constants * Address PR comments * PR comments	2019-01-15 08:31:29 -08:00
Charles Allen	5d2947cd52	Use Guava Compatible immediate executor service (#6815 ) * Use multi-guava version friendly direct executor implementation * Don't use a singleton * Fix strict compliation complaints * Copy Guava's DirectExecutor * Fix javadoc * Imports are the devil	2019-01-11 10:42:19 -08:00
Jihoon Son	c35a39d70b	Add support maxRowsPerSegment for auto compaction (#6780 ) * Add support maxRowsPerSegment for auto compaction * fix build * fix build * fix teamcity * add test * fix test * address comment	2019-01-10 09:50:14 -08:00
Mingming Qiu	8ebb7b558b	Handoff should ignore segments that are dropped by drop rules (#6676 ) * Handoff should ignore segments that are dropped by drop rules * fix travis-ci * fix tests * address comments * remove line added by accident * address comments * add javadoc and logging the full stack trace of exception * add error message	2019-01-07 14:43:11 -08:00
Mingming Qiu	636964fcb5	Fix issue that tasks failed because of no sink for identifier (#6724 ) * Fix issue that tasks failed because of no sink for identifier * make find sinks to persist run in one callable together with the actual persist work * Revert "make find sinks to persist run in one callable together with the actual persist work" This reverts commit `a24a2d80ae`.	2019-01-04 17:09:11 -08:00
elloooooo	832a3b16ed	Improve slfj logger input for MDC field:datasource (#6787 ) * improve slfj logger MDC datasource input * add some UT and isNested field	2019-01-03 18:00:04 -08:00
Jihoon Son	9ad6a733a5	Add support segmentGranularity for CompactionTask (#6758 ) * Add support segmentGranularity * add doc and fix combination of options * improve doc	2019-01-03 17:50:45 -08:00
Mingming Qiu	114a9fc38f	change propertyBase in ServerViewModule (#6774 )	2019-01-02 16:44:02 +08:00
Jihoon Son	fa7cb906e4	Fix auto compaction to consider intervals of running tasks (#6767 ) * Fix auto compaction to consider intervals of running tasks * adjust initial collection size	2018-12-27 18:03:44 -08:00
Gian Merlino	7a09cde4de	Broker: Await initialization before finishing startup. (#6742 ) * Broker: Await initialization before finishing startup. In particular, hold off on announcing the service and starting the HTTP server until the server view and SQL metadata cache are finished initializing. This closes a window of time where a Broker could return partial results shortly after startup. As part of this, some simplification of server-lifecycle service announcements. This helps ensure that the two different kinds of announcements we do (legacy and new-style) stay in sync. * Remove unused imports. * Fix NPE in ServerRunnable.	2018-12-18 20:32:31 -08:00
Clint Wylie	9505074530	fix log typo (#6755 ) * fix log typo, add DataSegmentUtils.getIdentifiersString util method * fix indecisive oops	2018-12-18 15:10:25 -08:00
Jihoon Son	f0ee6bf898	Fix auto compaction when the firstSegment is in skipOffset (#6738 ) * Fix auto compaction when the firstSegment is in skipOffset * remove duplicate	2018-12-18 19:10:46 +08:00
Clint Wylie	486c6f3cf9	emit logs that are only useful for debugging at debug level (#6741 ) * make logs that are only useful for debugging be at debug level so log volume is much more chill * info level messages for total merge buffer allocated/free * more chill compaction logs	2018-12-17 14:20:28 +08:00
Jonathan Wei	c713116a75	Use @Coordinator leader client in CoordinatorRuleManager (#6729 )	2018-12-16 15:18:09 -08:00
Gian Merlino	04e7c7fbdc	FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637 ) * FilteredRequestLogger: Fix start/stop, invalid delegate behavior. Fixes two bugs: 1) FilteredRequestLogger did not start/stop the delegate. 2) FilteredRequestLogger would ignore an invalid delegate type, and instead silently substitute the "noop" logger. This was due to a larger problem with RequestLoggerProvider setup in general; the fix here is to remove "defaultImpl" from the RequestLoggerProvider interface, and instead have JsonConfigurator be responsible for creating the default implementations. It is stricter about things than the old system was, and is only willing to make a noop logger if it doesn't see any request logger configs. Otherwise, it'll raise a provision error. * Remove unneeded annotations.	2018-12-14 16:55:44 +08:00
dongyifeng	91e3cf7196	add charset UTF-8 to log api (#6709 ) When I retrieve the task log in browser, the Chinese characters all end up as garbage. ![image](https://user-images.githubusercontent.com/1322134/49502749-bd614080-f8b0-11e8-839e-07f7117eebfd.png) After adding charset UTF-8, it was correct. ![image](https://user-images.githubusercontent.com/1322134/49502804-dc5fd280-f8b0-11e8-916b-bda8f1e7f318.png)	2018-12-12 16:31:04 +01:00
Atul Mohan	86e3ae5b48	Add fail message (#6720 )	2018-12-11 08:05:50 -08:00
Mingming Qiu	e8dd3716b8	add close method in Cache interface (#6540 ) * add close method in Cache interface * address comments * address comments and fix travis-ci * use try-finally	2018-12-06 17:28:41 +08:00
Mingming Qiu	607339003b	Add TaskCountStatsMonitor to monitor task count stats (#6657 ) * Add TaskCountStatsMonitor to monitor task count stats * address comments * add file header * tweak test	2018-12-04 13:37:17 -08:00
Clint Wylie	a1c9d0add2	autosize processing buffers based on direct memory sizing by default (#6588 ) * autosize processing buffers based on direct memory sizing * remove oops, more test * max 1gb autosize buffers, test, start of docs * fix oops * revert accidental change * print buffer size in exception * change the things	2018-12-03 18:40:02 -07:00
Clint Wylie	43adb391c2	remove AbstractResourceFilter.isApplicable because it is not (#6691 ) * remove AbstractResourceFilter.isApplicable because it is not, add tests for OverlordResource.doShutdown and OverlordResource.shutdownTasksForDatasource * cleanup	2018-12-01 21:52:31 +08:00
Roman Leventov	ec38df7575	Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606 ) * Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns * Fix style * Fix HttpEmitterTest.timeoutEmptyQueue() * Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests * Clarify code	2018-12-01 01:12:56 +01:00
Jihoon Son	d6539abd0a	Fix overlord api and console (#6686 ) * Fix overlord APIs and console * remove getRunningTasksByDataSource * add missing path to isApplicable	2018-11-29 23:45:28 -08:00
hate13	f4b49f01ff	add rule count on log (#6467 ) * add rule count on log * add final	2018-11-28 16:08:38 +08:00
Mingming Qiu	9a89200607	Emit query metrics even if the ETags are equal (#6663 )	2018-11-27 15:18:01 -08:00
Jihoon Son	219f0965dc	Remove duplicate DataSegmentTest (#6669 )	2018-11-27 15:13:39 -08:00
seoeun	22a5bf97a2	Fix issue that tasks tables in metadata storage are not cleared (#6592 ) * tasks tables in metadata storage are not cleared * address comments. remove tasklogs and revert obsolete changes * address comments. change comment and update doc. * address comments. update doc more detailed * address comments. remove redundant log and update doc more detailed. * address comments. update document	2018-11-22 11:50:31 +08:00
Gian Merlino	92cce04165	Fix missing default config in some calls to coordinator dynamic configs. (#6652 ) * Fix missing default config in some calls to coordinator dynamic configs. The lack of a default config meant that if someone called an API _without_ a default config before one _with_ a default config, then the default value would get stuck at null instead of the intended default value. I noticed this in a cluster where calling /druid/coordinator/v1/config before a coordinator had fully started up would lead to NPEs during DruidCoordinatorRuleRunner. This patch makes the default configs consistent across all calls. * Remove unnecessary null check.	2018-11-22 10:25:39 +08:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
Jihoon Son	0395d554e1	Properly reset total size of segmentsToCompact in NewestSegmentFirstIterator (#6622 ) * Properly reset total size of segmentsToCompact in NewestSegmentFirstIterator * add test	2018-11-15 01:00:51 -08:00
Jihoon Son	7b262b7123	Remove unnecessary path param from auto compaction api (#6594 ) * Remove unnecessary path param from auto compaction api * fix ci	2018-11-13 09:46:13 -08:00
David Lim	afb239b17a	add missing license headers, in particular to MD files; clean up RAT … (#6563 ) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg	2018-11-13 09:38:37 -08:00
Roman Leventov	54351a5c75	Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490 ) * Fix various bugs; Enable more IntelliJ inspections and update error-prone * Fix NPE * Fix inspections * Remove unused imports	2018-11-06 14:38:08 -08:00
Surekha	bcb754d066	Use current coordinator leader instead of cached one (#6551 ) (#6552 ) * Use current coordinator leader instead of cached one (#6551) Check the response status and throw exception if not OK * Modify tests * PR comment * Add the correct check for status of BytesAccumulatingResponseHandler * Move the status check into JsonParserIterator so sql query outputs meaningful message on failure * Fix tests	2018-11-06 13:09:51 -08:00
QiuMM	7b34662462	Period load/drop/broadcast rules should include the future by default (#6414 ) * Period load/drop/broadcast rules should include the future by default * address comments * adjust coordinator console and tweak docs * address comments * fix travis-ci	2018-11-01 09:43:34 -07:00
Roman Leventov	2cdce2e2a6	Add RequestLogEventBuilderFactory (#6477 ) This PR allows to control the fields in `RequestLogEvent`, emitted in `EmittingRequestLogger`. In our case, we want to get rid of the `intervals` fields of the query objects that are a part of `DefaultRequestLogEvent`. They are enormous (thousands of segments) and not useful. Related to #5522, FYI @a2l007.	2018-10-31 22:24:37 +01:00
QiuMM	676f5e6d7f	Prohibit some guava collection APIs and use JDK collection APIs directly (#6511 ) * Prohibit some guava collection APIs and use JDK APIs directly * reset files that changed by accident * sort codestyle/druid-forbidden-apis.txt alphabetically	2018-10-29 13:02:43 +01:00
Jonathan Wei	b2d9b6f23d	Allow custom TLS cert checks (#6432 ) * Allow custom TLS cert checks * PR comment * Checkstyle, PR comment	2018-10-24 16:31:52 -07:00
QiuMM	601183b4c7	Add period drop before rule (#6415 ) * Add period drop before rule * add license header * support period drop before rule in coordinator console * address comments	2018-10-24 12:44:30 -07:00
Roman Leventov	84ac18dc1b	Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461 ) * Catch some incorrect method parameter or call argument formatting patterns with checkstyle * Fix DiscoveryModule * Inline parameters_and_arguments.txt * Fix a bug in PolyBind * Fix formatting	2018-10-23 07:17:38 -03:00
Faxian Zhao	c5bf4e7503	update insert pending segments logic to synchronous (#6336 ) * 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict. 2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB. 3. 'select and insert' is not need within transaction. * Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments. * fix typo for NullPointerException	2018-10-22 19:48:20 -07:00
Samarth Jain	359576a80b	Implement force push down for nested group by query (#5471 ) * Force nested query push down * Code review changes	2018-10-22 13:43:47 -07:00
QiuMM	f5f4171a45	QueryCountStatsMonitor: emit query/count (#6473 ) Let `QueryCountStatsMonitor` emit `query/count`, then I can monitor QPS of my services, or I have to count it by myself.	2018-10-19 10:15:02 -03:00
patelh	c780aacc03	Add ability to specify dbcp properties file (#6419 ) * Add ability to specify dbcp properties file * Address PR comments, use mock config, remove setter * Add documentation * APRC, updated docs with example file contents * APRC, add @Nullable, @VisibileForTesting, update doc * APRC, remove error log, use props directly as jackson binding * Remove unused files	2018-10-16 12:27:19 -07:00
QiuMM	85a89e2703	make druid node bind address configurable (#6464 ) * make druid node bind address configurable * fix tests * fix travis-ci	2018-10-15 14:19:40 -07:00
Roman Leventov	aa121da25f	Use NodeType enum instead of Strings (#6377 ) * Use NodeType enum instead of Strings * Make NodeType constants uppercase * Fix CommonCacheNotifier and NodeType/ServerType comments * Reconsidering comment * Fix import * Add a comment to CommonCacheNotifier.NODE_TYPES	2018-10-14 20:49:38 -07:00

... 5 6 7 8 9 ...

3871 Commits