druid

Commit Graph

Author	SHA1	Message	Date
Jihoon Son	1524af703d	Fix IllegalArgumentException in TaskLockBox.syncFromStorage() (#6050 )	2018-07-27 10:43:32 -07:00
Surekha	414487a78e	Add support to filter on datasource for active tasks (#5998 ) * Add support to filter on datasource for active tasks * Added datasource filter to sql query for active tasks * Fixed unit tests * Address PR comments	2018-07-19 16:33:46 -07:00
Jihoon Son	c48aa74a30	Fix NPE while handling CheckpointNotice in KafkaSupervisor (#5996 ) * Fix NPE while handling CheckpointNotice * fix code style * Fix test * fix test * add a log for creating a new taskGroup * fix backward compatibility in KafkaIOConfig	2018-07-13 17:14:57 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
Jihoon Son	1ccabab98e	Fix the broken Appenderator contract in KafkaIndexTask (#5905 ) * Fix broken Appenderator contract in KafkaIndexTask * fix build * add publishFuture * reuse sequenceToUse if possible	2018-07-03 13:31:29 -07:00
Jihoon Son	b6c957b0d2	Allow reordered segment allocation in kafka indexing service (#5805 ) * Allow reordered segment allocation in kafka indexing service * address comments * fix a bug	2018-07-02 15:09:12 -07:00
Surekha	933b25416c	Handle task deserialization failure in the tasks api (#5911 ) If task payload fails to deserialize json to Java, make the task null and handle null task in OverlordResource	2018-06-29 11:57:48 -07:00
Jonathan Wei	f3e1520360	Fix merge for TrueDimFilter (#5916 ) * Fix merge for TrueDimFilter * remove unused cache ID	2018-06-28 14:46:47 -07:00
Jihoon Son	8c5ded0fad	Splitting KafkaIndexTask for better code maintenance (#5854 ) * Refactoring KafkaIndexTask for better code maintenance * fix bug * fix test * add annotation * fix checkstyle * remove SetEndOffsetsResult	2018-06-22 13:00:03 -07:00
Surekha	8619adb5b9	Improve task retrieval APIs on Overlord (#5801 ) * Add the new tasks api in overlordResource It takes 4 optional query params * state(pending/running/waiting/compelte) * dataSource * interval (applies to completed tasks) * maxCompletedTasks (applies to completed tasks) If all params are null, the api returns all the tasks * Add the state to each task returned by tasks endpoint * divide active tasks into waiting, pending or running * Add more unit tests * Add UNKNOWN state to TaskState * Fix the authorization calls * WIP: PR comments Added new class to capture task info for caching Other refactoring * Refactoring : move TaskStatus class to druid-api so it can be accessed within server And other related classes like TaskState and TaskStatusPlus are in api * Remove unused class and apis accessing it * Add a separate cache for recently completed tasks This is to mainly capture the task type from payload * Ignore a test * Add a RuntimeTaskState to encompass all states a task can be in * Revert "Add a RuntimeTaskState to encompass all states a task can be in" This reverts commit `2a527a0731`. * Fix wrong api call * Fix and unignore tests * Remove waiting,pending state from TaskState * Add RunnerTaskState * Missed the annotation runnerStatusCode * Fix the creationTime * Fix the createdTime and queueInsertionTime for running/active tasks * Clean up tests * Add javadocs * Potentially fix the teamcity build * Address PR comments Get rid of TaskInfoBuilder Make TaskInfoMapper static nested class Other changes fix import in MaterializedViewSupervisor after merge * Address PR comments on * Replace global cache with local map * combine multiple queries into one * Removed unused code * Fix unit tests Fix a bug in securedTaskStatusPlus * Remove getRecentlyFinishedTaskStatuses method Change TaskInfoMapper signature to add generic type * Address PR comments * Passed datasource as argument to be used in sql query * Other minor fixes * Address PR comments Some minor changes, rename method, spacing changes Add early auth check if datasource is not null * Fix test case * Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage * Add TaskLocation to Anytask object * Address PR comments * Fix a bug in test case causing ClassCastException	2018-06-19 11:34:59 -07:00
Jihoon Son	2feec44a55	Fix mismatch in revoked task locks between memory and metastore after sync from storage (#5858 ) * Fix mismatched revoked task locks after sync from storage * fix build * fix log * fix lock release	2018-06-12 10:25:34 -04:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
Jonathan Wei	684b5d18c1	Moving averages for ingestion row stats (#5748 ) * Moving averages for ingestion row stats * PR comments * Make RowIngestionMeters extensible * test and checkstyle fixes * More PR comments * Fix metrics * Add some comments * PR comments * Comments	2018-06-05 09:08:57 -07:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Gian Merlino	f2cc6ce4d5	VersionedIntervalTimeline: Optimize construction with heavily populated holders. (#5777 ) * VersionedIntervalTimeline: Optimize construction with heavily populated holders. Each time a segment is "add"ed to a timeline, "isComplete" is called on the holder that it is added to. "isComplete" is an O(segments per chunk) operation, meaning that adding N segments to a chunk is an O(N^2) operation. This blows up badly if we have thousands of segments per chunk. The patch defers the "isComplete" check until after all segments have been inserted. * Fix imports.	2018-05-16 09:16:59 -07:00
Jihoon Son	9dca5ec76b	Simple cleanup for ThreadPoolTaskRunner and SetAndVerifyContextQueryRunner / Add ThreadPoolTaskRunnerTest (#5557 ) * Simple fix for ThreadPoolTaskRunner * fix build * address comments * update javadoc * fix build * fix test * add dependency	2018-05-15 22:53:11 +05:30
Jihoon Son	c9d645103b	Fix metrics for inserting segments (#5749 ) * Fix metrics for inserting segments * Add a comment	2018-05-08 13:07:39 -07:00
Kirill Kozlov	67d0b0ee42	Add taskType dimension to task metrics (#5664 )	2018-05-07 09:42:26 -07:00
Slim Bouguerra	8aa8d9fa5b	Kerberos Spnego Authentication Router Issue (#5706 ) * Adding decoration method to proxy servlet Change-Id: I872f9282fb60bfa20524271535980a36a87b9621 * moving the proxy request decoration to authenticators Change-Id: I7f94b9ff5ecf08e8abf7169b58bc410f33148448 * added docs Change-Id: I901543e52f0faf4666bfea6256a7c05593b1ae70 * use the authentication result to decorate request Change-Id: I052650de9cd02b4faefdbcdaf2332dd3b2966af5 * adding authenticated by name Change-Id: I074d2933460165feeddb19352eac9bd0f96f42ca * ensure that authenticator is not null Change-Id: Idb58e308f90db88224a06f3759114872165b24f5 * fix types and minor bug Change-Id: I6801d49a05d5d8324406fc0280286954eb66db10 * fix typo Change-Id: I390b12af74f44d760d0812a519125fbf0df4e97b * use actual type names Change-Id: I62c3ee763363781e52809ec912aafd50b8486b8e * set authenitcatedBy to null for AutheticationResults created by Escalator. Change-Id: I4a675c372f59ebd8a8d19c61b85a1e4bf227a8ba	2018-05-05 20:33:51 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
David Lim	8ec2d2fe18	Use unique segment paths for Kafka indexing (#5692 ) * support unique segment file paths * forbiddenapis * code review changes * code review changes * code review changes * checkstyle fix	2018-04-29 21:59:48 -07:00
Gian Merlino	762f8829e4	Add task action metrics, add taskId metric dimension. (#5714 ) * Add task action metrics, add taskId metric dimension. Adds two new metrics: task/action/log/time and task/action/run/time. Also adds taskId as a dimension, to give us the ability to drill down into metrics for an individual task. Also standardizes metrics-attachment using two helper methods in IndexTaskUtils. * Fix typo	2018-04-29 21:24:06 -07:00
Jihoon Son	23dc0d5b24	Better logging for taskLockBox (#5703 )	2018-04-28 21:08:10 -07:00
Roman Leventov	9be000758d	Refactor index merging, replace Rowboats with RowIterators and RowPointers (#5335 ) * Refactor index merging, replace Rowboats with RowIterators and RowPointers * Add javadocs * Fix a bug in QueryableIndexIndexableAdapter * Fixes * Remove unused declarations * Remove unused GenericColumn.isNull() method * Fix test * Address comments * Rearrange some code in MergingRowIterator for more clarity * Self-review * Fix style * Improve docs * Fix docs * Rename IndexMergerV9.writeDimValueAndSetupDimConversion to setUpDimConversion() * Update Javadocs * Minor fixes * Doc fixes, more code comments, cleanup of RowCombiningTimeAndDimsIterator * Fix doc link	2018-04-27 17:34:32 -07:00
Roman Leventov	a3a9ada843	Add GenericWhitespace checkstyle check (#5668 )	2018-04-24 01:09:14 +05:30
Jihoon Son	f349e03091	Fix NPE in compactionTask (#5613 ) * Fix NPE in compactionTask * more annotations for metadata * better error message for empty input * fix build * revert some null checks * address comments	2018-04-13 00:11:03 -04:00
Caroline1000	d709d1a59f	correct overlord console header. Because it's not the coordinator console, it's the overlord console. (#5627 )	2018-04-11 20:44:31 -04:00
Nishant Bangarwa	e6efd75a3d	Add config to allow setting up custom unsecured paths for druid nodes. (#5614 ) * Add config to allow setting up custom unsecured paths for druid nodes. * return all resources for Unsecured paths * review comment - Add test * fix tests * fix test	2018-04-11 17:10:07 -07:00
Jihoon Son	298ed1755d	Fix indexTask to respect forceExtendableShardSpecs (#5509 ) * Fix indexTask to respect forceExtendableShardSpecs * add comments	2018-04-05 23:54:59 -07:00
Jonathan Wei	969342cd28	More error reporting and stats for ingestion tasks (#5418 ) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments	2018-04-05 21:38:57 -07:00
Jihoon Son	7239f56131	Fix NPE in RemoteTaskRunner when some tasks in ZooKeeper but not in Overlord (#5511 ) * Fix NPE in RemoteTaskRunner when some tasks in ZooKeeper but not in Overlord * revert unnecessary change	2018-04-03 21:15:58 -07:00
Jonathan Wei	723f7ac550	Add support for task reports, upload reports to deep storage (#5524 ) * Add support for task reports, upload reports to deep storage * PR comments * Better name for method * Fix report file upload * Use TaskReportFileWriter * Checkstyle * More PR comments	2018-04-02 12:10:56 -07:00
Kirill Kozlov	8878a7ff94	Replace guava Charsets with native java StandardCharsets (#5545 )	2018-03-28 21:00:08 -07:00
Jihoon Son	1ad898bde2	Use the official aws-sdk instead of jet3t (#5382 ) * Use the official aws-sdk instead of jet3t * fix compile and serde tests * address comments and fix test * add http version string * remove redundant dependencies, fix potential NPE, and fix test * resolve TODOs * fix build * downgrade jackson version to 2.6.7 * fix test * resolve the last TODO * support proxy and endpoint configurations * fix build * remove debugging log * downgrade hadoop version to 2.8.3 * fix tests * remove unused log * fix it test * revert KerberosAuthenticator change * change hadoop-aws scope to provided in hdfs-storage * address comments * address comments	2018-03-21 15:36:54 -07:00
Charles Allen	58f110f7f8	Future-proof some Guava usage (#5414 ) * Future-proof some Guava usage * Use a java-util EmptyIterator instead of Guava's * Change some of the guava future handling to do manual async transforms. Guava changes transform into transformAsync by deprecating transform in ONLY Guava 19. Then its gone in 20 * Use `Collections.emptyIterator()` * Pretty formatting * Make listenable future transforms a thing in default druid * Format fix * Add forbidden guava apis * Make the ListenableFutrues.transformAsync have comments * Undo intellij bad pattern matching in comments * Futrues --> Futures * Add empty iterators forbidding * Fix extra `A` * Correct method signature * Address review comments * Finish Gian review comments * Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax	2018-03-20 08:59:33 -07:00
Jonathan Wei	b22455b924	Fix supervisor tombstone auth handling (#5504 )	2018-03-19 12:55:47 -07:00
Roman Leventov	693e3575f9	Remove unused code and exception declarations (#5461 ) * Remove unused code and exception declarations * Address comments * Remove redundant Exception declarations * Make FirehoseFactoryV2.connect() to throw IOException again	2018-03-16 22:11:12 +01:00
Jonathan Wei	30e6bdedf3	Authorize supervisor history instead of current active supervisors for supervisor history API (#5501 )	2018-03-16 12:29:17 -07:00
Jihoon Son	9b2a25bd84	Refactor supervisorReport to be type-safe (#5479 ) * refactor supervisorReport * use primitives	2018-03-13 09:28:44 -07:00
Niraja Mishra	96cebfc222	As part of this feature, implemented a new endpoint to get running tasks by datasources (#5260 ) and added datasource information as part of existing endpoint /druid/indexer/v1/runningTasks. Added junit test cases for the newly implemented API and fixed existing junit test cases. Fixed review comments - added new method getCreatedDateTimeAndDataSource into TaskStorageQueryAdapter class and formatted changed files	2018-03-12 23:48:11 -07:00
Alexander Korablev	8a51800693	fix PortFinder issue #5466 (#5467 )	2018-03-05 16:58:49 -08:00
Jonathan Wei	b63f1c0e45	Fix authorization check in supervisor history API (#5460 )	2018-03-02 14:03:07 -08:00
Kevin Conaway	969a12f6ca	#5425 Refactor to use map.get() when asserting the existence of published segments (#5426 )	2018-03-02 19:42:39 +01:00
Gian Merlino	e4eaee3806	Support for disabling bitmap indexes. (#5402 ) * Support for disabling bitmap indexes. Can save space for columns where bitmap indexes are pointless (like free-form text). * Remove import. * Fix CompactionTaskTest. * Update for review comments. * Review comments, tests. * Fix test.	2018-02-28 19:19:56 -08:00
Gian Merlino	3f72537787	Fix missing task type in task payload API. (#5399 ) * Fix missing task type in task payload API. Apparently embedding a polymorphic object inside a Map<String, Object> is a bit too much for Jackson to serialize properly. Fix this by using wrapper classes. * Fix OverlordTest casts. * Remove import. * Remove unused imports. * Clarify comments.	2018-02-21 10:00:13 -08:00
Jihoon Son	cd929000ca	Change early publishing to early pushing in indexTask & refactor AppenderatorDriver (#5297 ) * Fix early publishing to early pushing in batch indexing & refactor appenderatorDriver * fix compile * rename and add more javadocs * Fix conflicts * address comments * revert await executors * fix test	2018-02-14 12:48:33 -08:00
Jonathan Wei	b234a119ac	Log exceptions thrown before persist() for indexing tasks (#5374 ) * Log exceptions thrown before persist() for indexing tasks * PR comment	2018-02-13 09:20:07 -08:00
Roman Leventov	e64ffb10c2	Standartize on using Integer.BYTES instead of Ints.BYTES from Guava, same for other primitives (#5366 )	2018-02-07 13:24:30 -08:00

1 2 3 4 5 ...

1434 Commits