druid

Commit Graph

Author	SHA1	Message	Date
Jihoon Son	0c5dcf5586	Fix exclusivity for start offset in kinesis indexing service & check exclusivity properly in IndexerSQLMetadataStorageCoordinator (#7291 ) * Fix exclusivity for start offset in kinesis indexing service * some adjustment * Fix SeekableStreamDataSourceMetadata * Add missing javadocs * Add missing comments and unit test * fix SeekableStreamStartSequenceNumbers.plus and add comments * remove extra exclusivePartitions in KafkaIOConfig and fix downgrade issue * Add javadocs * fix compilation * fix test * remove unused variable	2019-03-21 13:12:22 -07:00
Surekha	e170203876	Consolidate kafka consumer configs (#7249 ) * Consolidate kafka consumer configs * change the order of adding properties * Add consumer properties to fix test it seems kafka consumer does not reveive any message without these configs * Use KafkaConsumerConfigs in integration test * Update zookeeper and kafka versions in the setup.sh for the base druid image * use version 0.2 of base druid image * Try to fix tests in KafkaRecordSupplierTest * unused import * Fix tests in KafkaSupervisorTest	2019-03-21 11:19:49 -07:00
Roman Leventov	dfd27e00c0	Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality (#7185 ) * Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality * Fix DruidCoordinatorTest; Renamed DruidCoordinator.getReplicationStatus() to computeUnderReplicationCountsPerDataSourcePerTier() * More Javadocs, typos, refactor DruidCoordinatorRuntimeParams.createAvailableSegmentsSet() * Style * typo * Disable StaticPseudoFunctionalStyleMethod inspection because of too much false positives * Fixes	2019-03-19 18:22:56 -03:00
Jihoon Son	892d1d35d6	Deprecate NoneShardSpec and drop support for automatic segment merge (#6883 ) * Deprecate noneShardSpec * clean up noneShardSpec constructor * revert unnecessary change * Deprecate mergeTask * add more doc * remove convert from indexMerger * Remove mergeTask * remove HadoopDruidConverterConfig * fix build * fix build * fix teamcity * fix teamcity * fix ServerModule * fix compilation * fix compilation	2019-03-15 23:29:25 -07:00
Gian Merlino	a8c7132482	Logic adjustments to SeekableStreamIndexTaskRunner. (#7267 ) * Logic adjustments to SeekableStreamIndexTaskRunner. A mix of simplifications and bug fixes. They are intermingled because some of the bugs were made difficult to fix, and also more likely to happen in the first place, by how the code was structured. I tried to keep restructuring to a minimum. The changes are: - Remove "initialOffsetsSnapshot", which was used to determine when to skip start offsets. Replace it with "lastReadOffsets", which I hope is more intuitive. (There is a connection: start offsets must be skipped if and only if they have already been read, either by a previous task or by a previous sequence in the same task, post-restoring.) - Remove "isStartingSequenceOffsetsExclusive", because it should always be the opposite of isEndOffsetExclusive. The reason is that starts are exclusive exactly when the prior ends are inclusive: they must match up in that way for adjacent reads to link up properly. - Don't call "seekToStartingSequence" after the initial seek. There is no reason to, since we expect to read continuous message streams throughout the task. And calling it makes offset-tracking logic trickier, so better to avoid the need for trickiness. I believe the call being here was causing a bug in Kinesis ingestion where a message might get double-read. - Remove the "continue" calls in the main read loop. They are bad because they prevent keeping currOffsets and lastReadOffsets up to date, and prevent us from detecting that we have finished reading. - Rework "verifyInitialRecordAndSkipExclusivePartition" into "verifyRecordInRange". It no longer has side effects. It does a sanity check on the message offset and also makes sure that it is not past the endOffsets. - Rework "assignPartitions" to replace inline comparisons with "isRecordAlreadyRead" and "isMoreToReadBeforeReadingRecord" calls. I believe this fixes an off-by-one error with Kinesis where the last record would not get read. It also makes the logic easier to read. - When doing the final publish, only adjust end offsets of the final sequence, rather than potentially adjusting any unpublished sequence. Adjusting sequences other than the last one is a mistake since it will extend their endOffsets beyond what they actually read. (I'm not sure if this was an issue in practice, since I'm not sure if real world situations would have more than one unpublished sequence.) - Rename "isEndSequenceOffsetsExclusive" to "isEndOffsetExclusive". It's shorter and more clear, I think. - Add equals/hashCode/toString methods to OrderedSequenceNumber. Kafka test changes: - Added a Kafka "testRestoreAtEndOffset" test to verify that restores at the very end of the task lifecycle still work properly. Kinesis test changes: - Renamed "testRunOnNothing" to "testRunOnSingletonRange". I think that given Kinesis semantics, the right behavior when start offset equals end offset (and there aren't exclusive partitions set) is to read that single offset. This is because they are both meant to be treated as inclusive. - Adjusted "testRestoreAfterPersistingSequences" to expect one more message read. I believe the old test was wrong; it expected the task not to read message number 5. - Adjusted "testRunContextSequenceAheadOfStartingOffsets" to use a checkpoint starting from 1 rather than 2. I believe the old test was wrong here too; it was expecting the task to start reading from the checkpointed offset, but it actually should have started reading from one past the checkpointed offset. - Adjusted "testIncrementalHandOffReadsThroughEndOffsets" to expect 11 messages read instead of 12. It's starting at message 0 and reading up to 10, which should be 11 messages. * Changes from code review.	2019-03-15 00:22:42 -07:00
Furkan KAMACI	7ada1c49f9	Prohibit Throwables.propagate() (#7121 ) * Throw caught exception. * Throw caught exceptions. * Related checkstyle rule is added to prevent further bugs. * RuntimeException() is used instead of Throwables.propagate(). * Missing import is added. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * * Checkstyle definition is improved. * Throwables.propagate() usages are removed. * Checkstyle pattern is changed for only scanning "Throwables.propagate(" instead of checking lookbehind. * Throwable is kept before firing a Runtime Exception. * Fix unused assignments.	2019-03-14 18:28:33 -03:00
Clint Wylie	fb1489d313	fix SequenceMetadata deserialization (#7256 ) * wip * fix tests, stop reading if we are at end offset * fix build * remove restore at end offsets fix in favor of a separate PR * use typereference from method for serialization too	2019-03-13 17:29:39 -04:00
Jihoon Son	873232954f	Fix log level and throw NPE on null currOffset in SeekableStreamIndexTaskRunner (#7253 )	2019-03-13 10:20:43 -04:00
Jihoon Son	32e86ea75e	Fix record validation in SeekableStreamIndexTaskRunner (#7246 ) * Fix record validation in SeekableStreamIndexTaskRunner * add kinesis test	2019-03-12 21:12:21 -07:00
Gian Merlino	dcfca03718	More accurate RealtimeMetricsMonitor messages. (#7230 ) The old messages did not reflect the full range of reasons why messages could be thrown away.	2019-03-11 19:50:32 -04:00
Jihoon Son	e48a9c138e	Reduce default max # of subTasks to 1 for native parallel task (#7181 ) * Reduce # of max subTasks to 2 * fix typo and add more doc * add more doc and link * change default and add warning * fix doc * add test * fix it test	2019-03-05 22:06:36 -08:00
Gian Merlino	fa218f5160	Fix two SeekableStream serde issues. (#7176 ) * Fix two SeekableStream serde issues. 1) Fix backwards-compatibility serde for SeekableStreamPartitions. It is needed for split 0.13 / 0.14 clusters to work properly during a rolling update. 2) Abstract classes don't need JsonCreator constructors; remove them. * Comment fixes.	2019-03-01 22:27:08 -08:00
Jihoon Son	06c8229c08	Kill all running tasks when the supervisor task is killed (#7041 ) * Kill all running tasks when the supervisor task is killed * add some docs and simplify * address comment	2019-03-01 11:28:03 -08:00
Jihoon Son	cacdc83cad	Improve error message for integer overflow in compaction task (#7131 ) * improve error message for integer overflow in compaction task * fix build	2019-02-28 11:07:37 +08:00
Himanshu Pandey	8b803cbc22	Added checkstyle for "Methods starting with Capital Letters" (#7118 ) * Added checkstyle for "Methods starting with Capital Letters" and changed the method names violating this. * Un-abbreviate the method names in the calcite tests * Fixed checkstyle errors * Changed asserts position in the code	2019-02-23 20:10:31 -08:00
David Glasser	1c2753ab90	ParallelIndexSubTask: support ingestSegment in delegating factories (#7089 ) IndexTask had special-cased code to properly send a TaskToolbox to a IngestSegmentFirehoseFactory that's nested inside a CombiningFirehoseFactory, but ParallelIndexSubTask didn't. This change refactors IngestSegmentFirehoseFactory so that it doesn't need a TaskToolbox; it instead gets a CoordinatorClient and a SegmentLoaderFactory directly injected into it. This also refactors SegmentLoaderFactory so it doesn't depend on an injectable SegmentLoaderConfig, since its only method always replaces the preconfigured SegmentLoaderConfig anyway. This makes it possible to use SegmentLoaderFactory without setting druid.segmentCaches.locations to some dummy value. Another goal of this PR is to make it possible for IngestSegmentFirehoseFactory to list data segments outside of connect() --- specifically, to make it a FiniteFirehoseFactory which can query the coordinator in order to calculate its splits. See #7048. This also adds missing datasource name URL-encoding to an API used by CoordinatorBasedSegmentHandoffNotifier.	2019-02-23 17:02:56 -08:00
Jihoon Son	4e2b085201	Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911 ) * Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file * delete descriptor.file when killing segments * fix test * Add doc for ha * improve warning	2019-02-20 15:10:29 -08:00
David Glasser	a81b1b8c9c	index_parallel: support !appendToExisting with no explicit intervals (#7046 ) * index_parallel: support !appendToExisting with no explicit intervals This enables ParallelIndexSupervisorTask to dynamically request locks at runtime if it is run without explicit intervals in the granularity spec and with appendToExisting set to false. Previously, it behaved as if appendToExisting was set to true, which was undocumented and inconsistent with IndexTask and Hadoop indexing. Also, when ParallelIndexSupervisorTask allocates segments in the explicit interval case, fail if its locks on the interval have been revoked. Also make a few other additions/clarifications to native ingestion docs. Fixes #6989. * Review feedback. PR description on GitHub updated to match. * Make native batch ingestion partitions start at 0 * Fix to previous commit * Unit test. Verified to fail without the other commits on this branch. * Another round of review * Slightly scarier warning	2019-02-20 10:54:26 -08:00
Clint Wylie	cadb6c5280	Missing Overlord and MiddleManager api docs (#7042 ) * document middle manager api * re-arrange * correction * document more missing overlord api calls, minor re-arrange of some code i was referencing * fix it * this will fix it * fixup * link to other docs	2019-02-19 10:52:05 -08:00
Surekha	80a2ef7be4	Support kafka transactional topics (#5404 ) (#6496 ) * Support kafka transactional topics * update kafka to version 2.0.0 * Remove the skipOffsetGaps option since it's not used anymore * Adjust kafka consumer to use transactional semantics * Update tests * Remove unused import from test * Fix compilation * Invoke transaction api to fix a unit test * temporary modification of travis.yml for debugging * another attempt to get travis tasklogs * update kafka to 2.0.1 at all places * Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes * Add deprecated in docs for kafka-eight and kafka-simple extensions * Remove skipOffsetGaps and code changes for transaction support * Fix indentation * remove skipOffsetGaps from kinesis * Add transaction api to KafkaRecordSupplierTest * Fix indent * Fix test * update kafka version to 2.1.0	2019-02-18 11:50:08 -08:00
Jihoon Son	1701fbcad3	Improve error message for revoked locks (#7035 ) * Improve error message for revoked locks * fix test * fix test * fix test * fix toString	2019-02-13 11:22:48 -08:00
Mingming Qiu	d0abf5c20a	fix kafka index task doesn't resume when recieve duplicate request (#6990 ) * fix kafka index task doesn't resume when recieve duplicate request * add unit test	2019-02-12 13:24:28 -08:00
Ankit Kothari	16a4a50e91	[Issue #6967 ] NoClassDefFoundError when using druid-hdfs-storage (#7015 ) * Fix: 1. hadoop-common dependency for druid-hdfs and druid-kerberos extensions Refactoring: 2. Hadoop config call in the inner static class to avoid class path conflicts for stopGracefully kill * Fix: 1. hadoop-common test dependency * Fix: 1. Avoid issue of kill command once the job is actually completed	2019-02-08 18:26:37 -08:00
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Furkan KAMACI	58f9507ccf	Improper equals override is fixed to prevent NullPointerException (#6938 ) * Improper equals override is fixed to prevent NullPointerException * Fixed curly brace indentation. * Test method is added for equals method of TaskLockPosse class.	2019-02-06 14:08:50 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
David Glasser	7e48593b57	ParallelIndexSupervisorTask: don't warn about a default value (#6987 ) Native batch indexing doesn't yet support the maxParseExceptions, maxSavedParseExceptions, and logParseExceptions tuning config options, so ParallelIndexSupervisorTask logs if these are set. But the default value for maxParseExceptions is Integer.MAX_VALUE, which means that you'll get the maxParseExceptions flavor of this warning even if you don't configure maxParseExceptions. This PR changes all three warnings to occur if you change the settings from the default; this mostly affects the maxParseExceptions warning.	2019-02-04 12:00:26 -08:00
Roman Leventov	0e926e8652	Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898 ) * Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool * Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java * Remove unnecessary comment * Checkstyle * Fix getFromExtensions() * Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization * Fix UriCacheGeneratorTest * Workaround issue with MaterializedViewQueryQueryToolChest * Strengthen Appenderator's contract regarding concurrency	2019-02-04 09:18:12 -08:00
Vadim Ogievetsky	7f1b19bfb1	Adding a Unified web console. (#6923 ) * Adding new web console. * fixed css * fix form height * fix typo * do import custom react-table css * added repo field so npm does not complain * ask travis for node 10 * move indexing-service/src/main/resources/indexer_static into web-console * fix resource names and paths * add licenses * fix exclude file * add licenses to misc files and tidy up * remove rebase marker * fix link * updated env variable name * tidy up licenses and surface errors * cleanup * remove unused code, fix missing await * TeamCity does not like the name aux * add more links to tasks view * rm pages * update gitignore * update readme to be accurate * make clean script * removed old console dependancy * update Jetty routes * add a comment for welcome files for coordinator * do not show inital notifaction for now * renamed overlord console back to console.html * fix coordinator console * rename coordinator-console.html to index.html	2019-01-31 17:26:41 -08:00
Benedict Jin	72a571fbf7	For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913 ) * * Add few methods about base64 into StringUtils * Use `java.util.Base64` instead of others * Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis * Rename encodeBase64String & decodeBase64String * Update druid-forbidden-apis	2019-01-25 17:32:29 -08:00
Ankit Kothari	8492d94f59	Kill Hadoop MR task on kill of Hadoop ingestion task (#6828 ) * KillTask from overlord UI now makes sure that it terminates the underlying MR job, thus saving unnecessary compute Run in jobby is now split into 2 1. submitAndGetHadoopJobId followed by 2. run submitAndGetHadoopJobId is responsible for submitting the job and returning the jobId as a string, run monitors this job for completion JobHelper writes this jobId in the path provided by HadoopIndexTask which in turn is provided by the ForkingTaskRunner HadoopIndexTask reads this path when kill task is clicked to get hte jobId and fire the kill command via the yarn api. This is taken care in the stopGracefully method which is called in SingleTaskBackgroundRunner. Have enabled `canRestore` method to return `true` for HadoopIndexTask in order for the stopGracefully method to be called HadoopJob files have been changed to incorporate the changes to jobby Addressing PR comments * Addressing PR comments - Fix taskDir * Addressing PR comments - For changing the contract of Task.stopGracefully() `SingleTaskBackgroundRunner` calls stopGracefully in stop() and then checks for canRestore condition to return the status of the task * Addressing PR comments 1. Formatting 2. Removing `submitAndGetHadoopJobId` from `Jobby` and calling writeJobIdToFile in the job itself * Addressing PR comments 1. POM change. Moving hadoop dependency to indexing-hadoop * Addressing PR comments 1. stopGracefully now accepts TaskConfig as a param Handling isRestoreOnRestart in stopGracefully for `AppenderatorDriverRealtimeIndexTask, RealtimeIndexTask, SeekableStreamIndexTask` Changing tests to make TaskConfig param isRestoreOnRestart to true	2019-01-25 15:43:06 -08:00
Himanshu Pandey	e1033bb412	Issue#6892- Replaced Math.random() with ThreadLocalRandom.current().nextDouble() (#6914 ) * Replacing Math.random() with ThreadLocalRandom.current().nextDouble() * Added java.lang.Math#random() in forbidden-apis.txt * Minor change in the message - druid-forbidden-apis.txt	2019-01-25 19:49:20 +08:00
Roman Leventov	8eae26fd4e	Introduce SegmentId class (#6370 ) * Introduce SegmentId class * tmp * Fix SelectQueryRunnerTest * Fix indentation * Fixes * Remove Comparators.inverse() tests * Refinements * Fix tests * Fix more tests * Remove duplicate DataSegmentTest, fixes #6064 * SegmentDescriptor doc * Fix SQLMetadataStorageUpdaterJobHandler * Fix DataSegment deserialization for ignoring id * Add comments * More comments * Address more comments * Fix compilation * Restore segment2 in SystemSchemaTest according to a comment * Fix style * fix testServerSegmentsTable * Fix compilation * Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes * Fix SystemSchemaTest * Fix style * Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment * Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164 * Fix compilation	2019-01-21 11:11:10 -08:00
Clint Wylie	8ba33b2505	add 'init' lifecycle stage for finer control over startup and shutdown (#6864 ) * add Lifecycle.Stage.INIT, put log shutter downer in init stage, tests, rad startup banner * log cleanup * log changes * add task-master lifecycle to module lifecycle to gracefully stop task-master stuff * fix it the right way * remove announce spam * unused import * one more log * updated comments * wrap leadership lifecycle stop to prevent exceptions from wrecking rest of task master stop * add precondition check	2019-01-21 09:01:36 -08:00
Jonathan Wei	8537a771b0	Some fixes and tests for spaces/non-ASCII chars in datasource names (#6761 ) * Fixes and tests for spaces/non-ASCII datasource names * Some unit test fixes * Fix ITRealtimeIndexTaskTest * Checkstyle * TeamCity * PR comments	2019-01-15 08:33:31 -08:00
Charles Allen	5d2947cd52	Use Guava Compatible immediate executor service (#6815 ) * Use multi-guava version friendly direct executor implementation * Don't use a singleton * Fix strict compliation complaints * Copy Guava's DirectExecutor * Fix javadoc * Imports are the devil	2019-01-11 10:42:19 -08:00
Jihoon Son	39780d914c	Propagate errors to RemoteTaskActionClient (#6837 ) * Propagate errors to RemoteTaskActionClient * style	2019-01-10 20:53:18 -08:00
Jihoon Son	c35a39d70b	Add support maxRowsPerSegment for auto compaction (#6780 ) * Add support maxRowsPerSegment for auto compaction * fix build * fix build * fix teamcity * add test * fix test * address comment	2019-01-10 09:50:14 -08:00
Jihoon Son	934c83bca6	Fix TaskLockbox when there are multiple intervals of the same start but differerent end (#6822 ) * Fix TaskLockbox when there are multiple intervals of the same start but differernt end * fix build * fix npe	2019-01-09 19:38:27 -08:00
Jihoon Son	c4716d1639	Fix ParallelIndexTask when publishing empty segments (#6807 ) * Fix ParallelIndexTask when publishing empty segments * unused import	2019-01-08 17:15:16 -08:00
Jihoon Son	9ad6a733a5	Add support segmentGranularity for CompactionTask (#6758 ) * Add support segmentGranularity * add doc and fix combination of options * improve doc	2019-01-03 17:50:45 -08:00
Joshua Sun	7c7997e8a1	Add Kinesis Indexing Service to core Druid (#6431 ) * created seekablestream classes * created seekablestreamsupervisor class * first attempt to integrate kafa indexing service to use SeekableStream * seekablestream bug fixes * kafkarecordsupplier * integrated kafka indexing service with seekablestream * implemented resume/suspend and refactored some package names * moved kinesis indexing service into core druid extensions * merged some changes from kafka supervisor race condition * integrated kinesis-indexing-service with seekablestream * unite tests for kinesis-indexing-service * various bug fixes for kinesis-indexing-service * refactored kinesisindexingtask * finished up more kinesis unit tests * more bug fixes for kinesis-indexing-service * finsihed refactoring kinesis unit tests * removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions * kinesis-indexing-service code cleanup and docs * merge #6291 merge #6337 merge #6383 * added more docs and reordered methods * fixd kinesis tests after merging master and added docs in seekablestream * fix various things from pr comment * improve recordsupplier and add unit tests * migrated to aws-java-sdk-kinesis * merge changes from master * fix pom files and forbiddenapi checks * checkpoint JavaType bug fix * fix pom and stuff * disable checkpointing in kinesis * fix kinesis sequence number null in closed shard * merge changes from master * fixes for kinesis tasks * capitalized <partitionType, sequenceType> * removed abstract class loggers * conform to guava api restrictions * add docker for travis other modules test * address comments * improve RecordSupplier to supply records in batch * fix strict compile issue * add test scope for localstack dependency * kinesis indexing task refactoring * comments * github comments * minor fix * removed unneeded readme * fix deserialization bug * fix various bugs * KinesisRecordSupplier unable to catch up to earliest position in stream bug fix * minor changes to kinesis * implement deaggregate for kinesis * Merge remote-tracking branch 'upstream/master' into seekablestream * fix kinesis offset discrepancy with kafka * kinesis record supplier disable getPosition * pr comments * mock for kinesis tests and remove docker dependency for unit tests * PR comments * avg lag in kafkasupervisor #6587 * refacotred SequenceMetadata in taskRunners * small fix * more small fix * recordsupplier resource leak * revert .travis.yml formatting * fix style * kinesis docs * doc part2 * more docs * comments * comments2 revert string replace changes * comments * teamcity * comments part 1 * comments part 2 * comments part 3 * merge #6754 * fix injection binding * comments * KinesisRegion refactor * comments part idk lol * can't think of a commit msg anymore * remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner * commmmmmmmmmments * extra error handling in KinesisRecordSupplier getRecords * comments * quickfix * typo * oof	2018-12-21 12:49:24 -07:00
Clint Wylie	9505074530	fix log typo (#6755 ) * fix log typo, add DataSegmentUtils.getIdentifiersString util method * fix indecisive oops	2018-12-18 15:10:25 -08:00
Clint Wylie	486c6f3cf9	emit logs that are only useful for debugging at debug level (#6741 ) * make logs that are only useful for debugging be at debug level so log volume is much more chill * info level messages for total merge buffer allocated/free * more chill compaction logs	2018-12-17 14:20:28 +08:00
Gian Merlino	04e7c7fbdc	FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637 ) * FilteredRequestLogger: Fix start/stop, invalid delegate behavior. Fixes two bugs: 1) FilteredRequestLogger did not start/stop the delegate. 2) FilteredRequestLogger would ignore an invalid delegate type, and instead silently substitute the "noop" logger. This was due to a larger problem with RequestLoggerProvider setup in general; the fix here is to remove "defaultImpl" from the RequestLoggerProvider interface, and instead have JsonConfigurator be responsible for creating the default implementations. It is stricter about things than the old system was, and is only willing to make a noop logger if it doesn't see any request logger configs. Otherwise, it'll raise a provision error. * Remove unneeded annotations.	2018-12-14 16:55:44 +08:00
dongyifeng	91e3cf7196	add charset UTF-8 to log api (#6709 ) When I retrieve the task log in browser, the Chinese characters all end up as garbage. ![image](https://user-images.githubusercontent.com/1322134/49502749-bd614080-f8b0-11e8-839e-07f7117eebfd.png) After adding charset UTF-8, it was correct. ![image](https://user-images.githubusercontent.com/1322134/49502804-dc5fd280-f8b0-11e8-916b-bda8f1e7f318.png)	2018-12-12 16:31:04 +01:00
Jihoon Son	f727333b70	Fix shutdownAllTasks API for non-existing dataSource (#6706 )	2018-12-08 09:54:01 -08:00
Mingming Qiu	607339003b	Add TaskCountStatsMonitor to monitor task count stats (#6657 ) * Add TaskCountStatsMonitor to monitor task count stats * address comments * add file header * tweak test	2018-12-04 13:37:17 -08:00
Clint Wylie	a1c9d0add2	autosize processing buffers based on direct memory sizing by default (#6588 ) * autosize processing buffers based on direct memory sizing * remove oops, more test * max 1gb autosize buffers, test, start of docs * fix oops * revert accidental change * print buffer size in exception * change the things	2018-12-03 18:40:02 -07:00
Clint Wylie	43adb391c2	remove AbstractResourceFilter.isApplicable because it is not (#6691 ) * remove AbstractResourceFilter.isApplicable because it is not, add tests for OverlordResource.doShutdown and OverlordResource.shutdownTasksForDatasource * cleanup	2018-12-01 21:52:31 +08:00
Roman Leventov	ec38df7575	Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606 ) * Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns * Fix style * Fix HttpEmitterTest.timeoutEmptyQueue() * Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests * Clarify code	2018-12-01 01:12:56 +01:00
Jihoon Son	d6539abd0a	Fix overlord api and console (#6686 ) * Fix overlord APIs and console * remove getRunningTasksByDataSource * add missing path to isApplicable	2018-11-29 23:45:28 -08:00
Mingming Qiu	c81cb94226	fix cannot resolve param at OverlordResource#getTasks (#6679 )	2018-11-29 10:07:21 +08:00
David Glasser	d150483fd3	indexing-service: fix HTML title on overlord console (#6671 ) This follows a similar fix to the body of the page done in #5627.	2018-11-27 22:34:26 -08:00
Jihoon Son	422b76b33c	Fix IndexTaskClient to retry on ChannelException (#6649 ) * Fix IndexTaskClient to retry on ChannelException * fix travis and add javadoc * address comment	2018-11-27 15:54:38 -08:00
seoeun	22a5bf97a2	Fix issue that tasks tables in metadata storage are not cleared (#6592 ) * tasks tables in metadata storage are not cleared * address comments. remove tasklogs and revert obsolete changes * address comments. change comment and update doc. * address comments. update doc more detailed * address comments. remove redundant log and update doc more detailed. * address comments. update document	2018-11-22 11:50:31 +08:00
Marat	81c9a6177c	Added support for filtering by unused parameter for HeapMemoryTaskStorage (#6510 ) * 1. added support for unused DateTime start parameter in getRecentlyFinishedTaskInfoSince method: HeapMemoryTaskStorage.getRecentlyFinishedTaskInfoSince return the finished tasks by comparing TaskStuff.createdDate with the start time 2. added filtering by status complete to TaskStuff list stream in HeapMemoryTaskStorage.getNRecentlyFinishedTaskInfo method. 3. changed names of methods and parameters to present that public API method OverlordResource.getTasks return the list of completed tasks, which createdDate, not date of completion, belongs to the interval parameter. * 1. added support for unused DateTime start parameter in getRecentlyFinishedTaskInfoSince method: HeapMemoryTaskStorage.getRecentlyFinishedTaskInfoSince return the finished tasks by comparing TaskStuff.createdDate with the start time 2. added filtering by status complete to TaskStuff list stream in HeapMemoryTaskStorage.getNRecentlyFinishedTaskInfo method. 3. changed names of methods and parameters to present that public API method OverlordResource.getTasks return the list of completed tasks, which createdDate, not date of completion, belongs to the interval parameter. * Fixed OverlordResourceTest to Support changed methods names * Changed methods and parameters names to make them more obvious to understand. * Changed String.replace() for the StringUtils.replace()(#6607) * Fixed checkstyle error	2018-11-20 13:42:44 -08:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Jihoon Son	d738ce4d2a	Enforce logging when killing a task (#6621 ) * Enforce logging when killing a task * fix test * address comment * address comment	2018-11-16 10:01:56 +08:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
QiuMM	f2b73f9df1	fix cannot resolve param at OverlordResource#getTasks (#6593 )	2018-11-13 09:47:11 -08:00
Roman Leventov	54351a5c75	Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490 ) * Fix various bugs; Enable more IntelliJ inspections and update error-prone * Fix NPE * Fix inspections * Remove unused imports	2018-11-06 14:38:08 -08:00
QiuMM	676f5e6d7f	Prohibit some guava collection APIs and use JDK collection APIs directly (#6511 ) * Prohibit some guava collection APIs and use JDK APIs directly * reset files that changed by accident * sort codestyle/druid-forbidden-apis.txt alphabetically	2018-10-29 13:02:43 +01:00
Clint Wylie	ee1fc93f97	fix exception in Supervisor.start causing overlord unable to become leader (#6516 ) * fix exception thrown by Supervisor.start causing overlord unable to become leader * fix style	2018-10-25 15:44:04 -07:00
Clint Wylie	e1057ad47a	Fix NPE in TaskLockbox that prevents overlord leadership (#6512 ) * fix NPE that prevents overlord from assuming leadership if extension that provides indexing task type is not loaded * heh	2018-10-25 13:06:11 -07:00
Roman Leventov	84ac18dc1b	Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461 ) * Catch some incorrect method parameter or call argument formatting patterns with checkstyle * Fix DiscoveryModule * Inline parameters_and_arguments.txt * Fix a bug in PolyBind * Fix formatting	2018-10-23 07:17:38 -03:00
Faxian Zhao	c5bf4e7503	update insert pending segments logic to synchronous (#6336 ) * 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict. 2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB. 3. 'select and insert' is not need within transaction. * Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments. * fix typo for NullPointerException	2018-10-22 19:48:20 -07:00
David Lim	e1a53fd17a	fix distribution to not include contrib extensions by default, don't pull the entire AWS SDK bundle (#6494 )	2018-10-19 13:50:05 -07:00
Joshua Sun	b662fe84c5	fix TaskRunnerUtils String formatting issue (#6492 ) * fix TaskRunnerUtils String formatting issue * additional fixes	2018-10-18 19:16:46 -06:00
QiuMM	85a89e2703	make druid node bind address configurable (#6464 ) * make druid node bind address configurable * fix tests * fix travis-ci	2018-10-15 14:19:40 -07:00
Roman Leventov	aa121da25f	Use NodeType enum instead of Strings (#6377 ) * Use NodeType enum instead of Strings * Make NodeType constants uppercase * Fix CommonCacheNotifier and NodeType/ServerType comments * Reconsidering comment * Fix import * Add a comment to CommonCacheNotifier.NODE_TYPES	2018-10-14 20:49:38 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
David Lim	20ab213ba6	change project versions to 0.13.0-incubating-SNAPSHOT (#6453 )	2018-10-11 19:28:01 -07:00
QiuMM	f8f4526b16	Add suspend\|resume\|terminate all supervisors endpoints. (#6272 ) * ability to showdown all supervisors * add doc * address comments * fix code style * address comments * change ternary assignment to if statement * better docs	2018-10-10 21:41:59 -07:00
Jihoon Son	9343cbc63a	Fix CompactionTask to consider only latest segments (#6429 ) * CompactionTask should consider only latest segments * fix test	2018-10-08 21:53:16 -07:00
Jihoon Son	2b76d57347	Fail compactionTask if it fails to run one of indexTaskSpecs (#6428 ) * Fail compactionTask if it fails to run one of indexTaskSpecs * add log	2018-10-08 08:53:32 -07:00
Jihoon Son	88d23b77b7	Add support keepSegmentGranularity for automatic compaction (#6407 ) * Add support keepSegmentGranularity for automatic compaction * skip unknown dataSource * ignore single semgnet to compact * add doc * address comments * address comment	2018-10-07 16:48:58 -07:00
Jihoon Son	45aa51a00c	Add support hash partitioning by a subset of dimensions to indexTask (#6326 ) * Add support hash partitioning by a subset of dimensions to indexTask * add doc * fix style * fix test * fix doc * fix build	2018-10-06 16:45:07 -07:00
Jonathan Wei	c7ac8785a1	Prevent failed KafkaConsumer creation from blocking overlord startup (#6383 ) * Prevent failed KafkaConsumer creation from blocking overlord startup * PR comments * Fix random task ID length * Adjust test timer * Use Integer.SIZE	2018-10-03 19:08:20 -07:00
Roman Leventov	3ae563263a	Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957 ) This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR. Some of the changes: - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names. - Generified `ComplexMetricExtractor`	2018-10-02 14:50:22 -03:00
Surekha	42e5385e56	make 0.13 tasks API backwards compatible with 0.12 (#6333 ) (#6334 ) * Replace statusCode with status (#6333) Also changed runnerStatusCode to runnerStatus to keep things consistent * Add unit test * Add status param to TaskStatusPlus Revert to statusCode and runnerStatusCode * Add additional status member to TaskStatusPlus * Change TaskResponseObject to match overlord's response object * Address PR comments * address comments * Add runtime exception after logging error * Remove (deprecated)status member variable from TaskStatusPlus * Minor change	2018-10-01 15:33:24 -07:00
Jihoon Son	cb14a43038	Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask (#6393 ) * Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask * update doc and remove auto conversion * remove remaining doc * fix teamcity	2018-10-01 12:03:35 -07:00
Gian Merlino	9fa4afdb8e	URL encode datasources, task ids, authenticator names. (#5938 ) * URL encode datasources, task ids, authenticator names. * Fix URL encoding for router forwarding servlets. * Fix log-with-offset API. * Fix test. * Test adjustments. * Task client fixes. * Remove unused import.	2018-09-30 12:29:51 -07:00
dyf6372	63ba7f7bec	overlord check task whether is present before get lock (#6308 )	2018-09-28 16:57:40 -07:00
Jihoon Son	122caec7b1	Add support targetCompactionSizeBytes for compactionTask (#6203 ) * Add support targetCompactionSizeBytes for compactionTask * fix test * fix a bug in keepSegmentGranularity * fix wrong noinspection comment * address comments	2018-09-28 11:16:35 -07:00
Jihoon Son	aef022de98	Fix race in taskMaster (#6388 )	2018-09-26 21:48:02 -07:00
Jihoon Son	6fb503c073	Deprecate task audit logging (#6368 ) * Deprecate task audit logging * fix test * fix it test	2018-09-26 16:28:02 -07:00
Clint Wylie	399a5659b2	fix incorrect precondition check in `SupervisorManager.suspendOrResumeSupervisor` (#6364 ) This check is reverse from the intention	2018-09-21 17:40:14 -07:00
Slim Bouguerra	028354eea8	Adding licenses and enable apache-rat-plugin. (#6215 ) * Adding licenses and enable apache-rat-plugi. Change-Id: I4685a2d9f1e147855dba69329b286f2d5bee3c18 * restore the copywrite of demo_table and add it to the list of allowed ones Change-Id: I2a9efde6f4b984bc1ac90483e90d98e71f818a14 * revirew comments Change-Id: I0256c930b7f9a5bb09b44b5e7a149e6ec48cb0ca * more fixup Change-Id: I1355e8a2549e76cd44487abec142be79bec59de2 * align Change-Id: I70bc47ecb577bdf6b91639dd91b6f5642aa6b02f	2018-09-18 08:39:26 -07:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
QiuMM	87ccee05f7	Add ability to specify list of task ports and port range (#6263 ) * support specify list of task ports * fix typos * address comments * remove druid.indexer.runner.separateIngestionEndpoint config * tweak doc * fix doc * code cleanup * keep some useful comments	2018-09-13 19:36:04 -07:00
Roman Leventov	d50b69e6d4	Prohibit LinkedList (#6112 ) * Prohibit LinkedList * Fix tests * Fix * Remove unused import	2018-09-13 18:07:06 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Gian Merlino	cb40b6d369	Fix all inspection errors currently reported. (#6236 ) * Fix all inspection errors currently reported. TeamCity builds on master are reporting inspection errors, possibly because there was a while where it was not running due to the Apache migration, and there was some drift. * Fix one more location. * Fix tests. * Another fix.	2018-08-26 18:36:01 -06:00
Himanshu	c3aaf8122d	fix TaskQueue-HRTR deadlock (#6212 ) * fix TaskQueue-HRTR deadlock causing https://github.com/apache/incubator-druid/issues/6201 * address review comments	2018-08-25 14:15:57 -07:00
QiuMM	9803ce954a	fix port conflict for druid peon (#6202 )	2018-08-23 19:05:13 -07:00
QiuMM	ceb8f8e625	remove unnecessary tlsPortFinder to avoid potential port conflicts (#6194 )	2018-08-23 10:41:49 -07:00
Benedict Jin	3647d4c94a	Make time-related variables more readable (#6158 ) * Make time-related variables more readable * Patch some improvements from the code reviewer * Remove unnecessary boxing of Long type variables	2018-08-21 15:29:40 -07:00
Jihoon Son	2bfe1b6a5a	Fix NPE for taskGroupId when rolling update (#6168 ) * Fix NPE for taskGroupId * missing changes * fix wrong annotation * fix potential race * keep baseSequenceName * make deprecated old param	2018-08-17 10:15:45 -07:00
QiuMM	b0cf8d0252	'shutdownAllTasks' API for a dataSource (#6185 ) * 'shutdownAllTasks' API for a dataSource Change-Id: I30d14390457d39e0427d23a48f4f224223dc5777 * fix api path and return Change-Id: Ib463f31ee2c4cb168cf2697f149be845b57c42e5 * optimize implementation Change-Id: I50a8dcd44dd9d36c9ecbfa78e103eb9bff32eab9	2018-08-17 12:57:09 -04:00
Gian Merlino	5ce3185b9c	Fix three bugs with segment publishing. (#6155 ) * Fix three bugs with segment publishing. 1. In AppenderatorImpl: always use a unique path if requested, even if the segment was already pushed. This is important because if we don't do this, it causes the issue mentioned in #6124. 2. In IndexerSQLMetadataStorageCoordinator: Fix a bug that could cause it to return a "not published" result instead of throwing an exception, when there was one metadata update failure, followed by some random exception. This is done by resetting the AtomicBoolean that tracks what case we're in, each time the callback runs. 3. In BaseAppenderatorDriver: Only kill segments if we get an affirmative false publish result. Skip killing if we just got some exception. The reason for this is that we want to avoid killing segments if they are in an unknown state. Two other changes to clarify the contracts a bit and hopefully prevent future bugs: 1. Return SegmentPublishResult from TransactionalSegmentPublisher, to make it more similar to announceHistoricalSegments. 2. Make it explicit, at multiple levels of javadocs, that a "false" publish result must indicate that the publish _definitely_ did not happen. Unknown states must be exceptions. This helps BaseAppenderatorDriver do the right thing. * Remove javadoc-only import. * Updates. * Fix test. * Fix tests.	2018-08-15 13:55:53 -07:00
Karol Woźniak	da3a1f61ac	Fix appenderator_realtime creating shards bigger by 1 than maxRowsPerSegment (#6125 )	2018-08-10 22:29:06 -07:00
Jihoon Son	d6a02de5b5	Add support 'keepSegmentGranularity' for compactionTask (#6095 ) * Add keepSegmentGranularity for compactionTask * fix build * createIoConfig method * fix build * fix build * address comments * fix build	2018-08-09 13:51:20 -07:00
Jihoon Son	577632f5c1	Fix missing argument of TaskToolbox (#6121 )	2018-08-07 17:18:56 -07:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Jihoon Son	ef2d6e9118	Fix IllegalArgumentException in TaskLockBox.syncFromStorage() when updating from 0.12.x to 0.12.2 (#6086 ) * Fix TaskLockBox.syncFromStorage() when updating from 0.12.x to 0.12.2 * Make the priority of taskLock nullable * fix test * fix build	2018-08-03 17:13:44 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Roman Leventov	0754d78a2e	Prohibit Lists.newArrayList() with a single argument (#6068 ) * Prohibit Lists.newArrayList() with a single argument * Test fixes * Add Javadoc to Node constructor	2018-07-31 20:09:10 -07:00
Jonathan Wei	91943a24db	Fix CombiningFirehoseFactory with IngestSegmentFirehoseFactory in IndexTask (#6065 ) * Fix CombiningFirehoseFactory with IngestSegmentFirehoseFactory in IndexTask * Make recursive	2018-07-31 15:30:44 -07:00
Gian Merlino	3aa7017975	Remove some unnecessary task storage internal APIs. (#6058 ) * Remove some unnecessary task storage internal APIs. - Remove MetadataStorageActionHandler's getInactiveStatusesSince and getActiveEntriesWithStatus. - Remove TaskStorage's getCreatedDateTimeAndDataSource. - Remove TaskStorageQueryAdapter's getCreatedTime, and getCreatedDateAndDataSource. - Migrated all callers to getActiveTaskInfo and getCompletedTaskInfo. This has one side effect: since getActiveTaskInfo (new) warns and continues when it sees unreadable tasks, but getActiveEntriesWithStatus threw an exception when it encountered those, it means that after this patch bad tasks will be ignored when syncing from metadata storage rather than causing an exception to be thrown. IMO, this is an improvement, since the most likely reason for bad tasks is either: - A new version introduced an additional validation, and a pre-existing task doesn't pass it. - You are rolling back from a newer version to an older version. In both cases, I believe you would want to skip tasks that can't be deserialized, rather than blocking overlord startup. * Remove unused import. * Fix formatting. * Fix formatting.	2018-07-30 18:35:06 -07:00
Gian Merlino	63be028cee	CompactionTask: Reject empty intervals on construction. (#6059 ) * CompactionTask: Reject empty intervals on construction. They don't make sense anyway, and it's better to fail fast. * Switch API.	2018-07-30 08:52:50 -07:00
Benedict Jin	331a0afb98	Remove redundant type parameters and enforce some other style and inspection rules (#5980 ) * Various changes about druid-services module * Patch improvements from reviewer * Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile * Fix ArraysAsListWithZeroOrOneArgument * Fix conflict * Fix ToArrayCallWithZeroLengthArrayArgument * Fix AliEqualsAvoidNull * Remove blank line * Remove unused import clauses * Fix code style in TopNQueryRunnerTest * Fix conflict * Don't use Collections.singletonList when converting the type of array type * Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase * Roll back the latest commit * Add java.io.File#toURL() into druid-forbidden-apis * Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord * Add a new regexp element into stylecode xml file * Fix style error for new regexp * Set the level of ArraysAsListWithZeroOrOneArgument as WARNING * Fix style error for new regexp * Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile * Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR * Add toArray(new Object[0]) regexp into checkstyle config file & fix them * Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it * Add a comment for string equals regexp in checkstyle config * Fix code format * Add RedundantTypeArguments as ERROR level inspection * Fix cannot resolve symbol datasource	2018-07-27 16:56:49 -05:00
Surekha	74ae73df42	Add the correct createdTime to waiting tasks (#6060 )	2018-07-27 14:56:19 -07:00
Jihoon Son	1524af703d	Fix IllegalArgumentException in TaskLockBox.syncFromStorage() (#6050 )	2018-07-27 10:43:32 -07:00
Surekha	414487a78e	Add support to filter on datasource for active tasks (#5998 ) * Add support to filter on datasource for active tasks * Added datasource filter to sql query for active tasks * Fixed unit tests * Address PR comments	2018-07-19 16:33:46 -07:00
Jihoon Son	c48aa74a30	Fix NPE while handling CheckpointNotice in KafkaSupervisor (#5996 ) * Fix NPE while handling CheckpointNotice * fix code style * Fix test * fix test * add a log for creating a new taskGroup * fix backward compatibility in KafkaIOConfig	2018-07-13 17:14:57 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
Jihoon Son	1ccabab98e	Fix the broken Appenderator contract in KafkaIndexTask (#5905 ) * Fix broken Appenderator contract in KafkaIndexTask * fix build * add publishFuture * reuse sequenceToUse if possible	2018-07-03 13:31:29 -07:00
Jihoon Son	b6c957b0d2	Allow reordered segment allocation in kafka indexing service (#5805 ) * Allow reordered segment allocation in kafka indexing service * address comments * fix a bug	2018-07-02 15:09:12 -07:00
Surekha	933b25416c	Handle task deserialization failure in the tasks api (#5911 ) If task payload fails to deserialize json to Java, make the task null and handle null task in OverlordResource	2018-06-29 11:57:48 -07:00
Jonathan Wei	f3e1520360	Fix merge for TrueDimFilter (#5916 ) * Fix merge for TrueDimFilter * remove unused cache ID	2018-06-28 14:46:47 -07:00
Jihoon Son	8c5ded0fad	Splitting KafkaIndexTask for better code maintenance (#5854 ) * Refactoring KafkaIndexTask for better code maintenance * fix bug * fix test * add annotation * fix checkstyle * remove SetEndOffsetsResult	2018-06-22 13:00:03 -07:00
Surekha	8619adb5b9	Improve task retrieval APIs on Overlord (#5801 ) * Add the new tasks api in overlordResource It takes 4 optional query params * state(pending/running/waiting/compelte) * dataSource * interval (applies to completed tasks) * maxCompletedTasks (applies to completed tasks) If all params are null, the api returns all the tasks * Add the state to each task returned by tasks endpoint * divide active tasks into waiting, pending or running * Add more unit tests * Add UNKNOWN state to TaskState * Fix the authorization calls * WIP: PR comments Added new class to capture task info for caching Other refactoring * Refactoring : move TaskStatus class to druid-api so it can be accessed within server And other related classes like TaskState and TaskStatusPlus are in api * Remove unused class and apis accessing it * Add a separate cache for recently completed tasks This is to mainly capture the task type from payload * Ignore a test * Add a RuntimeTaskState to encompass all states a task can be in * Revert "Add a RuntimeTaskState to encompass all states a task can be in" This reverts commit `2a527a0731`. * Fix wrong api call * Fix and unignore tests * Remove waiting,pending state from TaskState * Add RunnerTaskState * Missed the annotation runnerStatusCode * Fix the creationTime * Fix the createdTime and queueInsertionTime for running/active tasks * Clean up tests * Add javadocs * Potentially fix the teamcity build * Address PR comments Get rid of TaskInfoBuilder Make TaskInfoMapper static nested class Other changes fix import in MaterializedViewSupervisor after merge * Address PR comments on * Replace global cache with local map * combine multiple queries into one * Removed unused code * Fix unit tests Fix a bug in securedTaskStatusPlus * Remove getRecentlyFinishedTaskStatuses method Change TaskInfoMapper signature to add generic type * Address PR comments * Passed datasource as argument to be used in sql query * Other minor fixes * Address PR comments Some minor changes, rename method, spacing changes Add early auth check if datasource is not null * Fix test case * Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage * Add TaskLocation to Anytask object * Address PR comments * Fix a bug in test case causing ClassCastException	2018-06-19 11:34:59 -07:00
Jihoon Son	2feec44a55	Fix mismatch in revoked task locks between memory and metastore after sync from storage (#5858 ) * Fix mismatched revoked task locks after sync from storage * fix build * fix log * fix lock release	2018-06-12 10:25:34 -04:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
Jonathan Wei	684b5d18c1	Moving averages for ingestion row stats (#5748 ) * Moving averages for ingestion row stats * PR comments * Make RowIngestionMeters extensible * test and checkstyle fixes * More PR comments * Fix metrics * Add some comments * PR comments * Comments	2018-06-05 09:08:57 -07:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Gian Merlino	f2cc6ce4d5	VersionedIntervalTimeline: Optimize construction with heavily populated holders. (#5777 ) * VersionedIntervalTimeline: Optimize construction with heavily populated holders. Each time a segment is "add"ed to a timeline, "isComplete" is called on the holder that it is added to. "isComplete" is an O(segments per chunk) operation, meaning that adding N segments to a chunk is an O(N^2) operation. This blows up badly if we have thousands of segments per chunk. The patch defers the "isComplete" check until after all segments have been inserted. * Fix imports.	2018-05-16 09:16:59 -07:00
Jihoon Son	9dca5ec76b	Simple cleanup for ThreadPoolTaskRunner and SetAndVerifyContextQueryRunner / Add ThreadPoolTaskRunnerTest (#5557 ) * Simple fix for ThreadPoolTaskRunner * fix build * address comments * update javadoc * fix build * fix test * add dependency	2018-05-15 22:53:11 +05:30
Jihoon Son	c9d645103b	Fix metrics for inserting segments (#5749 ) * Fix metrics for inserting segments * Add a comment	2018-05-08 13:07:39 -07:00
Kirill Kozlov	67d0b0ee42	Add taskType dimension to task metrics (#5664 )	2018-05-07 09:42:26 -07:00
Slim Bouguerra	8aa8d9fa5b	Kerberos Spnego Authentication Router Issue (#5706 ) * Adding decoration method to proxy servlet Change-Id: I872f9282fb60bfa20524271535980a36a87b9621 * moving the proxy request decoration to authenticators Change-Id: I7f94b9ff5ecf08e8abf7169b58bc410f33148448 * added docs Change-Id: I901543e52f0faf4666bfea6256a7c05593b1ae70 * use the authentication result to decorate request Change-Id: I052650de9cd02b4faefdbcdaf2332dd3b2966af5 * adding authenticated by name Change-Id: I074d2933460165feeddb19352eac9bd0f96f42ca * ensure that authenticator is not null Change-Id: Idb58e308f90db88224a06f3759114872165b24f5 * fix types and minor bug Change-Id: I6801d49a05d5d8324406fc0280286954eb66db10 * fix typo Change-Id: I390b12af74f44d760d0812a519125fbf0df4e97b * use actual type names Change-Id: I62c3ee763363781e52809ec912aafd50b8486b8e * set authenitcatedBy to null for AutheticationResults created by Escalator. Change-Id: I4a675c372f59ebd8a8d19c61b85a1e4bf227a8ba	2018-05-05 20:33:51 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
David Lim	8ec2d2fe18	Use unique segment paths for Kafka indexing (#5692 ) * support unique segment file paths * forbiddenapis * code review changes * code review changes * code review changes * checkstyle fix	2018-04-29 21:59:48 -07:00
Gian Merlino	762f8829e4	Add task action metrics, add taskId metric dimension. (#5714 ) * Add task action metrics, add taskId metric dimension. Adds two new metrics: task/action/log/time and task/action/run/time. Also adds taskId as a dimension, to give us the ability to drill down into metrics for an individual task. Also standardizes metrics-attachment using two helper methods in IndexTaskUtils. * Fix typo	2018-04-29 21:24:06 -07:00
Jihoon Son	23dc0d5b24	Better logging for taskLockBox (#5703 )	2018-04-28 21:08:10 -07:00
Roman Leventov	9be000758d	Refactor index merging, replace Rowboats with RowIterators and RowPointers (#5335 ) * Refactor index merging, replace Rowboats with RowIterators and RowPointers * Add javadocs * Fix a bug in QueryableIndexIndexableAdapter * Fixes * Remove unused declarations * Remove unused GenericColumn.isNull() method * Fix test * Address comments * Rearrange some code in MergingRowIterator for more clarity * Self-review * Fix style * Improve docs * Fix docs * Rename IndexMergerV9.writeDimValueAndSetupDimConversion to setUpDimConversion() * Update Javadocs * Minor fixes * Doc fixes, more code comments, cleanup of RowCombiningTimeAndDimsIterator * Fix doc link	2018-04-27 17:34:32 -07:00
Roman Leventov	a3a9ada843	Add GenericWhitespace checkstyle check (#5668 )	2018-04-24 01:09:14 +05:30
Jihoon Son	f349e03091	Fix NPE in compactionTask (#5613 ) * Fix NPE in compactionTask * more annotations for metadata * better error message for empty input * fix build * revert some null checks * address comments	2018-04-13 00:11:03 -04:00
Caroline1000	d709d1a59f	correct overlord console header. Because it's not the coordinator console, it's the overlord console. (#5627 )	2018-04-11 20:44:31 -04:00
Nishant Bangarwa	e6efd75a3d	Add config to allow setting up custom unsecured paths for druid nodes. (#5614 ) * Add config to allow setting up custom unsecured paths for druid nodes. * return all resources for Unsecured paths * review comment - Add test * fix tests * fix test	2018-04-11 17:10:07 -07:00
Jihoon Son	298ed1755d	Fix indexTask to respect forceExtendableShardSpecs (#5509 ) * Fix indexTask to respect forceExtendableShardSpecs * add comments	2018-04-05 23:54:59 -07:00
Jonathan Wei	969342cd28	More error reporting and stats for ingestion tasks (#5418 ) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments	2018-04-05 21:38:57 -07:00
Jihoon Son	7239f56131	Fix NPE in RemoteTaskRunner when some tasks in ZooKeeper but not in Overlord (#5511 ) * Fix NPE in RemoteTaskRunner when some tasks in ZooKeeper but not in Overlord * revert unnecessary change	2018-04-03 21:15:58 -07:00
Jonathan Wei	723f7ac550	Add support for task reports, upload reports to deep storage (#5524 ) * Add support for task reports, upload reports to deep storage * PR comments * Better name for method * Fix report file upload * Use TaskReportFileWriter * Checkstyle * More PR comments	2018-04-02 12:10:56 -07:00
Kirill Kozlov	8878a7ff94	Replace guava Charsets with native java StandardCharsets (#5545 )	2018-03-28 21:00:08 -07:00
Jihoon Son	1ad898bde2	Use the official aws-sdk instead of jet3t (#5382 ) * Use the official aws-sdk instead of jet3t * fix compile and serde tests * address comments and fix test * add http version string * remove redundant dependencies, fix potential NPE, and fix test * resolve TODOs * fix build * downgrade jackson version to 2.6.7 * fix test * resolve the last TODO * support proxy and endpoint configurations * fix build * remove debugging log * downgrade hadoop version to 2.8.3 * fix tests * remove unused log * fix it test * revert KerberosAuthenticator change * change hadoop-aws scope to provided in hdfs-storage * address comments * address comments	2018-03-21 15:36:54 -07:00
Charles Allen	58f110f7f8	Future-proof some Guava usage (#5414 ) * Future-proof some Guava usage * Use a java-util EmptyIterator instead of Guava's * Change some of the guava future handling to do manual async transforms. Guava changes transform into transformAsync by deprecating transform in ONLY Guava 19. Then its gone in 20 * Use `Collections.emptyIterator()` * Pretty formatting * Make listenable future transforms a thing in default druid * Format fix * Add forbidden guava apis * Make the ListenableFutrues.transformAsync have comments * Undo intellij bad pattern matching in comments * Futrues --> Futures * Add empty iterators forbidding * Fix extra `A` * Correct method signature * Address review comments * Finish Gian review comments * Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax	2018-03-20 08:59:33 -07:00
Jonathan Wei	b22455b924	Fix supervisor tombstone auth handling (#5504 )	2018-03-19 12:55:47 -07:00
Roman Leventov	693e3575f9	Remove unused code and exception declarations (#5461 ) * Remove unused code and exception declarations * Address comments * Remove redundant Exception declarations * Make FirehoseFactoryV2.connect() to throw IOException again	2018-03-16 22:11:12 +01:00
Jonathan Wei	30e6bdedf3	Authorize supervisor history instead of current active supervisors for supervisor history API (#5501 )	2018-03-16 12:29:17 -07:00
Jihoon Son	9b2a25bd84	Refactor supervisorReport to be type-safe (#5479 ) * refactor supervisorReport * use primitives	2018-03-13 09:28:44 -07:00
Niraja Mishra	96cebfc222	As part of this feature, implemented a new endpoint to get running tasks by datasources (#5260 ) and added datasource information as part of existing endpoint /druid/indexer/v1/runningTasks. Added junit test cases for the newly implemented API and fixed existing junit test cases. Fixed review comments - added new method getCreatedDateTimeAndDataSource into TaskStorageQueryAdapter class and formatted changed files	2018-03-12 23:48:11 -07:00
Alexander Korablev	8a51800693	fix PortFinder issue #5466 (#5467 )	2018-03-05 16:58:49 -08:00
Jonathan Wei	b63f1c0e45	Fix authorization check in supervisor history API (#5460 )	2018-03-02 14:03:07 -08:00
Kevin Conaway	969a12f6ca	#5425 Refactor to use map.get() when asserting the existence of published segments (#5426 )	2018-03-02 19:42:39 +01:00
Gian Merlino	e4eaee3806	Support for disabling bitmap indexes. (#5402 ) * Support for disabling bitmap indexes. Can save space for columns where bitmap indexes are pointless (like free-form text). * Remove import. * Fix CompactionTaskTest. * Update for review comments. * Review comments, tests. * Fix test.	2018-02-28 19:19:56 -08:00
Gian Merlino	3f72537787	Fix missing task type in task payload API. (#5399 ) * Fix missing task type in task payload API. Apparently embedding a polymorphic object inside a Map<String, Object> is a bit too much for Jackson to serialize properly. Fix this by using wrapper classes. * Fix OverlordTest casts. * Remove import. * Remove unused imports. * Clarify comments.	2018-02-21 10:00:13 -08:00
Jihoon Son	cd929000ca	Change early publishing to early pushing in indexTask & refactor AppenderatorDriver (#5297 ) * Fix early publishing to early pushing in batch indexing & refactor appenderatorDriver * fix compile * rename and add more javadocs * Fix conflicts * address comments * revert await executors * fix test	2018-02-14 12:48:33 -08:00
Jonathan Wei	b234a119ac	Log exceptions thrown before persist() for indexing tasks (#5374 ) * Log exceptions thrown before persist() for indexing tasks * PR comment	2018-02-13 09:20:07 -08:00
Roman Leventov	e64ffb10c2	Standartize on using Integer.BYTES instead of Ints.BYTES from Guava, same for other primitives (#5366 )	2018-02-07 13:24:30 -08:00
Jihoon Son	2099b43e5f	Add a new config object for compactConfig (#5264 ) * add a new config object for compactConfig * fix test * address comments * Update doc	2018-02-06 12:13:52 -08:00
Kevin Conaway	93fdbcb364	Change RealtimeIndexTask to use AppenderatorDriver (#5261 ) * Change RealtimeIndexTask to use AppenderatorDriver instead of RealtimePlumber. Related to #4774 * Remove unused throwableDuringPublishing * Fix usage of forbidden API * Update realtime index IT to account for not skipping older data any more * Separate out waiting on publish futures and handoff futures to avoid a race condition where the handoff timeout expires before the segment is published * #5261 Add separate AppenderatorDriverRealtimeIndexTask and revert changes to RealtimeIndexTask * #5261 Add separate AppenderatorDriverRealtimeIndexTask and revert changes to RealtimeIndexTask * #5261 Readability improvements in AppenderatorDriverRealtimeIndexTask. Combine publish and handoff futures in to single future * #5261 Add separate tuningConfig for RealtimeAppenderatorIndexTask. Revert changes to RealtimeTuningConfig * #5261 Change JSON type to realtime_appenderator to keep the same naming pattern as RealtimeIndexTask	2018-02-06 10:21:31 -08:00
Gian Merlino	9a62b02cb7	Extensions: Option to load classes from extension jars first. (#5321 ) The behavior is configurable through druid.extensions.useExtensionClassloaderFirst. It is useful when extensions want to load a dependency different from one provided by Druid, for example a different version of geoip or protobuf.	2018-02-06 16:14:03 +05:30
Gian Merlino	7e02408510	Update versions to 0.13.0-SNAPSHOT. (#5323 )	2018-02-02 12:06:38 -06:00
Clint Wylie	1fffc681d2	fix RemoteTaskRunner terminating lazy workers below autoscaler minNumWorkers value (#5310 ) * fix RemoteTaskRunner terminating lazy workers below autoscaler minNumWorkers value * add comment	2018-02-01 17:57:01 +01:00
Jihoon Son	3a69b0e513	Handle nullable taskTypes for rolling upgrade (#5309 )	2018-01-30 13:32:54 -08:00
Himanshu	59250cf19b	add taskType from announcement in HttpRemoteTaskRunnerWorkItem (#5301 )	2018-01-26 15:47:35 -08:00
Jonathan Wei	80419752b5	Add metamx emitter, http clients, and metrics packages to druid java-util (#5289 ) * Add metamx java-util emitter, http clients, and metrics packages to druid java-util * Remove metamx java-util from pom.xml files * Checkstyle fixes * Import fix * TeamCity inspection fixes * Use slf4j, move some version defs to master pom.xml * Use parent jvm-attach-api and maven-surefire-plugin versions * Add ] to log msg, suppress inspection	2018-01-24 22:10:36 +01:00
Roman Leventov	61e6878afd	Check Javadoc reference integrity (#5279 )	2018-01-22 13:51:28 -08:00
Jihoon Son	241efafbb2	Automatic compaction by coordinators (#5102 ) * Automatic compaction by coordinator * add links * skip compaction for very recent segments if they are small * fix finding search interval * fix finding search interval * fix TimelineHolder iteration * add test for newestSegmentFirstPolicy * add CompactionSegmentIterator * add numTargetCompactionSegments * add missing config * fix skipping huge shards * fix handling large number of segments per shard * fix test failure * change recursive call to loop * fix logging * fix build * fix test failure * address comments * change dataSources type * check running pendingTasks at each run * fix test * address comments * fix build * fix test * address comments * address comments * add doc for segment size optimization * address comment	2018-01-13 13:52:37 +09:00
Roman Leventov	8877ce38d6	Enforce modifier order with Checkstyle (#5246 )	2018-01-11 09:50:42 +01:00
Gian Merlino	7b8b0a96d6	IndexTask: Add summary stats line at the end. (#5241 )	2018-01-10 14:06:54 +09:00
Himanshu	a46d34daa2	HTTP based task/worker management. (#5104 ) * just renaming of SegmentChangeRequestHistory etc * additional change history refactoring changes * WorkerTaskManager a replica of WorkerTaskMonitor * HttpServerInventoryView refactoring to extract sync code and robustification * Introducing HttpRemoteTaskRunner * Additional Worker side updates	2018-01-04 19:19:35 -08:00
Roman Leventov	579f9fbedf	Add IndexedInts.debugToString() and AbstractIndex.toString(); Add Sequence.toList() and limit() (#5175 ) * Add IndexedInts.debugToString() and AbstractIndex.toString() * Fix AppenderatorTest	2018-01-04 09:56:47 +09:00
David Lim	a7967ade4d	Support replaceExisting parameter for segments pushers (#5187 ) * support replaceExisting parameter for segments pushers * code review changes * code review changes	2018-01-03 16:13:21 -08:00
Jihoon Son	b31abd03ad	Fix timeout in RemoteTaskRunnerTest (#5191 ) * Fix timeout in RemoteTaskRunnerTest * add message for npe	2017-12-22 17:40:11 +09:00
Jihoon Son	9199d61389	Automatic pendingSegments cleanup (#5149 ) * PendingSegments cleanup * fix build * address comments * address comments * fix potential npe * address comments * fix build * fix test * fix test	2017-12-20 14:46:34 -08:00
Roman Leventov	5787d04fad	Bump Druid version to 0.12.0 (#5138 )	2017-12-15 07:37:01 -08:00
Roman Leventov	64848c7ebf	DataSegment memory optimizations (#5094 ) * Deduplicate DataSegments contents (loadSpec's keys, dimensions and metrics lists as a whole) more aggressively; use ArrayMap instead of default LinkedHashMap for DataSegment.loadSpec, because they have only 3 entries on average; prune DataSegment.loadSpec on brokers * Fix DataSegmentTest * Refinements * Try to fix * Fix the second DataSegmentTest * Nullability * Fix tests * Fix tests, unify to use TestHelper.getJsonMapper() * Revert TestUtil as ServerTestHelper, fix tests * Add newline * Fix indexing tests * Fix s3 tests * Try to fix tests, remove lazy caching of ObjectMapper in TestHelper, rename TestHelper.getJsonMapper() to makeJsonMapper() * Fix HDFS tests * Fix HdfsDataSegmentPusherTest * Capitalize constant names	2017-12-12 11:41:40 -08:00
Gian Merlino	4f5e2b4549	Fix some unemitted alerts. (#5141 )	2017-12-06 18:37:01 -08:00
Roman Leventov	a7a6a0487e	Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762 ) * Replace IOPeon with OutputMedium; Improve compression * Fix test * Cleanup CompressionStrategy * Javadocs * Add OutputBytesTest * Address comments * Random access in OutputBytes and GenericIndexedWriter * Fix bugs * Fixes * Test OutputBytes.readFully() * Address comments * Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes * Add comments to ByteBufferInputStream * Remove unused declarations	2017-12-04 18:04:27 -08:00
Parag Jain	7c01f77b04	Parse Batch support (#5081 ) * add parseBatch and deprecate parse method in InputRowParser add addAll method, skip max rows in memory check for it remove parse method from implemetations transform transformers add string multiplier input row parser fix withParseSpec fix kafka batch indexing fix isPersistRequired comments * add unit test * make persist async * review comments	2017-12-04 16:06:16 -06:00
Jihoon Son	645de02fb2	Remove Injector from IngestSegmentFirehoseFactory (#5045 ) * Add lock check to segment list actions * fix test * remove lock check	2017-11-20 15:54:35 -08:00
Parag Jain	cb03efeb14	Kafka Index Task that supports Incremental handoffs (#4815 ) * Kafka Index Task that supports Incremental handoffs - Incrementally handoff segments when they hit maxRowsPerSegment limit - Decouple segment partitioning from Kafka partitioning, all records from consumed partitions go to a single druid segment - Support for restoring task on middle manager restarts by check pointing end offsets for segments * take care of review comments * make getCurrentOffsets call async, keep track of publishing sequence, review comments * fix setEndoffset duplicate request handling, formatting * fix unit test * backward compatibility * make AppenderatorDriverMetadata backwards compatible * add unit test * fix deadlock between persist and push executors in AppenderatorImpl * fix formatting * use persist dir instead of work dir * review comments * fix deadlock * actually fix deadlock	2017-11-17 16:05:20 -06:00
Jihoon Son	93459f748f	Fix missing intervals after compacting intervals (#5092 ) * Fix missing intervals after compacting intervals * fix build	2017-11-16 11:42:38 -08:00
Jonathan Wei	af44d1142b	Add unsecured /health endpoint, remove auth checks from isLeader (#5087 ) * Add unsecured /health endpoint, remove auth checks from isLeader * PR comments	2017-11-15 14:41:30 -06:00
Akash Dwivedi	c1538f29fc	maxQueryTimeout property in runtime properties. (#4852 ) * maxQueryTimeout property in runtime properties. * extra line * move withTimeoutAndMaxScatterGatherBytes method to QueryLifeCycle. * Fix initialize method. * remove unused import. * doc update. * some more details in doc about query failure.. * minor fix. * decorating QueryRunner to set and verify context. Added by servers. * remove whitespace.	2017-11-13 19:23:11 -06:00
Himanshu	2ecebb3173	Fix coordinator/overlord redirects when TLS is enabled (#5037 ) * Fix coordinator/overlord redirects when TLS is enabled * address review comment * fix UTs * workaround to not ignore URL instance to fix the teamcity build * update tls doc	2017-11-09 13:10:28 -08:00
Jihoon Son	c11c71ab3e	Using ImmutableDruidDataSource as a key for map and set instead of DruidDataSource (#5054 ) * use ImmutableDruidDataSource for map and set * address comments * unused import * allow returning only ImmutableDruidDataSource in MetadataSegmentManager * address comments * remove TreeSet * revert to use TreeSet	2017-11-09 16:07:58 -03:00
Roman Leventov	3541b7544b	Prohibit and remove unused declarations in the processing module (#4930 ) * Prohibit and remove unused declarations in the processing module * Fix tests * Fix integration tests * Suppress unused * Try to remove SuppressWarnings unused in VirtualColumn * Remove reset 'false positives' * Annotate CliCommandCreator as ExtensionPoint * Unused import warning instead of error in IntelliJ * Fixes * Add comment * Fix AzureBlob * Fix CloudFilesBlob * Address comments * Add Project SDK section to INTELLIJ_SETUP.md * Fix image	2017-11-09 09:27:27 -08:00
Jihoon Son	5f3c863d5e	Add compaction task (#4985 ) * Add compaction task * added doc * use combining aggregators * address comments * add support for dimensionsSpec * fix getUniqueDims and getUniqueMetics * find unique dimensionsSpec * fix compilation * add unit test * fix test * fix test * test for different dimension orderings and types, and doc for type and ordering * add control for custom ordering and type * update doc * fix compile * fix compile * add segments param * fix serde error * fix build	2017-11-03 21:55:27 -06:00
Gian Merlino	6c725a7e06	Fix havingSpec on complex aggregators. (#5024 ) * Fix havingSpec on complex aggregators. - Uses the technique from #4883 on DimFilterHavingSpec too. - Also uses Transformers from #4890, necessitating a move of that and other related classes from druid-server to druid-processing. They probably make more sense there anyway. - Adds a SQL query test. Fixes #4957. * Remove unused import.	2017-11-01 12:58:08 -04:00
Jihoon Son	e96daa2593	Fix SQLMetadataSegmentManager (#5001 )	2017-10-31 08:02:41 -07:00
Gian Merlino	0ce406bdf1	Introduce "transformSpec" at ingest-time. (#4890 ) * Introduce "transformSpec" at ingest-time. It accepts a "filter" (standard query filter object) and "transforms" (a list of objects with "name" and "expression"). These can be used to do filtering and single-row transforms without need for a separate data processing job. The "expression" fields use the same expression language as other expression-based feature. * Remove forbidden api. * Fix compile error. * Fix tests. * Some more changes. - Add nullable annotation to Firehose.nextRow. - Add tests for index task, realtime task, kafka task, hadoop mapper, and ingestSegment firehose. * Fix bad merge. * Adjust imports. * Adjust whitespace. * Make Transform into an interface. * Add missing annotation. * Switch logger. * Switch logger. * Adjust test. * Adjustment to handling for DatasourceIngestionSpec. * Fix test. * CR comments. * Remove unused method. * Add javadocs. * More javadocs, and always decorate. * Fix bug in TransformingStringInputRowParser. * Fix bad merge. * Fix ISFF tests. * Fix DORC test.	2017-10-30 17:38:52 -07:00
Gian Merlino	5fc6891404	Reduce code duplication between test ExprMacroTables. (#4979 )	2017-10-18 15:57:49 -05:00
Roman Leventov	dc7cb117a1	Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism (#4886 ) * Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism * Fix MapVirtualColumn.makeColumnValueSelector() * Minor fixes * Fix IndexGeneratorCombinerTest * DimensionSelector to return zeros when treated as numeric ColumnValueSelector * Fix IncrementalIndexTest * Fix IncrementalIndex.makeColumnSelectorFactory() * Optimize MapBasedRow.getMetric() * Fix VarianceAggregatorTest * Simplify IncrementalIndex.makeColumnSelectorFactory() * Address comments * More comments * Test	2017-10-13 21:44:17 -05:00
Jihoon Son	8d9902831e	Refactoring PrefetchableTextFilesFirehoseFactory (#4836 ) * Refactoring prefetchable firehose * Fix to read cache when prefetch is disabled * More tests * Cleanup codes * Add Fetcher * Fix test failure * Count file size * Fix test * rename generic parameter * address comments * address comments * reuse buffer * move Execs to java-util * use execs * Fix build	2017-10-13 21:39:28 -05:00
Jihoon Son	675c6c00dd	Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958 ) * add checkstyle and intellij rule * fix tc fail	2017-10-13 07:56:19 -07:00
Atul Mohan	c07678b143	Synchronization of lookups during startup of druid processes (#4758 ) * Changes for lookup synchronization * Refactor of Lookup classes * Minor refactors and doc update * Change coordinator instance to be retrieved by DruidLeaderClient * Wait before thread shutdown * Make disablelookups flag true by default * Update docs * Rename flag * Move executorservice shutdown to finally block * Update LookupConfig * Refactoring and doc changes * Remove lookup config constructor * Revert Lookupconfig constructor changes * Add tests to LookupConfig * Make executorservice local * Update LRM * Move ListeningScheduledExecutorService to ExecutorCompletionService * Move exception to outer block * Remove check to see future is done * Remove unnecessary assignment * Add logging	2017-10-12 21:22:24 -05:00
Jihoon Son	dfa9cdc982	Prioritized locking (#4550 ) * Implementation of prioritized locking * Fix build failure * Fix tc fail * Fix typos * Fix IndexTaskTest * Addressed comments * Fix test * Fix spacing * Fix build error * Fix build error * Add lock status * Cleanup suspicious method * Add nullables * add doInCriticalSection to TaskLockBox and revert return type of task actions * fix build * refactor CriticalAction * make replaceLock transactional * fix formatting * fix javadoc * fix build	2017-10-11 23:16:31 -07:00
Jihoon Son	56fb11ce0b	Lazy initialization for JavaScript functions (#4871 ) * Lazy initialization of JavaScript functions * Fix test failure * Fix thread-safety and postpone js conf check * Fix test fail * Fix test * Fix KafkaIndexTaskTest * Move config check	2017-10-10 21:52:42 -07:00
Gian Merlino	797b54d283	DruidLeaderClient: Throw IOException on retryable errors. (#4913 ) * DruidLeaderClient: Throw IOException on retryable errors. Fixes #4911. * Adjustments.	2017-10-06 15:12:09 -05:00
Parag Jain	535c034c06	assume scheme to be http if not present (#4912 )	2017-10-06 14:50:48 -05:00
Gian Merlino	c19cd23e94	RTR: Demote chatty log message. (#4895 ) "No worker selection strategy set." would get logged any time tryAssignTask runs in the default configuration, which is often. It also doesn't provide much value.	2017-10-03 08:16:32 -07:00
Roman Leventov	3f1009aaa1	Make Overlord auto-scaling and provisioning extensible (#4730 ) * Make AutoScaler, ProvisioningStrategy and BaseWorkerBehaviorConfig extension points; More logging in PendingTaskBasedWorkerProvisioningStrategy * Address comments and fix a bug * Extract method * debug logging * Rename BaseWorkerBehaviorConfig to WorkerBehaviorConfig and WorkerBehaviorConfig to DefaultWorkerBehaviorConfig * Fixes	2017-10-02 20:12:23 -05:00
QiuMM	6f91d9ca1e	change WorkerSelectStrategy's defaultImpl from FillCapacityWorkerSelectStrategy to EqualDistributionWorkerSelectStrategy (#4777 )	2017-10-02 16:52:41 -07:00
Jonathan Wei	5e60ccade1	Add context map to AuthenticationResult (#4870 )	2017-10-02 17:08:14 -05:00
Jihoon Son	ee7eaccbab	Better logging for SegmentAllocateAction (#4884 ) * Better logging for SegmentAllocateAction * Split methods	2017-10-02 09:29:21 -07:00
Gian Merlino	1f2074c247	Bump versions in master to 0.11.1-SNAPSHOT. (#4878 ) * Bump versions in master to 0.11.1-SNAPSHOT. * Missed a few.	2017-09-28 17:09:51 -05:00
Himanshu	f69c9280c4	remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858 ) * remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form * sanitize output of /druid/coordinator/v1/cluster endpoint	2017-09-28 10:40:59 -05:00
Roman Leventov	9c126e2aa9	Forbid MapMaker (#4845 ) * Forbid MapMaker * Shorter syntax * Forbid Maps.newConcurrentMap()	2017-09-27 06:49:47 -07:00
Roman Leventov	e267f3901b	Enforce Indentation with Checkstyle (#4799 )	2017-09-21 13:06:48 -07:00
Jonathan Wei	3a4a483bb0	Single auth check for authorized resource filtering (#4818 ) * Single auth check for authorized resource filtering * PR comment * PR comments	2017-09-19 21:46:08 +05:30
Jonathan Wei	c2a0e753b6	Extension points for authentication/authorization (#4271 ) * Extension points for authentication/authorization * Address some PR comments * Authorization result caching * Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter * Use Set for auth caching, close outputstreams in filters * Don't close output stream on success in sanity check filter * Add ConfigResourceFilter to coordinator lookups * Fix filtering authorization check for empty resource list * HttpClient users must explicitly escalate the client * Remove response modification from PreResponseAuthorizationCheckFilter * Remove extraneous pom.xml * Fix unit test * Better lifecycle management * Rename AuthorizationManager to Authorizer * Fix authorization denials for empty supervisor list * Address some PR comments * Address more PR comments * Small cleanup * Add Jetty HttpClient wrapper to Authenticator * Remove Authorizer start/stop * Restore immutable context map in DruidConnection, UT fix * Fix/update docs * Add authorization checks to EventReceiverFirehose * Fix router authorization check failure, restore PreResponseAuthorizationFilter changes * Compile fixes * Test fixes * Update Authenticator/Authorizer doc comments * Merge fixes * PR comments * Fix test * Fix IT * More PR comments * PR comments * SSL fix	2017-09-15 23:45:48 -07:00
Egor Riashin	6f3e52b3db	Make optional Peon "stdin" check (#4760 )	2017-09-11 16:37:01 -05:00
Himanshu	834e050bc4	Use internal-discovery and http for talking to overlord/coordinator leaders (#4735 ) * Use internal-discovery and http for talking to overlord/coordinator leaders * CuratorDruidNodeDiscovery.getAllNodes() best effort 30 sec wait for cache initialization * DruidLeaderClientProvider to eagerly instantiate DruidNodeDiscovery when needed so that DruidNodeDiscovery impl cache gets initialized well in time * Revert "DruidLeaderClientProvider to eagerly instantiate DruidNodeDiscovery when needed so that DruidNodeDiscovery impl cache gets initialized well in time" This reverts commit f1a2432614ba56ddc2d55fe47e990d17fcfd6129. * add lifecycle to DruidLeaderClient to early initialize DruidNodeDiscovery so that it has its cache update well in time	2017-09-11 11:18:01 -07:00
dgolitsyn	752151f6cb	Add CachingCostBalancerStrategy (#4731 ) * Add CachingCostBalancerStrategy; Rename ServerView.ServerCallback to ServerRemovedCallback * Fix benchmark units * Style, forbidden-api, review, bug fixes * Add docs * Address comments	2017-09-08 12:23:04 -05:00
Gian Merlino	33c0928bed	Collapse worker select strategies, change default, add strong affinity. (#4534 ) * Collapse worker select strategies, change default, add strong affinity. - Change default worker select strategy to equalDistribution. It is more generally useful than fillCapacity. - Collapse the WithAffinity strategies into the regular ones. The WithAffinity strategies are retained for backwards compatibility. - Change WorkerSelectStrategy to return nullable instead of Optional. - Fix a couple of errors in the docs. * Fix test. * Review adjustments. * Remove unused imports. * Switch to DateTimes.nowUtc. * Simplify code. * Fix tests (worker assignment started off on a different foot)	2017-09-04 14:40:55 -07:00
Himanshu	06ac6678e6	DruidLeaderSelector interface for leader election and Curator based impl. (#4699 ) * DruidLeaderSelector interface for leader election and Curator based impl. DruidCoordinator/TaskMaster are updated to use the new interface. * add fake DruidNode binding in integration-tests module * add docs on DruidLeaderSelector interface * remove start/stop and keep register/unregister Listener in DruidLeaderSelector interface * updated comments on DruidLeaderSelector * cache the listener executor in CuratorDruidLeaderSelector * use same latch owner name that was used before * remove stuff related to druid.zk.paths.indexer.leaderLatchPath config * randomize the delay when giving up leadership and restarting leader latch	2017-09-01 09:49:04 -07:00
Charles Allen	bdfc6fe25e	Move common TypeReference into JacksonUtils (#4738 )	2017-08-31 13:40:16 -07:00
Gian Merlino	daf3c5f927	Add "round" option to cardinality and hyperUnique aggregators. (#4720 ) * Add "round" option to cardinality and hyperUnique aggregators. Also turn it on by default in SQL, to make math on distinct counts work more as expected. * Fix some compile errors. * Fix test. * Formatting.	2017-08-28 14:52:11 -07:00
Roman Leventov	cbd1902db8	Add forbidden-apis plugin; prohibit using system time zone (#4611 ) * Forbidden APIs WIP * Remove some tests * Restore io.druid.math.expr.Function * Integration tests fix * Add comments * Fix in SimpleWorkerProvisioningStrategy * Formatting * Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest * Address comments * Fix GroupByMultiSegmentTest	2017-08-21 13:02:42 -07:00
Himanshu	74a64c88ab	internal-discovery: interfaces for announcement/discovery, curator based impls (#4634 ) * internal-discovery: interfaces for announcement/discovery, curator impls * more tests * address some review comments * more fixes * address more review comments * simplify ObjectMapper setup in CuratorDruidNodeAnnouncerAndDiscoveryTest * fix KafkaIndexTaskTest * make lookupTier overridable via RealtimeIndexTask and KafkaIndexTask context * make teamcity build happy	2017-08-16 13:07:16 -07:00
Roman Leventov	bf28d0775b	Remove QueryRunner.run(Query, responseContext) and related legacy methods (#4482 ) * Remove QueryRunner.run(Query, responseContext) and related legacy methods * Remove local var	2017-08-11 09:12:38 +09:00
Jihoon Son	d5606bc558	Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549 ) * Passing lockTimeout as a parameter for TaskLockbox.lock() * Remove TIME_UNIT * Fix tc fail * Add taskLockTimeout to TaskContext * Add caution	2017-08-08 18:21:07 -07:00
Roman Leventov	486b7a2347	TaskMaster deadlock fix (#4548 ) * Stop RemoteTaskRunner's cleanupExec using TaskMaster's lifecycle, not global injected lifecycle * Prohibit starting Lifecycle twice; Make Lifecycle to reject addMaybeStartHandler() attempts in the process of stopping rather than entering deadlock * Fix Lifecycle.addMaybeStartHandler() * Remove RemoteTaskRunnerFactoryTest * Add docs * Language * Address comments * Fix RemoteTaskRunnerTestUtils	2017-08-07 14:28:43 -07:00
Roman Leventov	aa7e4ae5e4	Enforce correct spacing with Checkstyle (#4651 )	2017-08-05 10:18:25 -07:00
David Lim	dd0b84e766	Fix bugs in RTR related to blacklisting, change default worker strategy (#4619 ) * fix bugs in RTR related to blacklisting, change default worker strategy to equalDistribution * code review and additional changes * fix errorprone * code review changes	2017-08-03 10:34:45 -07:00
Gian Merlino	a9c875d746	IndexTask: Use shared groupId when "appendToExisting" is on. (#4582 ) This allows the tasks to run concurrently. Additionally, rework the partition-determining code in a couple ways: - Use a task-id based sequenceName so concurrently running append tasks do not clobber each others' segments. - Make the list of shardSpecs empty when rollup is non-guaranteed, and let allocators handle the creation of incremental shardSpecs.	2017-07-24 09:57:23 -07:00
Roman Leventov	c0beb78ffd	Enforce brace formatting with Checkstyle (#4564 )	2017-07-21 10:26:59 -05:00
Gian Merlino	38b03f56b4	IndexTask: Raise default maxTotalRows. (#4579 ) 150k is very low, given that it should only be limited by disk space rather than JVM heap.	2017-07-20 13:55:07 -07:00
Akash Dwivedi	0b85c60869	Fix issue-4539 (#4546 ) * Protect double URI encoding * removeExtraLine	2017-07-19 09:38:29 -07:00
Roman Leventov	60cdf94677	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 ) * Add PMD and prohibit unnecessary fully qualified class names in code * Extra fixes * Remove extra unnecessary fully-qualified names * Remove qualifiers * Remove qualifier	2017-07-17 22:22:29 +09:00
Roman Leventov	b7203510b8	Fix RemoteTaskRunner's auto-scaling (#3768 ) * Rename ResourceManagementStrategy to ProvisioningStrategy, similarly for related classes. Make ProvisioningService non-global, created per RemoteTaskRunner instead. Add OverlordBlinkLeadershipTest. * Fix RemoteTaskRunnerFactoryTest.testExecNotSharedBetweenRunners() * Small fix * Make SimpleProvisioner and PendingProvisioner more similar in details * Fix executor name * Style fixes * Use LifecycleLock in RemoteTaskRunner	2017-07-14 09:11:39 +09:00
Chris Gavin	960cb07ea6	Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474 ) * Remove some unnecessary use of boxed types. * Fix some incorrect format strings. * Enable IDEA's MalformedFormatString inspection. * Add a Checkstyle check for finding uses of incorrect logging packages. * Fix some incorrect usages of the metamx logger. * Bypass incorrect logger Checkstyle check where using the correct logger is not simple. * Fix some more places where the wrong number of arguments are provided to format strings. * Suppress `MalformedFormatString` inspection on legacy logging test. * Use @SuppressWarnings rather than a noinspection suppression comment. * Fix some more incorrect format strings. * Suppress some more incorrect format string warnings where the incorrect string is intentional. * Log the aggregator when closing it fails. * Remove some unneeded log lines.	2017-07-13 12:15:32 -07:00
Roman Leventov	b2865b7c7b	Make possible to start Peon without DI loading of any querying-related stuff (#4516 ) * Make QueryRunnerFactoryConglomerate injection lazy in TaskToolbox/TaskToolboxFactory * Extract QueryablePeonModule and add druid.modules.excludeList config * Typo	2017-07-12 13:18:25 -05:00
Jihoon Son	6d2df2a542	Fix duplicated locks after sync from storage (#4521 ) * Fix duplicated locks after sync from storage * Remove unnecessary table creation	2017-07-11 10:10:11 -07:00
Akash Dwivedi	5f411f14af	Timeout for LockAcquireAction (#4461 ) * Timeout for LockAcquireAction * Static inner class. * Rebase changes. * makeAlert and throw exception incase of overlapping interval. * Addressed comments. * remove unused import. * Addressed comments	2017-07-11 18:59:32 +09:00
Jihoon Son	cc20260078	Early publishing segments in the middle of data ingestion (#4238 ) * Early publishing segments in the middle of data ingestion * Remove unnecessary logs * Address comments * Refactoring the patch according to #4292 and address comments * Set the total shard number of NumberedShardSpec to 0 * refactoring * Address comments * Fix tests * Address comments * Fix sync problem of committer and retry push only * Fix doc * Fix build failure * Address comments * Fix compilation failure * Fix transient test failure	2017-07-10 22:35:36 -07:00
Jihoon Son	8ed25acc15	Fix a bug for CSVParser/DelimitedParser when empty column exists in the header row (#4504 ) * Fix a bug when empty column exists in header row * Address comments	2017-07-07 16:19:25 -07:00
Parag Jain	6e2f78f552	TLS support (#4270 )	2017-07-06 17:40:12 -07:00
Roman Leventov	9ae457f7ad	Avoid using the default system Locale and printing to System.out in production code (#4409 ) * Avoid usages of Default system Locale and printing to System.out or System.err in production code * Fix Charset in DruidKerberosUtil * Remove redundant string format in GenericIndexed * Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format() * Fix testSafeFormat() * More fixes of redundant StringUtils.format() inside ISE * Rename unimportantSafeFormat() to nonStrictFormat()	2017-06-29 14:06:19 -07:00
Roman Leventov	ae900a4934	Update versions to 0.11.0-SNAPSHOT (#4483 )	2017-06-28 17:05:58 -07:00
Jihoon Son	e3c13c246a	Respect reportParseExceptions option in IndexTask.determineShardSpecs() (#4467 ) * Respect reportParseExceptions option in IndexTask.determineShardSpecs() * Fix typo	2017-06-27 10:28:22 -07:00
Roman Leventov	05d58689ad	Remove the ability to create segments in v8 format (#4420 ) * Remove ability to create segments in v8 format * Fix IndexGeneratorJobTest * Fix parameterized test name in IndexMergerTest * Remove extra legacy merging stuff * Remove legacy serializer builders * Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest	2017-06-26 13:21:39 -07:00
Jihoon Son	b37c9b5fe0	Fix a bug of CSV/TSV parsers when extracting columns from header (#4443 ) * Reset fieldNames whenever a new file begins * Fix test failure * Fix test failure	2017-06-23 14:29:26 -07:00
Goh Wei Xiang	f68a0693f3	Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397 ) * Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe. * - Use of computeIfAbsent over putIfAbsent - Replace Maps.newXXXMap() with normal instantiation - Documentations on when is thread-safe required. - Use Builders for On/OffheapIncrementalIndex * - Optimization on computeIfAbsent - Constant EMPTY DimensionsSpec - Improvement on IncrementalIndexSchema.Builder - Remove setting of default values - Use var args for metrics - Correction on On/OffheapIncrementalIndex Builders - Combine On/OffheapIncrementalIndex Builders * - Removing unused imports. * - Helper method for testing with IncrementalIndex.Builder * - Correction on javadoc. * Style fix	2017-06-16 16:04:19 -05:00
Gian Merlino	1f2afccdf8	Expressions: Add ExprMacros. (#4365 ) * Expressions: Add ExprMacros, which have the same syntax as functions, but can convert themselves to any kind of Expr at parse-time. ExprMacroTable is an extension point for adding new ExprMacros. Anything that might need to parse expressions needs an ExprMacroTable, which can be injected through Guice. * Address code review comments.	2017-06-08 09:32:10 -04:00
Roman Leventov	63a897c278	Enable most IntelliJ 'Probable bugs' inspections (#4353 ) * Enable most IntelliJ 'Probable bugs' inspections * Fix in RemoteTestNG * Fix IndexSpec's equals() and hashCode() to include longEncoding * Fix inspection errors * Extract global isntance of natural().nullsFirst(); address comments * Fix * Use noinspection comments instead of SuppressWarnings on method for IntelliJ-specific inspections * Prohibit Ordering.natural().nullsFirst() using Checkstyle	2017-06-07 09:54:25 -07:00
Roman Leventov	31d33b333e	Make using implicit system Charset an error (#4326 ) * Make using implicit system charset an error * Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String() * Use English locale in StringUtils.safeFormat() * Restore comment	2017-06-05 23:57:25 -07:00
David Lim	13ecf90923	Report Kafka lag information in supervisor status report (#4314 ) * refactor lag reporting and report lag at status endpoint * refactor offset reporting logic to fetch offsets periodically vs. at request time * remove JavaCompatUtils * code review changes * code review changes	2017-06-05 13:26:25 -07:00
Slim	a2584d214a	Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116 ) * Adding s3a schema and s3a implem to hdfs storage module. * use 2.7.3 * use segment pusher to make loadspec * move getStorageDir and makeLoad spec under DataSegmentPusher * fix uts * fix comment part1 * move to hadoop 2.8 * inject deep storage properties * set version to 2.7.3 * fix build issue about static class * fix comments * fix default hadoop default coordinate * fix create filesytem * downgrade aws sdk * bump the version	2017-06-04 00:55:09 -06:00
Jihoon Son	f876246af7	Rename FiniteAppenderatorDriver to AppenderatorDriver (#4356 )	2017-06-03 00:48:44 +09:00
Jihoon Son	1150bf7a2c	Refactoring Appenderator Driver (#4292 ) * Refactoring Appenderator 1) Added publishExecutor and handoffExecutor for background publishing and handing segments off 2) Change add() to not move segments out in it * Address comments 1) Remove publishTimeout for KafkaIndexTask 2) Simplifying registerHandoff() 3) Add increamental handoff test * Remove unused variable * Add persist() to Appenderator and more tests for AppenderatorDriver * Remove unused imports * Fix strict build * Address comments	2017-06-02 07:09:11 +09:00
chaoqiang	5fc4abcf71	fix equalDistribution worker select strategy (#4318 ) * fix equalDistribution worker select strategy * replace anonymous Comparator * keep previous version sorting comment * fix code style * update comment * move JsonProperty	2017-05-25 13:30:42 +09:00
Gian Merlino	adeecc0e72	Add /isLeader call to overlord and coordinator. (#4282 ) This is useful for putting them behind load balancers or proxies, as it lets the load balancer know which server is currently active through an http health check. Also makes the method naming a little more consistent between coordinator and overlord code.	2017-05-18 20:46:13 -05:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00
Himanshu	daa8ef8658	Optional long-polling based segment announcement via HTTP instead of Zookeeper (#3902 ) * Optional long-polling based segment announcement via HTTP instead of Zookeeper * address review comments * make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed * update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers * address review comments * remove size check not required anymore as only segment servers announce themselves and not all peon processes * annouce segment server on historical only after cached segments are loaded * fix checkstyle errors	2017-05-17 16:31:58 -05:00
Roman Leventov	b7a52286e8	Make @Override annotation obligatory (#4274 ) * Make MissingOverride an error * Make travis stript to fail fast * Add missing Override annotations * Comment	2017-05-16 13:30:30 -05:00
Benedict Jin	e823085866	Improve `collection` related things that reusing a immutable object instead of creating a new object (#4135 )	2017-05-17 01:38:51 +09:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Roman Leventov	1ebfa22955	Update Error prone configuration; Fix bugs (#4252 ) * Make Errorprone the default compiler * Address comments * Make Error Prone's ClassCanBeStatic rule a error * Preconditions allow only %s pattern * Fix DruidCoordinatorBalancerTester * Try to give the compiler more memory * Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now * Don't show compiler warnings * Try different travis script * Fix travis.yml * Make Error Prone optional again * For error-prone compiler * Increase compiler's maxmem * Don't run Error Prone for benchmarks because of OOM * Skip install step in Travis * Remove MetricHolder.writeToChannel() * In travis.yml, check compilation before tests, because it may fail faster	2017-05-12 15:55:17 +09:00
Himanshu	462f6482df	optionally add extensions to explicitly specified hadoopContainerClassPath (#4230 ) * optionally add extensions to explicitly specified hadoopContainerClassPath * note extensions always pushed in hadoop container when druid.extensions.hadoopContainerDruidClasspath is not provided explicitly	2017-05-08 14:24:14 -05:00
Gian Merlino	f0fd8ba191	Add supervisors to overlord console. (#4248 )	2017-05-04 11:13:12 -07:00
Gian Merlino	2ca7b00346	Update versions to 0.10.1-SNAPSHOT. (#4191 )	2017-04-20 18:12:28 -07:00
Roman Leventov	15f3a94474	Copy closer into Druid codebase (fixes #3652 ) (#4153 )	2017-04-10 09:38:45 +09:00
Parag Jain	7e0d4c9555	secure supervisor endpoints (#3985 )	2017-04-05 16:42:32 -07:00
JackyWoo	a0f2cf05d5	Add EqualDistributionWithAffinityWorkerSelectStrategy which balance w… (#3998 ) * Add EqualDistributionWithAffinityWorkerSelectStrategy which balance work load within affinity workers. * add docs to equalDistributionWithAffinity	2017-03-25 19:15:49 -07:00
Himanshu	de081c711b	RealtimeIndexTask to support alertTimeout in context (#4089 ) * RealtimeIndexTask to support alertTimeout in context and raise alert if task process exists after the timeout * move alertTimeout config to tuningConfig and document	2017-03-24 12:48:12 -07:00
Gian Merlino	b4289c0004	Remove "granularity" from IngestSegmentFirehose. (#4110 ) It wasn't doing anything useful (the sequences were being concatted, and cursor.getTime() wasn't being called) and it defaulted to Granularities.NONE. Changing it to Granularities.ALL gave me a 700x+ performance boost on a small dataset I was reindexing (2m27s to 365ms). Most of that was from avoiding making a lot of unnecessary column selectors.	2017-03-24 10:28:54 -07:00
Zhihui Jiao	6febcd9f24	Fix IngestSegmentFirehoseFactory (#4069 )	2017-03-17 16:57:25 -06:00
Parag Jain	c155d9a5e9	increase kill timeout (#4002 )	2017-03-08 09:00:34 -08:00
kaijianding	19ac1c7c2c	Add SameIntervalMergeTask for easier usage of MergeTask (#3981 ) * Add SameIntervalMergeTask for easier usage of MergeTask * fix a bug and add ut * remove same_interval_merge_sub from Task.java and remove other no needed code	2017-03-06 11:21:32 -06:00
Roman Leventov	ea1f5b7954	LifecycleLock for better synchronization in lifecycled objects (#3964 ) * Introduce LifecycleLock * Add LifecycleLockTest * Rename LifecycleLock.release() to exitStart() * Rewrite LifecycleLock using AbstractQueuedSynchronizer for more safety, added tests * Add LifecycleLock.exitStop() and reset() * Add LifecycleLock.awaitStarted(timeout) * Braces * Fix	2017-03-02 12:22:57 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
kaijianding	ef6a19c81b	buildV9Directly in MergeTask and AppendTask (#3976 ) * buildV9Directly in MergeTask and AppendTask * add doc	2017-02-28 10:04:32 -08:00
Parag Jain	469ae374a3	add kill task link on console (#3974 ) * add kill task link on console * refresh after kill	2017-02-25 14:58:16 +05:30
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
David Lim	3c54fc912a	fix numShards = -1 not being handled correctly (#3937 )	2017-02-14 18:45:38 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Himanshu	8cf7ad1e3a	druid.coordinator.asOverlord.enabled flag at coordinator to make it an overlord too (#3711 )	2017-02-13 15:03:59 -08:00
Parag Jain	8e31a465ad	report hand off count finite appenderator driver (#3925 )	2017-02-13 10:41:24 -08:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
DaimonPl	93b71e265e	Extract HLL related code to separate module (#3900 )	2017-02-03 09:45:11 -08:00
Parag Jain	1aabb45a09	auto reset option for Kafka Indexing service (#3842 ) * auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers * review comments * review comments * reverted last change * review comments * review comments * fix typo	2017-02-02 14:57:45 -06:00
David Lim	ff52581bd3	IndexTask improvements (#3611 ) * index task improvements * code review changes * add null check	2017-01-18 14:24:37 -08:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Charles Allen	229559b46a	Make TaskLockbox's ReentrantLock fair (#3828 )	2017-01-07 12:34:47 -08:00
Himanshu	4ca3b7f1e4	overlord helpers framework and tasklog auto cleanup (#3677 ) * overlord helpers framework and tasklog auto cleanup * review comment changes * further review comments addressed	2016-12-21 15:18:55 -08:00
Gian Merlino	6440ddcbca	Fix #3795 (Java 7 compatibility). (#3796 ) * Fix #3795 (Java 7 compatibility). Also introduce Animal Sniffer checks during build, which would have caught the original problems. * Add Animal Sniffer on caffeine-cache for JDK8.	2016-12-21 10:19:13 -08:00
Roman Leventov	70e83bea6d	Fix PathChildrenCache's ExecutorService leak (#3726 ) * Fix PathChildrenCache's executorService leak in Announcer, CuratorInventoryManager and RemoteTaskRunner * Use a single ExecutorService for all workerStatusPathChildrenCaches in RemoteTaskRunner	2016-12-07 13:00:10 -08:00
Gian Merlino	4e67dd28c0	RemoteTaskRunnerConfig: Fix Guice error on startup. (#3737 )	2016-12-06 00:19:53 +05:30
Charles Allen	27ab23ef44	Don't update segment metadata if archive doesn't move anything (#3476 ) * Don't update segment metadata if archive doesn't move anything * Fix restore task to handle potential null values * Don't try to update empty metadata * Address review comments * Move to druid-io java-util	2016-12-01 07:49:28 -08:00

... 4 5 6 7 8 ...

1800 Commits