druid

Commit Graph

Author	SHA1	Message	Date
Jihoon Son	ab5b3be6c6	Add shuffleSegmentPusher for data shuffle (#8115 ) * Fix race between canHandle() and addSegment() in StorageLocation * add comment * Add shuffleSegmentPusher which is a dataSegmentPusher used for writing shuffle data in local storage. * add comments * unused import * add comments * fix test * address comments * remove <p> tag from javadoc * address comments * comparingLong * Address comments * fix test	2019-08-05 13:38:35 -07:00
Eugene Sevastianov	3f3162b85e	Enum of ResponseContext keys (#8157 ) * Refactored ResponseContext and aggregated its keys into Enum * Added unit tests for ResponseContext and refactored the serialization * Removed unused methods * Fixed code style * Fixed code style * Fixed code style * Made SerializationResult static * Updated according to the PR discussion: Renamed an argument Updated comparator Replaced Pair usage with Map.Entry Added a comment about quadratic complexity Removed boolean field with an expression Renamed SerializationResult field Renamed the method merge to add and renamed several context keys Renamed field and method related to scanRowsLimit Updated a comment Simplified a block of code Renamed a variable * Added JsonProperty annotation to renamed ScanQuery field * Extension-friendly context key implementation * Refactored ResponseContext: updated delegate type, comments and exceptions Reducing serialized context length by removing some of its' collection elements * Fixed tests * Simplified response context truncation during serialization * Extracted a method of removing elements from a response context and added some comments * Fixed typos and updated comments	2019-08-03 12:05:21 +03:00
Fokko Driesprong	91743eeebe	Spotbugs: NP_NONNULL_PARAM_VIOLATION (#8129 )	2019-08-02 19:20:22 +03:00
Chi Cao Minh	7783b31846	Add IPv4 druid expressions (#8197 ) * Add IPv4 druid expressions New druid expressions for filtering IPv4 addresses: - ipv4address_match: Check if IP address belongs to a subnet - ipv4address_parse: Convert string IP address to long - ipv4address_stringify: Convert long IP address to string These expressions operate on IP addresses represented as either strings or longs, so that they can be applied to dimensions with mixed representation of IP addresses. The filtering is more efficient when operating on IP addresses as longs. In other words, the intended use case is: 1) Use ipv4address_parse to convert to long at ingestion time 2) Use ipv4address_match to filter (on longs) at query time 3) Use ipv4adress_stringify to convert to (readable) string at query time * Fix licenses and null handling * Simplify IPv4 expressions * Fix tests * Fix check for valid ipv4 address string	2019-08-01 11:45:04 -07:00
Jonathan Wei	41893d4647	Simple memory allocation for CliIndexer tasks (#8201 ) * Simple memory allocation for CliIndexer * PR comments * Checkstyle	2019-08-01 10:22:41 +08:00
Gian Merlino	77297f4e6f	GroupBy array-based result rows. (#8196 ) * GroupBy array-based result rows. Fixes #8118; see that proposal for details. Other than the GroupBy changes, the main other "interesting" classes are: - ResultRow: The array-based result type. - BaseQuery: T is no longer required to be Comparable. - QueryToolChest: Adds "decorateObjectMapper" to enable query-aware serialization and deserialization of result rows (necessary due to their positional nature). - QueryResource: Uses the new decoration functionality. - DirectDruidClient: Also uses the new decoration functionality. - QueryMaker (in Druid SQL): Modifications to read ResultRows. These classes weren't changed, but got some new javadocs: - BySegmentQueryRunner - FinalizeResultsQueryRunner - Query * Adjustments for TC stuff.	2019-07-31 16:15:12 -07:00
Fokko Driesprong	faf51107d5	Add SuppressWarnings SS_SHOULD_BE_STATIC (#8138 ) * Spotbugs: SS_SHOULD_BE_STATIC (#8073) * Add SuppressWarnings SS_SHOULD_BE_STATIC Fixes #8073 * Fix the voilation * Make them non-final * Remove @Nonnull	2019-07-31 19:44:42 +03:00
Jihoon Son	385f492a55	Use PartitionsSpec for all task types (#8141 ) * Use partitionsSpec for all task types * fix doc * fix typos and revert to use isPushRequired * address comments * move partitionsSpec to core * remove hadoopPartitionsSpec	2019-07-30 17:24:39 -07:00
Clint Wylie	653b558134	sql firehose and firehose doc adjustments (#8067 ) * firehose doc adjustments * fix typo * additional information on parser types in ingestion docs * clarify ingest segment firehose docs, add sql firehose examples to sql extension pages * fixit * make sql firehose more forgiving my always constructing a MapInputRowParser from the parseSpec of whatever actual InputRowParser impl is provided, remove doc references to map based parsers * transforms * fix tests	2019-07-30 15:28:10 -07:00
Fokko Driesprong	e016995d1f	Enable Spotbugs: WMI_WRONG_MAP_ITERATOR (#8005 ) * WMI_WRONG_MAP_ITERATOR * Fixed missing loop	2019-07-30 19:51:53 +03:00
Jonathan Wei	640b7afc1c	Add CliIndexer process type and initial task runner implementation (#8107 ) * Add CliIndexer process type and initial task runner implementation * Fix HttpRemoteTaskRunnerTest * Remove batch sanity check on PeonAppenderatorsManager * Fix paralle index tests * PR comments * Adjust Jersey resource logging * Additional cleanup * Fix SystemSchemaTest * Add comment to LocalDataSegmentPusherTest absolute path test * More PR comments * Use Server annotated with RemoteChatHandler * More PR comments * Checkstyle * PR comments * Add task shutdown to stopGracefully * Small cleanup * Compile fix * Address PR comments * Adjust TaskReportFileWriter and fix nits * Remove unnecessary closer * More PR comments * Minor adjustments * PR comments * ThreadingTaskRunner: cancel task run future not shutdownFuture and remove thread from workitem	2019-07-29 17:06:33 -07:00
Chi Cao Minh	ab71a2e1e4	Revert "Fix dependency analyze warnings (#8128 )" (#8189 ) This reverts commit `5dd0d8e873`.	2019-07-29 11:42:16 -07:00
Jihoon Son	adf7bafb9f	Fix race between canHandle() and addSegment() in StorageLocation (#8114 ) * Fix race between canHandle() and addSegment() in StorageLocation * add comment * add comments * fix test * address comments * remove <p> tag from javadoc * address comments * comparingLong	2019-07-27 11:11:06 +03:00
Chi Cao Minh	5dd0d8e873	Fix dependency analyze warnings (#8128 ) * Fix dependency analyze warnings Update the maven dependency plugin to the latest version and fix all warnings for unused declared and used undeclared dependencies in the compile scope. Added new travis job to add the check to CI. Also fixed some source code files to use the correct packages for their imports. * Fix licenses and dependencies * Fix licenses and dependencies again * Fix integration test dependency * Address review comments * Fix unit test dependencies * Fix integration test dependency * Fix integration test dependency again * Fix integration test dependency third time * Fix integration test dependency fourth time * Fix compile error * Fix assert package	2019-07-26 10:49:03 -07:00
Parag Jain	31a29d8883	add noop type name to prevent jackson exception when setting type to noop (#8133 )	2019-07-25 16:07:08 -07:00
Jihoon Son	db14946207	Add support minor compaction with segment locking (#7547 ) * Segment locking * Allow both timeChunk and segment lock in the same gruop * fix it test * Fix adding same chunk to atomicUpdateGroup * resolving todos * Fix segments to lock * fix segments to lock * fix kill task * resolving todos * resolving todos * fix teamcity * remove unused class * fix single map * resolving todos * fix build * fix SQLMetadataSegmentManager * fix findInputSegments * adding more tests * fixing task lock checks * add SegmentTransactionalOverwriteAction * changing publisher * fixing something * fix for perfect rollup * fix test * adjust package-lock.json * fix test * fix style * adding javadocs * remove unused classes * add more javadocs * unused import * fix test * fix test * Support forceTimeChunk context and force timeChunk lock for parallel index task if intervals are missing * fix travis * fix travis * unused import * spotbug * revert getMaxVersion * address comments * fix tc * add missing error handling * fix backward compatibility * unused import * Fix perf of versionedIntervalTimeline * fix timeline * fix tc * remove remaining todos * add comment for parallel index * fix javadoc and typos * typo * address comments	2019-07-24 17:35:46 -07:00
Clint Wylie	0695e487e7	fix issue with CuratorLoadQueuePeon shutting down executors it does not own (#8140 ) * fix issue with CuratorLoadQueuePeon shutting down executors it does not own * use lifecycled executors * maybe this	2019-07-24 10:59:43 -07:00
Eugene Sevastianov	799d20249f	Response context refactoring (#8110 ) * Response context refactoring * Serialization/Deserialization of ResponseContext * Added java doc comments * Renamed vars related to ResponseContext * Renamed empty() methods to createEmpty() * Fixed ResponseContext usage * Renamed multiple ResponseContext static fields * Added PublicApi annotations * Renamed QueryResponseContext class to ResourceIOReaderWriter * Moved the protected method below public static constants * Added createEmpty method to ResponseContext with DefaultResponseContext creation * Fixed inspection error * Added comments to the ResponseContext length limit and ResponseContext http header name * Added a comment of possible future refactoring * Removed .gitignore file of indexing-service * Removed a never-used method * VisibleForTesting method reducing boilerplate Co-Authored-By: Clint Wylie <cjwylie@gmail.com> * Reduced boilerplate * Renamed the method serialize to serializeWith * Removed unused import * Fixed incorrectly refactored test method * Added comments for ResponseContext keys * Fixed incorrectly refactored test method * Fixed IntervalChunkingQueryRunnerTest mocks	2019-07-24 18:29:03 +03:00
Clint Wylie	0388581493	Revert "Spotbugs: SS_SHOULD_BE_STATIC (#8073 )" (#8145 ) This reverts commit `04a180a5fb`.	2019-07-23 22:57:19 -07:00
Clint Wylie	83514958db	remove unnecessary lock in ForegroundCachePopulator leading to a lot of contention (#8116 ) * remove unecessary lock in ForegroundCachePopulator leading to a lot of contention * mutableboolean, javadocs,document some cache configs that were missing * more doc stuff * adjustments * remove background documentation	2019-07-23 10:57:59 -07:00
Fokko Driesprong	04a180a5fb	Spotbugs: SS_SHOULD_BE_STATIC (#8073 )	2019-07-23 18:18:49 +08:00
Fokko Driesprong	e1a745717e	Spotbugs: NP_STORE_INTO_NONNULL_FIELD (#8021 )	2019-07-21 21:23:47 +08:00
Clint Wylie	f24e2f16af	fix npe with sql metadata manager polling and empty database (#8106 ) * fix npe with sql metadata manager polling and empty database * treat null segments separately * use preconditions check * add test	2019-07-20 19:09:02 -07:00
Himanshu	54a7b54d2d	avoid 'must return non-void type' warning (#8105 )	2019-07-18 15:02:27 -07:00
Jihoon Son	c7eb7cd018	Add intermediary data server for shuffle (#8088 ) * Add intermediary data server for shuffle * javadoc * adjust timeout * resolved todo * fix test * style * address comments * rename to shuffleDataLocations * Address comments * bit adjustment StorageLocation * fix test * address comment & fix test * handle interrupted exception	2019-07-18 14:46:47 -07:00
Clint Wylie	03e55d30eb	add CachingClusteredClient benchmark, refactor some stuff (#8089 ) * add CachingClusteredClient benchmark, refactor some stuff * revert WeightedServerSelectorStrategy to ConnectionCountServerSelectorStrategy and remove getWeight since felt artificial, default mergeResults in toolchest implementation for topn, search, select * adjust javadoc * adjustments * oops * use it * use BinaryOperator, remove CombiningFunction, use Comparator instead of Ordering, other review adjustments * rename createComparator to createResultComparator, fix typo, firstNonNull nullable parameters	2019-07-18 13:16:28 -07:00
Sashidhar Thallam	72496d3712	#7858 Throwing UnsupportedOperationException from ImmutableDrui… (#7933 ) * #7858 Throwing UnsupportedOperationException from ImmutableDruidDataSource's equals() and hashCode() methods. * 1. Turning ImmutableDruidDataSource into a data container. 2. Adding a Util method to be used in tests for checking equality of ImmutableDruidDataSource objects. * Removing unused method * Fixing assert equals * Fixing assert equals in TestUtils.java * Adding java doc comments, Using ExpectedException in tests * Fixing test cases * Fixed expected exception message in tests * fixed line width * line width fix * code style fixes * code indentation fixes * fixing method name	2019-07-18 22:35:19 +03:00
Surekha	da16144495	Refactoring to use `CollectionUtils.mapValues` (#8059 ) * doc updates and changes to use the CollectionUtils.mapValues utility method * Add Structural Search patterns to intelliJ * refactoring from PR comments * put -> putIfAbsent * do single key lookup	2019-07-17 23:02:22 -07:00
Roman Leventov	ceb969903f	Refactor SQLMetadataSegmentManager; Change contract of REST met… (#7653 ) * Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource * Style fixes * Unused imports * Fix tests * Fix style * Comments * Comment fix * Remove unresolvable Javadoc references; address comments * Add comments to ImmutableDruidDataSource * Merge with master * Fix bad web-console merge * Fixes in api-reference.md * Rename in DruidCoordinatorRuntimeParams * Fix compilation * Residual changes	2019-07-17 17:18:48 +03:00
Clint Wylie	15fbf5983d	add Class.getCanonicalName to forbidden-apis (#8086 ) * add checkstyle to forbid unecessary use of Class.getCanonicalName * use forbiddin-api instead of checkstyle * add space	2019-07-16 15:21:50 -07:00
Chi Cao Minh	da3d141dd2	Add inline firehose (#8056 ) * Add inline firehose To allow users to quickly parsing and schema, add a firehose that reads data that is inlined in its spec. * Address review comments * Remove suppression of sonar warnings	2019-07-11 21:43:46 -07:00
Atul Mohan	631cda649b	Include replicated segment size property for datasources endpoint (#8039 ) * Add replication size * Summon comma	2019-07-11 01:10:38 -07:00
Himanshu	14aec7fcec	add config to optionally disable all compression in intermediate segment persists while ingestion (#7919 ) * disable all compression in intermediate segment persists while ingestion * more changes and build fix * by default retain existing indexingSpec for intermediate persisted segments * document indexSpecForIntermediatePersists index tuning config * fix build issues * update serde tests	2019-07-10 12:22:24 -07:00
Gian Merlino	338b8b3fef	SupervisorManager: Add authorization checks to bulk endpoints. (#8044 ) The endpoints added in #6272 were missing authorization checks. This patch removes the bulk methods from SupervisorManager, and instead has SupervisorResource run the full list through filterAuthorizedSupervisorIds before calling resume/suspend/terminate one by one.	2019-07-09 13:16:54 -07:00
Parag Jain	027291a90d	set DRUID_AUTHORIZATION_CHECKED attribute for router endpoints (#8026 ) * add state resource filter to router endpoints * add RouterResource to ResourceFilter test framework	2019-07-09 00:51:36 -07:00
Sashidhar Thallam	6701dc08fe	Making StatusResponseHandler singleton and fixing all its instantiation invocations (#7969 ) * Making StatusResponseHandler singleton and fixing all its instantiation invocations * Using StatusResponseHandler.getInstance() where applicable	2019-07-08 13:33:00 +05:30
Chi Cao Minh	1166bbcb75	Remove static imports from tests (#8036 ) Make static imports forbidden in tests and remove all occurrences to be consistent with the non-test code. Also, various changes to files affected by above: - Reformat to adhere to druid style guide - Fix various IntelliJ warnings - Fix various SonarLint warnings (e.g., the expected/actual args to Assert.assertEquals() were flipped)	2019-07-06 09:33:12 -07:00
Clint Wylie	42a7b8849a	remove FirehoseV2 and realtime node extensions (#8020 ) * remove firehosev2 and realtime node extensions * revert intellij stuff * rat exclusion	2019-07-04 15:40:22 -07:00
Clint Wylie	f7283378ac	remove deprecated standalone realtime node (#7915 ) * remove CliRealtime, RealtimeManager, etc * add redirects for deleted page to page that explains the deleted thing * adjust docs	2019-07-02 18:12:17 -07:00
Fokko Driesprong	c6baa59f77	Enable DLS_DEAD_LOCAL_STORE (#7967 )	2019-06-28 04:39:42 +08:00
Fokko Driesprong	82b248cc17	Spotbugs: Enable MS_SHOULD_BE_FINAL (#7946 )	2019-06-23 15:42:18 -07:00
SandishKumarHN	e80297efef	Set Test timeout higher for robust performance (#7890 )	2019-06-17 22:01:54 -07:00
SandishKumarHN	01881e3a98	Use only com.google.errorprone.annotations.concurrent.GuardedBy, not javax.annotations.concurrent.GuardedBy (#7889 )	2019-06-17 15:58:51 +02:00
Himanshu	b3328b2785	endpoint to delete lookup tier and remove tier on last lookup deletion (#7852 )	2019-06-15 17:55:50 -07:00
Sashidhar Thallam	3bee6adcf7	Use map.putIfAbsent() or map.computeIfAbsent() as appropriate instead of containsKey() + put() (#7764 ) * https://github.com/apache/incubator-druid/issues/7316 Use Map.putIfAbsent() instead of containsKey() + put() * fixing indentation * Using map.computeIfAbsent() instead of map.putIfAbsent() where appropriate * fixing checkstyle * Changing the recommendation text * Reverting auto changes made by IDE * Implementing recommendation: A ConcurrentHashMap on which computeIfAbsent() is called should be assigned into variables of ConcurrentHashMap type, not ConcurrentMap * Removing unused import	2019-06-14 17:59:36 +02:00
Clint Wylie	3fbb0a5e00	Supervisor list api with states and health (#7839 ) * allow optionally listing all supervisors with their state and health * docs * add state to full * clean * casing * format * spelling	2019-06-07 16:26:33 -07:00
Surekha	ea752ef562	Optimize overshadowed segments computation (#7595 ) * Move the overshadowed segment computation to SQLMetadataSegmentManager's poll * rename method in MetadataSegmentManager * Fix tests * PR comments * PR comments * PR comments * fix indentation * fix tests * fix test * add test for SegmentWithOvershadowedStatus serde format * PR comments * PR comments * fix test * remove snapshot updates outside poll * PR comments * PR comments * PR comments * removed unused import	2019-06-07 19:15:54 +02:00
Xue Yu	d482da6e9b	fix timestamp ceil lower bound bug (#7823 )	2019-06-04 01:16:31 -07:00
Fokko Driesprong	f2b00023f8	Bump Checkstyle to 8.21 (#7826 )	2019-06-04 01:02:46 -07:00
Jihoon Son	61ec521135	Remove keepSegmentGranularity option for compaction (#7747 ) * Remove keepSegmentGranularity option from compaction * fix it test * clean up * remove from web console * fix test	2019-06-03 12:59:15 -07:00
Justin Borromeo	8032c4add8	Add errors and state to stream supervisor status API endpoint (#7428 ) * Add state and error tracking for seekable stream supervisors * Fixed nits in docs * Made inner class static and updated spec test with jackson inject * Review changes * Remove redundant config param in supervisor * Style * Applied some of Jon's recommendations * Add transience field * write test * implement code review changes except for reconsidering logic of markRunFinishedAndEvaluateHealth() * remove transience reporting and fix SeekableStreamSupervisorStateManager impl * move call to stateManager.markRunFinished() from RunNotice to runInternal() for tests * remove stateHistory because it wasn't adding much value, some fixes, and add more tests * fix tests * code review changes and add HTTP health check status * fix test failure * refactor to split into a generic SupervisorStateManager and a specific SeekableStreamSupervisorStateManager * fixup after merge * code review changes - add additional docs * cleanup KafkaIndexTaskTest * add additional documentation for Kinesis indexing * remove unused throws class	2019-05-31 17:16:01 -07:00
Jihoon Son	7abfbb066a	Bump up snapshot version to 0.16.0 (#7802 )	2019-05-30 17:17:33 -07:00
Roman Leventov	782863ed0f	Fix some problems reported by PVS-Studio (#7738 ) * Fix some problems reported by PVS-Studio * Address comments	2019-05-29 11:20:45 -07:00
Gian Merlino	7ec7257e1d	Fix lookup serde on node types that don't load lookups. (#7752 ) This includes the router, overlord, middleManager, and coordinator. Does the following things: - Loads LookupSerdeModule on MM, overlord, and coordinator. - Adds LookupExprMacro to LookupSerdeModule, which allows these node types to understand that the 'lookup' function exists. - Adds a test to make sure that LookupSerdeModule works for virtual columns, filters, transforms, and dimension specs. This is implementing the technique discussed on these two issues: - https://github.com/apache/incubator-druid/issues/7724#issuecomment-494723333 - https://github.com/apache/incubator-druid/pull/7082#discussion_r264888771	2019-05-24 12:30:49 -07:00
Merlin Lee	26fad7e06a	Add checkstyle for "Local variable names shouldn't start with capital" (#7681 ) * Add checkstyle for "Local variable names shouldn't start with capital" * Adjust some local variables to constants * Replace StringUtils.LINE_SEPARATOR with System.lineSeparator()	2019-05-23 18:40:28 +02:00
Jonathan Wei	6901123a53	Fix compareAndSwap() in SQLMetadataConnector (#7661 ) * Fix compareAndSwap() in SQLMetadataConnector * Catch serialization_failure and retry for Postgres	2019-05-15 14:53:04 -07:00
Clint Wylie	b87c8f0314	fix lookup editor to use lookup tiers instead of historical tiers (#7647 ) * fix lookup editor to use lookup tiers instead of historical tiers * use default tier if empty response, fix if configured lookups is null * fixes * fix typo	2019-05-14 13:30:51 -07:00
Fokko Driesprong	2aa9613bed	Bump Checkstyle to 8.20 (#7651 ) * Bump Checkstyle to 8.20 Moderate severity vulnerability that affects: com.puppycrawl.tools:checkstyle Checkstyle prior to 8.18 loads external DTDs by default, which can potentially lead to denial of service attacks or the leaking of confidential information. Affected versions: < 8.18 * Oops, missed one * Oops, missed a few	2019-05-14 11:53:37 -07:00
Xavier Léauté	1d49364d08	Set direct memory if unable to detect JVM config (#7606 ) * Set direct memory if unable to detect JVM config Java 9 and above prevents us from detecting the maximum available direct memory. This change adds a fallback method to use at most 25% of maximum heap size, which should be a reasonable default. Unless -XX:MaxDirectMemorySize is set, recent JVMs will default maximum direct memory to match the maximum heap size, so this should work out of the box in most cases. For completeness we print instructions in the log to explain how to adjust settings if necessary. * skip test rather than succeeding * reword log message Co-Authored-By: Himanshu <g.himanshu@gmail.com>	2019-05-09 22:30:42 -07:00
Jihoon Son	18e0d6acb4	Fix resultLevelCache for timeseries with grandTotal (#7624 ) * Fix resultLevelCache for timeseries with grandTotal * Address comment * fix test	2019-05-09 18:11:04 -07:00
Jonathan Wei	dadf6a2f11	Add tool for migrating from local deep storage/Derby metadata (#7598 ) * Add tool for migrating from local deep storage/Derby metadata * Split deep storage and metadata migration docs * Support import into Derby * Fix create tables cmd * Fix create tables cmd * Fix commands * PR comment * Add -p	2019-05-06 23:39:40 -07:00
Xavier Léauté	c58aa2f2ab	Remove unnecessary cast to URLClassLoader (#7603 ) Java 9 and above will fail trying to cast the system classloader	2019-05-06 20:17:22 -07:00
Xavier Léauté	f7bfe8f269	Update mocking libraries for Java 11 support (#7596 ) * update easymock / powermock for to 4.0.2 / 2.0.2 for JDK11 support * update tests to use new easymock interfaces * fix tests failing due to easymock fixes * remove dependency on jmockit * fix race condition in ResourcePoolTest	2019-05-06 12:28:56 -07:00
Samarth Jain	afbcb9c07f	Improve parallelism of zookeeper based segment change processing (#7088 ) * V1 - improve parallelism of zookeeper based segment change processing * Create zk nodes in batches. Address code review comments. Introduce various configs. * Add documentation for the newly added configs * Fix test failures * Fix more test failures * Remove prinstacktrace statements * Address code review comments * Use a single queue * Address code review comments Since we have a separate load peon for every historical, just having a single SegmentChangeProcessor task per historical is enough. This commit also gets rid of the associated config druid.coordinator.loadqueuepeon.curator.numCreateThreads * Resolve merge conflict * Fix compilation failure * Remove batching since we already have a dynamic config maxSegmentsInNodeLoadingQueue that provides that control * Fix NPE in test * Remove documentation for configs that are no longer needed * Address code review comments * Address more code review comments * Fix checkstyle issue * Address code review comments * Code review comments * Add back monitor node remove executor * Cleanup code to isolate null checks and minor refactoring * Change param name since it conflicts with member variable name	2019-05-03 15:58:42 +02:00
Jonathan Wei	a013350018	Adjust required permissions for system schema (#7579 ) * Adjust required permissions for system schema * PR comments, fix current_size handling * Checkstyle * Set curr_size instead of current_size * Adjust information schema docs * Fix merge conflict * Update tests	2019-05-02 07:18:02 -07:00
David Lim	ec8562c885	Data loader (sampler component) (#7531 ) * sampler initial check-in fix checkstyle issues add sampler fix to process CSV files from cache properly change to composition and rename some classes add tests and report num rows read and indexed remove excludedByFilter flag and don't send filtered out data fix tests to handle both settings for druid.generic.useDefaultValueForNull * wrap sampler firehose in TimedShutoffFirehoseFactory to support timeouts * code review changes - add additional comments, limit maxRows	2019-05-01 22:37:14 -07:00
Surekha	15d19f3059	Add is_overshadowed column to sys.segments table (#7425 ) * Add is_overshadowed column to sys.segments table * update docs * Rename class and variables * PR comments * PR comments * remove unused variables in MetadataResource * move constants together * add getFullyOvershadowedSegments method to ImmutableDruidDataSource * Fix compareTo of SegmentWithOvershadowedStatus * PR comment * PR comments * PR comments * PR comments * PR comments * fix issue with already consumed stream * minor refactoring * PR comments	2019-05-01 18:00:57 +02:00
Xavier Léauté	6d4181191f	replace jdk internal exceptions with closest publicly available one	2019-04-30 14:21:45 -07:00
Gian Merlino	7b8bc9a5ef	EmitterModule: Throw an error on invalid emitter types. (#7328 ) * EmitterModule: Throw an error on invalid emitter types. The current behavior of silently using the "noop" emitter is unhelpful and makes it difficult to debug config typos. * Add comments.	2019-04-29 19:23:53 +02:00
Gian Merlino	ce7298b51e	BaseAppenderatorDriver: Fix potentially overeager segment cleanup. (#7558 ) * BaseAppenderatorDriver: Fix potentially overeager segment cleanup. Here is a thing that I think can go wrong: 1. We push some segments, then try to publish them transactionally. 2. The segments are actually published, but the 200 OK response gets lost (connection dropped, whatever). 3. We try again, and on the second try, the publish fails (because the transaction baseline start metadata no longer matches). 4. Because the publish failed, we delete the pushed segments. 5. But this is bad, because the publish didn't really fail, it actually succeeded in step 2. I haven't seen this in the wild, but thought about it while reviewing #7537. This patch also cleans up logging a bit, making it more accurate and somewhat less chatty. * Avoid wrapping exceptions when not necessary.	2019-04-29 09:55:04 -07:00
Justin Borromeo	07dd742e35	Fix time-ordered scan queries on realtime segments (#7546 ) * Initial commit * Added test for int to long conversion * Add appenderator test for realtime scan query * get rid of todo * Fix forbidden apis * Jon's recommendations * Formatting	2019-04-26 16:12:10 -07:00
Adam Peck	ebdf07b69f	Add reload by interval API (#7490 ) * Add reload by interval API Implements the reload proposal of #7439 Added tests and updated docs * PR updates * Only build timeline with required segments Use 404 with message when a segmentId is not found Fix typo in doc Return number of segments modified. * Fix checkstyle errors * Replace String.format with StringUtils.format * Remove return value * Expand timeline to segments that overlap for intervals Restrict update call to only segments that need updating. * Only add overlapping enabled segments to the timeline * Some renames for clarity Added comments * Don't rely on cached poll data Only fetch required information from DB * Match error style * Merge and cleanup doc * Fix String.format call * Add unit tests * Fix unit tests that check for overshadowing	2019-04-26 16:01:50 -07:00
Surekha	8308ffef1f	API to drop data by interval (#7494 ) * Add api to drop data by interval * update to address comments * unused imports * PR comments + add tests in SQLMetadataSegmentManagerTest * update tests and docs	2019-04-25 14:24:40 -07:00
Jihoon Son	c60e7feab8	Fix encoded taskId check in chatHandlerResource (#7520 ) * Fix encoded taskId check in chatHandlerResource * fix tests	2019-04-20 18:08:34 -07:00
Surekha	c2a42e05bb	Fix result-level cache for queries (#7325 ) * Add SegmentDescriptor interval in the hash while calculating Etag * Add computeResultLevelCacheKey to CacheStrategy Make HavingSpec cacheable and implement getCacheKey for subclasses Add unit tests for computeResultLevelCacheKey * Add more tests * Use CacheKeyBuilder for HavingSpec's getCacheKey * Initialize aggregators map to avoid NPE * adjust cachekey builder for HavingSpec to ignore aggregators * unused import * PR comments	2019-04-18 13:31:29 -07:00
Xavier Léauté	4322ce3303	Java 9 compatible cleaner operations (#7487 ) Java 9 removed support for sun.misc.Cleaner in favor of java.lang.ref.Cleaner. This change adds a thin abstraction to switch between Cleaner implementations based on JDK version at runtime	2019-04-17 08:04:52 -07:00
Lucas Capistrant	8acad27d99	Enhance the Http Firehose to work with URIs requiring basic authentication (#7145 ) * Enhnace the HttpFirehose to work with both insecure URIs and URIs requiring basic authentication * Improve security of enhanced HttpFirehoseFactory by not logging auth credentials * Fix checkstyle failure in HttpFirehoseFactory.java * Update docs and fix TeamCity build with required noinspection * Indentation cleanup and logic modification for HttpFirehose object stream * Remove default Empty string password provider in http firehose * Add JavaDoc for MixIn describing its intended use * Reverting documentation notation for json code to be inline with rest of doc * Improve instantiation of ObjectMappers that require MixIn for redacting password from task logs * Add comment to clarify fully qualified references of Objects in SQLMetadataStorageActionHandler	2019-04-15 14:29:01 -07:00
Surekha	4654e1e851	Remove unnecessary collection (#7350 ) From the discussion [here](https://github.com/apache/incubator-druid/pull/6901#discussion_r265741002) Remove the collection and filter datasources from the stream. Also remove StreamingOutput and JsonFactory constructs.	2019-04-15 19:49:21 +02:00
Gian Merlino	3854cfd15e	SQLMetadataSegmentManager: Comments, formatting adjustments (#7452 ) Follow up to #7447.	2019-04-11 21:57:50 -07:00
Gian Merlino	a517f8ce49	Coordinator: Allow dropping all segments. (#7447 ) Removes the coordinator sanity check that prevents it from dropping all segments. It's useful to get rid of this, since the behavior is unintuitive for dev/testing clusters where users might regularly want to drop all their data to get back to a clean slate. But the sanity check was there for a reason: to prevent a race condition where the coordinator might drop all segments if it ran before the first metadata store poll finished. This patch addresses that concern differently, by allowing methods in MetadataSegmentManager to return null if a poll has not happened yet, and canceling coordinator runs in that case. This patch also makes the "dataSources" reference in SQLMetadataSegmentManager volatile. I'm not sure why it wasn't volatile before, but it seems necessary to me: it's not final, and it's dereferenced from multiple threads without synchronization.	2019-04-11 08:45:38 -07:00
Justin Borromeo	2771ed50b0	Support Kafka supervisor adopting running tasks between versions (#7212 ) * Recompute hash in isTaskCurrent() and added tests * Fixed checkstyle stuff * Fixed failing tests * Make TestableKafkaSupervisorWithCustomIsTaskCurrent static * Add doc * baseSequenceName change * Added comment * WIP * Fixed imports * Undid lambda change for diff sake * Cleanup * Added comment * Reinsert Kafka tests * Readded kinesis test * Readd bad partition assignment in kinesis supervisor test * Nit * Misnamed var	2019-04-10 18:16:38 -07:00
Clint Wylie	76b4a5c62e	refactor lookups to be more chill to router (#7222 ) * refactor lookups to be more chill to router * remove accidental change * fix and combine LookupIntrospectionResourceTest * fix inspection * rename RouterLookupModule to LookupSerdeModule and RouterLookupExtractorFactoryContainerProvider to NoopLookupExtractorFactoryContainerProvider * make comment generic * use ConfigResourceFilter instead of StateResourceFilter * fix indentation * unused import * another unused import * refactor some stuff into processing module, split up LookupModule.java classes into their own files	2019-04-05 14:49:41 -07:00
Gian Merlino	78745fea84	Fix two issues with Coordinator -> Overlord communication. (#7412 ) * Fix two issues with Coordinator -> Overlord communication. 1) ClientCompactQuery needs to recognize the potential for 'intervals' to be set instead of 'segments'. The lack of this led to a NullPointerException on DruidCoordinatorSegmentCompactor.java:102. 2) In two locations (DruidCoordinatorSegmentCompactor, DruidCoordinatorCleanupPendingSegments) tasks were being retrieved using waiting/pending/running tasks in the wrong order: by checking 'running' first and then 'pending', tasks could be missed if they moved from 'pending' to 'running' in between the two calls. Replaced these methods with calls to 'getActiveTasks', a new method that does the calls in the right order. * Remove unused import.	2019-04-04 10:25:18 -07:00
David Glasser	4e23c11345	Make IngestSegmentFirehoseFactory splittable for parallel ingestion (#7048 ) * Make IngestSegmentFirehoseFactory splittable for parallel ingestion * Code review feedback - Get rid of WindowedSegment - Don't document 'segments' parameter or support splitting firehoses that use it - Require 'intervals' in WindowedSegmentId (since it won't be written by hand) * Add missing @JsonProperty * Integration test passes * Add unit test * Remove two FIXME comments from CompactionTask I'd like to leave this PR in a potentially mergeable state, but I still would appreciate reviewer eyes on the questions I'm removing here. * Updates from code review	2019-04-02 14:59:17 -07:00
Michael Trelinski	347779b17a	Zookeeper loss (#6740 ) * Update init Fix bin/init to source from proper directory. * Fix for Proposal #6518: Shutdown druid processes upon complete loss of ZK connectivity * Zookeeper Loss: - Add feature documentation - Cosmetic refactors - Variable extractions - Remove getter * - Change config key name and reword documentation - Switch from Function<Void,Void> to Runnable/Lambda - try { … } finally { … } * Fix line length too long * - change to formatted string for logging - use System.err.println after lifecycle stops * commenting on makeEnsembleProvider()-created Zookeeper termination * Add javadoc * added java doc reference back to apache discussion thread. * move comment to other class * favor two-slash comments instead of multiline comments	2019-03-29 15:10:42 -07:00
Jihoon Son	62c3e89266	maxTotalRows should be checked in DataSourceCompactionConfig before setting targetCompactionSizeBytes (#7368 ) * maxTotalRows should be checked in DataSourceCompactionConfig before setting targetCompactionSizeBytes * remove unnecessary default values * remove flacky test * fix build * Add comments	2019-03-28 20:25:10 -07:00
Justin Borromeo	ad7862c58a	Time Ordering On Scans (#7133 ) * Moved Scan Builder to Druids class and started on Scan Benchmark setup * Need to form queries * It runs. * Stuff for time-ordered scan query * Move ScanResultValue timestamp comparator to a separate class for testing * Licensing stuff * Change benchmark * Remove todos * Added TimestampComparator tests * Change number of benchmark iterations * Added time ordering to the scan benchmark * Changed benchmark params * More param changes * Benchmark param change * Made Jon's changes and removed TODOs * Broke some long lines into two lines * nit * Decrease segment size for less memory usage * Wrote tests for heapsort scan result values and fixed bug where iterator wasn't returning elements in correct order * Wrote more tests for scan result value sort * Committing a param change to kick teamcity * Fixed codestyle and forbidden API errors * . * Improved conciseness * nit * Created an error message for when someone tries to time order a result set > threshold limit * Set to spaces over tabs * Fixing tests WIP * Fixed failing calcite tests * Kicking travis with change to benchmark param * added all query types to scan benchmark * Fixed benchmark queries * Renamed sort function * Added javadoc on ScanResultValueTimestampComparator * Unused import * Added more javadoc * improved doc * Removed unused import to satisfy PMD check * Small changes * Changes based on Gian's comments * Fixed failing test due to null resultFormat * Added config and get # of segments * Set up time ordering strategy decision tree * Refactor and pQueue works * Cleanup * Ordering is correct on n-way merge -> still need to batch events into ScanResultValues * WIP * Sequence stuff is so dirty :( * Fixed bug introduced by replacing deque with list * Wrote docs * Multi-historical setup works * WIP * Change so batching only occurs on broker for time-ordered scans Restricted batching to broker for time-ordered queries and adjusted tests Formatting Cleanup * Fixed mistakes in merge * Fixed failing tests * Reset config * Wrote tests and added Javadoc * Nit-change on javadoc * Checkstyle fix * Improved test and appeased TeamCity * Sorry, checkstyle * Applied Jon's recommended changes * Checkstyle fix * Optimization * Fixed tests * Updated error message * Added error message for UOE * Renaming * Finish rename * Smarter limiting for pQueue method * Optimized n-way merge strategy * Rename segment limit -> segment partitions limit * Added a bit of docs * More comments * Fix checkstyle and test * Nit comment * Fixed failing tests -> allow usage of all types of segment spec * Fixed failing tests -> allow usage of all types of segment spec * Revert "Fixed failing tests -> allow usage of all types of segment spec" This reverts commit `ec470288c7`. * Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge" This reverts commit `57033f36df`, reversing changes made to `8f01d8dd16`. * Check type of segment spec before using for time ordering * Fix bug in numRowsScanned * Fix bug messing up count of rows * Fix docs and flipped boolean in ScanQueryLimitRowIterator * Refactor n-way merge * Added test for n-way merge * Refixed regression * Checkstyle and doc update * Modified sequence limit to accept longs and added test for long limits * doc fix * Implemented Clint's recommendations	2019-03-28 14:37:09 -07:00
Charles Allen	eeb3dbe79d	Move GCP to a core extension (#6953 ) * Move GCP to a core extension * Don't provide druid-core >.< * Keep AWS and GCP modules separate * Move AWSModule to its own module * Add aws ec2 extension and more modules in more places * Fix bad imports * Fix test jackson module * Include AWS and GCP core in server * Add simple empty method comment * Update version to 15 * One more 0.13.0-->0.15.0 change * Fix multi-binding problem * Grep for s3-extensions and update docs * Update extensions.md	2019-03-27 09:00:43 -07:00
Jihoon Son	543324f8a9	Fix logging in IndexerSQLMetadataStorageCoordinator (#7349 )	2019-03-26 20:36:19 -07:00
Jihoon Son	4d37edac1e	Suppress stack trace in warning (#7348 )	2019-03-26 17:27:29 -07:00
Jihoon Son	5294277cb4	Fix exclusive start partitions for sequenceMetadata (#7339 ) * Fix exclusvie start partitions for sequenceMetadata * add empty check	2019-03-26 14:39:07 -07:00
Roman Leventov	bca40dcdaf	Fix some IntelliJ inspections (#7273 ) Prepare TeamCity for IntelliJ 2018.3.1 upgrade. Mostly removed redundant exceptions declarations in `throws` clauses.	2019-03-25 21:11:01 -03:00
Jihoon Son	f410c28af6	Always convert start metadata to start (#7332 )	2019-03-22 21:12:15 -07:00
Jihoon Son	0c5dcf5586	Fix exclusivity for start offset in kinesis indexing service & check exclusivity properly in IndexerSQLMetadataStorageCoordinator (#7291 ) * Fix exclusivity for start offset in kinesis indexing service * some adjustment * Fix SeekableStreamDataSourceMetadata * Add missing javadocs * Add missing comments and unit test * fix SeekableStreamStartSequenceNumbers.plus and add comments * remove extra exclusivePartitions in KafkaIOConfig and fix downgrade issue * Add javadocs * fix compilation * fix test * remove unused variable	2019-03-21 13:12:22 -07:00
Roman Leventov	dfd27e00c0	Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality (#7185 ) * Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality * Fix DruidCoordinatorTest; Renamed DruidCoordinator.getReplicationStatus() to computeUnderReplicationCountsPerDataSourcePerTier() * More Javadocs, typos, refactor DruidCoordinatorRuntimeParams.createAvailableSegmentsSet() * Style * typo * Disable StaticPseudoFunctionalStyleMethod inspection because of too much false positives * Fixes	2019-03-19 18:22:56 -03:00
Jihoon Son	e18d5d96d9	Ignore bad JSON entries in SQLMetadataSupervisorManager.getAll() (#7278 )	2019-03-18 14:28:11 +08:00
Jihoon Son	892d1d35d6	Deprecate NoneShardSpec and drop support for automatic segment merge (#6883 ) * Deprecate noneShardSpec * clean up noneShardSpec constructor * revert unnecessary change * Deprecate mergeTask * add more doc * remove convert from indexMerger * Remove mergeTask * remove HadoopDruidConverterConfig * fix build * fix build * fix teamcity * fix teamcity * fix ServerModule * fix compilation * fix compilation	2019-03-15 23:29:25 -07:00
Atul Mohan	2daeb50008	Add support for optional client authentication on TLS (#7250 ) * Add optional client auth * Add docs	2019-03-15 15:14:34 -07:00
Furkan KAMACI	7ada1c49f9	Prohibit Throwables.propagate() (#7121 ) * Throw caught exception. * Throw caught exceptions. * Related checkstyle rule is added to prevent further bugs. * RuntimeException() is used instead of Throwables.propagate(). * Missing import is added. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * * Checkstyle definition is improved. * Throwables.propagate() usages are removed. * Checkstyle pattern is changed for only scanning "Throwables.propagate(" instead of checking lookbehind. * Throwable is kept before firing a Runtime Exception. * Fix unused assignments.	2019-03-14 18:28:33 -03:00
Hongze Zhang	f9d99b245b	Add missing doc link for operations/http-compression.html; Fix magic numbers in test cases using JettyServerInitUtils.wrapWithDefaultGzipHandler (#7110 )	2019-03-13 14:09:19 -07:00
Clint Wylie	3895914aa2	consolidate CompressionUtils.java since now in the same jar (#6908 )	2019-03-13 11:02:44 -04:00
Clint Wylie	4d3987c1dd	lifecycle stage refactor to ensure proper start and stop ordering of servers and announcements (#7234 ) * lifecycle stage refactor to ensure proper ordering of servers and announcements * move DerivativeDataSourceManager to Lifecycle.Stage.NORMAL	2019-03-12 07:09:03 -07:00
Jihoon Son	e240fba247	Fix logs in SegmentLoaderLocalCacheManager (#7229 )	2019-03-11 21:16:03 -07:00
Gian Merlino	dcfca03718	More accurate RealtimeMetricsMonitor messages. (#7230 ) The old messages did not reflect the full range of reasons why messages could be thrown away.	2019-03-11 19:50:32 -04:00
Samarth Jain	8804bd0dc1	Remove unnecessary check for contains() in LoadRule (#7073 ) See https://github.com/apache/incubator-druid/issues/7072	2019-03-11 13:52:46 -03:00
Clint Wylie	5cc171419c	move jetty module to Lifecycle.Stage.LAST to allow graceful shutdown to work with lookups and stuff, put http-clint on lifecycle modules lifecycle (#7215 )	2019-03-09 15:14:09 -08:00
Jihoon Son	9bebf113ba	Fix race in historical when loading segments in parallel (#7203 ) * Fix race in historical when loading segments in parallel * revert unnecessary change * remove synchronized * add reference counting locking * fix build * fix comment	2019-03-08 17:54:05 -08:00
Clint Wylie	a44df6522c	rename maintenance mode to decommission (#7154 ) * rename maintenance mode to decommission * review changes * missed one * fix straggler, add doc about decommissioning stalling if no active servers * fix missed typo, docs * refine docs * doc changes, replace generals * add explicit comment to mention suppressed stats for balanceTier * rename decommissioningVelocity to decommissioningMaxSegmentsToMovePercent and update docs * fix precondition check * decommissioningMaxPercentOfMaxSegmentsToMove * fix test * fix test * fixes	2019-03-08 16:33:51 -08:00
Roman Leventov	10c9f6d708	Fix and document concurrency of EventReceiverFirehose and TimedShutoffFirehose; Refine concurrency specification of Firehose (#7038 ) #### `EventReceiverFirehoseFactory` Fixed several concurrency bugs in `EventReceiverFirehoseFactory`: - Race condition over putting an entry into `producerSequences` in `checkProducerSequence()`. - `Stopwatch` used to measure time across threads, but it's a non-thread-safe class. - Use `System.nanoTime()` instead of `System.currentTimeMillis()` because the latter are [not suitable](https://stackoverflow.com/a/351571/648955) for measuring time intervals. - `close()` was not synchronized by could be called from multiple threads concurrently. Removed unnecessary `readLock` (protecting `hasMore()` and `nextRow()` which are always called from a single thread). Removed unnecessary `volatile` modifiers. Documented threading model and concurrent control flow of `EventReceiverFirehose` instances. Important: please read the updated Javadoc for `EventReceiverFirehose.addAll()`. It allows events from different requests (batches) to be interleaved in the buffer. Is this OK? #### `TimedShutoffFirehoseFactory` - Fixed a race condition that was possible because `close()` that was not properly synchronized. Documented threading model and concurrent control flow of `TimedShutoffFirehose` instances. #### `Firehose` Refined concurrency contract of `Firehose` based on `EventReceiverFirehose` implementation. Importantly, now it states that `close()` doesn't affect `hasMore()` and `nextRow()` and could be called concurrently with them. In other words, specified that `close()` is for "row supply" side rather than "row consume" side. However, I didn't check that other `Firehose` implementatations adhere to this contract. <hr> This issue is the result of reviewing `EventReceiverFirehose` and `TimedShutoffFirehose` using [this checklist](https://medium.com/@leventov/code-review-checklist-java-concurrency-49398c326154).	2019-03-04 18:50:03 -03:00
Himanshu Pandey	8b803cbc22	Added checkstyle for "Methods starting with Capital Letters" (#7118 ) * Added checkstyle for "Methods starting with Capital Letters" and changed the method names violating this. * Un-abbreviate the method names in the calcite tests * Fixed checkstyle errors * Changed asserts position in the code	2019-02-23 20:10:31 -08:00
David Glasser	1c2753ab90	ParallelIndexSubTask: support ingestSegment in delegating factories (#7089 ) IndexTask had special-cased code to properly send a TaskToolbox to a IngestSegmentFirehoseFactory that's nested inside a CombiningFirehoseFactory, but ParallelIndexSubTask didn't. This change refactors IngestSegmentFirehoseFactory so that it doesn't need a TaskToolbox; it instead gets a CoordinatorClient and a SegmentLoaderFactory directly injected into it. This also refactors SegmentLoaderFactory so it doesn't depend on an injectable SegmentLoaderConfig, since its only method always replaces the preconfigured SegmentLoaderConfig anyway. This makes it possible to use SegmentLoaderFactory without setting druid.segmentCaches.locations to some dummy value. Another goal of this PR is to make it possible for IngestSegmentFirehoseFactory to list data segments outside of connect() --- specifically, to make it a FiniteFirehoseFactory which can query the coordinator in order to calculate its splits. See #7048. This also adds missing datasource name URL-encoding to an API used by CoordinatorBasedSegmentHandoffNotifier.	2019-02-23 17:02:56 -08:00
Jihoon Son	4e2b085201	Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911 ) * Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file * delete descriptor.file when killing segments * fix test * Add doc for ha * improve warning	2019-02-20 15:10:29 -08:00
Mingming Qiu	dd34691004	Coordinator await initialization before finishing startup (#6847 ) * Curator server inventory await initialization * address comments * print exception object in log * remove throws ISE * cachingCost awaitInitialization default to false	2019-02-20 11:56:23 -08:00
Justin Borromeo	c7eeeabf45	2528 Replace Incremental Index Global Flags with Getters (#7043 ) * Eliminated reportParseExceptions and deserializeComplexMetrics * Removed more global flags * Cleanup * Addressed Surekha's recommendations	2019-02-15 13:36:46 -08:00
Jihoon Son	1701fbcad3	Improve error message for revoked locks (#7035 ) * Improve error message for revoked locks * fix test * fix test * fix test * fix toString	2019-02-13 11:22:48 -08:00
Jihoon Son	d42de574d6	Add an api to get all lookup specs (#7025 ) * Add an api to get all lookup specs * add doc	2019-02-08 11:05:59 -08:00
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
Egor Riashin	97b6407983	maintenance mode for Historical (#6349 ) * maintenance mode for Historical forbidden api fix, config deserialization fix logging fix, unit tests * addressed comments * addressed comments * a style fix * addressed comments * a unit-test fix due to recent code-refactoring * docs & refactoring * addressed comments * addressed a LoadRule drop flaw * post merge cleaning up	2019-02-04 18:11:00 -08:00
Roman Leventov	0e926e8652	Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898 ) * Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool * Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java * Remove unnecessary comment * Checkstyle * Fix getFromExtensions() * Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization * Fix UriCacheGeneratorTest * Workaround issue with MaterializedViewQueryQueryToolChest * Strengthen Appenderator's contract regarding concurrency	2019-02-04 09:18:12 -08:00
Surekha	7baa33049c	Introduce published segment cache in broker (#6901 ) * Add published segment cache in broker * Change the DataSegment interner so it's not based on DataSEgment's equals only and size is preserved if set * Added a trueEquals to DataSegment class * Use separate interner for realtime and historical segments * Remove trueEquals as it's not used anymore, change log message * PR comments * PR comments * Fix tests * PR comments * Few more modification to * change the coordinator api * removeall segments at once from MetadataSegmentView in order to serve a more consistent view of published segments * Change the poll behaviour to avoid multiple poll execution at same time * minor changes * PR comments * PR comments * Make the segment cache in broker off by default * Added a config to PlannerConfig * Moved MetadataSegmentView to sql module * Add doc for new planner config * Update documentation * PR comments * some more changes * PR comments * fix test * remove unintentional change, whether to synchronize on lifecycleLock is still in discussion in PR * minor changes * some changes to initialization * use pollPeriodInMS * Add boolean cachePopulated to check if first poll succeeds * Remove poll from start() * take the log message out of condition in stop()	2019-02-02 22:27:13 -08:00
Vadim Ogievetsky	7f1b19bfb1	Adding a Unified web console. (#6923 ) * Adding new web console. * fixed css * fix form height * fix typo * do import custom react-table css * added repo field so npm does not complain * ask travis for node 10 * move indexing-service/src/main/resources/indexer_static into web-console * fix resource names and paths * add licenses * fix exclude file * add licenses to misc files and tidy up * remove rebase marker * fix link * updated env variable name * tidy up licenses and surface errors * cleanup * remove unused code, fix missing await * TeamCity does not like the name aux * add more links to tasks view * rm pages * update gitignore * update readme to be accurate * make clean script * removed old console dependancy * update Jetty routes * add a comment for welcome files for coordinator * do not show inital notifaction for now * renamed overlord console back to console.html * fix coordinator console * rename coordinator-console.html to index.html	2019-01-31 17:26:41 -08:00
Jihoon Son	e56c598cc1	Fall back to the old coordinator API for checking segment handoff if new one is not supported (#6966 )	2019-01-31 08:50:46 -08:00
Benedict Jin	72a571fbf7	For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913 ) * * Add few methods about base64 into StringUtils * Use `java.util.Base64` instead of others * Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis * Rename encodeBase64String & decodeBase64String * Update druid-forbidden-apis	2019-01-25 17:32:29 -08:00
Roman Leventov	8eae26fd4e	Introduce SegmentId class (#6370 ) * Introduce SegmentId class * tmp * Fix SelectQueryRunnerTest * Fix indentation * Fixes * Remove Comparators.inverse() tests * Refinements * Fix tests * Fix more tests * Remove duplicate DataSegmentTest, fixes #6064 * SegmentDescriptor doc * Fix SQLMetadataStorageUpdaterJobHandler * Fix DataSegment deserialization for ignoring id * Add comments * More comments * Address more comments * Fix compilation * Restore segment2 in SystemSchemaTest according to a comment * Fix style * fix testServerSegmentsTable * Fix compilation * Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes * Fix SystemSchemaTest * Fix style * Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment * Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164 * Fix compilation	2019-01-21 11:11:10 -08:00
Clint Wylie	8ba33b2505	add 'init' lifecycle stage for finer control over startup and shutdown (#6864 ) * add Lifecycle.Stage.INIT, put log shutter downer in init stage, tests, rad startup banner * log cleanup * log changes * add task-master lifecycle to module lifecycle to gracefully stop task-master stuff * fix it the right way * remove announce spam * unused import * one more log * updated comments * wrap leadership lifecycle stop to prevent exceptions from wrecking rest of task master stop * add precondition check	2019-01-21 09:01:36 -08:00
Mingming Qiu	b704ebfa37	Let cachingCost balancer strategy only consider segment replicatable nodes (#6879 )	2019-01-17 09:26:33 -08:00
Jihoon Son	a07e66c540	Fix auto compaction to compact only same or abutting intervals (#6808 ) * Fix auto compaction to compact only same or abutting intervals * fix test	2019-01-16 14:54:11 -08:00
Dayue Gao	5b8a221713	Add SQL id, request logs, and metrics (#6302 ) * use SqlLifecyle to manage sql execution, add sqlId * add sql request logger * fix UT * rename sqlId to sqlQueryId, sql/time to sqlQuery/time, etc * add docs and more sql request logger impls * add UT for http and jdbc * fix forbidden use of com.google.common.base.Charsets * fix UT in QuantileSqlAggregatorTest, supressed unused warning of getSqlQueryId * do not use default method in QueryMetrics interface * capitalize 'sql' everywhere in the non-property parts of the docs * use RequestLogger interface to log sql query * minor bugfixes and add switching request logger * add filePattern configs for FileRequestLogger * address review comments, adjust sql request log format * fix inspection error * try SuppressWarnings("RedundantThrows") to fix inspection error on ComposingRequestLoggerProvider	2019-01-15 23:12:59 -08:00
Jonathan Wei	8537a771b0	Some fixes and tests for spaces/non-ASCII chars in datasource names (#6761 ) * Fixes and tests for spaces/non-ASCII datasource names * Some unit test fixes * Fix ITRealtimeIndexTaskTest * Checkstyle * TeamCity * PR comments	2019-01-15 08:33:31 -08:00
Surekha	f72f33f84a	Fix num_replicas count in sys.segments table (#6804 ) * Fix num_replicas count from sys.segments * Adjust unit test for num_replica > 1 * Pass named arguments instead of passing boolean constants * Address PR comments * PR comments	2019-01-15 08:31:29 -08:00
Charles Allen	5d2947cd52	Use Guava Compatible immediate executor service (#6815 ) * Use multi-guava version friendly direct executor implementation * Don't use a singleton * Fix strict compliation complaints * Copy Guava's DirectExecutor * Fix javadoc * Imports are the devil	2019-01-11 10:42:19 -08:00
Jihoon Son	c35a39d70b	Add support maxRowsPerSegment for auto compaction (#6780 ) * Add support maxRowsPerSegment for auto compaction * fix build * fix build * fix teamcity * add test * fix test * address comment	2019-01-10 09:50:14 -08:00
Mingming Qiu	8ebb7b558b	Handoff should ignore segments that are dropped by drop rules (#6676 ) * Handoff should ignore segments that are dropped by drop rules * fix travis-ci * fix tests * address comments * remove line added by accident * address comments * add javadoc and logging the full stack trace of exception * add error message	2019-01-07 14:43:11 -08:00
Mingming Qiu	636964fcb5	Fix issue that tasks failed because of no sink for identifier (#6724 ) * Fix issue that tasks failed because of no sink for identifier * make find sinks to persist run in one callable together with the actual persist work * Revert "make find sinks to persist run in one callable together with the actual persist work" This reverts commit `a24a2d80ae`.	2019-01-04 17:09:11 -08:00
elloooooo	832a3b16ed	Improve slfj logger input for MDC field:datasource (#6787 ) * improve slfj logger MDC datasource input * add some UT and isNested field	2019-01-03 18:00:04 -08:00
Jihoon Son	9ad6a733a5	Add support segmentGranularity for CompactionTask (#6758 ) * Add support segmentGranularity * add doc and fix combination of options * improve doc	2019-01-03 17:50:45 -08:00
Mingming Qiu	114a9fc38f	change propertyBase in ServerViewModule (#6774 )	2019-01-02 16:44:02 +08:00
Jihoon Son	fa7cb906e4	Fix auto compaction to consider intervals of running tasks (#6767 ) * Fix auto compaction to consider intervals of running tasks * adjust initial collection size	2018-12-27 18:03:44 -08:00
Gian Merlino	7a09cde4de	Broker: Await initialization before finishing startup. (#6742 ) * Broker: Await initialization before finishing startup. In particular, hold off on announcing the service and starting the HTTP server until the server view and SQL metadata cache are finished initializing. This closes a window of time where a Broker could return partial results shortly after startup. As part of this, some simplification of server-lifecycle service announcements. This helps ensure that the two different kinds of announcements we do (legacy and new-style) stay in sync. * Remove unused imports. * Fix NPE in ServerRunnable.	2018-12-18 20:32:31 -08:00
Clint Wylie	9505074530	fix log typo (#6755 ) * fix log typo, add DataSegmentUtils.getIdentifiersString util method * fix indecisive oops	2018-12-18 15:10:25 -08:00
Jihoon Son	f0ee6bf898	Fix auto compaction when the firstSegment is in skipOffset (#6738 ) * Fix auto compaction when the firstSegment is in skipOffset * remove duplicate	2018-12-18 19:10:46 +08:00
Clint Wylie	486c6f3cf9	emit logs that are only useful for debugging at debug level (#6741 ) * make logs that are only useful for debugging be at debug level so log volume is much more chill * info level messages for total merge buffer allocated/free * more chill compaction logs	2018-12-17 14:20:28 +08:00
Jonathan Wei	c713116a75	Use @Coordinator leader client in CoordinatorRuleManager (#6729 )	2018-12-16 15:18:09 -08:00
Gian Merlino	04e7c7fbdc	FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637 ) * FilteredRequestLogger: Fix start/stop, invalid delegate behavior. Fixes two bugs: 1) FilteredRequestLogger did not start/stop the delegate. 2) FilteredRequestLogger would ignore an invalid delegate type, and instead silently substitute the "noop" logger. This was due to a larger problem with RequestLoggerProvider setup in general; the fix here is to remove "defaultImpl" from the RequestLoggerProvider interface, and instead have JsonConfigurator be responsible for creating the default implementations. It is stricter about things than the old system was, and is only willing to make a noop logger if it doesn't see any request logger configs. Otherwise, it'll raise a provision error. * Remove unneeded annotations.	2018-12-14 16:55:44 +08:00
dongyifeng	91e3cf7196	add charset UTF-8 to log api (#6709 ) When I retrieve the task log in browser, the Chinese characters all end up as garbage. ![image](https://user-images.githubusercontent.com/1322134/49502749-bd614080-f8b0-11e8-839e-07f7117eebfd.png) After adding charset UTF-8, it was correct. ![image](https://user-images.githubusercontent.com/1322134/49502804-dc5fd280-f8b0-11e8-916b-bda8f1e7f318.png)	2018-12-12 16:31:04 +01:00
Atul Mohan	86e3ae5b48	Add fail message (#6720 )	2018-12-11 08:05:50 -08:00
Mingming Qiu	e8dd3716b8	add close method in Cache interface (#6540 ) * add close method in Cache interface * address comments * address comments and fix travis-ci * use try-finally	2018-12-06 17:28:41 +08:00
Mingming Qiu	607339003b	Add TaskCountStatsMonitor to monitor task count stats (#6657 ) * Add TaskCountStatsMonitor to monitor task count stats * address comments * add file header * tweak test	2018-12-04 13:37:17 -08:00
Clint Wylie	a1c9d0add2	autosize processing buffers based on direct memory sizing by default (#6588 ) * autosize processing buffers based on direct memory sizing * remove oops, more test * max 1gb autosize buffers, test, start of docs * fix oops * revert accidental change * print buffer size in exception * change the things	2018-12-03 18:40:02 -07:00
Clint Wylie	43adb391c2	remove AbstractResourceFilter.isApplicable because it is not (#6691 ) * remove AbstractResourceFilter.isApplicable because it is not, add tests for OverlordResource.doShutdown and OverlordResource.shutdownTasksForDatasource * cleanup	2018-12-01 21:52:31 +08:00
Roman Leventov	ec38df7575	Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606 ) * Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns * Fix style * Fix HttpEmitterTest.timeoutEmptyQueue() * Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests * Clarify code	2018-12-01 01:12:56 +01:00
Jihoon Son	d6539abd0a	Fix overlord api and console (#6686 ) * Fix overlord APIs and console * remove getRunningTasksByDataSource * add missing path to isApplicable	2018-11-29 23:45:28 -08:00
hate13	f4b49f01ff	add rule count on log (#6467 ) * add rule count on log * add final	2018-11-28 16:08:38 +08:00
Mingming Qiu	9a89200607	Emit query metrics even if the ETags are equal (#6663 )	2018-11-27 15:18:01 -08:00
Jihoon Son	219f0965dc	Remove duplicate DataSegmentTest (#6669 )	2018-11-27 15:13:39 -08:00
seoeun	22a5bf97a2	Fix issue that tasks tables in metadata storage are not cleared (#6592 ) * tasks tables in metadata storage are not cleared * address comments. remove tasklogs and revert obsolete changes * address comments. change comment and update doc. * address comments. update doc more detailed * address comments. remove redundant log and update doc more detailed. * address comments. update document	2018-11-22 11:50:31 +08:00
Gian Merlino	92cce04165	Fix missing default config in some calls to coordinator dynamic configs. (#6652 ) * Fix missing default config in some calls to coordinator dynamic configs. The lack of a default config meant that if someone called an API _without_ a default config before one _with_ a default config, then the default value would get stuck at null instead of the intended default value. I noticed this in a cluster where calling /druid/coordinator/v1/config before a coordinator had fully started up would lead to NPEs during DruidCoordinatorRuleRunner. This patch makes the default configs consistent across all calls. * Remove unnecessary null check.	2018-11-22 10:25:39 +08:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
Jihoon Son	0395d554e1	Properly reset total size of segmentsToCompact in NewestSegmentFirstIterator (#6622 ) * Properly reset total size of segmentsToCompact in NewestSegmentFirstIterator * add test	2018-11-15 01:00:51 -08:00
Jihoon Son	7b262b7123	Remove unnecessary path param from auto compaction api (#6594 ) * Remove unnecessary path param from auto compaction api * fix ci	2018-11-13 09:46:13 -08:00
David Lim	afb239b17a	add missing license headers, in particular to MD files; clean up RAT … (#6563 ) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg	2018-11-13 09:38:37 -08:00
Roman Leventov	54351a5c75	Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490 ) * Fix various bugs; Enable more IntelliJ inspections and update error-prone * Fix NPE * Fix inspections * Remove unused imports	2018-11-06 14:38:08 -08:00
Surekha	bcb754d066	Use current coordinator leader instead of cached one (#6551 ) (#6552 ) * Use current coordinator leader instead of cached one (#6551) Check the response status and throw exception if not OK * Modify tests * PR comment * Add the correct check for status of BytesAccumulatingResponseHandler * Move the status check into JsonParserIterator so sql query outputs meaningful message on failure * Fix tests	2018-11-06 13:09:51 -08:00
QiuMM	7b34662462	Period load/drop/broadcast rules should include the future by default (#6414 ) * Period load/drop/broadcast rules should include the future by default * address comments * adjust coordinator console and tweak docs * address comments * fix travis-ci	2018-11-01 09:43:34 -07:00
Roman Leventov	2cdce2e2a6	Add RequestLogEventBuilderFactory (#6477 ) This PR allows to control the fields in `RequestLogEvent`, emitted in `EmittingRequestLogger`. In our case, we want to get rid of the `intervals` fields of the query objects that are a part of `DefaultRequestLogEvent`. They are enormous (thousands of segments) and not useful. Related to #5522, FYI @a2l007.	2018-10-31 22:24:37 +01:00
QiuMM	676f5e6d7f	Prohibit some guava collection APIs and use JDK collection APIs directly (#6511 ) * Prohibit some guava collection APIs and use JDK APIs directly * reset files that changed by accident * sort codestyle/druid-forbidden-apis.txt alphabetically	2018-10-29 13:02:43 +01:00
Jonathan Wei	b2d9b6f23d	Allow custom TLS cert checks (#6432 ) * Allow custom TLS cert checks * PR comment * Checkstyle, PR comment	2018-10-24 16:31:52 -07:00
QiuMM	601183b4c7	Add period drop before rule (#6415 ) * Add period drop before rule * add license header * support period drop before rule in coordinator console * address comments	2018-10-24 12:44:30 -07:00
Roman Leventov	84ac18dc1b	Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461 ) * Catch some incorrect method parameter or call argument formatting patterns with checkstyle * Fix DiscoveryModule * Inline parameters_and_arguments.txt * Fix a bug in PolyBind * Fix formatting	2018-10-23 07:17:38 -03:00
Faxian Zhao	c5bf4e7503	update insert pending segments logic to synchronous (#6336 ) * 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict. 2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB. 3. 'select and insert' is not need within transaction. * Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments. * fix typo for NullPointerException	2018-10-22 19:48:20 -07:00
Samarth Jain	359576a80b	Implement force push down for nested group by query (#5471 ) * Force nested query push down * Code review changes	2018-10-22 13:43:47 -07:00
QiuMM	f5f4171a45	QueryCountStatsMonitor: emit query/count (#6473 ) Let `QueryCountStatsMonitor` emit `query/count`, then I can monitor QPS of my services, or I have to count it by myself.	2018-10-19 10:15:02 -03:00
patelh	c780aacc03	Add ability to specify dbcp properties file (#6419 ) * Add ability to specify dbcp properties file * Address PR comments, use mock config, remove setter * Add documentation * APRC, updated docs with example file contents * APRC, add @Nullable, @VisibileForTesting, update doc * APRC, remove error log, use props directly as jackson binding * Remove unused files	2018-10-16 12:27:19 -07:00
QiuMM	85a89e2703	make druid node bind address configurable (#6464 ) * make druid node bind address configurable * fix tests * fix travis-ci	2018-10-15 14:19:40 -07:00
Roman Leventov	aa121da25f	Use NodeType enum instead of Strings (#6377 ) * Use NodeType enum instead of Strings * Make NodeType constants uppercase * Fix CommonCacheNotifier and NodeType/ServerType comments * Reconsidering comment * Fix import * Add a comment to CommonCacheNotifier.NODE_TYPES	2018-10-14 20:49:38 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
vishnu rao	6567fff9e7	Query Response format to be based on http 'accept' header & Query Payload content type to be based on 'content-type' header (#4033 ) * o- Query Response format to be based on http 'accept' header & Query Payload contenty type to be based on 'content-type' header * o- Query Response format to be based on http 'accept' header & Query Payload contenty type to be based on 'content-type' header o- if Accept header is absent, it defaults to Content-Type header * Feature: Query Response format to be based on http 'accept' header & Query Payload content type to be based on 'content-type' PR #4033 Minor change to a comment - restoring to previous wording * Feature: Query Response format to be based on http 'accept' header & Query Payload content type to be based on 'content-type' PR #4033 o- minor change to check for empty string	2018-10-12 14:29:14 -07:00
Surekha	3be4a97150	Fix inconsistent segment size(#6448 ) (#6451 ) * Fix inconsistent segment size(#6448) * Fix the segment size for published segments * Changes to get numReplicas * Make coordinator segments API truly streaming * Changes to store partial segment data * Simplify SegmentMetadataHolder * Store partial the columns from available segments * Address comments	2018-10-12 12:55:20 -07:00
Clint Wylie	39d61b9ae5	update druid-console to 0.0.4 (#6450 )	2018-10-11 22:37:08 -06:00
David Lim	20ab213ba6	change project versions to 0.13.0-incubating-SNAPSHOT (#6453 )	2018-10-11 19:28:01 -07:00
Clint Wylie	f7775d1db3	fixes for LookupReferencesManagerTest (#6444 ) * some fixes for LookupReferencesManagerTest * docs * formatting * more formatting fixes	2018-10-10 18:02:11 -07:00
Surekha	3a0a667fe0	Introduce SystemSchema tables (#5989 ) (#6094 ) * Added SystemSchema with following tables (#5989) * SEGMENTS table provides details on served and published segments * SERVERS table provides details on data servers * SERVERSEGMETS table is the JOIN of SEGMENTS and SERVERS * TASKS table provides details on tasks * Add documentation for system schema * Fix static-analysis warnings * Address PR comments Add unit tests Fix a test * Try to fix a test * Fix a bug around replica count * rename io.druid to org.apache.druid * Major change is to make tasks and segment queries streaming * Made tasks/segments stream to calcite instead of storing it in memory * Add num_rows to segments table * Refactor JsonParserIterator * Replace with closeable iterator * Fix docs, make num_rows column nullable, some unit test changes * make num_rows column type long, allow it to be null fix a compile error after merge, add TrafficCop param to InputStreamResponseHandler * Filter null rows for segments table from Linq4j enumerable * change num_replicas datatype to long in segments table * Fix some tests and address comments * Doc updates, other PR comments * Update tests * Address comments * Add auth check * Update docs * Refactoring * Fix teamcity warning, change the getQueryableServer in TimelineServerView * Fix compilation after rebase * Use the stream API from AuthorizationUtils * Added LeaderClient interface and NoopDruidLeaderClient class * Revert "Added LeaderClient interface and NoopDruidLeaderClient class" This reverts commit `100fa46e39`. * Make the naming consistent to server_segments for the join table * Add ForbiddenException on auth check failure * Remove static block from SystemSchema * Try to fix a test in CalciteQueryTest due to rename of server_segments * Fix the json output format in the coordinator API * Add auth check in the segments API * Add null check to avoid NPE * Use annonymous class object instead of mock for DruidLeaderClient in SqlBenchmark * Fix test failures, type long/BIGINT can be nullable * Revert long nullability to fix tests * Fix style for tests * PR comments * Address PR comments * Add the missing BytesAccumulatingResponseHandler class * Use Sequences.withBaggage in DruidPlanner * Fix docs, add comments * Close the iterator if hasNext returns false	2018-10-10 17:17:29 -07:00
Jihoon Son	88d23b77b7	Add support keepSegmentGranularity for automatic compaction (#6407 ) * Add support keepSegmentGranularity for automatic compaction * skip unknown dataSource * ignore single semgnet to compact * add doc * address comments * address comment	2018-10-07 16:48:58 -07:00
Jihoon Son	45aa51a00c	Add support hash partitioning by a subset of dimensions to indexTask (#6326 ) * Add support hash partitioning by a subset of dimensions to indexTask * add doc * fix style * fix test * fix doc * fix build	2018-10-06 16:45:07 -07:00
QiuMM	0b8085aff7	Prohibit jackson ObjectMapper#reader methods which are deprecated (#6386 ) * Prohibit jackson ObjectMapper#reader methods which are deprecated * address comments	2018-10-03 17:55:20 -03:00
Roman Leventov	3ae563263a	Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957 ) This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR. Some of the changes: - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names. - Generified `ComplexMetricExtractor`	2018-10-02 14:50:22 -03:00
Jihoon Son	cb14a43038	Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask (#6393 ) * Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask * update doc and remove auto conversion * remove remaining doc * fix teamcity	2018-10-01 12:03:35 -07:00
Gian Merlino	9fa4afdb8e	URL encode datasources, task ids, authenticator names. (#5938 ) * URL encode datasources, task ids, authenticator names. * Fix URL encoding for router forwarding servlets. * Fix log-with-offset API. * Fix test. * Test adjustments. * Task client fixes. * Remove unused import.	2018-09-30 12:29:51 -07:00
Shiv Toolsidass	5a894f830b	Added backpressure metric (#6335 ) * Added backpressure metric * Updated channelReadable to AtomicBoolean and fixed broken test * Moved backpressure metric logic to NettyHttpClient * Fix placement of calculating backPressureDuration	2018-09-29 14:24:04 -07:00
Jihoon Son	122caec7b1	Add support targetCompactionSizeBytes for compactionTask (#6203 ) * Add support targetCompactionSizeBytes for compactionTask * fix test * fix a bug in keepSegmentGranularity * fix wrong noinspection comment * address comments	2018-09-28 11:16:35 -07:00
Jihoon Son	aef022de98	Fix race in taskMaster (#6388 )	2018-09-26 21:48:02 -07:00
Clint Wylie	fc1d5795c1	remove wikipedia irc firehose and dependencies from core server module to examples (#6391 )	2018-09-26 21:46:37 -07:00
Roman Leventov	8978d3751b	Don't convert DruidServer to ImmutableDruidServers multiple times in CoordinatorHistoricalManagerRunnable (#6385 )	2018-09-26 09:14:14 -07:00
Gian Merlino	a92a20e037	Fix indexes introduced in #6348 . (#6356 ) The indexes introduced in #6348 were on the wrong table. The tests did not catch them due to retries on the create table steps (the first try created the table but not the bogus indexes; the second try noticed that the table already existed and did nothing). This patch doesn't fix the issue with the tests, since the best way to do that would be to do the table and index creation in a transaction; but, this is not supported by all of our supported database engines.	2018-09-25 20:49:13 -07:00
Jihoon Son	d08c2c5eba	Make JvmThreadsMonitor injectable (#6369 )	2018-09-24 20:41:17 -07:00
Jihoon Son	99428e20d2	Deprecate dimensions / metrics APIs on brokers (#6361 ) * Deprecate dimensions / metrics APIs on brokers * add segmentMetadataQuery link * add more doc	2018-09-24 17:56:38 -07:00
Roman Leventov	9a3195e98c	Improve interning in SQLMetadataSegmentManager (#6357 ) * Improve interning in SQLMetadataSegmentManager * typo	2018-09-22 13:23:30 -07:00
Jonathan Wei	364bf9d1f9	Fix non org.apache.druid files and add package name checkstyle rule (#6367 ) * Fix non org.apache.druid files and add package name checkstyle rule * PR comment	2018-09-21 17:58:19 -07:00
Gian Merlino	e1c649e906	Add metadata indexes to help with segment allocation. (#6348 ) Segment allocation queries can take a long time (10s of seconds) when you have a lot of segments. Adding these indexes helps greatly.	2018-09-19 15:54:13 -07:00
Jonathan Wei	8972244c68	Mutual TLS support (#6076 ) * Mutual TLS support * Kafka test fixes * TeamCity fix * Split integration tests * Use localhost DOCKER_IP * Increase server thread count * Increase SSL handshake timeouts * Add broken pipe retries, use injected client config params * PR comments, Rat license check exclusion	2018-09-19 09:56:15 -07:00
Slim Bouguerra	028354eea8	Adding licenses and enable apache-rat-plugin. (#6215 ) * Adding licenses and enable apache-rat-plugi. Change-Id: I4685a2d9f1e147855dba69329b286f2d5bee3c18 * restore the copywrite of demo_table and add it to the list of allowed ones Change-Id: I2a9efde6f4b984bc1ac90483e90d98e71f818a14 * revirew comments Change-Id: I0256c930b7f9a5bb09b44b5e7a149e6ec48cb0ca * more fixup Change-Id: I1355e8a2549e76cd44487abec142be79bec59de2 * align Change-Id: I70bc47ecb577bdf6b91639dd91b6f5642aa6b02f	2018-09-18 08:39:26 -07:00
Hongze Zhang	2fac6743d4	Add maxIdleTime option to EventReceiverFirehose (#5997 )	2018-09-17 13:50:56 -07:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
QiuMM	87ccee05f7	Add ability to specify list of task ports and port range (#6263 ) * support specify list of task ports * fix typos * address comments * remove druid.indexer.runner.separateIngestionEndpoint config * tweak doc * fix doc * code cleanup * keep some useful comments	2018-09-13 19:36:04 -07:00
Roman Leventov	d50b69e6d4	Prohibit LinkedList (#6112 ) * Prohibit LinkedList * Fix tests * Fix * Remove unused import	2018-09-13 18:07:06 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Gian Merlino	d6cbdf86c2	Broker backpressure. (#6313 ) * Broker backpressure. Adds a new property "druid.broker.http.maxQueuedBytes" and a new context parameter "maxQueuedBytes". Both represent a maximum number of bytes queued per query before exerting backpressure on the channel to the data server. Fixes #4933. * Fix query context doc.	2018-09-10 09:33:29 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Gian Merlino	cb40b6d369	Fix all inspection errors currently reported. (#6236 ) * Fix all inspection errors currently reported. TeamCity builds on master are reporting inspection errors, possibly because there was a while where it was not running due to the Apache migration, and there was some drift. * Fix one more location. * Fix tests. * Another fix.	2018-08-26 18:36:01 -06:00
Benedict Jin	3647d4c94a	Make time-related variables more readable (#6158 ) * Make time-related variables more readable * Patch some improvements from the code reviewer * Remove unnecessary boxing of Long type variables	2018-08-21 15:29:40 -07:00
Jihoon Son	2bfe1b6a5a	Fix NPE for taskGroupId when rolling update (#6168 ) * Fix NPE for taskGroupId * missing changes * fix wrong annotation * fix potential race * keep baseSequenceName * make deprecated old param	2018-08-17 10:15:45 -07:00
Samarth Jain	1c8032f9f3	Composite request logger doesn't invoke @LifeCycleStart and @LifeCycleStop methods on its dependencies (#6173 )	2018-08-17 12:34:25 -04:00
Gian Merlino	5ce3185b9c	Fix three bugs with segment publishing. (#6155 ) * Fix three bugs with segment publishing. 1. In AppenderatorImpl: always use a unique path if requested, even if the segment was already pushed. This is important because if we don't do this, it causes the issue mentioned in #6124. 2. In IndexerSQLMetadataStorageCoordinator: Fix a bug that could cause it to return a "not published" result instead of throwing an exception, when there was one metadata update failure, followed by some random exception. This is done by resetting the AtomicBoolean that tracks what case we're in, each time the callback runs. 3. In BaseAppenderatorDriver: Only kill segments if we get an affirmative false publish result. Skip killing if we just got some exception. The reason for this is that we want to avoid killing segments if they are in an unknown state. Two other changes to clarify the contracts a bit and hopefully prevent future bugs: 1. Return SegmentPublishResult from TransactionalSegmentPublisher, to make it more similar to announceHistoricalSegments. 2. Make it explicit, at multiple levels of javadocs, that a "false" publish result must indicate that the publish _definitely_ did not happen. Unknown states must be exceptions. This helps BaseAppenderatorDriver do the right thing. * Remove javadoc-only import. * Updates. * Fix test. * Fix tests.	2018-08-15 13:55:53 -07:00
Jihoon Son	ecee3e0a24	Further optimize memory for Travis jobs (#6150 ) * Further optimize memory for Travis jobs * fix build * sudo false	2018-08-10 22:03:36 -07:00
Christoph Hösler	1a37dfdcd1	Fetch unhandled curator exceptions (#6131 ) * fix: stop druid on unhandled curator exceptions * catch exceptions when stopping lifecycle	2018-08-09 21:47:42 -07:00
Jihoon Son	d6a02de5b5	Add support 'keepSegmentGranularity' for compactionTask (#6095 ) * Add keepSegmentGranularity for compactionTask * fix build * createIoConfig method * fix build * fix build * address comments * fix build	2018-08-09 13:51:20 -07:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Jonathan Wei	b9c445c780	Optimize filtered aggs with interval filters in per-segment queries (#5857 ) * Optimize per-segment queries * Always optimize, add unit test * PR comments * Only run IntervalDimFilter optimization on __time column * PR comments * Checkstyle fix * Add test for non __time column	2018-08-01 14:39:38 -07:00
Clint Wylie	297810e7a4	log correct moved count on balance instead of snapshot of currently moving (#6032 )	2018-08-01 03:36:10 -07:00
Roman Leventov	0754d78a2e	Prohibit Lists.newArrayList() with a single argument (#6068 ) * Prohibit Lists.newArrayList() with a single argument * Test fixes * Add Javadoc to Node constructor	2018-07-31 20:09:10 -07:00
Gian Merlino	3aa7017975	Remove some unnecessary task storage internal APIs. (#6058 ) * Remove some unnecessary task storage internal APIs. - Remove MetadataStorageActionHandler's getInactiveStatusesSince and getActiveEntriesWithStatus. - Remove TaskStorage's getCreatedDateTimeAndDataSource. - Remove TaskStorageQueryAdapter's getCreatedTime, and getCreatedDateAndDataSource. - Migrated all callers to getActiveTaskInfo and getCompletedTaskInfo. This has one side effect: since getActiveTaskInfo (new) warns and continues when it sees unreadable tasks, but getActiveEntriesWithStatus threw an exception when it encountered those, it means that after this patch bad tasks will be ignored when syncing from metadata storage rather than causing an exception to be thrown. IMO, this is an improvement, since the most likely reason for bad tasks is either: - A new version introduced an additional validation, and a pre-existing task doesn't pass it. - You are rolling back from a newer version to an older version. In both cases, I believe you would want to skip tasks that can't be deserialized, rather than blocking overlord startup. * Remove unused import. * Fix formatting. * Fix formatting.	2018-07-30 18:35:06 -07:00
Benedict Jin	331a0afb98	Remove redundant type parameters and enforce some other style and inspection rules (#5980 ) * Various changes about druid-services module * Patch improvements from reviewer * Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile * Fix ArraysAsListWithZeroOrOneArgument * Fix conflict * Fix ToArrayCallWithZeroLengthArrayArgument * Fix AliEqualsAvoidNull * Remove blank line * Remove unused import clauses * Fix code style in TopNQueryRunnerTest * Fix conflict * Don't use Collections.singletonList when converting the type of array type * Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase * Roll back the latest commit * Add java.io.File#toURL() into druid-forbidden-apis * Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord * Add a new regexp element into stylecode xml file * Fix style error for new regexp * Set the level of ArraysAsListWithZeroOrOneArgument as WARNING * Fix style error for new regexp * Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile * Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR * Add toArray(new Object[0]) regexp into checkstyle config file & fix them * Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it * Add a comment for string equals regexp in checkstyle config * Fix code format * Add RedundantTypeArguments as ERROR level inspection * Fix cannot resolve symbol datasource	2018-07-27 16:56:49 -05:00
Jihoon Son	1524af703d	Fix IllegalArgumentException in TaskLockBox.syncFromStorage() (#6050 )	2018-07-27 10:43:32 -07:00
kaijianding	7919e4d5df	move rangeSet compare into shardspec (#5688 )	2018-07-26 14:17:57 -07:00
Jihoon Son	5ee7b0cada	Synchronize scheduled poll() calls in SQLMetadataSegmentManager (#6041 ) Similar issue to https://github.com/apache/incubator-druid/issues/6028.	2018-07-24 22:57:30 -05:00
Roman Leventov	7d5eb0c21a	Synchronize scheduled poll() calls in SQLMetadataRuleManager to prevent flakiness in SqlMetadataRuleManagerTest (#6033 )	2018-07-24 12:00:48 -07:00
Surekha	414487a78e	Add support to filter on datasource for active tasks (#5998 ) * Add support to filter on datasource for active tasks * Added datasource filter to sql query for active tasks * Fixed unit tests * Address PR comments	2018-07-19 16:33:46 -07:00
Jihoon Son	4a2df2b23a	Log the full stack trace when an HTTP request fails (#6022 )	2018-07-19 12:05:46 -07:00
Jihoon Son	c48aa74a30	Fix NPE while handling CheckpointNotice in KafkaSupervisor (#5996 ) * Fix NPE while handling CheckpointNotice * fix code style * Fix test * fix test * add a log for creating a new taskGroup * fix backward compatibility in KafkaIOConfig	2018-07-13 17:14:57 -07:00
Clint Wylie	31c2179fe1	Coordinator fix balancer stuck (#5987 ) * this will fix it * filter destinations to not consider servers already serving segment * fix it * cleanup * fix opposite day in ImmutableDruidServer.equals * simplify	2018-07-11 20:19:11 -07:00
Clint Wylie	ac194cc082	Coordinator fix exception caused by additional logging (#5988 ) * fix explosion in curator load queue peon caused by additional logging, as well as annoying chatty log * remove log message	2018-07-11 16:13:32 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
Gian Merlino	24c20b4734	Forbid slashes in datasource names. (#5937 ) They are bad because datasources are used as paths on filesystems, and slashes invariably make things get stored improperly.	2018-07-05 09:49:16 -07:00
Clint Wylie	aa4987b871	change default compaction task target size from 800MB to 400MB to fall within range of what docs recommend for segment sizing (#5930 )	2018-07-05 00:12:31 -07:00
Jihoon Son	4cd14e8158	Proper handling of the exceptions from auto persisting in AppenderatorImpl.add() (#5932 )	2018-07-04 23:42:41 -07:00
Clint Wylie	39371b0ff8	More coordinator logging to help give context to load queue peon log messages (#5929 ) * more coordinator logging to help give context to load queue peon log messages * fix style * more chill load queue peon log messages	2018-07-04 23:40:25 -07:00
Clint Wylie	0a472d3fa0	coordinator slight optimze load rule to skip drop if numToDrop is 0 (#5928 )	2018-07-03 17:56:11 -07:00
Clint Wylie	d5a3871864	Coordinator fix balance to try to move max segments instead of up to max segments (#5927 ) * fix move to try to move max segments instead of "up to" max segments * fix * fix oops	2018-07-03 17:06:38 -07:00
Jihoon Son	1ccabab98e	Fix the broken Appenderator contract in KafkaIndexTask (#5905 ) * Fix broken Appenderator contract in KafkaIndexTask * fix build * add publishFuture * reuse sequenceToUse if possible	2018-07-03 13:31:29 -07:00
mhshimul	867f6a9e2b	Fix SQL Server select query in createInactiveStatusesSinceQuery() method. (#5901 ) * Fix SQL Server select query in createInactiveStatusesSinceQuery() method. SQL server does not support LIMIT N in select queries. Instead it has TOP N to limiting number of query results. And TOP N is already added in the select statement as per maxNumStatuses value. * Add parentheses for TOP in SELECT statement as SQL Servers no longer support TOP without parentheses.	2018-07-03 23:16:47 +05:30
Jihoon Son	b6c957b0d2	Allow reordered segment allocation in kafka indexing service (#5805 ) * Allow reordered segment allocation in kafka indexing service * address comments * fix a bug	2018-07-02 15:09:12 -07:00
Surekha	933b25416c	Handle task deserialization failure in the tasks api (#5911 ) If task payload fails to deserialize json to Java, make the task null and handle null task in OverlordResource	2018-06-29 11:57:48 -07:00
Gian Merlino	a28314349c	Fix spelling of "propagate" in various places. (#5896 ) One of these is a configuration parameter (introduced in #5429), but it's never been in a release, so I think it's ok to rename it.	2018-06-25 09:18:08 -07:00
George Paraskevas	4b111929ec	Fix typo lage->large , improve warning message (#5890 )	2018-06-22 17:33:02 -07:00
Clint Wylie	1a7adabf57	Coordinator segment balancer max load queue fix (#5888 ) * Coordinator segment balancer will now respect "maxSegmentsInNodeLoadingQueue" config * allow moves from full load queues * better variable names	2018-06-20 23:04:41 -07:00
Niketh Sabbineni	0982472c90	Use historical node instead of realtime for querying (#4764 ) * Use historical node instead of realtime for querying * Incorporated code review comments * Incorporate code review comments * Remove artifact comment * Consider non-historical nodes as realtime	2018-06-20 22:53:56 -07:00
Surekha	8619adb5b9	Improve task retrieval APIs on Overlord (#5801 ) * Add the new tasks api in overlordResource It takes 4 optional query params * state(pending/running/waiting/compelte) * dataSource * interval (applies to completed tasks) * maxCompletedTasks (applies to completed tasks) If all params are null, the api returns all the tasks * Add the state to each task returned by tasks endpoint * divide active tasks into waiting, pending or running * Add more unit tests * Add UNKNOWN state to TaskState * Fix the authorization calls * WIP: PR comments Added new class to capture task info for caching Other refactoring * Refactoring : move TaskStatus class to druid-api so it can be accessed within server And other related classes like TaskState and TaskStatusPlus are in api * Remove unused class and apis accessing it * Add a separate cache for recently completed tasks This is to mainly capture the task type from payload * Ignore a test * Add a RuntimeTaskState to encompass all states a task can be in * Revert "Add a RuntimeTaskState to encompass all states a task can be in" This reverts commit `2a527a0731`. * Fix wrong api call * Fix and unignore tests * Remove waiting,pending state from TaskState * Add RunnerTaskState * Missed the annotation runnerStatusCode * Fix the creationTime * Fix the createdTime and queueInsertionTime for running/active tasks * Clean up tests * Add javadocs * Potentially fix the teamcity build * Address PR comments Get rid of TaskInfoBuilder Make TaskInfoMapper static nested class Other changes fix import in MaterializedViewSupervisor after merge * Address PR comments on * Replace global cache with local map * combine multiple queries into one * Removed unused code * Fix unit tests Fix a bug in securedTaskStatusPlus * Remove getRecentlyFinishedTaskStatuses method Change TaskInfoMapper signature to add generic type * Address PR comments * Passed datasource as argument to be used in sql query * Other minor fixes * Address PR comments Some minor changes, rename method, spacing changes Add early auth check if datasource is not null * Fix test case * Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage * Add TaskLocation to Anytask object * Address PR comments * Fix a bug in test case causing ClassCastException	2018-06-19 11:34:59 -07:00
varaga	b4b1b2a020	Provisioning support for ZooKeeper Authorization (#5701 ) Review comments implemented	2018-06-15 14:02:01 -07:00
Jonathan Wei	dc67b77ec2	Immediately send 401 on basic HTTP authentication failure (#5856 ) * Immediately send 401 on basic HTTP authentication failure * Add unit tests	2018-06-14 10:23:10 -07:00
Jonathan Wei	24efbb054c	Fix inefficient available segment cache population in SQLMetadataSegmentManager (#5878 )	2018-06-12 18:53:30 -07:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
awelsh93	6f0aedd6ab	Fix defaultQueryTimeout (#5807 ) * Fix defaultQueryTimeout - set default timeout in query context before query fail time is evaluated Remove unused import * Address failing checks * Addressing code review comments * Removed line that was no longer used	2018-06-08 15:34:10 -07:00
Hongze Zhang	cfa94b747b	Update to jetty 9.4; Enable request decompression (#5624 ) * Update to jetty 9.4; Enable request decompression; Add http compression config options * Fix BadMessageException from jetty server at HttpGenerator.generateHeaders(...)	2018-06-08 14:53:08 -07:00
awelsh93	adbe22c05b	Security - add anonymous authenticator (#5842 ) * Anonymous authenticator that authenticates all requests and then directs them to an authorizer. * Adding documentation * Removed some fields from class AnonymousAuthenticator * Updating docs	2018-06-07 10:17:54 -07:00
Jonathan Wei	684b5d18c1	Moving averages for ingestion row stats (#5748 ) * Moving averages for ingestion row stats * PR comments * Make RowIngestionMeters extensible * test and checkstyle fixes * More PR comments * Fix metrics * Add some comments * PR comments * Comments	2018-06-05 09:08:57 -07:00
Michael Schnupp	33b4eb624d	fix freeSpacePercent in segmentCache.locations (#5765 ) * fix freeSpacePercent in segmentCache.locations * the check should probably test the other way around * documentation should put the option in the right place * examples have a superfluous backslash * add test to verify correct behavior * switch to Path and test with jimfs Path allows to use different filesystems. Jimfs provides an actual (in memory) filesystem. This also allows more complex test scenarios. The behavior should be unchanged by this commit. * Revert "switch to Path and test with jimfs" This reverts commit `8b9a418d65`.	2018-05-24 11:15:30 +09:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Dylan Wylie	c537ea56f6	Validate dataschema datasource (#5785 ) * Validate dataschema has a datasource * Fix tests * Use Guava Strings.isNullOrEmpty * Inverse nullempty check, whoops	2018-05-18 16:29:06 -07:00
Gian Merlino	f2cc6ce4d5	VersionedIntervalTimeline: Optimize construction with heavily populated holders. (#5777 ) * VersionedIntervalTimeline: Optimize construction with heavily populated holders. Each time a segment is "add"ed to a timeline, "isComplete" is called on the holder that it is added to. "isComplete" is an O(segments per chunk) operation, meaning that adding N segments to a chunk is an O(N^2) operation. This blows up badly if we have thousands of segments per chunk. The patch defers the "isComplete" check until after all segments have been inserted. * Fix imports.	2018-05-16 09:16:59 -07:00
Jihoon Son	9dca5ec76b	Simple cleanup for ThreadPoolTaskRunner and SetAndVerifyContextQueryRunner / Add ThreadPoolTaskRunnerTest (#5557 ) * Simple fix for ThreadPoolTaskRunner * fix build * address comments * update javadoc * fix build * fix test * add dependency	2018-05-15 22:53:11 +05:30
Surekha	2f8904e25f	Check against the real default of maxBytes(1/6 max mem) in AppenderatorImpl's add (#5758 ) * The check for maxBytesInMemory should be >= 0 instead of > 0 * if the default value is 0, the actual check could be skipped * fix the message for persistReasons * Address PR comments * if maxBytes set -1, make is Long.MAX_VAL, so we do not need to check if it's 0 or -1 * set the maxBytesTuningconfig in AppenderatorImpl constructor to avoid duplicate code * fix the failing test cases * Address PR comments	2018-05-09 13:41:51 -07:00
Jihoon Son	c7a59394e0	Consider waiting and pending compaction tasks as well as running tasks in DruidCoordinatorSegmentCompactor (#5704 ) * Consider waiting and pending compaction tasks as well as running tasks in DruidCoordinatorSegmentCompactor * fix build * fix logging	2018-05-08 19:03:54 -07:00
Kirill Kozlov	67d0b0ee42	Add taskType dimension to task metrics (#5664 )	2018-05-07 09:42:26 -07:00
Fokko Driesprong	a95ec92296	Move to the org.lz4 dependency (#5746 ) The net.jpountz.lz4 moved to org.lz4	2018-05-07 08:16:45 -07:00
Slim Bouguerra	8aa8d9fa5b	Kerberos Spnego Authentication Router Issue (#5706 ) * Adding decoration method to proxy servlet Change-Id: I872f9282fb60bfa20524271535980a36a87b9621 * moving the proxy request decoration to authenticators Change-Id: I7f94b9ff5ecf08e8abf7169b58bc410f33148448 * added docs Change-Id: I901543e52f0faf4666bfea6256a7c05593b1ae70 * use the authentication result to decorate request Change-Id: I052650de9cd02b4faefdbcdaf2332dd3b2966af5 * adding authenticated by name Change-Id: I074d2933460165feeddb19352eac9bd0f96f42ca * ensure that authenticator is not null Change-Id: Idb58e308f90db88224a06f3759114872165b24f5 * fix types and minor bug Change-Id: I6801d49a05d5d8324406fc0280286954eb66db10 * fix typo Change-Id: I390b12af74f44d760d0812a519125fbf0df4e97b * use actual type names Change-Id: I62c3ee763363781e52809ec912aafd50b8486b8e * set authenitcatedBy to null for AutheticationResults created by Escalator. Change-Id: I4a675c372f59ebd8a8d19c61b85a1e4bf227a8ba	2018-05-05 20:33:51 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Stuart McLean	c2b5e5ec95	Default caffeine cache size (#5738 ) * add default caffeine cache size based on runtime Xmx or max 1GB * update docs for caffeine cache * fix formatting * test caffeine size should never be less than 0 * set caffeine max default size to 1G not 1M * fix caffeine cache tests	2018-05-04 09:29:11 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
Gian Merlino	df01998213	SegmentLoadDropHandler: Fix deadlock when segments have errors loading on startup. (#5735 ) The "lock" object was used to synchronize start/stop as well as synchronize removals from segmentsToDelete (when a segment is done dropping). This could cause a deadlock if a segment-load throws an exception during loadLocalCache. loadLocalCache is run by start() while it holds the lock, but then it spawns loading threads, and those threads will try to acquire the "segmentsToDelete" lock if they want to drop a corrupt segments. I don't see any reason for these two locks to be the same lock, so I split them.	2018-05-03 09:59:01 -07:00
Jihoon Son	2c8296f94d	Fix Appenderator.push() to commit the metadata of all segments (#5730 ) * Remove persist from Appenderator * fix javadoc	2018-05-02 13:17:54 -07:00
Jihoon Son	d4311b4a5a	Support enablePathStyleAccess, disableChunkedEncoding, and forceGlobalBucketAccessEnabled for aws client (#5702 ) * Support enablePathStyleAccess and disableChunkedEncoding for aws client * add an option for forceGlobalBucketAccessEnabled * add missing doc	2018-05-02 10:45:38 -07:00
David Lim	8ec2d2fe18	Use unique segment paths for Kafka indexing (#5692 ) * support unique segment file paths * forbiddenapis * code review changes * code review changes * code review changes * checkstyle fix	2018-04-29 21:59:48 -07:00
Roman Leventov	9be000758d	Refactor index merging, replace Rowboats with RowIterators and RowPointers (#5335 ) * Refactor index merging, replace Rowboats with RowIterators and RowPointers * Add javadocs * Fix a bug in QueryableIndexIndexableAdapter * Fixes * Remove unused declarations * Remove unused GenericColumn.isNull() method * Fix test * Address comments * Rearrange some code in MergingRowIterator for more clarity * Self-review * Fix style * Improve docs * Fix docs * Rename IndexMergerV9.writeDimValueAndSetupDimConversion to setUpDimConversion() * Update Javadocs * Minor fixes * Doc fixes, more code comments, cleanup of RowCombiningTimeAndDimsIterator * Fix doc link	2018-04-27 17:34:32 -07:00
David Lim	55b003e5e8	Fix loadstatus?full double counting expected segments (#5667 ) * fix loadstatus?full double counting expected segments * remove possible flakiness from Thread.sleep() in test	2018-04-24 01:11:16 +05:30
Roman Leventov	a3a9ada843	Add GenericWhitespace checkstyle check (#5668 )	2018-04-24 01:09:14 +05:30
Jihoon Son	ca3f833426	Fix coordinator's dataSource api with full parameter (#5662 ) * Fix coordinator's dataSource api with full parameter * address comment * Add a constructor for json serde and fix result order * Change to immutableSortedMap * Revert immutableSortedMap to treeMap	2018-04-19 17:41:53 -07:00
Kirill Kozlov	a7ba2bf275	Detailed error message when unable to create temp dir (#5648 )	2018-04-17 15:12:46 -07:00
Jonathan Wei	d0b66a6af5	Fix HTTP OPTIONS request auth handling (#5638 ) * Fix HTTP OPTIONS request auth handling * PR comment * More PR comments * Fix * PR comment	2018-04-16 18:09:56 -07:00
Jonathan Wei	882b172318	Revert "Fix HTTP OPTIONS request auth handling (#5615 )" (#5637 ) This reverts commit `df51a7bcb7`.	2018-04-12 16:43:54 -07:00
Jonathan Wei	e91add6843	Fix coordinator loadStatus performance (#5632 ) * Optimize coordinator loadStatus * Add comment * Fix teamcity * Checkstyle * More checkstyle * Checkstyle	2018-04-12 15:07:52 -07:00
Jonathan Wei	df51a7bcb7	Fix HTTP OPTIONS request auth handling (#5615 ) * Fix HTTP OPTIONS request auth handling * Flip configuration boolean	2018-04-12 14:02:20 -07:00
Gian Merlino	d0400a0688	SegmentWithState: Add toString method. (#5635 ) The class appears in log messages, and the default toString method isn't very informative.	2018-04-12 14:01:09 -05:00
palanieppan-m	dbea5cb9b7	Load rules should honor partial overlap (#5595 ) Load rules should load segments that partially overlap with rule window, instead of loading only segments that fully overlap.	2018-04-12 09:46:00 -07:00
Atul Mohan	19f359957f	Add getters for AlertEvent (#5522 ) * Add getters for AlertEvent * Move PublicApi and ExtensionPoint to java-util * Fix publicapi annotation usage * Add publicapi annotations to ServiceMetricEvent and RequestLogEvent	2018-04-12 23:38:20 +07:00
Nishant Bangarwa	e6efd75a3d	Add config to allow setting up custom unsecured paths for druid nodes. (#5614 ) * Add config to allow setting up custom unsecured paths for druid nodes. * return all resources for Unsecured paths * review comment - Add test * fix tests * fix test	2018-04-11 17:10:07 -07:00
Clint Wylie	ea4f8544fb	revert lambda conversion to fix occasional jvm error (#5591 )	2018-04-06 14:18:55 -07:00
Gian Merlino	5ab17668c0	CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586 ) Also switch various firehoses to the new method. Fixes #5585.	2018-04-06 08:06:45 -07:00
Niketh Sabbineni	270fd1ea15	Allow getDomain to return disjointed intervals (#5570 ) * Allow getDomain to return disjointed intervals * Indentation issues	2018-04-05 22:12:30 -07:00
Jonathan Wei	969342cd28	More error reporting and stats for ingestion tasks (#5418 ) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments	2018-04-05 21:38:57 -07:00
Niketh Sabbineni	f0a94f5035	Remove unused config (#5564 ) * Remove unused config * Fix failing tests	2018-04-03 13:23:46 -07:00
Clint Wylie	f31dba6c5b	Coordinator drop segment selection through cost balancer (#5529 ) * drop selection through cost balancer * use collections.emptyIterator * add test to ensure does not drop from server with larger loading queue with cost balancer * javadocs and comments to clear things up * random drop for completeness	2018-04-03 11:22:51 -07:00
Clint Wylie	a81ae99021	add 'stopped' check and handling to HttpLoadQueuePeon load and drop segment methods (#5555 ) * add stopped check and handling to HttpLoadQueuePeon load and drop segment methods * fix unrelated timeout :( * revert unintended change * PR feedback: change logging * fix dumb	2018-04-03 11:21:52 -07:00
Clint Wylie	6feac204e3	Coordinator primary segment assignment fix (#5532 ) * fix issue where assign primary assigns segments to all historical servers in cluster * fix test * add test to ensure primary assignment will not assign to another server while loading is in progress	2018-04-02 09:40:20 -07:00
Jihoon Son	05547e29b2	Fix SQLMetadataSegmentManager to allow succesive start and stop (#5554 ) * Fix SQLMetadataSegmentManager to allow succesive start and stop * address comment * add synchronization	2018-03-30 12:43:19 -07:00

... 4 5 6 7 8 ...

3648 Commits