druid

mirror of https://github.com/apache/druid.git synced 2025-02-08 02:58:30 +00:00

Author	SHA1	Message	Date
Jihoon Son	ecee3e0a24	Further optimize memory for Travis jobs (#6150 ) * Further optimize memory for Travis jobs * fix build * sudo false	2018-08-10 22:03:36 -07:00
Christoph Hösler	1a37dfdcd1	Fetch unhandled curator exceptions (#6131 ) * fix: stop druid on unhandled curator exceptions * catch exceptions when stopping lifecycle	2018-08-09 21:47:42 -07:00
Jihoon Son	d6a02de5b5	Add support 'keepSegmentGranularity' for compactionTask (#6095 ) * Add keepSegmentGranularity for compactionTask * fix build * createIoConfig method * fix build * fix build * address comments * fix build	2018-08-09 13:51:20 -07:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Jonathan Wei	b9c445c780	Optimize filtered aggs with interval filters in per-segment queries (#5857 ) * Optimize per-segment queries * Always optimize, add unit test * PR comments * Only run IntervalDimFilter optimization on __time column * PR comments * Checkstyle fix * Add test for non __time column	2018-08-01 14:39:38 -07:00
Clint Wylie	297810e7a4	log correct moved count on balance instead of snapshot of currently moving (#6032 )	2018-08-01 03:36:10 -07:00
Roman Leventov	0754d78a2e	Prohibit Lists.newArrayList() with a single argument (#6068 ) * Prohibit Lists.newArrayList() with a single argument * Test fixes * Add Javadoc to Node constructor	2018-07-31 20:09:10 -07:00
Gian Merlino	3aa7017975	Remove some unnecessary task storage internal APIs. (#6058 ) * Remove some unnecessary task storage internal APIs. - Remove MetadataStorageActionHandler's getInactiveStatusesSince and getActiveEntriesWithStatus. - Remove TaskStorage's getCreatedDateTimeAndDataSource. - Remove TaskStorageQueryAdapter's getCreatedTime, and getCreatedDateAndDataSource. - Migrated all callers to getActiveTaskInfo and getCompletedTaskInfo. This has one side effect: since getActiveTaskInfo (new) warns and continues when it sees unreadable tasks, but getActiveEntriesWithStatus threw an exception when it encountered those, it means that after this patch bad tasks will be ignored when syncing from metadata storage rather than causing an exception to be thrown. IMO, this is an improvement, since the most likely reason for bad tasks is either: - A new version introduced an additional validation, and a pre-existing task doesn't pass it. - You are rolling back from a newer version to an older version. In both cases, I believe you would want to skip tasks that can't be deserialized, rather than blocking overlord startup. * Remove unused import. * Fix formatting. * Fix formatting.	2018-07-30 18:35:06 -07:00
Benedict Jin	331a0afb98	Remove redundant type parameters and enforce some other style and inspection rules (#5980 ) * Various changes about druid-services module * Patch improvements from reviewer * Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile * Fix ArraysAsListWithZeroOrOneArgument * Fix conflict * Fix ToArrayCallWithZeroLengthArrayArgument * Fix AliEqualsAvoidNull * Remove blank line * Remove unused import clauses * Fix code style in TopNQueryRunnerTest * Fix conflict * Don't use Collections.singletonList when converting the type of array type * Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase * Roll back the latest commit * Add java.io.File#toURL() into druid-forbidden-apis * Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord * Add a new regexp element into stylecode xml file * Fix style error for new regexp * Set the level of ArraysAsListWithZeroOrOneArgument as WARNING * Fix style error for new regexp * Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile * Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR * Add toArray(new Object[0]) regexp into checkstyle config file & fix them * Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it * Add a comment for string equals regexp in checkstyle config * Fix code format * Add RedundantTypeArguments as ERROR level inspection * Fix cannot resolve symbol datasource	2018-07-27 16:56:49 -05:00
Jihoon Son	1524af703d	Fix IllegalArgumentException in TaskLockBox.syncFromStorage() (#6050 )	2018-07-27 10:43:32 -07:00
kaijianding	7919e4d5df	move rangeSet compare into shardspec (#5688 )	2018-07-26 14:17:57 -07:00
Jihoon Son	5ee7b0cada	Synchronize scheduled poll() calls in SQLMetadataSegmentManager (#6041 ) Similar issue to https://github.com/apache/incubator-druid/issues/6028.	2018-07-24 22:57:30 -05:00
Roman Leventov	7d5eb0c21a	Synchronize scheduled poll() calls in SQLMetadataRuleManager to prevent flakiness in SqlMetadataRuleManagerTest (#6033 )	2018-07-24 12:00:48 -07:00
Surekha	414487a78e	Add support to filter on datasource for active tasks (#5998 ) * Add support to filter on datasource for active tasks * Added datasource filter to sql query for active tasks * Fixed unit tests * Address PR comments	2018-07-19 16:33:46 -07:00
Jihoon Son	4a2df2b23a	Log the full stack trace when an HTTP request fails (#6022 )	2018-07-19 12:05:46 -07:00
Jihoon Son	c48aa74a30	Fix NPE while handling CheckpointNotice in KafkaSupervisor (#5996 ) * Fix NPE while handling CheckpointNotice * fix code style * Fix test * fix test * add a log for creating a new taskGroup * fix backward compatibility in KafkaIOConfig	2018-07-13 17:14:57 -07:00
Clint Wylie	31c2179fe1	Coordinator fix balancer stuck (#5987 ) * this will fix it * filter destinations to not consider servers already serving segment * fix it * cleanup * fix opposite day in ImmutableDruidServer.equals * simplify	2018-07-11 20:19:11 -07:00
Clint Wylie	ac194cc082	Coordinator fix exception caused by additional logging (#5988 ) * fix explosion in curator load queue peon caused by additional logging, as well as annoying chatty log * remove log message	2018-07-11 16:13:32 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
Gian Merlino	24c20b4734	Forbid slashes in datasource names. (#5937 ) They are bad because datasources are used as paths on filesystems, and slashes invariably make things get stored improperly.	2018-07-05 09:49:16 -07:00
Clint Wylie	aa4987b871	change default compaction task target size from 800MB to 400MB to fall within range of what docs recommend for segment sizing (#5930 )	2018-07-05 00:12:31 -07:00
Jihoon Son	4cd14e8158	Proper handling of the exceptions from auto persisting in AppenderatorImpl.add() (#5932 )	2018-07-04 23:42:41 -07:00
Clint Wylie	39371b0ff8	More coordinator logging to help give context to load queue peon log messages (#5929 ) * more coordinator logging to help give context to load queue peon log messages * fix style * more chill load queue peon log messages	2018-07-04 23:40:25 -07:00
Clint Wylie	0a472d3fa0	coordinator slight optimze load rule to skip drop if numToDrop is 0 (#5928 )	2018-07-03 17:56:11 -07:00
Clint Wylie	d5a3871864	Coordinator fix balance to try to move max segments instead of up to max segments (#5927 ) * fix move to try to move max segments instead of "up to" max segments * fix * fix oops	2018-07-03 17:06:38 -07:00
Jihoon Son	1ccabab98e	Fix the broken Appenderator contract in KafkaIndexTask (#5905 ) * Fix broken Appenderator contract in KafkaIndexTask * fix build * add publishFuture * reuse sequenceToUse if possible	2018-07-03 13:31:29 -07:00
mhshimul	867f6a9e2b	Fix SQL Server select query in createInactiveStatusesSinceQuery() method. (#5901 ) * Fix SQL Server select query in createInactiveStatusesSinceQuery() method. SQL server does not support LIMIT N in select queries. Instead it has TOP N to limiting number of query results. And TOP N is already added in the select statement as per maxNumStatuses value. * Add parentheses for TOP in SELECT statement as SQL Servers no longer support TOP without parentheses.	2018-07-03 23:16:47 +05:30
Jihoon Son	b6c957b0d2	Allow reordered segment allocation in kafka indexing service (#5805 ) * Allow reordered segment allocation in kafka indexing service * address comments * fix a bug	2018-07-02 15:09:12 -07:00
Surekha	933b25416c	Handle task deserialization failure in the tasks api (#5911 ) If task payload fails to deserialize json to Java, make the task null and handle null task in OverlordResource	2018-06-29 11:57:48 -07:00
Gian Merlino	a28314349c	Fix spelling of "propagate" in various places. (#5896 ) One of these is a configuration parameter (introduced in #5429), but it's never been in a release, so I think it's ok to rename it.	2018-06-25 09:18:08 -07:00
George Paraskevas	4b111929ec	Fix typo lage->large , improve warning message (#5890 )	2018-06-22 17:33:02 -07:00
Clint Wylie	1a7adabf57	Coordinator segment balancer max load queue fix (#5888 ) * Coordinator segment balancer will now respect "maxSegmentsInNodeLoadingQueue" config * allow moves from full load queues * better variable names	2018-06-20 23:04:41 -07:00
Niketh Sabbineni	0982472c90	Use historical node instead of realtime for querying (#4764 ) * Use historical node instead of realtime for querying * Incorporated code review comments * Incorporate code review comments * Remove artifact comment * Consider non-historical nodes as realtime	2018-06-20 22:53:56 -07:00
Surekha	8619adb5b9	Improve task retrieval APIs on Overlord (#5801 ) * Add the new tasks api in overlordResource It takes 4 optional query params * state(pending/running/waiting/compelte) * dataSource * interval (applies to completed tasks) * maxCompletedTasks (applies to completed tasks) If all params are null, the api returns all the tasks * Add the state to each task returned by tasks endpoint * divide active tasks into waiting, pending or running * Add more unit tests * Add UNKNOWN state to TaskState * Fix the authorization calls * WIP: PR comments Added new class to capture task info for caching Other refactoring * Refactoring : move TaskStatus class to druid-api so it can be accessed within server And other related classes like TaskState and TaskStatusPlus are in api * Remove unused class and apis accessing it * Add a separate cache for recently completed tasks This is to mainly capture the task type from payload * Ignore a test * Add a RuntimeTaskState to encompass all states a task can be in * Revert "Add a RuntimeTaskState to encompass all states a task can be in" This reverts commit 2a527a0731a064dc0f15cf2ba3dfc5f639c6e182. * Fix wrong api call * Fix and unignore tests * Remove waiting,pending state from TaskState * Add RunnerTaskState * Missed the annotation runnerStatusCode * Fix the creationTime * Fix the createdTime and queueInsertionTime for running/active tasks * Clean up tests * Add javadocs * Potentially fix the teamcity build * Address PR comments Get rid of TaskInfoBuilder Make TaskInfoMapper static nested class Other changes fix import in MaterializedViewSupervisor after merge * Address PR comments on * Replace global cache with local map * combine multiple queries into one * Removed unused code * Fix unit tests Fix a bug in securedTaskStatusPlus * Remove getRecentlyFinishedTaskStatuses method Change TaskInfoMapper signature to add generic type * Address PR comments * Passed datasource as argument to be used in sql query * Other minor fixes * Address PR comments Some minor changes, rename method, spacing changes Add early auth check if datasource is not null * Fix test case * Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage * Add TaskLocation to Anytask object * Address PR comments * Fix a bug in test case causing ClassCastException	2018-06-19 11:34:59 -07:00
varaga	b4b1b2a020	Provisioning support for ZooKeeper Authorization (#5701 ) Review comments implemented	2018-06-15 14:02:01 -07:00
Jonathan Wei	dc67b77ec2	Immediately send 401 on basic HTTP authentication failure (#5856 ) * Immediately send 401 on basic HTTP authentication failure * Add unit tests	2018-06-14 10:23:10 -07:00
Jonathan Wei	24efbb054c	Fix inefficient available segment cache population in SQLMetadataSegmentManager (#5878 )	2018-06-12 18:53:30 -07:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
awelsh93	6f0aedd6ab	Fix defaultQueryTimeout (#5807 ) * Fix defaultQueryTimeout - set default timeout in query context before query fail time is evaluated Remove unused import * Address failing checks * Addressing code review comments * Removed line that was no longer used	2018-06-08 15:34:10 -07:00
Hongze Zhang	cfa94b747b	Update to jetty 9.4; Enable request decompression (#5624 ) * Update to jetty 9.4; Enable request decompression; Add http compression config options * Fix BadMessageException from jetty server at HttpGenerator.generateHeaders(...)	2018-06-08 14:53:08 -07:00
awelsh93	adbe22c05b	Security - add anonymous authenticator (#5842 ) * Anonymous authenticator that authenticates all requests and then directs them to an authorizer. * Adding documentation * Removed some fields from class AnonymousAuthenticator * Updating docs	2018-06-07 10:17:54 -07:00
Jonathan Wei	684b5d18c1	Moving averages for ingestion row stats (#5748 ) * Moving averages for ingestion row stats * PR comments * Make RowIngestionMeters extensible * test and checkstyle fixes * More PR comments * Fix metrics * Add some comments * PR comments * Comments	2018-06-05 09:08:57 -07:00
Michael Schnupp	33b4eb624d	fix freeSpacePercent in segmentCache.locations (#5765 ) * fix freeSpacePercent in segmentCache.locations * the check should probably test the other way around * documentation should put the option in the right place * examples have a superfluous backslash * add test to verify correct behavior * switch to Path and test with jimfs Path allows to use different filesystems. Jimfs provides an actual (in memory) filesystem. This also allows more complex test scenarios. The behavior should be unchanged by this commit. * Revert "switch to Path and test with jimfs" This reverts commit 8b9a418d65a42a3adb87756967e780442484a9d9.	2018-05-24 11:15:30 +09:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Dylan Wylie	c537ea56f6	Validate dataschema datasource (#5785 ) * Validate dataschema has a datasource * Fix tests * Use Guava Strings.isNullOrEmpty * Inverse nullempty check, whoops	2018-05-18 16:29:06 -07:00
Gian Merlino	f2cc6ce4d5	VersionedIntervalTimeline: Optimize construction with heavily populated holders. (#5777 ) * VersionedIntervalTimeline: Optimize construction with heavily populated holders. Each time a segment is "add"ed to a timeline, "isComplete" is called on the holder that it is added to. "isComplete" is an O(segments per chunk) operation, meaning that adding N segments to a chunk is an O(N^2) operation. This blows up badly if we have thousands of segments per chunk. The patch defers the "isComplete" check until after all segments have been inserted. * Fix imports.	2018-05-16 09:16:59 -07:00
Jihoon Son	9dca5ec76b	Simple cleanup for ThreadPoolTaskRunner and SetAndVerifyContextQueryRunner / Add ThreadPoolTaskRunnerTest (#5557 ) * Simple fix for ThreadPoolTaskRunner * fix build * address comments * update javadoc * fix build * fix test * add dependency	2018-05-15 22:53:11 +05:30

1 2 3 4 5 ...

3182 Commits