druid

Commit Graph

Author	SHA1	Message	Date
Roman Leventov	3ae563263a	Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957 ) This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR. Some of the changes: - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names. - Generified `ComplexMetricExtractor`	2018-10-02 14:50:22 -03:00
Jihoon Son	122caec7b1	Add support targetCompactionSizeBytes for compactionTask (#6203 ) * Add support targetCompactionSizeBytes for compactionTask * fix test * fix a bug in keepSegmentGranularity * fix wrong noinspection comment * address comments	2018-09-28 11:16:35 -07:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Benedict Jin	3647d4c94a	Make time-related variables more readable (#6158 ) * Make time-related variables more readable * Patch some improvements from the code reviewer * Remove unnecessary boxing of Long type variables	2018-08-21 15:29:40 -07:00
Gian Merlino	4d2ff0f6c7	Serde test for JdbcExtractionNamespace. (#6186 )	2018-08-17 11:54:06 -04:00
Jihoon Son	ecee3e0a24	Further optimize memory for Travis jobs (#6150 ) * Further optimize memory for Travis jobs * fix build * sudo false	2018-08-10 22:03:36 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Roman Leventov	0754d78a2e	Prohibit Lists.newArrayList() with a single argument (#6068 ) * Prohibit Lists.newArrayList() with a single argument * Test fixes * Add Javadoc to Node constructor	2018-07-31 20:09:10 -07:00
Gian Merlino	3aa7017975	Remove some unnecessary task storage internal APIs. (#6058 ) * Remove some unnecessary task storage internal APIs. - Remove MetadataStorageActionHandler's getInactiveStatusesSince and getActiveEntriesWithStatus. - Remove TaskStorage's getCreatedDateTimeAndDataSource. - Remove TaskStorageQueryAdapter's getCreatedTime, and getCreatedDateAndDataSource. - Migrated all callers to getActiveTaskInfo and getCompletedTaskInfo. This has one side effect: since getActiveTaskInfo (new) warns and continues when it sees unreadable tasks, but getActiveEntriesWithStatus threw an exception when it encountered those, it means that after this patch bad tasks will be ignored when syncing from metadata storage rather than causing an exception to be thrown. IMO, this is an improvement, since the most likely reason for bad tasks is either: - A new version introduced an additional validation, and a pre-existing task doesn't pass it. - You are rolling back from a newer version to an older version. In both cases, I believe you would want to skip tasks that can't be deserialized, rather than blocking overlord startup. * Remove unused import. * Fix formatting. * Fix formatting.	2018-07-30 18:35:06 -07:00
Benedict Jin	331a0afb98	Remove redundant type parameters and enforce some other style and inspection rules (#5980 ) * Various changes about druid-services module * Patch improvements from reviewer * Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile * Fix ArraysAsListWithZeroOrOneArgument * Fix conflict * Fix ToArrayCallWithZeroLengthArrayArgument * Fix AliEqualsAvoidNull * Remove blank line * Remove unused import clauses * Fix code style in TopNQueryRunnerTest * Fix conflict * Don't use Collections.singletonList when converting the type of array type * Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase * Roll back the latest commit * Add java.io.File#toURL() into druid-forbidden-apis * Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord * Add a new regexp element into stylecode xml file * Fix style error for new regexp * Set the level of ArraysAsListWithZeroOrOneArgument as WARNING * Fix style error for new regexp * Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile * Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR * Add toArray(new Object[0]) regexp into checkstyle config file & fix them * Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it * Add a comment for string equals regexp in checkstyle config * Fix code format * Add RedundantTypeArguments as ERROR level inspection * Fix cannot resolve symbol datasource	2018-07-27 16:56:49 -05:00
Surekha	414487a78e	Add support to filter on datasource for active tasks (#5998 ) * Add support to filter on datasource for active tasks * Added datasource filter to sql query for active tasks * Fixed unit tests * Address PR comments	2018-07-19 16:33:46 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
Jihoon Son	d1d9358274	Increase timeout for BlockingPoolTest (#5959 )	2018-07-06 16:34:53 -07:00
Jihoon Son	4cd14e8158	Proper handling of the exceptions from auto persisting in AppenderatorImpl.add() (#5932 )	2018-07-04 23:42:41 -07:00
Surekha	8619adb5b9	Improve task retrieval APIs on Overlord (#5801 ) * Add the new tasks api in overlordResource It takes 4 optional query params * state(pending/running/waiting/compelte) * dataSource * interval (applies to completed tasks) * maxCompletedTasks (applies to completed tasks) If all params are null, the api returns all the tasks * Add the state to each task returned by tasks endpoint * divide active tasks into waiting, pending or running * Add more unit tests * Add UNKNOWN state to TaskState * Fix the authorization calls * WIP: PR comments Added new class to capture task info for caching Other refactoring * Refactoring : move TaskStatus class to druid-api so it can be accessed within server And other related classes like TaskState and TaskStatusPlus are in api * Remove unused class and apis accessing it * Add a separate cache for recently completed tasks This is to mainly capture the task type from payload * Ignore a test * Add a RuntimeTaskState to encompass all states a task can be in * Revert "Add a RuntimeTaskState to encompass all states a task can be in" This reverts commit `2a527a0731`. * Fix wrong api call * Fix and unignore tests * Remove waiting,pending state from TaskState * Add RunnerTaskState * Missed the annotation runnerStatusCode * Fix the creationTime * Fix the createdTime and queueInsertionTime for running/active tasks * Clean up tests * Add javadocs * Potentially fix the teamcity build * Address PR comments Get rid of TaskInfoBuilder Make TaskInfoMapper static nested class Other changes fix import in MaterializedViewSupervisor after merge * Address PR comments on * Replace global cache with local map * combine multiple queries into one * Removed unused code * Fix unit tests Fix a bug in securedTaskStatusPlus * Remove getRecentlyFinishedTaskStatuses method Change TaskInfoMapper signature to add generic type * Address PR comments * Passed datasource as argument to be used in sql query * Other minor fixes * Address PR comments Some minor changes, rename method, spacing changes Add early auth check if datasource is not null * Fix test case * Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage * Add TaskLocation to Anytask object * Address PR comments * Fix a bug in test case causing ClassCastException	2018-06-19 11:34:59 -07:00
Jonathan Wei	684b5d18c1	Moving averages for ingestion row stats (#5748 ) * Moving averages for ingestion row stats * PR comments * Make RowIngestionMeters extensible * test and checkstyle fixes * More PR comments * Fix metrics * Add some comments * PR comments * Comments	2018-06-05 09:08:57 -07:00
Gian Merlino	f2cc6ce4d5	VersionedIntervalTimeline: Optimize construction with heavily populated holders. (#5777 ) * VersionedIntervalTimeline: Optimize construction with heavily populated holders. Each time a segment is "add"ed to a timeline, "isComplete" is called on the holder that it is added to. "isComplete" is an O(segments per chunk) operation, meaning that adding N segments to a chunk is an O(N^2) operation. This blows up badly if we have thousands of segments per chunk. The patch defers the "isComplete" check until after all segments have been inserted. * Fix imports.	2018-05-16 09:16:59 -07:00
Gian Merlino	d8effff30b	PartitionHolder: Early return from isComplete when we find an end. (#5778 ) * PartitionHolder: Early return from isComplete when we find an end. Holders are complete if they have a start, sequence of abutting objects, and then an end. There isn't any reason to check whether or not the objects _after_ the end are abutting (the extensible set). This is really a performance patch, since behavior shouldn't be changing. The extensible shardSpecs (where we could have shards after the end) are always abutting each other anyway. Performance doesn't usually matter much in this function, but it can when there are thousands of segments per time chunk. * Remove endSeen	2018-05-16 09:16:50 -07:00
Jihoon Son	86746f82d8	Use mergeBuffer instead of processingBuffer in parallelCombiner (#5634 ) * Use mergeBuffer instead of processingBuffer in parallelCombiner * Fix test * address comments * fix test * Fix test * Update comment * address comments * fix build * Fix test failure	2018-04-27 18:14:37 -07:00
Jonathan Wei	969342cd28	More error reporting and stats for ingestion tasks (#5418 ) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments	2018-04-05 21:38:57 -07:00
Jihoon Son	05547e29b2	Fix SQLMetadataSegmentManager to allow succesive start and stop (#5554 ) * Fix SQLMetadataSegmentManager to allow succesive start and stop * address comment * add synchronization	2018-03-30 12:43:19 -07:00
Jihoon Son	1ad898bde2	Use the official aws-sdk instead of jet3t (#5382 ) * Use the official aws-sdk instead of jet3t * fix compile and serde tests * address comments and fix test * add http version string * remove redundant dependencies, fix potential NPE, and fix test * resolve TODOs * fix build * downgrade jackson version to 2.6.7 * fix test * resolve the last TODO * support proxy and endpoint configurations * fix build * remove debugging log * downgrade hadoop version to 2.8.3 * fix tests * remove unused log * fix it test * revert KerberosAuthenticator change * change hadoop-aws scope to provided in hdfs-storage * address comments * address comments	2018-03-21 15:36:54 -07:00
Charles Allen	58f110f7f8	Future-proof some Guava usage (#5414 ) * Future-proof some Guava usage * Use a java-util EmptyIterator instead of Guava's * Change some of the guava future handling to do manual async transforms. Guava changes transform into transformAsync by deprecating transform in ONLY Guava 19. Then its gone in 20 * Use `Collections.emptyIterator()` * Pretty formatting * Make listenable future transforms a thing in default druid * Format fix * Add forbidden guava apis * Make the ListenableFutrues.transformAsync have comments * Undo intellij bad pattern matching in comments * Futrues --> Futures * Add empty iterators forbidding * Fix extra `A` * Correct method signature * Address review comments * Finish Gian review comments * Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax	2018-03-20 08:59:33 -07:00
Roman Leventov	693e3575f9	Remove unused code and exception declarations (#5461 ) * Remove unused code and exception declarations * Address comments * Remove redundant Exception declarations * Make FirehoseFactoryV2.connect() to throw IOException again	2018-03-16 22:11:12 +01:00
Nishant Bangarwa	219e77aeac	SQL compatible Null Handling Part - Expressions and Storage Changes (#5278 ) * SQL compatible Null Handling Part - Expressions, Storage and Dimension Selector Changes fix travis strict compilation * fix teamcity error - remove unused method * review comments * review comments * more comments * review comments * review comments * Optimize isNull method * Optimize isNull in ColumnarFloats/Longs/Doubles * review comment - separate classes for null and non-null columns fix intellij inspection * remove unused import * More Review comments * improve comment * More review comments * fix checkstyle * more review comments * review comments. fix javadoc links remove Nullable from ConstantColumnValueSelector * review comments. * satisfy teamcity inspections	2018-02-21 13:27:26 +01:00
Roman Leventov	e64ffb10c2	Standartize on using Integer.BYTES instead of Ints.BYTES from Guava, same for other primitives (#5366 )	2018-02-07 13:24:30 -08:00
Jihoon Son	2099b43e5f	Add a new config object for compactConfig (#5264 ) * add a new config object for compactConfig * fix test * address comments * Update doc	2018-02-06 12:13:52 -08:00
Gian Merlino	7e02408510	Update versions to 0.13.0-SNAPSHOT. (#5323 )	2018-02-02 12:06:38 -06:00
Jihoon Son	241efafbb2	Automatic compaction by coordinators (#5102 ) * Automatic compaction by coordinator * add links * skip compaction for very recent segments if they are small * fix finding search interval * fix finding search interval * fix TimelineHolder iteration * add test for newestSegmentFirstPolicy * add CompactionSegmentIterator * add numTargetCompactionSegments * add missing config * fix skipping huge shards * fix handling large number of segments per shard * fix test failure * change recursive call to loop * fix logging * fix build * fix test failure * address comments * change dataSources type * check running pendingTasks at each run * fix test * address comments * fix build * fix test * address comments * address comments * add doc for segment size optimization * address comment	2018-01-13 13:52:37 +09:00
Roman Leventov	8877ce38d6	Enforce modifier order with Checkstyle (#5246 )	2018-01-11 09:50:42 +01:00
Roman Leventov	579f9fbedf	Add IndexedInts.debugToString() and AbstractIndex.toString(); Add Sequence.toList() and limit() (#5175 ) * Add IndexedInts.debugToString() and AbstractIndex.toString() * Fix AppenderatorTest	2018-01-04 09:56:47 +09:00
Jihoon Son	9199d61389	Automatic pendingSegments cleanup (#5149 ) * PendingSegments cleanup * fix build * address comments * address comments * fix potential npe * address comments * fix build * fix test * fix test	2017-12-20 14:46:34 -08:00
Roman Leventov	5787d04fad	Bump Druid version to 0.12.0 (#5138 )	2017-12-15 07:37:01 -08:00
Jonathan Wei	f48c9d7be1	Basic auth extension (#5099 ) * Basic auth extension * Add auth configuration integration test * Fix missing authorizerName property * PR comments * Fix missing @JsonProperty annotation * PR comments * more PR comments	2017-12-14 10:36:04 -08:00
Roman Leventov	a7a6a0487e	Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762 ) * Replace IOPeon with OutputMedium; Improve compression * Fix test * Cleanup CompressionStrategy * Javadocs * Add OutputBytesTest * Address comments * Random access in OutputBytes and GenericIndexedWriter * Fix bugs * Fixes * Test OutputBytes.readFully() * Address comments * Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes * Add comments to ByteBufferInputStream * Remove unused declarations	2017-12-04 18:04:27 -08:00
Gian Merlino	5f6bdd940b	SQL: Improve translation of time floor expressions. (#5107 ) * SQL: Improve translation of time floor expressions. The main change is to TimeFloorOperatorConversion.applyTimestampFloor. - Prefer timestamp_floor expressions to timeFormat extractionFns, to avoid turning things into strings when it isn't necessary. - Collapse CAST(FLOOR(X TO Y) AS DATE) to FLOOR(X TO Y) if appropriate. * Fix tests.	2017-11-29 12:06:03 -08:00
Roman Leventov	3541b7544b	Prohibit and remove unused declarations in the processing module (#4930 ) * Prohibit and remove unused declarations in the processing module * Fix tests * Fix integration tests * Suppress unused * Try to remove SuppressWarnings unused in VirtualColumn * Remove reset 'false positives' * Annotate CliCommandCreator as ExtensionPoint * Unused import warning instead of error in IntelliJ * Fixes * Add comment * Fix AzureBlob * Fix CloudFilesBlob * Address comments * Add Project SDK section to INTELLIJ_SETUP.md * Fix image	2017-11-09 09:27:27 -08:00
Jihoon Son	e96daa2593	Fix SQLMetadataSegmentManager (#5001 )	2017-10-31 08:02:41 -07:00
Gian Merlino	5fc6891404	Reduce code duplication between test ExprMacroTables. (#4979 )	2017-10-18 15:57:49 -05:00
Jihoon Son	52d7f74226	Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704 ) * Add steaming grouper * Fix doc * Use a single dictionary while combining * Revert GroupByBenchmark * Removed unused code * More cleanup * Remove unused config * Fix some typos and bugs * Refactor Groupers.mergeIterators() * Add comments for combining tree * Refactor buildCombineTree * Refactor iterator * Add ParallelCombiner * Add ParallelCombinerTest * Handle InterruptedException * use AbstractPrioritizedCallable * Address comments * [maven-release-plugin] prepare release druid-0.11.0-sg * [maven-release-plugin] prepare for next development iteration * Address comments * Revert "[maven-release-plugin] prepare for next development iteration" This reverts commit `5c6b31e488`. * Revert "[maven-release-plugin] prepare release druid-0.11.0-sg" This reverts commit `0f5c3a8b82`. * Fix build failure * Change list to array * rename sortableIds * Address comments * change to foreach loop * Fix comment * Revert keyEquals() * Remove loop * Address comments * Fix build fail * Address comments * Remove unused imports * Fix method name * Split intermediate and leaf combine degrees * Add comments to StreamingMergeSortedGrouper * Add more comments and fix overflow * Address comments * ConcurrentGrouperTest cleanup * add thread number configuration for parallel combining * improve doc * address comments * fix build	2017-10-17 23:24:08 -07:00
Roman Leventov	dc7cb117a1	Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism (#4886 ) * Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism * Fix MapVirtualColumn.makeColumnValueSelector() * Minor fixes * Fix IndexGeneratorCombinerTest * DimensionSelector to return zeros when treated as numeric ColumnValueSelector * Fix IncrementalIndexTest * Fix IncrementalIndex.makeColumnSelectorFactory() * Optimize MapBasedRow.getMetric() * Fix VarianceAggregatorTest * Simplify IncrementalIndex.makeColumnSelectorFactory() * Address comments * More comments * Test	2017-10-13 21:44:17 -05:00
Jihoon Son	8d9902831e	Refactoring PrefetchableTextFilesFirehoseFactory (#4836 ) * Refactoring prefetchable firehose * Fix to read cache when prefetch is disabled * More tests * Cleanup codes * Add Fetcher * Fix test failure * Count file size * Fix test * rename generic parameter * address comments * address comments * reuse buffer * move Execs to java-util * use execs * Fix build	2017-10-13 21:39:28 -05:00
Jihoon Son	675c6c00dd	Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958 ) * add checkstyle and intellij rule * fix tc fail	2017-10-13 07:56:19 -07:00
Jihoon Son	dfa9cdc982	Prioritized locking (#4550 ) * Implementation of prioritized locking * Fix build failure * Fix tc fail * Fix typos * Fix IndexTaskTest * Addressed comments * Fix test * Fix spacing * Fix build error * Fix build error * Add lock status * Cleanup suspicious method * Add nullables * add doInCriticalSection to TaskLockBox and revert return type of task actions * fix build * refactor CriticalAction * make replaceLock transactional * fix formatting * fix javadoc * fix build	2017-10-11 23:16:31 -07:00
Gian Merlino	b20e3038b6	SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889 ) * SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. This brings benefits: - Ability to do GROUP BY and ORDER BY with ordinals. - Ability to support IN filters beyond 19 elements (fixes #4203). Some refactoring of druid-sql internals: - Builtin aggregators and operators are implemented as SqlAggregators and SqlOperatorConversions rather being special cases. This simplifies the Expressions and GroupByRules code, which were becoming complex. - SqlAggregator implementations are no longer responsible for filtering. Added new functions: - Expressions: strpos. - SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR, and DATE_TRUNC. * Add missing @Override annotation. * Adjustments for forbidden APIs. * Adjustments for forbidden APIs. * Disable GROUP BY alias. * Doc reword.	2017-10-10 12:44:05 -07:00
Roman Leventov	3f1009aaa1	Make Overlord auto-scaling and provisioning extensible (#4730 ) * Make AutoScaler, ProvisioningStrategy and BaseWorkerBehaviorConfig extension points; More logging in PendingTaskBasedWorkerProvisioningStrategy * Address comments and fix a bug * Extract method * debug logging * Rename BaseWorkerBehaviorConfig to WorkerBehaviorConfig and WorkerBehaviorConfig to DefaultWorkerBehaviorConfig * Fixes	2017-10-02 20:12:23 -05:00
Gian Merlino	1f2074c247	Bump versions in master to 0.11.1-SNAPSHOT. (#4878 ) * Bump versions in master to 0.11.1-SNAPSHOT. * Missed a few.	2017-09-28 17:09:51 -05:00

1 2 3 4 5 ...

1111 Commits