Commit Graph

38 Commits

Author SHA1 Message Date
Jihoon Son 56ab4363ea
Native parallel batch indexing without shuffle (#5492)
* Native parallel indexing without shuffle

* fix build

* fix ci

* fix ingestion without intervals

* fix retry

* fix retry

* add it test

* use chat handler

* fix build

* add docs

* fix ITUnionQueryTest

* fix failures

* disable metrics reporting

* working

* Fix split of static-s3 firehose

* Add endpoints to supervisor task and a unit test for endpoints

* increase timeout in test

* Added doc

* Address comments

* Fix overlapping locks

* address comments

* Fix static s3 firehose

* Fix test

* fix build

* fix test

* fix typo in docs

* add missing maxBytesInMemory to doc

* address comments

* fix race in test

* fix test

* Rename to ParallelIndexSupervisorTask

* fix teamcity

* address comments

* Fix license

* addressing comments

* addressing comments

* indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator

* Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner

* Add more javadocs

* use StringUtils.nonStrictFormat for logging

* fix typo and remove unused class

* fix tests

* change package

* fix strict build

* tmp

* Fix overlord api according to the recent change in master

* Fix it test
2018-08-06 23:59:42 -07:00
Benedict Jin 331a0afb98 Remove redundant type parameters and enforce some other style and inspection rules (#5980)
* Various changes about druid-services module

* Patch improvements from reviewer

* Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile

* Fix ArraysAsListWithZeroOrOneArgument

* Fix conflict

* Fix ToArrayCallWithZeroLengthArrayArgument

* Fix AliEqualsAvoidNull

* Remove blank line

* Remove unused import clauses

* Fix code style in TopNQueryRunnerTest

* Fix conflict

* Don't use Collections.singletonList when converting the type of array type

* Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase

* Roll back the latest commit

* Add java.io.File#toURL() into druid-forbidden-apis

* Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord

* Add a new regexp element into stylecode xml file

* Fix style error for new regexp

* Set the level of ArraysAsListWithZeroOrOneArgument as WARNING

* Fix style error for new regexp

* Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile

* Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR

* Add toArray(new Object[0]) regexp into checkstyle config file & fix them

* Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it

* Add a comment for string equals regexp in checkstyle config

* Fix code format

* Add RedundantTypeArguments as ERROR level inspection

* Fix cannot resolve symbol datasource
2018-07-27 16:56:49 -05:00
Gian Merlino 04ea3c9f8c
Update license headers. (#5976)
* Update license headers.

For compliance with http://www.apache.org/legal/src-headers.html.

* More license adjustments.

* Fix mistakenly edited package line.
2018-07-11 09:55:18 -07:00
Charles Allen 0fd42af8d6 Make the google storage extension friendlier to 429 and 5XX responses (#5750)
* Make the google extension friendlier to 429 responses

* Lots of trouble for a little space

* Add in better tests and fix formatting

* Add 500 errors as well as some basic unit tests

* Add IOException to test

* Add some more stuff to killer test

* Change error code in puller test

* fix tests and make errors more generic handling
2018-06-07 13:19:35 -07:00
David Lim 8ec2d2fe18 Use unique segment paths for Kafka indexing (#5692)
* support unique segment file paths

* forbiddenapis

* code review changes

* code review changes

* code review changes

* checkstyle fix
2018-04-29 21:59:48 -07:00
Erik Dubbelboer 71ecb71dcf Add DataSegmentFinder for Google Storage adapter (#5686)
Partially fixes #5628
2018-04-26 14:00:49 -07:00
Charles Allen 2e76012aca Allow GCS data segment killer to delete if present (#5675) 2018-04-24 07:16:54 -07:00
Gian Merlino 5ab17668c0 CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586)
Also switch various firehoses to the new method.

Fixes #5585.
2018-04-06 08:06:45 -07:00
Jonathan Wei 723f7ac550
Add support for task reports, upload reports to deep storage (#5524)
* Add support for task reports, upload reports to deep storage

* PR comments

* Better name for method

* Fix report file upload

* Use TaskReportFileWriter

* Checkstyle

* More PR comments
2018-04-02 12:10:56 -07:00
Roman Leventov 693e3575f9
Remove unused code and exception declarations (#5461)
* Remove unused code and exception declarations

* Address comments

* Remove redundant Exception declarations

* Make FirehoseFactoryV2.connect() to throw IOException again
2018-03-16 22:11:12 +01:00
Gian Merlino 7e02408510 Update versions to 0.13.0-SNAPSHOT. (#5323) 2018-02-02 12:06:38 -06:00
Jihoon Son 5d0619f5ce Support retrying for PrefetchableTextFilesFirehoseFactory when prefetch is disabled (#5162)
* Add RetryingInputStream

* unnecessary exception

* fix PrefetchableTextFilesFirehoseFactoryTest

* Fix retrying on connection reset

* fix start offset

* fix checkstyle

* fix check connection reset

* address comments

* fix compile

* address comments

* address comments
2018-01-10 17:37:19 +01:00
David Lim a7967ade4d Support replaceExisting parameter for segments pushers (#5187)
* support replaceExisting parameter for segments pushers

* code review changes

* code review changes
2018-01-03 16:13:21 -08:00
Roman Leventov 5787d04fad Bump Druid version to 0.12.0 (#5138) 2017-12-15 07:37:01 -08:00
Jihoon Son 8d9902831e Refactoring PrefetchableTextFilesFirehoseFactory (#4836)
* Refactoring prefetchable firehose

* Fix to read cache when prefetch is disabled

* More tests

* Cleanup codes

* Add Fetcher

* Fix test failure

* Count file size

* Fix test

* rename generic parameter

* address comments

* address comments

* reuse buffer

* move Execs to java-util

* use execs

* Fix build
2017-10-13 21:39:28 -05:00
Jihoon Son d95915f8d2 Implement get methods for PrefetchableFirehose (#4948) 2017-10-12 16:14:45 +09:00
Gian Merlino 1f2074c247 Bump versions in master to 0.11.1-SNAPSHOT. (#4878)
* Bump versions in master to 0.11.1-SNAPSHOT.

* Missed a few.
2017-09-28 17:09:51 -05:00
Roman Leventov e267f3901b Enforce Indentation with Checkstyle (#4799) 2017-09-21 13:06:48 -07:00
Roman Leventov cbd1902db8 Add forbidden-apis plugin; prohibit using system time zone (#4611)
* Forbidden APIs WIP

* Remove some tests

* Restore io.druid.math.expr.Function

* Integration tests fix

* Add comments

* Fix in SimpleWorkerProvisioningStrategy

* Formatting

* Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest

* Address comments

* Fix GroupByMultiSegmentTest
2017-08-21 13:02:42 -07:00
Roman Leventov aa7e4ae5e4 Enforce correct spacing with Checkstyle (#4651) 2017-08-05 10:18:25 -07:00
Roman Leventov c0beb78ffd Enforce brace formatting with Checkstyle (#4564) 2017-07-21 10:26:59 -05:00
Gian Merlino 441ee56ba9 DataSegmentPusher: Add allowed hadoop property prefixes. (#4562)
* DataSegmentPusher: Add allowed hadoop property prefixes.

* Fix dots.
2017-07-18 10:16:12 -07:00
Chris Gavin 960cb07ea6 Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474)
* Remove some unnecessary use of boxed types.

* Fix some incorrect format strings.

* Enable IDEA's MalformedFormatString inspection.

* Add a Checkstyle check for finding uses of incorrect logging packages.

* Fix some incorrect usages of the metamx logger.

* Bypass incorrect logger Checkstyle check where using the correct logger is not simple.

* Fix some more places where the wrong number of arguments are provided to format strings.

* Suppress `MalformedFormatString` inspection on legacy logging test.

* Use @SuppressWarnings rather than a noinspection suppression comment.

* Fix some more incorrect format strings.

* Suppress some more incorrect format string warnings where the incorrect string is intentional.

* Log the aggregator when closing it fails.

* Remove some unneeded log lines.
2017-07-13 12:15:32 -07:00
Roman Leventov 9ae457f7ad Avoid using the default system Locale and printing to System.out in production code (#4409)
* Avoid usages of Default system Locale and printing to System.out or System.err in production code

* Fix Charset in DruidKerberosUtil

* Remove redundant string format in GenericIndexed

* Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format()

* Fix testSafeFormat()

* More fixes of redundant StringUtils.format() inside ISE

* Rename unimportantSafeFormat() to nonStrictFormat()
2017-06-29 14:06:19 -07:00
Roman Leventov ae900a4934 Update versions to 0.11.0-SNAPSHOT (#4483) 2017-06-28 17:05:58 -07:00
Roman Leventov c121845102 Avoid using Guava in DataSegmentPushers because of incompatibilities (#4391)
* Avoid using Guava in DataSegmentPushers because of Hadoop incompatibilities

* Clarify comments
2017-06-12 09:58:34 -07:00
Niketh Sabbineni 2cd91b64d0 Uncompress streams without having to download to tmp first (#4364)
* Uncompress streams without having to download to tmp first

* Remove unused file
2017-06-08 18:08:38 -07:00
Roman Leventov 31d33b333e Make using implicit system Charset an error (#4326)
* Make using implicit system charset an error

* Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String()

* Use English locale in StringUtils.safeFormat()

* Restore comment
2017-06-05 23:57:25 -07:00
Slim a2584d214a Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116)
* Adding s3a schema and s3a implem to hdfs storage module.

* use 2.7.3

* use segment pusher to make loadspec

* move getStorageDir and makeLoad spec under DataSegmentPusher

* fix uts

* fix comment part1

* move to hadoop 2.8

* inject deep storage properties

* set version to 2.7.3

* fix build issue about static class

* fix comments

* fix default hadoop default coordinate

* fix create filesytem

* downgrade aws sdk

* bump the version
2017-06-04 00:55:09 -06:00
Roman Leventov 9625993c9a Fix bugs in Google extensions and RocketMQ extension (#4340) 2017-05-30 14:25:35 -07:00
Jihoon Son 733dfc9b30 Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193)
* Add PrefetcheableTextFilesFirehoseFactory

* fix comment

* exception handling

* Fix wrong json property

* Remove ReplayableFirehoseFactory and fix misspelling

* Defer object initialization

* Add a temporaryDirectory parameter to FirehoseFactory.connect()

* fix when cache and fetch are disabled

* Address comments

* Add more test

* Increase timeout for test

* Add wrapObjectStream

* Move methods to Firehose from PrefetchableFirehoseFactory

* Cleanup comment

* add directory listing to s3 firehose

* Rename a variable

* Addressing comments

* Update document

* Support disabling prefetch

* Fix race condition

* Add fetchLock

* Remove ReplayableFirehoseFactoryTest

* Fix compilation error

* Fix test failure

* Address comments

* Add default implementation for new method
2017-05-18 15:37:18 +09:00
Gian Merlino 2ca7b00346 Update versions to 0.10.1-SNAPSHOT. (#4191) 2017-04-20 18:12:28 -07:00
Roman Leventov 81a5f9851f TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir (#3990)
* In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir

* Use FileUtils.forceMkdir() across the codebase and remove some unused code

* Fix test

* Fix PullDependencies.run()

* Unused import
2017-03-02 14:05:12 -08:00
Gian Merlino cc20133e70 Checkstyle rule to outlaw tabs. (#3988)
Tabs are the worst.
2017-02-28 23:52:53 -08:00
Akash Dwivedi 8854ce018e File.deleteOnExit() (#3923)
* Less use of File.deleteOnExit()
 * removed deleteOnExit from most of the tests/benchmarks/iopeon
 * Made IOpeon closable

* Formatting.

* Revert DeterminePartitionsJobTest, remove cleanup method from IOPeon
2017-02-13 15:12:14 -08:00
Gian Merlino 12317fd001 Bump version to 0.10.0-SNAPSHOT. (#3913) 2017-02-06 17:54:35 -08:00
Himanshu 4ca3b7f1e4 overlord helpers framework and tasklog auto cleanup (#3677)
* overlord helpers framework and tasklog auto cleanup

* review comment changes

* further review comments addressed
2016-12-21 15:18:55 -08:00
Erik Dubbelboer 7d36f540e8 WIP: Add Google Storage support (#2458)
Also excludes the correct artifacts from #2741
2016-11-16 14:06:45 +05:30