Commit Graph

95 Commits

Author SHA1 Message Date
Gian Merlino 5da0241ac8
Kafka: Fixes needlessly low interpretation of maxRowsInMemory. (#5034)
AppenderatorImpl already applies maxRowsInMemory across all sinks. So dividing by
the number of Kafka partitions is pointless and effectively makes the interpretation
of maxRowsInMemory lower than expected.

This undoes one of the two changes from #3284, which fixed the original bug twice.
In this, that's worse than fixing it once.
2017-11-02 13:45:04 -06:00
Gian Merlino 6c725a7e06 Fix havingSpec on complex aggregators. (#5024)
* Fix havingSpec on complex aggregators.

- Uses the technique from #4883 on DimFilterHavingSpec too.
- Also uses Transformers from #4890, necessitating a move of that and other
  related classes from druid-server to druid-processing. They probably make
  more sense there anyway.
- Adds a SQL query test.

Fixes #4957.

* Remove unused import.
2017-11-01 12:58:08 -04:00
Gian Merlino 0ce406bdf1
Introduce "transformSpec" at ingest-time. (#4890)
* Introduce "transformSpec" at ingest-time.

It accepts a "filter" (standard query filter object) and "transforms" (a
list of objects with "name" and "expression"). These can be used to do
filtering and single-row transforms without need for a separate data
processing job.

The "expression" fields use the same expression language as other
expression-based feature.

* Remove forbidden api.

* Fix compile error.

* Fix tests.

* Some more changes.

- Add nullable annotation to Firehose.nextRow.
- Add tests for index task, realtime task, kafka task, hadoop mapper,
  and ingestSegment firehose.

* Fix bad merge.

* Adjust imports.

* Adjust whitespace.

* Make Transform into an interface.

* Add missing annotation.

* Switch logger.

* Switch logger.

* Adjust test.

* Adjustment to handling for DatasourceIngestionSpec.

* Fix test.

* CR comments.

* Remove unused method.

* Add javadocs.

* More javadocs, and always decorate.

* Fix bug in TransformingStringInputRowParser.

* Fix bad merge.

* Fix ISFF tests.

* Fix DORC test.
2017-10-30 17:38:52 -07:00
elloooooo 52a162e302 define earlyMessegeRejectPeriod as the period after the taskduration (#4990) 2017-10-27 01:13:46 +05:30
Jihoon Son 8d9902831e Refactoring PrefetchableTextFilesFirehoseFactory (#4836)
* Refactoring prefetchable firehose

* Fix to read cache when prefetch is disabled

* More tests

* Cleanup codes

* Add Fetcher

* Fix test failure

* Count file size

* Fix test

* rename generic parameter

* address comments

* address comments

* reuse buffer

* move Execs to java-util

* use execs

* Fix build
2017-10-13 21:39:28 -05:00
Jihoon Son dfa9cdc982 Prioritized locking (#4550)
* Implementation of prioritized locking

* Fix build failure

* Fix tc fail

* Fix typos

* Fix IndexTaskTest

* Addressed comments

* Fix test

* Fix spacing

* Fix build error

* Fix build error

* Add lock status

* Cleanup suspicious method

* Add nullables

*  add doInCriticalSection to TaskLockBox and revert return type of task actions

* fix build

* refactor CriticalAction

* make replaceLock transactional

* fix formatting

* fix javadoc

* fix build
2017-10-11 23:16:31 -07:00
Jihoon Son 56fb11ce0b Lazy initialization for JavaScript functions (#4871)
* Lazy initialization of JavaScript functions

* Fix test failure

* Fix thread-safety and postpone js conf check

* Fix test fail

* Fix test

* Fix KafkaIndexTaskTest

* Move config check
2017-10-10 21:52:42 -07:00
Gian Merlino 1f2074c247 Bump versions in master to 0.11.1-SNAPSHOT. (#4878)
* Bump versions in master to 0.11.1-SNAPSHOT.

* Missed a few.
2017-09-28 17:09:51 -05:00
Gian Merlino bf8fd4c203 Add flattenSpec support to the Avro parser. (#4832)
* Add flattenSpec support to the Avro parser.

Also:

- Refactor the JSONPathParser a bit so it can share flattening code
  with Avro (see ObjectFlatteners).
- Remove the JSONParser. It was only used in two places: by
  UriNamespaceExtractor, and as a base for JSONToLowerParser. Migrated
  the former to JSONPathParser and made the latter a standalone.
- Move GenericRecordAsMap to the Parquet extension, since the Avro
  extension no longer uses it.

* Fix indentation.

* Fix equals/hashCode.
2017-09-26 09:26:06 -07:00
Parag Jain 07446ef32c warn if topic not found (#4834) 2017-09-25 12:21:46 +09:00
Roman Leventov e267f3901b Enforce Indentation with Checkstyle (#4799) 2017-09-21 13:06:48 -07:00
Jonathan Wei c2a0e753b6 Extension points for authentication/authorization (#4271)
* Extension points for authentication/authorization

* Address some PR comments

* Authorization result caching

* Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter

* Use Set for auth caching, close outputstreams in filters

* Don't close output stream on success in sanity check filter

* Add ConfigResourceFilter to coordinator lookups

* Fix filtering authorization check for empty resource list

* HttpClient users must explicitly escalate the client

* Remove response modification from PreResponseAuthorizationCheckFilter

* Remove extraneous pom.xml

* Fix unit test

* Better lifecycle management

* Rename AuthorizationManager to Authorizer

* Fix authorization denials for empty supervisor list

* Address some PR comments

* Address more PR comments

* Small cleanup

* Add Jetty HttpClient wrapper to Authenticator

* Remove Authorizer start/stop

* Restore immutable context map in DruidConnection, UT fix

* Fix/update docs

* Add authorization checks to EventReceiverFirehose

* Fix router authorization check failure, restore PreResponseAuthorizationFilter changes

* Compile fixes

* Test fixes

* Update Authenticator/Authorizer doc comments

* Merge fixes

* PR comments

* Fix test

* Fix IT

* More PR comments

* PR comments

* SSL fix
2017-09-15 23:45:48 -07:00
Himanshu 4c04083926 kafkaIndexTask unannounce service in final block (#4736) 2017-09-01 09:31:15 -07:00
Roman Leventov cbd1902db8 Add forbidden-apis plugin; prohibit using system time zone (#4611)
* Forbidden APIs WIP

* Remove some tests

* Restore io.druid.math.expr.Function

* Integration tests fix

* Add comments

* Fix in SimpleWorkerProvisioningStrategy

* Formatting

* Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest

* Address comments

* Fix GroupByMultiSegmentTest
2017-08-21 13:02:42 -07:00
Himanshu 74a64c88ab internal-discovery: interfaces for announcement/discovery, curator based impls (#4634)
* internal-discovery: interfaces for announcement/discovery, curator impls

* more tests

* address some review comments

* more fixes

* address more review comments

* simplify ObjectMapper setup in CuratorDruidNodeAnnouncerAndDiscoveryTest

* fix KafkaIndexTaskTest

* make lookupTier overridable via RealtimeIndexTask and KafkaIndexTask context

* make teamcity build happy
2017-08-16 13:07:16 -07:00
Parag Jain 725a144096 add localhost as advertised hostname (#4689)
* add localhost as advertised hostname

* set advertised.host.name to localhost for test kafka broker
2017-08-14 16:59:26 -07:00
Roman Leventov bf28d0775b Remove QueryRunner.run(Query, responseContext) and related legacy methods (#4482)
* Remove QueryRunner.run(Query, responseContext) and related legacy methods

* Remove local var
2017-08-11 09:12:38 +09:00
Yuewen Wang c821bc9a5a Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 (#4607)
* Implement "earlyMessageRejectionPeriod" config discussed in issue #4599
    * implement the logics of this param
    * Added doc of this config
    * Added unit tests of it

* Update KafkaSupervisor.java

ameliorate comment

* fix format

* fix bug when rebasing
2017-08-11 09:12:08 +09:00
Jihoon Son d5606bc558 Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549)
* Passing lockTimeout as a parameter for TaskLockbox.lock()

* Remove TIME_UNIT

* Fix tc fail

* Add taskLockTimeout to TaskContext

* Add caution
2017-08-08 18:21:07 -07:00
Roman Leventov f5d4171459 Prohibit for loops which could be foreach with IntelliJ (#4653)
* Replace for with foreach

* Replace for with for-each in GroupByQueryEngineV2

* Remove io.druid.collections.IntList
2017-08-08 18:05:33 -07:00
Roman Leventov aa7e4ae5e4 Enforce correct spacing with Checkstyle (#4651) 2017-08-05 10:18:25 -07:00
Roman Leventov c0beb78ffd Enforce brace formatting with Checkstyle (#4564) 2017-07-21 10:26:59 -05:00
Chris Gavin 960cb07ea6 Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474)
* Remove some unnecessary use of boxed types.

* Fix some incorrect format strings.

* Enable IDEA's MalformedFormatString inspection.

* Add a Checkstyle check for finding uses of incorrect logging packages.

* Fix some incorrect usages of the metamx logger.

* Bypass incorrect logger Checkstyle check where using the correct logger is not simple.

* Fix some more places where the wrong number of arguments are provided to format strings.

* Suppress `MalformedFormatString` inspection on legacy logging test.

* Use @SuppressWarnings rather than a noinspection suppression comment.

* Fix some more incorrect format strings.

* Suppress some more incorrect format string warnings where the incorrect string is intentional.

* Log the aggregator when closing it fails.

* Remove some unneeded log lines.
2017-07-13 12:15:32 -07:00
Roman Leventov b2865b7c7b Make possible to start Peon without DI loading of any querying-related stuff (#4516)
* Make QueryRunnerFactoryConglomerate injection lazy in TaskToolbox/TaskToolboxFactory

* Extract QueryablePeonModule and add druid.modules.excludeList config

* Typo
2017-07-12 13:18:25 -05:00
Akash Dwivedi 5f411f14af Timeout for LockAcquireAction (#4461)
* Timeout for LockAcquireAction

* Static inner class.

* Rebase changes.

* makeAlert and throw exception incase of overlapping interval.

* Addressed comments.

* remove unused import.

* Addressed comments
2017-07-11 18:59:32 +09:00
Jihoon Son cc20260078 Early publishing segments in the middle of data ingestion (#4238)
* Early publishing segments in the middle of data ingestion

* Remove unnecessary logs

* Address comments

* Refactoring the patch according to #4292 and address comments

* Set the total shard number of NumberedShardSpec to 0

* refactoring

* Address comments

* Fix tests

* Address comments

* Fix sync problem of committer and retry push only

* Fix doc

* Fix build failure

* Address comments

* Fix compilation failure

* Fix transient test failure
2017-07-10 22:35:36 -07:00
Parag Jain 6e2f78f552 TLS support (#4270) 2017-07-06 17:40:12 -07:00
Roman Leventov 9ae457f7ad Avoid using the default system Locale and printing to System.out in production code (#4409)
* Avoid usages of Default system Locale and printing to System.out or System.err in production code

* Fix Charset in DruidKerberosUtil

* Remove redundant string format in GenericIndexed

* Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format()

* Fix testSafeFormat()

* More fixes of redundant StringUtils.format() inside ISE

* Rename unimportantSafeFormat() to nonStrictFormat()
2017-06-29 14:06:19 -07:00
Roman Leventov ae900a4934 Update versions to 0.11.0-SNAPSHOT (#4483) 2017-06-28 17:05:58 -07:00
Roman Leventov 05d58689ad Remove the ability to create segments in v8 format (#4420)
* Remove ability to create segments in v8 format

* Fix IndexGeneratorJobTest

* Fix parameterized test name in IndexMergerTest

* Remove extra legacy merging stuff

* Remove legacy serializer builders

* Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest
2017-06-26 13:21:39 -07:00
Roman Leventov 5285eb961b Update dependencies (#4313)
* Update dependencies

* Downgrade curator

* Rollback aws-java-sdk dependency to 1.10.77

* Revert exclusions in integration-tests

* Depend only on aws-java-sdk-ec2 instead of umbrella aws-java-sdk (fixes #4382)
2017-06-09 14:32:07 -07:00
Gian Merlino 1f2afccdf8 Expressions: Add ExprMacros. (#4365)
* Expressions: Add ExprMacros, which have the same syntax as functions, but
can convert themselves to any kind of Expr at parse-time.

ExprMacroTable is an extension point for adding new ExprMacros. Anything
that might need to parse expressions needs an ExprMacroTable, which can
be injected through Guice.

* Address code review comments.
2017-06-08 09:32:10 -04:00
Roman Leventov 31d33b333e Make using implicit system Charset an error (#4326)
* Make using implicit system charset an error

* Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String()

* Use English locale in StringUtils.safeFormat()

* Restore comment
2017-06-05 23:57:25 -07:00
David Lim 13ecf90923 Report Kafka lag information in supervisor status report (#4314)
* refactor lag reporting and report lag at status endpoint

* refactor offset reporting logic to fetch offsets periodically vs. at request time

* remove JavaCompatUtils

* code review changes

* code review changes
2017-06-05 13:26:25 -07:00
Jihoon Son da32e1ae53 Reducing testing time for KafkaIndexTaskTest and KafkaSupervisorTest (#4352) 2017-06-03 00:53:07 +09:00
Jihoon Son f876246af7 Rename FiniteAppenderatorDriver to AppenderatorDriver (#4356) 2017-06-03 00:48:44 +09:00
Jihoon Son 1150bf7a2c Refactoring Appenderator Driver (#4292)
* Refactoring Appenderator

1) Added publishExecutor and handoffExecutor for background publishing and handing segments off
2) Change add() to not move segments out in it

* Address comments

1) Remove publishTimeout for KafkaIndexTask
2) Simplifying registerHandoff()
3) Add increamental handoff test

* Remove unused variable

* Add persist() to Appenderator and more tests for AppenderatorDriver

* Remove unused imports

* Fix strict build

* Address comments
2017-06-02 07:09:11 +09:00
Jihoon Son 7889891bd3 Fix integration tests (#4337)
* Fix integration tests

1) Use the same version of kafka
2) Change ServiceEmitter from LazySingleton to ManageLifecycle

* Revert unnecessary change
2017-05-28 08:48:39 -07:00
Jihoon Son 733dfc9b30 Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193)
* Add PrefetcheableTextFilesFirehoseFactory

* fix comment

* exception handling

* Fix wrong json property

* Remove ReplayableFirehoseFactory and fix misspelling

* Defer object initialization

* Add a temporaryDirectory parameter to FirehoseFactory.connect()

* fix when cache and fetch are disabled

* Address comments

* Add more test

* Increase timeout for test

* Add wrapObjectStream

* Move methods to Firehose from PrefetchableFirehoseFactory

* Cleanup comment

* add directory listing to s3 firehose

* Rename a variable

* Addressing comments

* Update document

* Support disabling prefetch

* Fix race condition

* Add fetchLock

* Remove ReplayableFirehoseFactoryTest

* Fix compilation error

* Fix test failure

* Address comments

* Add default implementation for new method
2017-05-18 15:37:18 +09:00
Himanshu daa8ef8658 Optional long-polling based segment announcement via HTTP instead of Zookeeper (#3902)
* Optional long-polling based segment announcement via HTTP instead of Zookeeper

* address review comments

* make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed

* update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers

* address review comments

* remove size check not required anymore as only segment servers announce themselves and not all peon processes

* annouce segment server on historical only after cached segments are loaded

* fix checkstyle errors
2017-05-17 16:31:58 -05:00
Roman Leventov b7a52286e8 Make @Override annotation obligatory (#4274)
* Make MissingOverride an error

* Make travis stript to fail fast

* Add missing Override annotations

* Comment
2017-05-16 13:30:30 -05:00
David Lim 8333043b7b add skipOffsetGaps flag (#4256) 2017-05-16 12:19:28 -06:00
Roman Leventov 1ebfa22955 Update Error prone configuration; Fix bugs (#4252)
* Make Errorprone the default compiler

* Address comments

* Make Error Prone's ClassCanBeStatic rule a error

* Preconditions allow only %s pattern

* Fix DruidCoordinatorBalancerTester

* Try to give the compiler more memory

* Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now

* Don't show compiler warnings

* Try different travis script

* Fix travis.yml

* Make Error Prone optional again

* For error-prone compiler

* Increase compiler's maxmem

* Don't run Error Prone for benchmarks because of OOM

* Skip install step in Travis

* Remove MetricHolder.writeToChannel()

* In travis.yml, check compilation before tests, because it may fail faster
2017-05-12 15:55:17 +09:00
Roman Leventov e09e892477 Refactor QueryRunner to accept QueryPlus: Query + QueryMetrics (part of #3798) (#4184)
* Add QueryPlus. Add QueryRunner.run(QueryPlus, Map) method with default implementation, to replace QueryRunner.run(Query, Map).

* Fix GroupByMergingQueryRunnerV2

* Fix QueryResourceTest

* Expand the comment to Query.run(walker, context)

* Remove legacy version of BySegmentSkippingQueryRunner.doRun()

* Add LegacyApiQueryRunnerTest and be more specific about legacy API removal plans in Druid 0.11 in Javadocs
2017-05-10 12:25:00 -07:00
Parag Jain 1fd177039d fix auto reset - pause task instead of putting thread to sleep (#4244) 2017-05-08 15:08:25 -07:00
Parag Jain eb8e1b0a97 Prevent interrupted exception from polluting log during supervisor shutdown (#4253)
* Prevent interrupted exception from polluting log during supervisor shutdown

* do nothing in case of InterruptedException
2017-05-08 15:05:25 -07:00
Parag Jain 4502c207af fix injection bug and documentation (#4243) 2017-05-03 15:07:43 -05:00
Parag Jain f9a61ea2ba Kafka lag emitter - Kafka Indexing Service (#4194)
* Kafka lag emitter

* enforce minimum emit period to a minute

* fixed comment
2017-05-02 17:30:07 -06:00
satishbhor d51097c809 Fix lz4 library incompatibility in kafka-indexing-service extension (#4115)
* Fix lz4 library incompatibility in kafka-indexing-service extension #3266

* Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension #3266

* Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension #4115
2017-04-25 12:23:51 +09:00
Gian Merlino 2ca7b00346 Update versions to 0.10.1-SNAPSHOT. (#4191) 2017-04-20 18:12:28 -07:00