* Add flattenSpec support to the Avro parser.
Also:
- Refactor the JSONPathParser a bit so it can share flattening code
with Avro (see ObjectFlatteners).
- Remove the JSONParser. It was only used in two places: by
UriNamespaceExtractor, and as a base for JSONToLowerParser. Migrated
the former to JSONPathParser and made the latter a standalone.
- Move GenericRecordAsMap to the Parquet extension, since the Avro
extension no longer uses it.
* Fix indentation.
* Fix equals/hashCode.
* internal-discovery: interfaces for announcement/discovery, curator impls
* more tests
* address some review comments
* more fixes
* address more review comments
* simplify ObjectMapper setup in CuratorDruidNodeAnnouncerAndDiscoveryTest
* fix KafkaIndexTaskTest
* make lookupTier overridable via RealtimeIndexTask and KafkaIndexTask context
* make teamcity build happy
* Implement "earlyMessageRejectionPeriod" config discussed in issue #4599
* implement the logics of this param
* Added doc of this config
* Added unit tests of it
* Update KafkaSupervisor.java
ameliorate comment
* fix format
* fix bug when rebasing
* Remove some unnecessary use of boxed types.
* Fix some incorrect format strings.
* Enable IDEA's MalformedFormatString inspection.
* Add a Checkstyle check for finding uses of incorrect logging packages.
* Fix some incorrect usages of the metamx logger.
* Bypass incorrect logger Checkstyle check where using the correct logger is not simple.
* Fix some more places where the wrong number of arguments are provided to format strings.
* Suppress `MalformedFormatString` inspection on legacy logging test.
* Use @SuppressWarnings rather than a noinspection suppression comment.
* Fix some more incorrect format strings.
* Suppress some more incorrect format string warnings where the incorrect string is intentional.
* Log the aggregator when closing it fails.
* Remove some unneeded log lines.
* Early publishing segments in the middle of data ingestion
* Remove unnecessary logs
* Address comments
* Refactoring the patch according to #4292 and address comments
* Set the total shard number of NumberedShardSpec to 0
* refactoring
* Address comments
* Fix tests
* Address comments
* Fix sync problem of committer and retry push only
* Fix doc
* Fix build failure
* Address comments
* Fix compilation failure
* Fix transient test failure
* Avoid usages of Default system Locale and printing to System.out or System.err in production code
* Fix Charset in DruidKerberosUtil
* Remove redundant string format in GenericIndexed
* Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format()
* Fix testSafeFormat()
* More fixes of redundant StringUtils.format() inside ISE
* Rename unimportantSafeFormat() to nonStrictFormat()
* Remove ability to create segments in v8 format
* Fix IndexGeneratorJobTest
* Fix parameterized test name in IndexMergerTest
* Remove extra legacy merging stuff
* Remove legacy serializer builders
* Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest
* Expressions: Add ExprMacros, which have the same syntax as functions, but
can convert themselves to any kind of Expr at parse-time.
ExprMacroTable is an extension point for adding new ExprMacros. Anything
that might need to parse expressions needs an ExprMacroTable, which can
be injected through Guice.
* Address code review comments.
* Make using implicit system charset an error
* Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String()
* Use English locale in StringUtils.safeFormat()
* Restore comment
* refactor lag reporting and report lag at status endpoint
* refactor offset reporting logic to fetch offsets periodically vs. at request time
* remove JavaCompatUtils
* code review changes
* code review changes
* Refactoring Appenderator
1) Added publishExecutor and handoffExecutor for background publishing and handing segments off
2) Change add() to not move segments out in it
* Address comments
1) Remove publishTimeout for KafkaIndexTask
2) Simplifying registerHandoff()
3) Add increamental handoff test
* Remove unused variable
* Add persist() to Appenderator and more tests for AppenderatorDriver
* Remove unused imports
* Fix strict build
* Address comments
* Optional long-polling based segment announcement via HTTP instead of Zookeeper
* address review comments
* make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed
* update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers
* address review comments
* remove size check not required anymore as only segment servers announce themselves and not all peon processes
* annouce segment server on historical only after cached segments are loaded
* fix checkstyle errors
* Make Errorprone the default compiler
* Address comments
* Make Error Prone's ClassCanBeStatic rule a error
* Preconditions allow only %s pattern
* Fix DruidCoordinatorBalancerTester
* Try to give the compiler more memory
* Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now
* Don't show compiler warnings
* Try different travis script
* Fix travis.yml
* Make Error Prone optional again
* For error-prone compiler
* Increase compiler's maxmem
* Don't run Error Prone for benchmarks because of OOM
* Skip install step in Travis
* Remove MetricHolder.writeToChannel()
* In travis.yml, check compilation before tests, because it may fail faster
* Add QueryPlus. Add QueryRunner.run(QueryPlus, Map) method with default implementation, to replace QueryRunner.run(Query, Map).
* Fix GroupByMergingQueryRunnerV2
* Fix QueryResourceTest
* Expand the comment to Query.run(walker, context)
* Remove legacy version of BySegmentSkippingQueryRunner.doRun()
* Add LegacyApiQueryRunnerTest and be more specific about legacy API removal plans in Druid 0.11 in Javadocs
* Fix lz4 library incompatibility in kafka-indexing-service extension #3266
* Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension #3266
* Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension #4115
task.pause(0) can return early before the task is actually paused.
Exception for failure -
java.lang.AssertionError: expected:<PAUSED> but was:<READING>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at
io.druid.indexing.kafka.KafkaIndexTaskTest.testRunWithOffsetOutOfRangeEx
ceptionAndPause(KafkaIndexTaskTest.java:1229)
To reproduce add Thread.sleep(10000) in beginning of
KafkaIndexTask.possiblypause method.
* No more singleton. Reduce iterations
* Granularities
* Fix the delay in the test
* Add license header
* Remove unused imports
* Lot more unused imports from all the rearranging
* CR feedback
* Move javadoc to constructor
* Refactor Segment Granularity
* Beginning of one granularity
* Copy the fix for custom periods in segment-grunalrity over here.
* Remove the custom serialization for now.
* Compilation cleanup
* Reformat code
* Fixing unit tests
* Unify to use a single iterable
* Backward compatibility for rolling upgrade
* Minor check style. Cosmetic changes.
* Rename length and millis to duration
* CR feedback
* Minor changes.
* auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers
* review comments
* review comments
* reverted last change
* review comments
* review comments
* fix typo
* Fix#3795 (Java 7 compatibility).
Also introduce Animal Sniffer checks during build, which would
have caught the original problems.
* Add Animal Sniffer on caffeine-cache for JDK8.
* option to reset offset automatically in case of OffsetOutOfRangeException
if the next offset is less than the earliest available offset for that partition
* review comments
* refactoring
* refactor
* review comments
* validate X-Druid-Task-Id header in request and add header to response
* modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant
segment creation deterministic.
This means that each segment will contain data from just one Kafka
partition. So, users will probably not want to have a super high number
of Kafka partitions...
Fixes#2703.
Reads a specific offset range from specific partitions, and can use dataSource metadata
transactions to guarantee exactly-once ingestion.
Each task has a finite lifecycle, so it is expected that some process will be supervising
existing tasks and creating new ones when needed.