Currently, NiFi Kafka consumer processors have following issue.
While downstream connections are full, ConsumeKafka is not scheduled to run onTrigger.
It stopps executing poll to tell Kafka server that this client is alive.
Thus, after a while in that situation, Kafka server rebalances the client.
When downstream connections back to normal, although ConsumeKafka is scheduled again,
the client is no longer a part of a consumer group.
If this happens, Kafka client succeeds polling messages when ConsumeKafka processor resumes, but fails to commit offset.
Received messages are already committed into NiFi flow, but since consumer offset is not updated, those will be consumed again, duplicated.
In order to address above issue:
- For ConsumeKafka_0_10, use latest client library
Above issue has been addressed by KIP-62.
The latest Kafka consumer poll checks if the client instance is still valid, and rejoin the group if not, before consuming messages.
- For ConsumeKafka (0.9), added manual retention logic using pause/resume
Kafka client 0.9 doesn't have background thread heartbeat, so similar machanism is added manually.
Use Kafka pause/resume consumer API to tell Kafka server that the client stops consuming messages but is still alive.
Another internal thread is used to perform paused poll periodically based on the time passed since the last onTrigger(poll) is executed.
This closes#1527.
Signed-off-by: Bryan Bende <bbende@apache.org>
- Improved StreamScanner for better performance
- Renamed StreamScanner to StreamDemarcator as suggested by Joe
- Added failure handling logic to ensure both processors can be reset to their initial state (as if they were just started)
- Provided comprehensive test suite to validate various aspects of both Publish and Consume from Kafka
- Added relevant javadocs
- Added initial additionalDetails docs
- Addressed NPE reported by NIFI-1764
- Life-cycle refactoring for the existing PutKafka to ensure producer restart after errors
- Incorporated code changes contributed by Ralph Perko (see NIFI-1837)
- Addressed partition issue in RoundRobinPartitioner discussed in NIFI-1827
- Updated PropertyDescriptor descriptions to reflect their purpose
NIFI-1296 added @Ignore on some Kafka tests to improve test time
NIFI-1296 reworked tests to avoid dependency on embedded Kafka
NIFI-1296 fixed spelling error
NIFI-1296 fixed trailing whitespaces in non-java files
This closes#366