druid/processing
Gian Merlino 764f41d959
Clear "lineSplittable" for JSON when using KafkaInputFormat. (#15692)
* Clear "lineSplittable" for JSON when using KafkaInputFormat.

JsonInputFormat has a "withLineSplittable" method that can be used to
control whether JSON is read line-by-line, or as a whole. The intent
is that in streaming ingestion, "lineSplittable" is false (although it
can be overridden by "assumeNewlineDelimited"), and in batch ingestion,
lineSplittable is true.

When a "json" format is wrapped by a "kafka" format, this isn't set
properly. This patch updates KafkaInputFormat to set this on an
underlying "json" format.

The tests for KafkaInputFormat were overriding the "lineSplittable"
parameter explicitly, which wasn't really fair, because that made them
unrealistic to what happens in production. Now they omit the parameter
and get the production behavior.

* Add test.

* Fix test coverage.
2024-01-18 03:22:41 -08:00
..
src Clear "lineSplittable" for JSON when using KafkaInputFormat. (#15692) 2024-01-18 03:22:41 -08:00
pom.xml Faster parsing: reduce String usage, list-based input rows. (#15681) 2024-01-18 19:18:46 +08:00