* Fix the potential race SplittableInputSource.getNumSplits() and SplittableInputSource.createSplits() in TaskMonitor
* Fix docs and javadoc
* Add unit tests for large or small estimated num splits
* add override
Since it hasn't received updates or community interest in a while, it makes sense
to de-emphasize it in the distribution and most documentation (outside of simple
mentions of its existence).
* Stateful auto compaction
* javaodc
* add removed test back
* fix test
* adding indexSpec to compactionState
* fix build
* add lastCompactionState
* address comments
* extract CompactionState
* fix doc
* fix build and test
* Add a task context to store compaction state; add javadoc
* fix it test
* IOConfig for compaction task
* add javadoc, doc, unit test
* fix webconsole test
* add spelling
* address comments
* fix build and test
* address comments
* Added live reports for Kafka and Native batch task
* Removed unused local variables
* Added the missing unit test
* Refine unit test logic, add implementation for HttpRemoteTaskRunner
* checksytle fixes
* Update doc descriptions for updated API
* remove unnecessary files
* Fix spellcheck complaints
* More details for api descriptions
* Adjust defaults for hashed partitioning
If neither the partition size nor the number of shards are specified,
default to partitions of 5,000,000 rows (similar to the behavior of
dynamic partitions). Previously, both could be null and cause incorrect
behavior.
Specifying both a partition size and a number of shards now results in
an error instead of ignoring the partition size in favor of using the
number of shards. This is a behavior change that makes it more apparent
to the user that only one of the two properties will be honored
(previously, a message was just logged when the specified partition size
was ignored).
* Fix test
* Handle -1 as null
* Add -1 as null tests for single dim partitioning
* Simplify logic to handle -1 as null
* Address review comments
* Rename partition spec fields
Rename partition spec fields to be consistent across the various types
(hashed, single_dim, dynamic). Specifically, use targetNumRowsPerSegment
and maxRowsPerSegment in favor of targetPartitionSize and
maxSegmentSize. Consistent and clearer names are easier for users to
understand and use.
Also fix various IntelliJ inspection warnings and doc spelling mistakes.
* Fix test
* Improve docs
* Add targetRowsPerSegment to HashedPartitionsSpec
* move google ext docs from contrib to core
* fix links
* revert unintended change
* more links, add note to example ext doc that it was removed, unlink from sidebar