This PR splits current SegmentLoader into SegmentLoader and SegmentCacheManager.
SegmentLoader - this class is responsible for building the segment object but does not expose any methods for downloading, cache space management, etc. Default implementation delegates the download operations to SegmentCacheManager and only contains the logic for building segments once downloaded. . This class will be used in SegmentManager to construct Segment objects.
SegmentCacheManager - this class manages the segment cache on the local disk. It fetches the segment files to the local disk, can clean up the cache, and in the future, support reserve and release on cache space. [See https://github.com/Make SegmentLoader extensible and customizable #11398]. This class will be used in ingestion tasks such as compaction, re-indexing where segment files need to be downloaded locally.
* Doc update for new input source and input format.
- The input source and input format are promoted in all docs under docs/ingestion
- All input sources including core extension ones are located in docs/ingestion/native-batch.md
- All input formats and parsers including core extension ones are localted in docs/ingestion/data-formats.md
- New behavior of the parallel task with different partitionsSpecs are documented in docs/ingestion/native-batch.md
* parquet
* add warning for range partitioning with sequential mode
* hdfs + s3, gs
* add fs impl for gs
* address comments
* address comments
* gcs