mirror of https://github.com/apache/druid.git
b13f07a057
* Make LocalInputSource.files a List instead of Set and adjust wikipedia_index_task to use file list. Rationale: the behavior of wikipedia_index_task.json is order-dependent with regard to its input files; some orders produce 4 segments and some produce 5 segments. Some integration tests, like ITSystemTableBatchIndexTaskTest and ITAutoCompactionTest, are written assuming that the 4-segment case will always happen. Providing the file list in a specific order ensures that this will happen as expected by the tests. I didn't see a specific reason why the LocalInputSource.files parameter needed to be a Set, so changing it to a List was the simplest way to achieve the consistent ordering. I think it will also make the behavior make more sense if someone does specify the same input file multiple times in a spec: I think they'd expect it to be loaded multiple times instead of deduped. This is consistent with the behavior of other input sources like S3, GCS, HTTP. * Sort files in LocalFirehoseFactory. |
||
---|---|---|
.. | ||
src | ||
pom.xml |