hadoop/hadoop-tools
Steve Loughran 7a45ef4164
MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519)
This modifies the manifest committer so that the list of files
to rename is passed between stages as a file of
writeable entries on the local filesystem.

The map of directories to create is still passed in memory;
this map is built across all tasks, so even if many tasks
created files, if they all write into the same set of directories
the memory needed is O(directories) with the
task count not a factor.

The _SUCCESS file reports on heap size through gauges.
This should give a warning if there are problems.

Contributed by Steve Loughran
2023-06-09 17:00:59 +01:00
..
hadoop-aliyun HADOOP-18458: AliyunOSSBlockOutputStream to support heap/off-heap buffer before uploading data to OSS (#4912) 2023-03-28 14:27:01 +08:00
hadoop-archive-logs HADOOP-18206 Cleanup the commons-logging references and restrict its usage in future (#5315) 2023-02-14 03:24:06 +08:00
hadoop-archives HADOOP-18548. Hadoop Archive tool (HAR) should acquire delegation tokens from source and destination file systems (#5355) 2023-03-30 07:12:02 +08:00
hadoop-aws MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519) 2023-06-09 17:00:59 +01:00
hadoop-azure MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519) 2023-06-09 17:00:59 +01:00
hadoop-azure-datalake HADOOP-18641. Cloud connector dependency and LICENSE fixup. (#5429) 2023-02-28 10:48:54 +00:00
hadoop-benchmark HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076) 2022-10-31 21:12:13 +00:00
hadoop-datajoin HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-distcp Revert "HADOOP-18207. Introduce hadoop-logging module (#5503)" 2023-06-05 09:34:40 +05:30
hadoop-dynamometer HADOOP-18359. Update commons-cli from 1.2 to 1.5. (#5095). Contributed by Shilun Fan. 2023-05-10 01:42:12 +05:30
hadoop-extras HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-federation-balance HDFS-16256. Minor fix in HDFS Fedbalance document (#4192) 2022-05-02 08:08:12 +08:00
hadoop-fs2img HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-gridmix HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-kafka HADOOP-17753. Keep restrict-imports-enforcer-rule for Guava Lists in top level hadoop-main pom (#3087) 2021-06-11 12:15:52 +09:00
hadoop-openstack HADOOP-18442. Remove openstack support (#4855) 2022-10-06 11:49:38 +01:00
hadoop-pipes Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-resourceestimator HADOOP-15983. Use jersey-json that is built to use jackson2 (#3988) 2022-04-28 14:18:19 +09:00
hadoop-rumen Revert "HADOOP-18207. Introduce hadoop-logging module (#5503)" 2023-06-05 09:34:40 +05:30
hadoop-sls YARN-10680. Revisit try blocks without catch blocks but having finally blocks. Contributed by Susheel Gupta 2022-10-15 21:51:08 +02:00
hadoop-streaming HADOOP-18359. Update commons-cli from 1.2 to 1.5. (#5095). Contributed by Shilun Fan. 2023-05-10 01:42:12 +05:30
hadoop-tools-dist HADOOP-18442. Remove openstack support (#4855) 2022-10-06 11:49:38 +01:00
pom.xml HADOOP-11867. Add a high-performance vectored read API. (#3904) 2022-06-22 17:29:32 +01:00