From b4d11337c9271f29dd403d2393812f2ab6f35b35 Mon Sep 17 00:00:00 2001 From: Arpit Agarwal Date: Tue, 2 Jan 2018 12:54:40 -0800 Subject: [PATCH] HDFS-12351. Explicitly describe the minimal number of DataNodes required to support an EC policy in EC document.. Contributed by Hanisha Koneru. --- .../hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md index 4459c9475f7..60fd3abf184 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md @@ -99,6 +99,10 @@ Deployment Encoding and decoding work consumes additional CPU on both HDFS clients and DataNodes. + Erasure coding requires a minimum of as many DataNodes in the cluster as + the configured EC stripe width. For EC policy RS (6,3), this means + a minimum of 9 DataNodes. + Erasure coded files are also spread across racks for rack fault-tolerance. This means that when reading and writing striped files, most operations are off-rack. Network bisection bandwidth is thus very important.