From 077bc4af3c63c3c40d1d3e312484189a0332c77c Mon Sep 17 00:00:00 2001 From: Konstantin Shvachko Date: Fri, 28 Jan 2011 22:40:42 +0000 Subject: [PATCH] HADOOP-6812. Change documentation for correct placement of configuration variables: mapreduce.reduce.input.buffer.percent, mapreduce.task.io.sort.factor, mapreduce.task.io.sort.mb. Contributed by Chris Douglas. git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1064917 13f79535-47bb-0310-9956-ffa450edef68 --- CHANGES.txt | 5 ++++ .../content/xdocs/cluster_setup.xml | 25 +++++++++++++------ .../content/xdocs/hod_scheduler.xml | 2 +- 3 files changed, 24 insertions(+), 8 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index f41d6f862fb..490a417d740 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -268,6 +268,11 @@ Release 0.22.0 - Unreleased HADOOP-7110. Implement chmod with JNI. (todd) + HADOOP-6812. Change documentation for correct placement of configuration + variables: mapreduce.reduce.input.buffer.percent, + mapreduce.task.io.sort.factor, mapreduce.task.io.sort.mb + (Chris Douglas via shv) + OPTIMIZATIONS HADOOP-6884. Add LOG.isDebugEnabled() guard for each LOG.debug(..). diff --git a/src/docs/src/documentation/content/xdocs/cluster_setup.xml b/src/docs/src/documentation/content/xdocs/cluster_setup.xml index ddb75062d2e..8f51c38ab3c 100644 --- a/src/docs/src/documentation/content/xdocs/cluster_setup.xml +++ b/src/docs/src/documentation/content/xdocs/cluster_setup.xml @@ -666,22 +666,33 @@ - conf/core-site.xml - fs.inmemory.size.mb - 200 + conf/mapred-site.xml + mapreduce.reduce.shuffle.input.buffer.percent + 0.80 - Larger amount of memory allocated for the in-memory - file-system used to merge map-outputs at the reduces. + Larger amount of memory allocated for merging map output + in memory during the shuffle. Expressed as a fraction of + the total heap. - conf/core-site.xml + conf/mapred-site.xml + mapreduce.reduce.input.buffer.percent + 0.80 + + Larger amount of memory allocated for retaining map output + in memory during the reduce. Expressed as a fraction of + the total heap. + + + + conf/mapred-site.xml mapreduce.task.io.sort.factor 100 More streams merged at once while sorting files. - conf/core-site.xml + conf/mapred-site.xml mapreduce.task.io.sort.mb 200 Higher memory-limit while sorting data. diff --git a/src/docs/src/documentation/content/xdocs/hod_scheduler.xml b/src/docs/src/documentation/content/xdocs/hod_scheduler.xml index db0e94bf2d6..bc262d669f1 100644 --- a/src/docs/src/documentation/content/xdocs/hod_scheduler.xml +++ b/src/docs/src/documentation/content/xdocs/hod_scheduler.xml @@ -250,7 +250,7 @@ info_port = Port number of the HDFS NameNode web UI

For example:

server-params = mapred.reduce.parallel.copies=20,io.sort.factor=100,io.sort.mb=128,io.file.buffer.size=131072 -final-server-params = mapred.child.java.opts=-Xmx512m,dfs.block.size=134217728,fs.inmemory.size.mb=128 +final-server-params = mapred.child.java.opts=-Xmx512m,dfs.block.size=134217728

In order to provide the options from command line, you can use the following syntax:

For configuring the MapReduce daemons use: