svn merge -c 1299045. Merge MAPREDUCE-3991 into branch-0.23. (harsh)

git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23@1299053 13f79535-47bb-0310-9956-ffa450edef68
2012-03-09 21:21:30 +00:00 · 2012-03-09 21:21:30 +00:00 · 95e8912bc0
parent 07e7ad5e86
commit 95e8912bc0
2 changed files with 3 additions and 1 deletions
--- a/hadoop-mapreduce-project/CHANGES.txt
+++ b/hadoop-mapreduce-project/CHANGES.txt
@ -28,6 +28,8 @@ Release 0.23.3 - UNRELEASED
 	MAPREDUCE-3935. Annotate Counters.Counter and Counters.Group as @Public.
    (tomwhite)

+    MAPREDUCE-3991. Streaming FAQ has some wrong instructions about input files splitting. (harsh)
+
  OPTIMIZATIONS

  BUG FIXES
--- a/hadoop-mapreduce-project/src/docs/src/documentation/content/xdocs/streaming.xml
+++ b/hadoop-mapreduce-project/src/docs/src/documentation/content/xdocs/streaming.xml
@ -750,7 +750,7 @@ You can use Hadoop Streaming to do this.
 As an example, consider the problem of zipping (compressing) a set of files across the hadoop cluster. You can achieve this using either of these methods:
 </p><ol>
 <li> Hadoop Streaming and custom mapper script:<ul>
-  <li> Generate a file containing the full HDFS path of the input files. Each map task would get one file name as input.</li>
+  <li> Generate files listing the full HDFS paths of the files to be processed. Each list file is the input for an individual map task which processes the files listed.</li>
  <li> Create a mapper script which, given a filename, will get the file to local disk, gzip the file and put it back in the desired output directory</li>
 </ul></li>
 <li>The existing Hadoop Framework:<ul>