HDFS-9048. DistCp documentation is out-of-dated (Daisuke Kobayashi via iwasakims)

(cherry picked from commit 33a412e8a4)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt

(cherry picked from commit 55f7ceb0db)
This commit is contained in:
Masatake Iwasaki 2016-03-03 18:57:23 +09:00
parent 06d9a245fe
commit 08fc048018
2 changed files with 10 additions and 6 deletions

View File

@ -41,6 +41,9 @@ Release 2.7.3 - UNRELEASED
HDFS-8791. block ID-based DN storage layout can be very slow for datanode
on ext4 (Chris Trezzo via kihwal)
HDFS-9048. DistCp documentation is out-of-dated
(Daisuke Kobayashi via iwasakims)
OPTIMIZATIONS
HDFS-8845. DiskChecker should not traverse the entire tree (Chang Li via

View File

@ -406,12 +406,13 @@ $H3 Map sizing
$H3 Copying Between Versions of HDFS
For copying between two different versions of Hadoop, one will usually use
HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
destination cluster (more specifically, on NodeManagers that can write to the
destination cluster). Each source is specified as
`hftp://<dfs.http.address>/<path>` (the default `dfs.http.address` is
`<namenode>:50070`).
For copying between two different major versions of Hadoop (e.g. between 1.X
and 2.X), one will usually use WebHdfsFileSystem. Unlike the previous
HftpFileSystem, as webhdfs is available for both read and write operations,
DistCp can be run on both source and destination cluster.
Remote cluster is specified as `webhdfs://<namenode_hostname>:<http_port>`.
When copying between same major versions of Hadoop cluster (e.g. between 2.X
and 2.X), use hdfs protocol for better performance.
$H3 MapReduce and other side-effects