HDFS-9048. DistCp documentation is out-of-dated (Daisuke Kobayashi via iwasakims)

This commit is contained in:
Masatake Iwasaki 2016-03-03 18:57:23 +09:00
parent eb864d35d6
commit 33a412e8a4
2 changed files with 10 additions and 6 deletions

View File

@ -2916,6 +2916,9 @@ Release 2.7.3 - UNRELEASED
HDFS-8791. block ID-based DN storage layout can be very slow for datanode
on ext4 (Chris Trezzo via kihwal)
HDFS-9048. DistCp documentation is out-of-dated
(Daisuke Kobayashi via iwasakims)
OPTIMIZATIONS
BUG FIXES

View File

@ -412,12 +412,13 @@ $H3 Map sizing
$H3 Copying Between Versions of HDFS
For copying between two different versions of Hadoop, one will usually use
HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
destination cluster (more specifically, on NodeManagers that can write to the
destination cluster). Each source is specified as
`hftp://<dfs.http.address>/<path>` (the default `dfs.http.address` is
`<namenode>:50070`).
For copying between two different major versions of Hadoop (e.g. between 1.X
and 2.X), one will usually use WebHdfsFileSystem. Unlike the previous
HftpFileSystem, as webhdfs is available for both read and write operations,
DistCp can be run on both source and destination cluster.
Remote cluster is specified as `webhdfs://<namenode_hostname>:<http_port>`.
When copying between same major versions of Hadoop cluster (e.g. between 2.X
and 2.X), use hdfs protocol for better performance.
$H3 MapReduce and other side-effects