HDFS-9048. DistCp documentation is out-of-dated (Daisuke Kobayashi via iwasakims)
(cherry picked from commit 33a412e8a4
)
Conflicts:
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
This commit is contained in:
parent
85a62dcb5b
commit
55f7ceb0db
|
@ -1866,6 +1866,8 @@ Release 2.7.3 - UNRELEASED
|
|||
HDFS-8791. block ID-based DN storage layout can be very slow for datanode
|
||||
on ext4 (Chris Trezzo via kihwal)
|
||||
|
||||
HDFS-9048. DistCp documentation is out-of-dated
|
||||
(Daisuke Kobayashi via iwasakims)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
|
|
|
@ -412,12 +412,13 @@ $H3 Map sizing
|
|||
|
||||
$H3 Copying Between Versions of HDFS
|
||||
|
||||
For copying between two different versions of Hadoop, one will usually use
|
||||
HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
|
||||
destination cluster (more specifically, on NodeManagers that can write to the
|
||||
destination cluster). Each source is specified as
|
||||
`hftp://<dfs.http.address>/<path>` (the default `dfs.http.address` is
|
||||
`<namenode>:50070`).
|
||||
For copying between two different major versions of Hadoop (e.g. between 1.X
|
||||
and 2.X), one will usually use WebHdfsFileSystem. Unlike the previous
|
||||
HftpFileSystem, as webhdfs is available for both read and write operations,
|
||||
DistCp can be run on both source and destination cluster.
|
||||
Remote cluster is specified as `webhdfs://<namenode_hostname>:<http_port>`.
|
||||
When copying between same major versions of Hadoop cluster (e.g. between 2.X
|
||||
and 2.X), use hdfs protocol for better performance.
|
||||
|
||||
$H3 MapReduce and other side-effects
|
||||
|
||||
|
|
Loading…
Reference in New Issue