From 6dffad028e7beef34937df031f7293d752168917 Mon Sep 17 00:00:00 2001 From: Takanobu Asanuma Date: Thu, 20 Jun 2019 20:16:48 -0700 Subject: [PATCH] HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma. Signed-off-by: Wei-Chiu Chuang (cherry picked from commit 98d20656433cdec76c2108d24ff3b935657c1e80) --- .../src/site/markdown/ServerSetup.md.vm | 3 +- .../hadoop-hdfs/src/site/markdown/WebHDFS.md | 51 +++++++++++++++++++ .../src/site/markdown/DistCp.md.vm | 4 +- 3 files changed, 56 insertions(+), 2 deletions(-) diff --git a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm index 072c067b5d8..2d0a5b8cd2e 100644 --- a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm +++ b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm @@ -114,7 +114,7 @@ Configure `etc/hadoop/ssl-server.xml` with proper values, for example: ``` The SSL passwords can be secured by a credential provider. See -[Credential Provider API](../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html). +[Credential Provider API](../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html). You need to create an SSL certificate for the HttpFS server. As the `httpfs` Unix user, using the Java `keytool` command to create the SSL certificate: @@ -131,6 +131,7 @@ The answer to "What is your first and last name?" (i.e. "CN") must be the hostna Start HttpFS. It should work over HTTPS. Using the Hadoop `FileSystem` API or the Hadoop FS shell, use the `swebhdfs://` scheme. Make sure the JVM is picking up the truststore containing the public key of the SSL certificate if using a self-signed certificate. +For more information about the client side settings, see [SSL Configurations for SWebHDFS](../hadoop-project-dist/hadoop-hdfs/WebHDFS.html#SSL_Configurations_for_SWebHDFS). NOTE: Some old SSL clients may use weak ciphers that are not supported by the HttpFS server. It is recommended to upgrade the SSL client. diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md index 1c696344ebe..d3669b88ad1 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md @@ -101,6 +101,7 @@ In the REST API, the prefix "`/webhdfs/v1`" is inserted in the path and a query swebhdfs://:/ +See also: [SSL Configurations for SWebHDFS](#SSL_Configurations_for_SWebHDFS) ### HDFS Configuration Options @@ -161,6 +162,56 @@ The following properties control OAuth2 authentication. | `dfs.webhdfs.oauth2.refresh.token.expires.ms.since.epoch` | (required if using ConfRefreshTokenBasedAccessTokenProvider) Access token expiration measured in milliseconds since Jan 1, 1970. *Note this is a different value than provided by OAuth providers and has been munged as described in interface to be suitable for a client application* | | `dfs.webhdfs.oauth2.credential` | (required if using ConfCredentialBasedAccessTokenProvider). Credential used to obtain initial and subsequent access tokens. | +SSL Configurations for SWebHDFS +------------------------------------------------------- + +To use SWebHDFS FileSystem (i.e. using the swebhdfs protocol), a SSL configuration +file needs to be specified on the client side. This must specify 3 parameters: + +| SSL property | Description | +|:---- |:---- | +| `ssl.client.truststore.location` | The local-filesystem location of the trust-store file, containing the certificate for the NameNode. | +| `ssl.client.truststore.type` | (Optional) The format of the trust-store file. | +| `ssl.client.truststore.password` | (Optional) Password for the trust-store file. | + +The following is an example SSL configuration file (**ssl-client.xml**): + +```xml + + + ssl.client.truststore.location + /work/keystore.jks + Truststore to be used by clients. Must be specified. + + + + ssl.client.truststore.password + changeme + Optional. Default value is "". + + + + ssl.client.truststore.type + jks + Optional. Default value is "jks". + + +``` + +The SSL configuration file must be in the class-path of the client program and the filename needs to be specified in **core-site.xml**: + +```xml + + hadoop.ssl.client.conf + ssl-client.xml + + Resource file from which ssl client keystore information will be extracted. + This file is looked up in the classpath, typically it should be in Hadoop + conf/ directory. Default value is "ssl-client.xml". + + +``` + Proxy Users ----------- diff --git a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm index c678090908d..a5c40115aed 100644 --- a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm +++ b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm @@ -542,10 +542,12 @@ $H3 Copying Between Versions of HDFS HftpFileSystem, as webhdfs is available for both read and write operations, DistCp can be run on both source and destination cluster. Remote cluster is specified as `webhdfs://:`. - (Use the "`swebhdfs://`" scheme when webhdfs is secured with SSL). When copying between same major versions of Hadoop cluster (e.g. between 2.X and 2.X), use hdfs protocol for better performance. +$H3 Secure Copy over the wire with distcp + Use the "`swebhdfs://`" scheme when webhdfs is secured with SSL. For more information see [SSL Configurations for SWebHDFS](../hadoop-project-dist/hadoop-hdfs/WebHDFS.html#SSL_Configurations_for_SWebHDFS). + $H3 MapReduce and other side-effects As has been mentioned in the preceding, should a map fail to copy one of its