HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma.

Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
This commit is contained in:
Takanobu Asanuma 2019-06-20 20:16:48 -07:00 committed by Wei-Chiu Chuang
parent 840d02ca5b
commit 98d2065643
3 changed files with 56 additions and 2 deletions

View File

@ -114,7 +114,7 @@ Configure `etc/hadoop/ssl-server.xml` with proper values, for example:
```
The SSL passwords can be secured by a credential provider. See
[Credential Provider API](../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
[Credential Provider API](../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
You need to create an SSL certificate for the HttpFS server. As the `httpfs` Unix user, using the Java `keytool` command to create the SSL certificate:
@ -131,6 +131,7 @@ The answer to "What is your first and last name?" (i.e. "CN") must be the hostna
Start HttpFS. It should work over HTTPS.
Using the Hadoop `FileSystem` API or the Hadoop FS shell, use the `swebhdfs://` scheme. Make sure the JVM is picking up the truststore containing the public key of the SSL certificate if using a self-signed certificate.
For more information about the client side settings, see [SSL Configurations for SWebHDFS](../hadoop-project-dist/hadoop-hdfs/WebHDFS.html#SSL_Configurations_for_SWebHDFS).
NOTE: Some old SSL clients may use weak ciphers that are not supported by the HttpFS server. It is recommended to upgrade the SSL client.

View File

@ -104,6 +104,7 @@ In the REST API, the prefix "`/webhdfs/v1`" is inserted in the path and a query
swebhdfs://<HOST>:<HTTP_PORT>/<PATH>
See also: [SSL Configurations for SWebHDFS](#SSL_Configurations_for_SWebHDFS)
### HDFS Configuration Options
@ -164,6 +165,56 @@ The following properties control OAuth2 authentication.
| `dfs.webhdfs.oauth2.refresh.token.expires.ms.since.epoch` | (required if using ConfRefreshTokenBasedAccessTokenProvider) Access token expiration measured in milliseconds since Jan 1, 1970. *Note this is a different value than provided by OAuth providers and has been munged as described in interface to be suitable for a client application* |
| `dfs.webhdfs.oauth2.credential` | (required if using ConfCredentialBasedAccessTokenProvider). Credential used to obtain initial and subsequent access tokens. |
SSL Configurations for SWebHDFS
-------------------------------------------------------
To use SWebHDFS FileSystem (i.e. using the swebhdfs protocol), a SSL configuration
file needs to be specified on the client side. This must specify 3 parameters:
| SSL property | Description |
|:---- |:---- |
| `ssl.client.truststore.location` | The local-filesystem location of the trust-store file, containing the certificate for the NameNode. |
| `ssl.client.truststore.type` | (Optional) The format of the trust-store file. |
| `ssl.client.truststore.password` | (Optional) Password for the trust-store file. |
The following is an example SSL configuration file (**ssl-client.xml**):
```xml
<configuration>
<property>
<name>ssl.client.truststore.location</name>
<value>/work/keystore.jks</value>
<description>Truststore to be used by clients. Must be specified.</description>
</property>
<property>
<name>ssl.client.truststore.password</name>
<value>changeme</value>
<description>Optional. Default value is "".</description>
</property>
<property>
<name>ssl.client.truststore.type</name>
<value>jks</value>
<description>Optional. Default value is "jks".</description>
</property>
</configuration>
```
The SSL configuration file must be in the class-path of the client program and the filename needs to be specified in **core-site.xml**:
```xml
<property>
<name>hadoop.ssl.client.conf</name>
<value>ssl-client.xml</value>
<description>
Resource file from which ssl client keystore information will be extracted.
This file is looked up in the classpath, typically it should be in Hadoop
conf/ directory. Default value is "ssl-client.xml".
</description>
</property>
```
Proxy Users
-----------

View File

@ -542,10 +542,12 @@ $H3 Copying Between Versions of HDFS
HftpFileSystem, as webhdfs is available for both read and write operations,
DistCp can be run on both source and destination cluster.
Remote cluster is specified as `webhdfs://<namenode_hostname>:<http_port>`.
(Use the "`swebhdfs://`" scheme when webhdfs is secured with SSL).
When copying between same major versions of Hadoop cluster (e.g. between 2.X
and 2.X), use hdfs protocol for better performance.
$H3 Secure Copy over the wire with distcp
Use the "`swebhdfs://`" scheme when webhdfs is secured with SSL. For more information see [SSL Configurations for SWebHDFS](../hadoop-project-dist/hadoop-hdfs/WebHDFS.html#SSL_Configurations_for_SWebHDFS).
$H3 MapReduce and other side-effects
As has been mentioned in the preceding, should a map fail to copy one of its