HDFS-10860. Switch HttpFS from Tomcat to Jetty. Contributed by John Zhuge.

This commit is contained in:
Xiao Chen 2017-02-06 13:33:32 -08:00
parent 7a8f3f237e
commit 69b23632c4
23 changed files with 733 additions and 702 deletions

View File

@ -21,6 +21,14 @@
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<!-- Jar file -->
<fileSet>
<directory>target</directory>
<outputDirectory>/share/hadoop/hdfs</outputDirectory>
<includes>
<include>${project.artifactId}-${project.version}.jar</include>
</includes>
</fileSet>
<!-- Configuration files -->
<fileSet>
<directory>${basedir}/src/main/conf</directory>
@ -41,7 +49,7 @@
<directory>${basedir}/src/main/libexec</directory>
<outputDirectory>/libexec</outputDirectory>
<includes>
<include>*</include>
<include>**/*</include>
</includes>
<fileMode>0755</fileMode>
</fileSet>
@ -51,4 +59,19 @@
<outputDirectory>/share/doc/hadoop/httpfs</outputDirectory>
</fileSet>
</fileSets>
<dependencySets>
<dependencySet>
<useProjectArtifact>false</useProjectArtifact>
<outputDirectory>/share/hadoop/hdfs/lib</outputDirectory>
<!-- Exclude hadoop artifacts. They will be found via HADOOP* env -->
<excludes>
<exclude>org.apache.hadoop:hadoop-common</exclude>
<exclude>org.apache.hadoop:hadoop-hdfs</exclude>
<!-- use slf4j from common to avoid multiple binding warnings -->
<exclude>org.slf4j:slf4j-api</exclude>
<exclude>org.slf4j:slf4j-log4j12</exclude>
<exclude>org.hsqldb:hsqldb</exclude>
</excludes>
</dependencySet>
</dependencySets>
</assembly>

View File

@ -263,11 +263,14 @@ Example:
Note that the setting is not permanent and will be reset when the daemon is restarted.
This command works by sending a HTTP/HTTPS request to the daemon's internal Jetty servlet, so it supports the following daemons:
* Common
* key management server
* HDFS
* name node
* secondary name node
* data node
* journal node
* HttpFS server
* YARN
* resource manager
* node manager

View File

@ -101,6 +101,7 @@ In summary, first, provision the credentials into a provider then configure the
|HDFS |DFSUtil leverages Configuration.getPassword method to use the credential provider API and/or fallback to the clear text value stored in ssl-server.xml.|TODO|
|YARN |WebAppUtils uptakes the use of the credential provider API through the new method on Configuration called getPassword. This provides an alternative to storing the passwords in clear text within the ssl-server.xml file while maintaining backward compatibility.|TODO|
|KMS |Uses HttpServer2.loadSSLConfiguration that leverages Configuration.getPassword to read SSL related credentials. They may be resolved through Credential Provider and/or from the clear text in the config when allowed.|[KMS](../../hadoop-kms/index.html)|
|HttpFS |Uses HttpServer2.loadSSLConfiguration that leverages Configuration.getPassword to read SSL related credentials. They may be resolved through Credential Provider and/or from the clear text in the config when allowed.|[HttpFS Server Setup](../../hadoop-hdfs-httpfs/ServerSetup.html)|
|AWS <br/> S3/S3A |Uses Configuration.getPassword to get the S3 credentials. They may be resolved through the credential provider API or from the config for backward compatibility.|[AWS S3/S3A Usage](../../hadoop-aws/tools/hadoop-aws/index.html)|
|Azure <br/> WASB |Uses Configuration.getPassword to get the WASB credentials. They may be resolved through the credential provider API or from the config for backward compatibility.|[Azure WASB Usage](../../hadoop-azure/index.html)|
|Azure <br/> ADLS |Uses Configuration.getPassword to get the ADLS credentials. They may be resolved through the credential provider API or from the config for backward compatibility.|[Azure ADLS Usage](../../hadoop-azure-datalake/index.html)|

View File

@ -196,7 +196,7 @@ AES offers the greatest cryptographic strength and the best performance. At this
Data transfer between Web-console and clients are protected by using SSL(HTTPS). SSL configuration is recommended but not required to configure Hadoop security with Kerberos.
To enable SSL for web console of HDFS daemons, set `dfs.http.policy` to either `HTTPS_ONLY` or `HTTP_AND_HTTPS` in hdfs-site.xml.
Note that this does not affect KMS nor HttpFS, as they are implemented on top of Tomcat and do not respect this parameter. See [Hadoop KMS](../../hadoop-kms/index.html) and [Hadoop HDFS over HTTP - Server Setup](../../hadoop-hdfs-httpfs/ServerSetup.html) for instructions on enabling KMS over HTTPS and HttpFS over HTTPS, respectively.
Note KMS and HttpFS do not respect this parameter. See [Hadoop KMS](../../hadoop-kms/index.html) and [Hadoop HDFS over HTTP - Server Setup](../../hadoop-hdfs-httpfs/ServerSetup.html) for instructions on enabling KMS over HTTPS and HttpFS over HTTPS, respectively.
To enable SSL for web console of YARN daemons, set `yarn.http.policy` to `HTTPS_ONLY` in yarn-site.xml.

View File

@ -27,23 +27,18 @@
</parent>
<artifactId>hadoop-hdfs-httpfs</artifactId>
<version>3.0.0-alpha3-SNAPSHOT</version>
<packaging>war</packaging>
<packaging>jar</packaging>
<name>Apache Hadoop HttpFS</name>
<description>Apache Hadoop HttpFS</description>
<properties>
<httpfs.source.repository>REPO NOT AVAIL</httpfs.source.repository>
<httpfs.source.repository>REPO NOT AVAIL</httpfs.source.repository>
<httpfs.source.revision>REVISION NOT AVAIL</httpfs.source.revision>
<maven.build.timestamp.format>yyyy-MM-dd'T'HH:mm:ssZ</maven.build.timestamp.format>
<httpfs.build.timestamp>${maven.build.timestamp}</httpfs.build.timestamp>
<httpfs.tomcat.dist.dir>
${project.build.directory}/${project.artifactId}-${project.version}/share/hadoop/httpfs/tomcat
</httpfs.tomcat.dist.dir>
<kerberos.realm>LOCALHOST</kerberos.realm>
<test.exclude.kerberos.test>**/TestHttpFSWithKerberos.java</test.exclude.kerberos.test>
<tomcat.download.url>http://archive.apache.org/dist/tomcat/tomcat-6/v${tomcat.version}/bin/apache-tomcat-${tomcat.version}.tar.gz</tomcat.download.url>
</properties>
<dependencies>
@ -75,7 +70,6 @@
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
@ -90,7 +84,10 @@
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-webapp</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
@ -373,23 +370,6 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-war-plugin</artifactId>
<executions>
<execution>
<id>default-war</id>
<phase>package</phase>
<goals>
<goal>war</goal>
</goals>
<configuration>
<warName>webhdfs</warName>
<webappDirectory>${project.build.directory}/webhdfs</webappDirectory>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>findbugs-maven-plugin</artifactId>
@ -490,79 +470,6 @@
</execution>
</executions>
</plugin>
<!-- Downloading Tomcat TAR.GZ, using downloads/ dir to avoid downloading over an over -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<executions>
<execution>
<id>dist</id>
<goals>
<goal>run</goal>
</goals>
<phase>package</phase>
<configuration>
<target>
<mkdir dir="downloads"/>
<get
src="${tomcat.download.url}"
dest="downloads/apache-tomcat-${tomcat.version}.tar.gz" verbose="true" skipexisting="true"/>
<delete dir="${project.build.directory}/tomcat.exp"/>
<mkdir dir="${project.build.directory}/tomcat.exp"/>
<!-- Using Unix script to preserve file permissions -->
<echo file="${project.build.directory}/tomcat-untar.sh">
cd "${project.build.directory}/tomcat.exp"
gzip -cd ../../downloads/apache-tomcat-${tomcat.version}.tar.gz | tar xf -
</echo>
<exec executable="${shell-executable}" dir="${project.build.directory}" failonerror="true">
<arg line="./tomcat-untar.sh"/>
</exec>
<move file="${project.build.directory}/tomcat.exp/apache-tomcat-${tomcat.version}"
tofile="${httpfs.tomcat.dist.dir}"/>
<delete dir="${project.build.directory}/tomcat.exp"/>
<delete dir="${httpfs.tomcat.dist.dir}/webapps"/>
<mkdir dir="${httpfs.tomcat.dist.dir}/webapps"/>
<delete file="${httpfs.tomcat.dist.dir}/conf/server.xml"/>
<copy file="${basedir}/src/main/tomcat/server.xml"
toDir="${httpfs.tomcat.dist.dir}/conf"/>
<delete file="${httpfs.tomcat.dist.dir}/conf/ssl-server.xml"/>
<copy file="${basedir}/src/main/tomcat/ssl-server.xml.conf"
toDir="${httpfs.tomcat.dist.dir}/conf"/>
<delete file="${httpfs.tomcat.dist.dir}/conf/logging.properties"/>
<copy file="${basedir}/src/main/tomcat/logging.properties"
toDir="${httpfs.tomcat.dist.dir}/conf"/>
<copy toDir="${httpfs.tomcat.dist.dir}/webapps/ROOT">
<fileset dir="${basedir}/src/main/tomcat/ROOT"/>
</copy>
<copy toDir="${httpfs.tomcat.dist.dir}/webapps/webhdfs">
<fileset dir="${project.build.directory}/webhdfs"/>
</copy>
</target>
</configuration>
</execution>
<execution>
<id>tar</id>
<phase>package</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target if="tar">
<!-- Using Unix script to preserve symlinks -->
<echo file="${project.build.directory}/dist-maketar.sh">
cd "${project.build.directory}"
tar cf - ${project.artifactId}-${project.version} | gzip > ${project.artifactId}-${project.version}.tar.gz
</echo>
<exec executable="${shell-executable}" dir="${project.build.directory}" failonerror="true">
<arg line="./dist-maketar.sh"/>
</exec>
</target>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>

View File

@ -18,6 +18,14 @@
# hadoop-env.sh is read prior to this file.
#
# HTTPFS config directory
#
# export HTTPFS_CONFIG=${HADOOP_CONF_DIR}
# HTTPFS log directory
#
# export HTTPFS_LOG=${HADOOP_LOG_DIR}
# HTTPFS temporary directory
#
# export HTTPFS_TEMP=${HADOOP_HOME}/temp
@ -26,11 +34,7 @@
#
# export HTTPFS_HTTP_PORT=14000
# The Admin port used by HTTPFS
#
# export HTTPFS_ADMIN_PORT=$((HTTPFS_HTTP_PORT + 1))
# The maximum number of Tomcat handler threads
# The maximum number of HTTP handler threads
#
# export HTTPFS_MAX_THREADS=1000
@ -38,39 +42,18 @@
#
# export HTTPFS_HTTP_HOSTNAME=$(hostname -f)
# The maximum size of Tomcat HTTP header
# The maximum size of HTTP header
#
# export HTTPFS_MAX_HTTP_HEADER_SIZE=65536
# Whether SSL is enabled
#
# export HTTPFS_SSL_ENABLED=false
# The location of the SSL keystore if using SSL
#
# export HTTPFS_SSL_KEYSTORE_FILE=${HOME}/.keystore
#
# The password of the SSL keystore if using SSL
#
# export HTTPFS_SSL_KEYSTORE_PASS=password
##
## Tomcat specific settings
##
#
# Location of tomcat
#
# export HTTPFS_CATALINA_HOME=${HADOOP_HOME}/share/hadoop/httpfs/tomcat
# Java System properties for HTTPFS should be specified in this variable.
# The java.library.path and hadoop.home.dir properties are automatically
# configured. In order to supplement java.library.path,
# one should add to the JAVA_LIBRARY_PATH env var.
#
# export CATALINA_OPTS=
# PID file
#
# export CATALINA_PID=${HADOOP_PID_DIR}/hadoop-${HADOOP_IDENT_STRING}-httpfs.pid
# Output file
#
# export CATALINA_OUT=${HTTPFS_LOG}/hadoop-${HADOOP_IDENT_STRING}-httpfs-${HOSTNAME}.out
# export HTTPFS_SSL_KEYSTORE_PASS=password

View File

@ -43,7 +43,7 @@ import java.util.Properties;
public class HttpFSAuthenticationFilter
extends DelegationTokenAuthenticationFilter {
private static final String CONF_PREFIX = "httpfs.authentication.";
static final String CONF_PREFIX = "httpfs.authentication.";
private static final String SIGNATURE_SECRET_FILE = SIGNATURE_SECRET + ".file";

View File

@ -0,0 +1,170 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.http.server;
import static org.apache.hadoop.util.StringUtils.startupShutdownMessage;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.conf.ConfigurationWithLogging;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.security.authorize.AccessControlList;
import org.apache.hadoop.security.ssl.SSLFactory;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* The HttpFS web server.
*/
@InterfaceAudience.Private
public class HttpFSServerWebServer {
private static final Logger LOG =
LoggerFactory.getLogger(HttpFSServerWebServer.class);
private static final String HTTPFS_DEFAULT_XML = "httpfs-default.xml";
private static final String HTTPFS_SITE_XML = "httpfs-site.xml";
// HTTP properties
static final String HTTP_PORT_KEY = "hadoop.httpfs.http.port";
private static final int HTTP_PORT_DEFAULT = 14000;
static final String HTTP_HOST_KEY = "hadoop.httpfs.http.host";
private static final String HTTP_HOST_DEFAULT = "0.0.0.0";
// SSL properties
private static final String SSL_ENABLED_KEY = "hadoop.httpfs.ssl.enabled";
private static final boolean SSL_ENABLED_DEFAULT = false;
private static final String HTTP_ADMINS_KEY =
"hadoop.httpfs.http.administrators";
private static final String NAME = "webhdfs";
private static final String SERVLET_PATH = "/webhdfs";
static {
Configuration.addDefaultResource(HTTPFS_DEFAULT_XML);
Configuration.addDefaultResource(HTTPFS_SITE_XML);
}
private final HttpServer2 httpServer;
private final String scheme;
HttpFSServerWebServer(Configuration conf, Configuration sslConf) throws
Exception {
// Override configuration with deprecated environment variables.
deprecateEnv("HTTPFS_TEMP", conf, HttpServer2.HTTP_TEMP_DIR_KEY,
HTTPFS_SITE_XML);
deprecateEnv("HTTPFS_HTTP_PORT", conf, HTTP_PORT_KEY,
HTTPFS_SITE_XML);
deprecateEnv("HTTPFS_MAX_THREADS", conf,
HttpServer2.HTTP_MAX_THREADS_KEY, HTTPFS_SITE_XML);
deprecateEnv("HTTPFS_MAX_HTTP_HEADER_SIZE", conf,
HttpServer2.HTTP_MAX_REQUEST_HEADER_SIZE_KEY, HTTPFS_SITE_XML);
deprecateEnv("HTTPFS_MAX_HTTP_HEADER_SIZE", conf,
HttpServer2.HTTP_MAX_RESPONSE_HEADER_SIZE_KEY, HTTPFS_SITE_XML);
deprecateEnv("HTTPFS_SSL_ENABLED", conf, SSL_ENABLED_KEY,
HTTPFS_SITE_XML);
deprecateEnv("HTTPFS_SSL_KEYSTORE_FILE", sslConf,
SSLFactory.SSL_SERVER_KEYSTORE_LOCATION,
SSLFactory.SSL_SERVER_CONF_DEFAULT);
deprecateEnv("HTTPFS_SSL_KEYSTORE_PASS", sslConf,
SSLFactory.SSL_SERVER_KEYSTORE_PASSWORD,
SSLFactory.SSL_SERVER_CONF_DEFAULT);
boolean sslEnabled = conf.getBoolean(SSL_ENABLED_KEY,
SSL_ENABLED_DEFAULT);
scheme = sslEnabled ? HttpServer2.HTTPS_SCHEME : HttpServer2.HTTP_SCHEME;
String host = conf.get(HTTP_HOST_KEY, HTTP_HOST_DEFAULT);
int port = conf.getInt(HTTP_PORT_KEY, HTTP_PORT_DEFAULT);
URI endpoint = new URI(scheme, null, host, port, null, null, null);
httpServer = new HttpServer2.Builder()
.setName(NAME)
.setConf(conf)
.setSSLConf(sslConf)
.authFilterConfigurationPrefix(HttpFSAuthenticationFilter.CONF_PREFIX)
.setACL(new AccessControlList(conf.get(HTTP_ADMINS_KEY, " ")))
.addEndpoint(endpoint)
.build();
}
/**
* Load the deprecated environment variable into the configuration.
*
* @param varName the environment variable name
* @param conf the configuration
* @param propName the configuration property name
* @param confFile the configuration file name
*/
private static void deprecateEnv(String varName, Configuration conf,
String propName, String confFile) {
String value = System.getenv(varName);
if (value == null) {
return;
}
String propValue = conf.get(propName);
LOG.warn("Environment variable {} = '{}' is deprecated and overriding"
+ " property {} = '{}', please set the property in {} instead.",
varName, value, propName, propValue, confFile);
conf.set(propName, value, "environment variable " + varName);
}
public void start() throws IOException {
httpServer.start();
}
public void join() throws InterruptedException {
httpServer.join();
}
public void stop() throws Exception {
httpServer.stop();
}
public URL getUrl() {
InetSocketAddress addr = httpServer.getConnectorAddress(0);
if (null == addr) {
return null;
}
try {
return new URL(scheme, addr.getHostName(), addr.getPort(),
SERVLET_PATH);
} catch (MalformedURLException ex) {
throw new RuntimeException("It should never happen: " + ex.getMessage(),
ex);
}
}
public static void main(String[] args) throws Exception {
startupShutdownMessage(HttpFSServerWebServer.class, args, LOG);
Configuration conf = new ConfigurationWithLogging(
new Configuration(true));
Configuration sslConf = new ConfigurationWithLogging(
SSLFactory.readSSLConfiguration(conf, SSLFactory.Mode.SERVER));
HttpFSServerWebServer webServer =
new HttpFSServerWebServer(conf, sslConf);
webServer.start();
webServer.join();
}
}

View File

@ -84,7 +84,9 @@ public class MDCFilter implements Filter {
MDC.put("user", user);
}
MDC.put("method", ((HttpServletRequest) request).getMethod());
MDC.put("path", ((HttpServletRequest) request).getPathInfo());
if (((HttpServletRequest) request).getPathInfo() != null) {
MDC.put("path", ((HttpServletRequest) request).getPathInfo());
}
chain.doFilter(request, response);
} finally {
MDC.clear();

View File

@ -1,76 +0,0 @@
#!/usr/bin/env bash
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
function hadoop_subproject_init
{
local this
local binparent
local varlist
if [[ -z "${HADOOP_HTTPFS_ENV_PROCESSED}" ]]; then
if [[ -e "${HADOOP_CONF_DIR}/httpfs-env.sh" ]]; then
. "${HADOOP_CONF_DIR}/httpfs-env.sh"
export HADOOP_HTTPFS_ENV_PROCESSED=true
fi
fi
export HADOOP_CATALINA_PREFIX=httpfs
export HADOOP_CATALINA_TEMP="${HTTPFS_TEMP:-${HADOOP_HOME}/temp}"
hadoop_deprecate_envvar HTTPFS_CONFIG HADOOP_CONF_DIR
hadoop_deprecate_envvar HTTPFS_LOG HADOOP_LOG_DIR
export HADOOP_CATALINA_CONFIG="${HADOOP_CONF_DIR}"
export HADOOP_CATALINA_LOG="${HADOOP_LOG_DIR}"
export HTTPFS_HTTP_HOSTNAME=${HTTPFS_HTTP_HOSTNAME:-$(hostname -f)}
export HADOOP_CATALINA_HTTP_PORT="${HTTPFS_HTTP_PORT:-14000}"
export HADOOP_CATALINA_ADMIN_PORT="${HTTPFS_ADMIN_PORT:-$((HADOOP_CATALINA_HTTP_PORT+1))}"
export HADOOP_CATALINA_MAX_THREADS="${HTTPFS_MAX_THREADS:-150}"
export HADOOP_CATALINA_MAX_HTTP_HEADER_SIZE="${HTTPFS_MAX_HTTP_HEADER_SIZE:-65536}"
export HTTPFS_SSL_ENABLED=${HTTPFS_SSL_ENABLED:-false}
export HADOOP_CATALINA_SSL_KEYSTORE_FILE="${HTTPFS_SSL_KEYSTORE_FILE:-${HOME}/.keystore}"
export CATALINA_BASE="${CATALINA_BASE:-${HADOOP_HOME}/share/hadoop/httpfs/tomcat}"
export HADOOP_CATALINA_HOME="${HTTPFS_CATALINA_HOME:-${CATALINA_BASE}}"
export CATALINA_OUT="${CATALINA_OUT:-${HADOOP_LOG_DIR}/hadoop-${HADOOP_IDENT_STRING}-httpfs-${HOSTNAME}.out}"
export CATALINA_PID="${CATALINA_PID:-${HADOOP_PID_DIR}/hadoop-${HADOOP_IDENT_STRING}-httpfs.pid}"
if [[ -n "${HADOOP_SHELL_SCRIPT_DEBUG}" ]]; then
varlist=$(env | egrep '(^HTTPFS|^CATALINA)' | cut -f1 -d= | grep -v _PASS)
for i in ${varlist}; do
hadoop_debug "Setting ${i} to ${!i}"
done
fi
}
if [[ -n "${HADOOP_COMMON_HOME}" ]] &&
[[ -e "${HADOOP_COMMON_HOME}/libexec/hadoop-config.sh" ]]; then
. "${HADOOP_COMMON_HOME}/libexec/hadoop-config.sh"
elif [[ -e "${HADOOP_LIBEXEC_DIR}/hadoop-config.sh" ]]; then
. "${HADOOP_LIBEXEC_DIR}/hadoop-config.sh"
elif [[ -e "${HADOOP_HOME}/libexec/hadoop-config.sh" ]]; then
. "${HADOOP_HOME}/libexec/hadoop-config.sh"
else
echo "ERROR: Hadoop common not found." 2>&1
exit 1
fi

View File

@ -0,0 +1,67 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
if [[ "${HADOOP_SHELL_EXECNAME}" = hdfs ]]; then
hadoop_add_subcommand "httpfs" "run HttpFS server, the HDFS HTTP Gateway"
fi
## @description Command handler for httpfs subcommand
## @audience private
## @stability stable
## @replaceable no
function hdfs_subcommand_httpfs
{
if [[ -f "${HADOOP_CONF_DIR}/httpfs-env.sh" ]]; then
# shellcheck disable=SC1090
. "${HADOOP_CONF_DIR}/httpfs-env.sh"
fi
hadoop_deprecate_envvar HTTPFS_CONFIG HADOOP_CONF_DIR
hadoop_deprecate_envvar HTTPFS_LOG HADOOP_LOG_DIR
hadoop_using_envvar HTTPFS_HTTP_HOSTNAME
hadoop_using_envvar HTTPFS_HTTP_PORT
hadoop_using_envvar HTTPFS_MAX_HTTP_HEADER_SIZE
hadoop_using_envvar HTTPFS_MAX_THREADS
hadoop_using_envvar HTTPFS_SSL_ENABLED
hadoop_using_envvar HTTPFS_SSL_KEYSTORE_FILE
hadoop_using_envvar HTTPFS_TEMP
# shellcheck disable=SC2034
HADOOP_SUBCMD_SUPPORTDAEMONIZATION=true
# shellcheck disable=SC2034
HADOOP_CLASSNAME=org.apache.hadoop.fs.http.server.HttpFSServerWebServer
# shellcheck disable=SC2034
hadoop_add_param HADOOP_OPTS "-Dhttpfs.home.dir" \
"-Dhttpfs.home.dir=${HADOOP_HOME}"
hadoop_add_param HADOOP_OPTS "-Dhttpfs.config.dir" \
"-Dhttpfs.config.dir=${HTTPFS_CONFIG:-${HADOOP_CONF_DIR}}"
hadoop_add_param HADOOP_OPTS "-Dhttpfs.log.dir" \
"-Dhttpfs.log.dir=${HTTPFS_LOG:-${HADOOP_LOG_DIR}}"
hadoop_add_param HADOOP_OPTS "-Dhttpfs.http.hostname" \
"-Dhttpfs.http.hostname=${HTTPFS_HOST_NAME:-$(hostname -f)}"
if [[ -n "${HTTPFS_SSL_ENABLED}" ]]; then
hadoop_add_param HADOOP_OPTS "-Dhttpfs.ssl.enabled" \
"-Dhttpfs.ssl.enabled=${HTTPFS_SSL_ENABLED}"
fi
if [[ "${HADOOP_DAEMON_MODE}" == "default" ]] ||
[[ "${HADOOP_DAEMON_MODE}" == "start" ]]; then
hadoop_mkdir "${HTTPFS_TEMP:-${HADOOP_HOME}/temp}"
fi
}

View File

@ -15,6 +15,78 @@
-->
<configuration>
<property>
<name>hadoop.httpfs.http.port</name>
<value>14000</value>
<description>
The HTTP port for HttpFS REST API.
</description>
</property>
<property>
<name>hadoop.httpfs.http.host</name>
<value>0.0.0.0</value>
<description>
The bind host for HttpFS REST API.
</description>
</property>
<property>
<name>hadoop.httpfs.http.administrators</name>
<value></value>
<description>ACL for the admins, this configuration is used to control
who can access the default servlets for HttpFS server. The value
should be a comma separated list of users and groups. The user list
comes first and is separated by a space followed by the group list,
e.g. "user1,user2 group1,group2". Both users and groups are optional,
so "user1", " group1", "", "user1 group1", "user1,user2 group1,group2"
are all valid (note the leading space in " group1"). '*' grants access
to all users and groups, e.g. '*', '* ' and ' *' are all valid.
</description>
</property>
<property>
<name>hadoop.httpfs.ssl.enabled</name>
<value>false</value>
<description>
Whether SSL is enabled. Default is false, i.e. disabled.
</description>
</property>
<!-- HTTP properties -->
<property>
<name>hadoop.http.max.threads</name>
<value>1000</value>
<description>
The maxmimum number of threads.
</description>
</property>
<property>
<name>hadoop.http.max.request.header.size</name>
<value>65536</value>
<description>
The maxmimum HTTP request header size.
</description>
</property>
<property>
<name>hadoop.http.max.response.header.size</name>
<value>65536</value>
<description>
The maxmimum HTTP response header size.
</description>
</property>
<property>
<name>hadoop.http.temp.dir</name>
<value>${hadoop.tmp.dir}/httpfs</value>
<description>
HttpFS temp directory.
</description>
</property>
<!-- HttpFSServer Server -->
<property>

View File

@ -15,7 +15,22 @@
-->
<html>
<head>
<title>Hadoop HttpFS Server</title>
</head>
<body>
<b>HttpFs service</b>, service base URL at /webhdfs/v1.
<h1>Hadoop HttpFS Server</h1>
<ul>
<li>HttpFS Server service base URL at <b>/webhdfs/v1/</b></li>
<ul>
<li><a href="/webhdfs/v1/?op=LISTSTATUS">
/webhdfs/v1/?op=LISTSTATUS</a> to list root directory</li>
</ul>
<li><a href="/conf">HttpFS configuration properties</a></li>
<li><a href="/jmx">HttpFS JMX</a></li>
<li><a href="/logLevel">HttpFS log level</a></li>
<li><a href="/logs">HttpFS log files</a></li>
<li><a href="/stacks">HttpFS stacks</a></li>
</ul>
</body>
</html>
</html>

View File

@ -0,0 +1,98 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<web-app version="2.4" xmlns="http://java.sun.com/xml/ns/j2ee">
<listener>
<listener-class>org.apache.hadoop.fs.http.server.HttpFSServerWebApp</listener-class>
</listener>
<servlet>
<servlet-name>webservices-driver</servlet-name>
<servlet-class>com.sun.jersey.spi.container.servlet.ServletContainer</servlet-class>
<init-param>
<param-name>com.sun.jersey.config.property.packages</param-name>
<param-value>org.apache.hadoop.fs.http.server,org.apache.hadoop.lib.wsrs</param-value>
</init-param>
<!-- Enables detailed Jersey request/response logging -->
<!--
<init-param>
<param-name>com.sun.jersey.spi.container.ContainerRequestFilters</param-name>
<param-value>com.sun.jersey.api.container.filter.LoggingFilter</param-value>
</init-param>
<init-param>
<param-name>com.sun.jersey.spi.container.ContainerResponseFilters</param-name>
<param-value>com.sun.jersey.api.container.filter.LoggingFilter</param-value>
</init-param>
-->
<load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
<servlet-name>webservices-driver</servlet-name>
<url-pattern>/webhdfs/*</url-pattern>
</servlet-mapping>
<filter>
<filter-name>authFilter</filter-name>
<filter-class>org.apache.hadoop.fs.http.server.HttpFSAuthenticationFilter</filter-class>
</filter>
<filter>
<filter-name>MDCFilter</filter-name>
<filter-class>org.apache.hadoop.lib.servlet.MDCFilter</filter-class>
</filter>
<filter>
<filter-name>hostnameFilter</filter-name>
<filter-class>org.apache.hadoop.lib.servlet.HostnameFilter</filter-class>
</filter>
<filter>
<filter-name>checkUploadContentType</filter-name>
<filter-class>org.apache.hadoop.fs.http.server.CheckUploadContentTypeFilter</filter-class>
</filter>
<filter>
<filter-name>fsReleaseFilter</filter-name>
<filter-class>org.apache.hadoop.fs.http.server.HttpFSReleaseFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>authFilter</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>MDCFilter</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>hostnameFilter</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>checkUploadContentType</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>fsReleaseFilter</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>
</web-app>

View File

@ -13,102 +13,52 @@
# limitations under the License.
#
MYNAME="${BASH_SOURCE-$0}"
MYNAME="${0##*/}"
function hadoop_usage
## @description Print usage
## @audience private
## @stability stable
## @replaceable no
function print_usage
{
hadoop_add_subcommand "run" "Start HttpFS in the current window"
hadoop_add_subcommand "run -security" "Start in the current window with security manager"
hadoop_add_subcommand "start" "Start HttpFS in a separate window"
hadoop_add_subcommand "start -security" "Start in a separate window with security manager"
hadoop_add_subcommand "status" "Return the LSB compliant status"
hadoop_add_subcommand "stop" "Stop HttpFS, waiting up to 5 seconds for the process to end"
hadoop_add_subcommand "stop n" "Stop HttpFS, waiting up to n seconds for the process to end"
hadoop_add_subcommand "stop -force" "Stop HttpFS, wait up to 5 seconds and then use kill -KILL if still running"
hadoop_add_subcommand "stop n -force" "Stop HttpFS, wait up to n seconds and then use kill -KILL if still running"
hadoop_generate_usage "${MYNAME}" false
cat <<EOF
Usage: ${MYNAME} run|start|status|stop
commands:
run Run HttpFS server, the HDFS HTTP Gateway
start Start HttpFS server as a daemon
status Return the status of the HttpFS server daemon
stop Stop the HttpFS server daemon
EOF
}
# let's locate libexec...
if [[ -n "${HADOOP_HOME}" ]]; then
HADOOP_DEFAULT_LIBEXEC_DIR="${HADOOP_HOME}/libexec"
else
bin=$(cd -P -- "$(dirname -- "${MYNAME}")" >/dev/null && pwd -P)
HADOOP_DEFAULT_LIBEXEC_DIR="${bin}/../libexec"
fi
HADOOP_LIBEXEC_DIR="${HADOOP_LIBEXEC_DIR:-$HADOOP_DEFAULT_LIBEXEC_DIR}"
# shellcheck disable=SC2034
HADOOP_NEW_CONFIG=true
if [[ -f "${HADOOP_LIBEXEC_DIR}/httpfs-config.sh" ]]; then
. "${HADOOP_LIBEXEC_DIR}/httpfs-config.sh"
else
echo "ERROR: Cannot execute ${HADOOP_LIBEXEC_DIR}/httpfs-config.sh." 2>&1
exit 1
fi
# The Java System property 'httpfs.http.port' it is not used by HttpFS,
# it is used in Tomcat's server.xml configuration file
#
# Mask the trustStorePassword
# shellcheck disable=SC2086
CATALINA_OPTS_DISP="$(echo ${CATALINA_OPTS} | sed -e 's/trustStorePassword=[^ ]*/trustStorePassword=***/')"
hadoop_debug "Using CATALINA_OPTS: ${CATALINA_OPTS_DISP}"
# We're using hadoop-common, so set up some stuff it might need:
hadoop_finalize
hadoop_verify_logdir
echo "WARNING: ${MYNAME} is deprecated," \
"please use 'hdfs [--daemon start|status|stop] httpfs'." >&2
if [[ $# = 0 ]]; then
case "${HADOOP_DAEMON_MODE}" in
status)
hadoop_status_daemon "${CATALINA_PID}"
exit
;;
start)
set -- "start"
;;
stop)
set -- "stop"
;;
esac
print_usage
exit
fi
hadoop_finalize_catalina_opts
export CATALINA_OPTS
case $1 in
run)
args=("httpfs")
;;
start|stop|status)
args=("--daemon" "$1" "httpfs")
;;
*)
echo "Unknown sub-command \"$1\"."
print_usage
exit 1
;;
esac
# A bug in catalina.sh script does not use CATALINA_OPTS for stopping the server
#
if [[ "${1}" = "stop" ]]; then
export JAVA_OPTS=${CATALINA_OPTS}
# Locate bin
if [[ -n "${HADOOP_HOME}" ]]; then
bin="${HADOOP_HOME}/bin"
else
sbin=$(cd -P -- "$(dirname -- "$0")" >/dev/null && pwd -P)
bin=$(cd -P -- "${sbin}/../bin" >/dev/null && pwd -P)
fi
# If ssl, the populate the passwords into ssl-server.xml before starting tomcat
#
# HTTPFS_SSL_KEYSTORE_PASS is a bit odd.
# if undefined, then the if test will not enable ssl on its own
# if "", set it to "password".
# if custom, use provided password
#
if [[ -f "${HADOOP_CATALINA_HOME}/conf/ssl-server.xml.conf" ]]; then
if [[ -n "${HTTPFS_SSL_KEYSTORE_PASS+x}" ]] || [[ -n "${HTTPFS_SSL_TRUSTSTORE_PASS}" ]]; then
export HTTPFS_SSL_KEYSTORE_PASS=${HTTPFS_SSL_KEYSTORE_PASS:-password}
HTTPFS_SSL_KEYSTORE_PASS_ESCAPED=$(hadoop_xml_escape \
"$(hadoop_sed_escape "$HTTPFS_SSL_KEYSTORE_PASS")")
HTTPFS_SSL_TRUSTSTORE_PASS_ESCAPED=$(hadoop_xml_escape \
"$(hadoop_sed_escape "$HTTPFS_SSL_TRUSTSTORE_PASS")")
sed -e 's/"_httpfs_ssl_keystore_pass_"/'"\"${HTTPFS_SSL_KEYSTORE_PASS_ESCAPED}\""'/g' \
-e 's/"_httpfs_ssl_truststore_pass_"/'"\"${HTTPFS_SSL_TRUSTSTORE_PASS_ESCAPED}\""'/g' \
"${HADOOP_CATALINA_HOME}/conf/ssl-server.xml.conf" \
> "${HADOOP_CATALINA_HOME}/conf/ssl-server.xml"
chmod 700 "${HADOOP_CATALINA_HOME}/conf/ssl-server.xml" >/dev/null 2>&1
fi
fi
hadoop_add_param CATALINA_OPTS -Dhttpfs.http.hostname "-Dhttpfs.http.hostname=${HTTPFS_HOST_NAME}"
hadoop_add_param CATALINA_OPTS -Dhttpfs.ssl.enabled "-Dhttpfs.ssl.enabled=${HTTPFS_SSL_ENABLED}"
exec "${HADOOP_CATALINA_HOME}/bin/catalina.sh" "$@"
exec "${bin}/hdfs" "${args[@]}"

View File

@ -1,16 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<web-app version="2.4" xmlns="http://java.sun.com/xml/ns/j2ee">
</web-app>

View File

@ -1,67 +0,0 @@
#
# All Rights Reserved.
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
handlers = 1catalina.org.apache.juli.FileHandler, 2localhost.org.apache.juli.FileHandler, 3manager.org.apache.juli.FileHandler, 4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
.handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################
1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = ${httpfs.log.dir}
1catalina.org.apache.juli.FileHandler.prefix = httpfs-catalina.
2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = ${httpfs.log.dir}
2localhost.org.apache.juli.FileHandler.prefix = httpfs-localhost.
3manager.org.apache.juli.FileHandler.level = FINE
3manager.org.apache.juli.FileHandler.directory = ${httpfs.log.dir}
3manager.org.apache.juli.FileHandler.prefix = httpfs-manager.
4host-manager.org.apache.juli.FileHandler.level = FINE
4host-manager.org.apache.juli.FileHandler.directory = ${httpfs.log.dir}
4host-manager.org.apache.juli.FileHandler.prefix = httpfs-host-manager.
java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
############################################################
# Facility specific properties.
# Provides extra control for each logger.
############################################################
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = 2localhost.org.apache.juli.FileHandler
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = 3manager.org.apache.juli.FileHandler
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers = 4host-manager.org.apache.juli.FileHandler
# For example, set the com.xyz.foo logger to only log SEVERE
# messages:
#org.apache.catalina.startup.ContextConfig.level = FINE
#org.apache.catalina.startup.HostConfig.level = FINE
#org.apache.catalina.session.ManagerBase.level = FINE
#org.apache.catalina.core.AprLifecycleListener.level=FINE

View File

@ -1,151 +0,0 @@
<?xml version='1.0' encoding='utf-8'?>
<!--
All Rights Reserved.
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!-- Note: A "Server" is not itself a "Container", so you may not
define subcomponents such as "Valves" at this level.
Documentation at /docs/config/server.html
-->
<Server port="${httpfs.admin.port}" shutdown="SHUTDOWN">
<!--APR library loader. Documentation at /docs/apr.html -->
<Listener className="org.apache.catalina.core.AprLifecycleListener" SSLEngine="on"/>
<!--Initialize Jasper prior to webapps are loaded. Documentation at /docs/jasper-howto.html -->
<Listener className="org.apache.catalina.core.JasperListener"/>
<!-- Prevent memory leaks due to use of particular java/javax APIs-->
<Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener"/>
<!-- JMX Support for the Tomcat server. Documentation at /docs/non-existent.html -->
<Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"/>
<Listener className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"/>
<!-- Global JNDI resources
Documentation at /docs/jndi-resources-howto.html
-->
<GlobalNamingResources>
<!-- Editable user database that can also be used by
UserDatabaseRealm to authenticate users
-->
<Resource name="UserDatabase" auth="Container"
type="org.apache.catalina.UserDatabase"
description="User database that can be updated and saved"
factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
pathname="conf/tomcat-users.xml"/>
</GlobalNamingResources>
<!-- A "Service" is a collection of one or more "Connectors" that share
a single "Container" Note: A "Service" is not itself a "Container",
so you may not define subcomponents such as "Valves" at this level.
Documentation at /docs/config/service.html
-->
<Service name="Catalina">
<!--The connectors can use a shared executor, you can define one or more named thread pools-->
<!--
<Executor name="tomcatThreadPool" namePrefix="catalina-exec-"
maxThreads="150" minSpareThreads="4"/>
-->
<!-- A "Connector" represents an endpoint by which requests are received
and responses are returned. Documentation at :
Java HTTP Connector: /docs/config/http.html (blocking & non-blocking)
Java AJP Connector: /docs/config/ajp.html
APR (HTTP/AJP) Connector: /docs/apr.html
Define a non-SSL HTTP/1.1 Connector on port ${httpfs.http.port}
-->
<Connector port="${httpfs.http.port}" protocol="HTTP/1.1"
connectionTimeout="20000"
maxHttpHeaderSize="${httpfs.max.http.header.size}"
redirectPort="8443"/>
<!-- A "Connector" using the shared thread pool-->
<!--
<Connector executor="tomcatThreadPool"
port="${httpfs.http.port}" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" />
-->
<!-- Define a SSL HTTP/1.1 Connector on port 8443
This connector uses the JSSE configuration, when using APR, the
connector should be using the OpenSSL style configuration
described in the APR documentation -->
<!--
<Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true"
maxThreads="150" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS" />
-->
<!-- Define an AJP 1.3 Connector on port 8009 -->
<!-- An Engine represents the entry point (within Catalina) that processes
every request. The Engine implementation for Tomcat stand alone
analyzes the HTTP headers included with the request, and passes them
on to the appropriate Host (virtual host).
Documentation at /docs/config/engine.html -->
<!-- You should set jvmRoute to support load-balancing via AJP ie :
<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm1">
-->
<Engine name="Catalina" defaultHost="localhost">
<!--For clustering, please take a look at documentation at:
/docs/cluster-howto.html (simple how to)
/docs/config/cluster.html (reference documentation) -->
<!--
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"/>
-->
<!-- The request dumper valve dumps useful debugging information about
the request and response data received and sent by Tomcat.
Documentation at: /docs/config/valve.html -->
<!--
<Valve className="org.apache.catalina.valves.RequestDumperValve"/>
-->
<!-- This Realm uses the UserDatabase configured in the global JNDI
resources under the key "UserDatabase". Any edits
that are performed against this UserDatabase are immediately
available for use by the Realm. -->
<Realm className="org.apache.catalina.realm.UserDatabaseRealm"
resourceName="UserDatabase"/>
<!-- Define the default virtual host
Note: XML Schema validation will not work with Xerces 2.2.
-->
<Host name="localhost" appBase="webapps"
unpackWARs="true" autoDeploy="true"
xmlValidation="false" xmlNamespaceAware="false">
<!-- SingleSignOn valve, share authentication between web applications
Documentation at: /docs/config/valve.html -->
<!--
<Valve className="org.apache.catalina.authenticator.SingleSignOn" />
-->
<!-- Access log processes all example.
Documentation at: /docs/config/valve.html -->
<!--
<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/>
-->
</Host>
</Engine>
</Service>
</Server>

View File

@ -1,136 +0,0 @@
<?xml version='1.0' encoding='utf-8'?>
<!--
All Rights Reserved.
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!-- Note: A "Server" is not itself a "Container", so you may not
define subcomponents such as "Valves" at this level.
Documentation at /docs/config/server.html
-->
<Server port="${httpfs.admin.port}" shutdown="SHUTDOWN">
<!--APR library loader. Documentation at /docs/apr.html -->
<Listener className="org.apache.catalina.core.AprLifecycleListener"
SSLEngine="on"/>
<!--Initialize Jasper prior to webapps are loaded. Documentation at /docs/jasper-howto.html -->
<Listener className="org.apache.catalina.core.JasperListener"/>
<!-- Prevent memory leaks due to use of particular java/javax APIs-->
<Listener
className="org.apache.catalina.core.JreMemoryLeakPreventionListener"/>
<!-- JMX Support for the Tomcat server. Documentation at /docs/non-existent.html -->
<Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"/>
<Listener
className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"/>
<!-- Global JNDI resources
Documentation at /docs/jndi-resources-howto.html
-->
<GlobalNamingResources>
<!-- Editable user database that can also be used by
UserDatabaseRealm to authenticate users
-->
<Resource name="UserDatabase" auth="Container"
type="org.apache.catalina.UserDatabase"
description="User database that can be updated and saved"
factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
pathname="conf/tomcat-users.xml"/>
</GlobalNamingResources>
<!-- A "Service" is a collection of one or more "Connectors" that share
a single "Container" Note: A "Service" is not itself a "Container",
so you may not define subcomponents such as "Valves" at this level.
Documentation at /docs/config/service.html
-->
<Service name="Catalina">
<!--The connectors can use a shared executor, you can define one or more named thread pools-->
<!--
<Executor name="tomcatThreadPool" namePrefix="catalina-exec-"
maxThreads="httpfs.max.threads" minSpareThreads="4"/>
-->
<!-- Define a SSL HTTP/1.1 Connector on port 8443
This connector uses the JSSE configuration, when using APR, the
connector should be using the OpenSSL style configuration
described in the APR documentation -->
<Connector port="${httpfs.http.port}" protocol="HTTP/1.1" SSLEnabled="true"
maxThreads="150" scheme="https" secure="true"
maxHttpHeaderSize="${httpfs.max.http.header.size}"
clientAuth="false" sslEnabledProtocols="TLSv1,TLSv1.1,TLSv1.2,SSLv2Hello"
keystoreFile="${httpfs.ssl.keystore.file}"
keystorePass="_httpfs_ssl_keystore_pass_"/>
<!-- Define an AJP 1.3 Connector on port 8009 -->
<!-- An Engine represents the entry point (within Catalina) that processes
every request. The Engine implementation for Tomcat stand alone
analyzes the HTTP headers included with the request, and passes them
on to the appropriate Host (virtual host).
Documentation at /docs/config/engine.html -->
<!-- You should set jvmRoute to support load-balancing via AJP ie :
<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm1">
-->
<Engine name="Catalina" defaultHost="localhost">
<!--For clustering, please take a look at documentation at:
/docs/cluster-howto.html (simple how to)
/docs/config/cluster.html (reference documentation) -->
<!--
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"/>
-->
<!-- The request dumper valve dumps useful debugging information about
the request and response data received and sent by Tomcat.
Documentation at: /docs/config/valve.html -->
<!--
<Valve className="org.apache.catalina.valves.RequestDumperValve"/>
-->
<!-- This Realm uses the UserDatabase configured in the global JNDI
resources under the key "UserDatabase". Any edits
that are performed against this UserDatabase are immediately
available for use by the Realm. -->
<Realm className="org.apache.catalina.realm.UserDatabaseRealm"
resourceName="UserDatabase"/>
<!-- Define the default virtual host
Note: XML Schema validation will not work with Xerces 2.2.
-->
<Host name="localhost" appBase="webapps"
unpackWARs="true" autoDeploy="true"
xmlValidation="false" xmlNamespaceAware="false">
<!-- SingleSignOn valve, share authentication between web applications
Documentation at: /docs/config/valve.html -->
<!--
<Valve className="org.apache.catalina.authenticator.SingleSignOn" />
-->
<!-- Access log processes all example.
Documentation at: /docs/config/valve.html -->
<!--
<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/>
-->
</Host>
</Engine>
</Service>
</Server>

View File

@ -55,11 +55,12 @@ You need to restart Hadoop for the proxyuser configuration to become active.
Start/Stop HttpFS
-----------------
To start/stop HttpFS use HttpFS's sbin/httpfs.sh script. For example:
To start/stop HttpFS, use `hdfs --daemon start|stop httpfs`. For example:
$ sbin/httpfs.sh start
hadoop-${project.version} $ hdfs --daemon start httpfs
NOTE: Invoking the script without any parameters list all possible parameters (start, stop, run, etc.). The `httpfs.sh` script is a wrapper for Tomcat's `catalina.sh` script that sets the environment variables and Java System properties required to run HttpFS server.
NOTE: The script `httpfs.sh` is deprecated. It is now just a wrapper of
`hdfs httpfs`.
Test HttpFS is working
----------------------
@ -67,52 +68,63 @@ Test HttpFS is working
$ curl -sS 'http://<HTTPFSHOSTNAME>:14000/webhdfs/v1?op=gethomedirectory&user.name=hdfs'
{"Path":"\/user\/hdfs"}
Embedded Tomcat Configuration
-----------------------------
To configure the embedded Tomcat go to the `tomcat/conf`.
HttpFS preconfigures the HTTP and Admin ports in Tomcat's `server.xml` to 14000 and 14001.
Tomcat logs are also preconfigured to go to HttpFS's `logs/` directory.
HttpFS default value for the maxHttpHeaderSize parameter in Tomcat's `server.xml` is set to 65536 by default.
The following environment variables (which can be set in HttpFS's `etc/hadoop/httpfs-env.sh` script) can be used to alter those values:
* HTTPFS\_HTTP\_PORT
* HTTPFS\_ADMIN\_PORT
* HADOOP\_LOG\_DIR
* HTTPFS\_MAX\_HTTP\_HEADER\_SIZE
HttpFS Configuration
--------------------
HttpFS preconfigures the HTTP port to 14000.
HttpFS supports the following [configuration properties](./httpfs-default.html) in the HttpFS's `etc/hadoop/httpfs-site.xml` configuration file.
HttpFS over HTTPS (SSL)
-----------------------
To configure HttpFS to work over SSL edit the [httpfs-env.sh](#httpfs-env.sh) script in the configuration directory setting the [HTTPFS\_SSL\_ENABLED](#HTTPFS_SSL_ENABLED) to [true](#true).
Enable SSL in `etc/hadoop/httpfs-site.xml`:
In addition, the following 2 properties may be defined (shown with default values):
```xml
<property>
<name>hadoop.httpfs.ssl.enabled</name>
<value>true</value>
<description>
Whether SSL is enabled. Default is false, i.e. disabled.
</description>
</property>
```
* HTTPFS\_SSL\_KEYSTORE\_FILE=$HOME/.keystore
Configure `etc/hadoop/ssl-server.xml` with proper values, for example:
* HTTPFS\_SSL\_KEYSTORE\_PASS=password
```xml
<property>
<name>ssl.server.keystore.location</name>
<value>${user.home}/.keystore</value>
<description>Keystore to be used. Must be specified.
</description>
</property>
In the HttpFS `tomcat/conf` directory, replace the `server.xml` file with the `ssl-server.xml` file.
<property>
<name>ssl.server.keystore.password</name>
<value></value>
<description>Must be specified.</description>
</property>
<property>
<name>ssl.server.keystore.keypassword</name>
<value></value>
<description>Must be specified.</description>
</property>
```
The SSL passwords can be secured by a credential provider. See
[Credential Provider API](../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
You need to create an SSL certificate for the HttpFS server. As the `httpfs` Unix user, using the Java `keytool` command to create the SSL certificate:
$ keytool -genkey -alias tomcat -keyalg RSA
$ keytool -genkey -alias jetty -keyalg RSA
You will be asked a series of questions in an interactive prompt. It will create the keystore file, which will be named **.keystore** and located in the `httpfs` user home directory.
The password you enter for "keystore password" must match the value of the `HTTPFS_SSL_KEYSTORE_PASS` environment variable set in the `httpfs-env.sh` script in the configuration directory.
The password you enter for "keystore password" must match the value of the
property `ssl.server.keystore.password` set in the `ssl-server.xml` in the
configuration directory.
The answer to "What is your first and last name?" (i.e. "CN") must be the hostname of the machine where the HttpFS Server will be running.
@ -121,3 +133,65 @@ Start HttpFS. It should work over HTTPS.
Using the Hadoop `FileSystem` API or the Hadoop FS shell, use the `swebhdfs://` scheme. Make sure the JVM is picking up the truststore containing the public key of the SSL certificate if using a self-signed certificate.
NOTE: Some old SSL clients may use weak ciphers that are not supported by the HttpFS server. It is recommended to upgrade the SSL client.
Deprecated Environment Variables
--------------------------------
The following environment variables are deprecated. Set the corresponding
configuration properties instead.
Environment Variable | Configuration Property | Configuration File
----------------------------|------------------------------|--------------------
HTTPFS_TEMP | hadoop.http.temp.dir | httpfs-site.xml
HTTPFS_HTTP_PORT | hadoop.httpfs.http.port | httpfs-site.xml
HTTPFS_MAX_HTTP_HEADER_SIZE | hadoop.http.max.request.header.size and hadoop.http.max.response.header.size | httpfs-site.xml
HTTPFS_MAX_THREADS | hadoop.http.max.threads | httpfs-site.xml
HTTPFS_SSL_ENABLED | hadoop.httpfs.ssl.enabled | httpfs-site.xml
HTTPFS_SSL_KEYSTORE_FILE | ssl.server.keystore.location | ssl-server.xml
HTTPFS_SSL_KEYSTORE_PASS | ssl.server.keystore.password | ssl-server.xml
HTTP Default Services
---------------------
Name | Description
-------------------|------------------------------------
/conf | Display configuration properties
/jmx | Java JMX management interface
/logLevel | Get or set log level per class
/logs | Display log files
/stacks | Display JVM stacks
/static/index.html | The static home page
To control the access to servlet `/conf`, `/jmx`, `/logLevel`, `/logs`,
and `/stacks`, configure the following properties in `httpfs-site.xml`:
```xml
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
<description>Is service-level authorization enabled?</description>
</property>
<property>
<name>hadoop.security.instrumentation.requires.admin</name>
<value>true</value>
<description>
Indicates if administrator ACLs are required to access
instrumentation servlets (JMX, METRICS, CONF, STACKS).
</description>
</property>
<property>
<name>hadoop.httpfs.http.administrators</name>
<value></value>
<description>ACL for the admins, this configuration is used to control
who can access the default servlets for HttpFS server. The value
should be a comma separated list of users and groups. The user list
comes first and is separated by a space followed by the group list,
e.g. "user1,user2 group1,group2". Both users and groups are optional,
so "user1", " group1", "", "user1 group1", "user1,user2 group1,group2"
are all valid (note the leading space in " group1"). '*' grants access
to all users and groups, e.g. '*', '* ' and ' *' are all valid.
</description>
</property>
```

View File

@ -32,7 +32,7 @@ How Does HttpFS Works?
HttpFS is a separate service from Hadoop NameNode.
HttpFS itself is Java web-application and it runs using a preconfigured Tomcat bundled with HttpFS binary distribution.
HttpFS itself is Java Jetty web-application.
HttpFS HTTP web-service API calls are HTTP REST calls that map to a HDFS file system operation. For example, using the `curl` Unix command:

View File

@ -0,0 +1,106 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.http.server;
import java.io.BufferedReader;
import java.io.File;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.text.MessageFormat;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.test.GenericTestUtils;
import org.apache.hadoop.test.HadoopUsersConfTestHelper;
import org.junit.Assert;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.Timeout;
/**
* Test {@link HttpFSServerWebServer}.
*/
public class TestHttpFSServerWebServer {
@Rule
public Timeout timeout = new Timeout(30000);
private HttpFSServerWebServer webServer;
@BeforeClass
public static void beforeClass() throws Exception {
File homeDir = GenericTestUtils.getTestDir();
File confDir = new File(homeDir, "etc/hadoop");
File logsDir = new File(homeDir, "logs");
File tempDir = new File(homeDir, "temp");
confDir.mkdirs();
logsDir.mkdirs();
tempDir.mkdirs();
System.setProperty("hadoop.home.dir", homeDir.getAbsolutePath());
System.setProperty("hadoop.log.dir", logsDir.getAbsolutePath());
System.setProperty("httpfs.home.dir", homeDir.getAbsolutePath());
System.setProperty("httpfs.log.dir", logsDir.getAbsolutePath());
System.setProperty("httpfs.config.dir", confDir.getAbsolutePath());
new File(confDir, "httpfs-signature.secret").createNewFile();
}
@Before
public void setUp() throws Exception {
Configuration conf = new Configuration();
conf.set(HttpFSServerWebServer.HTTP_HOST_KEY, "localhost");
conf.setInt(HttpFSServerWebServer.HTTP_PORT_KEY, 0);
Configuration sslConf = new Configuration();
webServer = new HttpFSServerWebServer(conf, sslConf);
}
@Test
public void testStartStop() throws Exception {
webServer.start();
String user = HadoopUsersConfTestHelper.getHadoopUsers()[0];
URL url = new URL(webServer.getUrl(), MessageFormat.format(
"/webhdfs/v1/?user.name={0}&op=liststatus", user));
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
Assert.assertEquals(conn.getResponseCode(), HttpURLConnection.HTTP_OK);
BufferedReader reader = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
reader.readLine();
reader.close();
webServer.stop();
}
@Test
public void testJustStop() throws Exception {
webServer.stop();
}
@Test
public void testDoubleStop() throws Exception {
webServer.start();
webServer.stop();
webServer.stop();
}
@Test
public void testDoubleStart() throws Exception {
webServer.start();
webServer.start();
webServer.stop();
}
}

View File

@ -139,6 +139,12 @@ Usage: `hdfs groups [username ...]`
Returns the group information given one or more usernames.
### `httpfs`
Usage: `hdfs httpfs`
Run HttpFS server, the HDFS HTTP Gateway.
### `lsSnapshottableDir`
Usage: `hdfs lsSnapshottableDir [-help]`