Merge trunk to HDFS-4685.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-4685@1562670 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Chris Nauroth 2014-01-30 01:55:14 +00:00
commit 10ef8a4b56
117 changed files with 2667 additions and 1614 deletions

View File

@ -24,8 +24,7 @@ Configuration
* Server Side Configuration Setup
The {{{./apidocs/org/apache/hadoop/auth/server/AuthenticationFilter.html}
AuthenticationFilter filter}} is Hadoop Auth's server side component.
The AuthenticationFilter filter is Hadoop Auth's server side component.
This filter must be configured in front of all the web application resources
that required authenticated requests. For example:
@ -46,9 +45,7 @@ Configuration
must start with the prefix. The default value is no prefix.
* <<<[PREFIX.]type>>>: the authentication type keyword (<<<simple>>> or
<<<kerberos>>>) or a
{{{./apidocs/org/apache/hadoop/auth/server/AuthenticationHandler.html}
Authentication handler implementation}}.
<<<kerberos>>>) or a Authentication handler implementation.
* <<<[PREFIX.]signature.secret>>>: The secret to SHA-sign the generated
authentication tokens. If a secret is not provided a random secret is

View File

@ -52,7 +52,3 @@ Hadoop Auth, Java HTTP SPNEGO ${project.version}
* {{{./BuildingIt.html}Building It}}
* {{{./apidocs/index.html}JavaDocs}}
* {{{./dependencies.html}Dependencies}}

View File

@ -285,9 +285,6 @@ Trunk (Unreleased)
HADOOP-9740. Fix FsShell '-text' command to be able to read Avro
files stored in HDFS and other filesystems. (Allan Yan via cutting)
HDFS-5471. CacheAdmin -listPools fails when user lacks permissions to view
all pools (Andrew Wang via Colin Patrick McCabe)
HADOOP-10044 Improve the javadoc of rpc code (sanjay Radia)
OPTIMIZATIONS
@ -302,11 +299,44 @@ Release 2.4.0 - UNRELEASED
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
Release 2.3.0 - UNRELEASED
INCOMPATIBLE CHANGES
HADOOP-8545. Filesystem Implementation for OpenStack Swift
(Dmitry Mezhensky, David Dobbins, Stevel via stevel)
NEW FEATURES
IMPROVEMENTS
HADOOP-10046. Print a log message when SSL is enabled.
(David S. Wang via wang)
HADOOP-10079. log a warning message if group resolution takes too long.
(cmccabe)
HADOOP-9623 Update jets3t dependency to 0.9.0. (Amandeep Khurana via Colin
Patrick McCabe)
HADOOP-10132. RPC#stopProxy() should log the class of proxy when IllegalArgumentException
is encountered (Ted yu via umamahesh)
HADOOP-10248. Property name should be included in the exception where property value
is null (Akira AJISAKA via umamahesh)
HADOOP-10086. User document for authentication in secure cluster.
(Masatake Iwasaki via Arpit Agarwal)
HADOOP-10274 Lower the logging level from ERROR to WARN for UGI.doAs method
(Takeshi Miao via stack)
HADOOP-9784. Add a builder for HttpServer. (Junping Du via llu)
HADOOP 9871. Fix intermittent findbugs warnings in DefaultMetricsSystem.
@ -427,8 +457,15 @@ Release 2.4.0 - UNRELEASED
HADOOP-9652. Allow RawLocalFs#getFileLinkStatus to fill in the link owner
and mode if requested. (Andrew Wang via Colin Patrick McCabe)
HADOOP-10305. Add "rpc.metrics.quantile.enable" and
"rpc.metrics.percentiles.intervals" to core-default.xml.
(Akira Ajisaka via wang)
OPTIMIZATIONS
HADOOP-10142. Avoid groups lookup for unprivileged users such as "dr.who"
(vinay via cmccabe)
HADOOP-9748. Reduce blocking on UGI.ensureInitialized (daryn)
HADOOP-10047. Add a direct-buffer based apis for compression. (Gopal V
@ -444,6 +481,90 @@ Release 2.4.0 - UNRELEASED
BUG FIXES
HADOOP-10028. Malformed ssl-server.xml.example. (Haohui Mai via jing9)
HADOOP-10030. FsShell -put/copyFromLocal should support Windows local path.
(Chuan Liu via cnauroth)
HADOOP-10031. FsShell -get/copyToLocal/moveFromLocal should support Windows
local path. (Chuan Liu via cnauroth)
HADOOP-10039. Add Hive to the list of projects using
AbstractDelegationTokenSecretManager. (Haohui Mai via jing9)
HADOOP-10040. hadoop.cmd in UNIX format and would not run by default on
Windows. (cnauroth)
HADOOP-10055. FileSystemShell.apt.vm doc has typo "numRepicas".
(Akira Ajisaka via cnauroth)
HADOOP-10072. TestNfsExports#testMultiMatchers fails due to non-deterministic
timing around cache expiry check. (cnauroth)
HADOOP-9898. Set SO_KEEPALIVE on all our sockets. (todd via wang)
HADOOP-9478. Fix race conditions during the initialization of Configuration
related to deprecatedKeyMap (cmccabe)
HADOOP-9660. [WINDOWS] Powershell / cmd parses -Dkey=value from command line
as [-Dkey, value] which breaks GenericsOptionParser.
(Enis Soztutar via cnauroth)
HADOOP-10078. KerberosAuthenticator always does SPNEGO. (rkanter via tucu)
HADOOP-10110. hadoop-auth has a build break due to missing dependency.
(Chuan Liu via arp)
HADOOP-9114. After defined the dfs.checksum.type as the NULL, write file and hflush will
through java.lang.ArrayIndexOutOfBoundsException (Sathish via umamahesh)
HADOOP-10130. RawLocalFS::LocalFSFileInputStream.pread does not track
FS::Statistics (Binglin Chang via Colin Patrick McCabe)
HDFS-5560. Trash configuration log statements prints incorrect units.
(Josh Elser via Andrew Wang)
HADOOP-10081. Client.setupIOStreams can leak socket resources on exception
or error (Tsuyoshi OZAWA via jlowe)
HADOOP-10087. UserGroupInformation.getGroupNames() fails to return primary
group first when JniBasedUnixGroupsMappingWithFallback is used (cmccabe)
HADOOP-10175. Har files system authority should preserve userinfo.
(Chuan Liu via cnauroth)
HADOOP-10090. Jobtracker metrics not updated properly after execution
of a mapreduce job. (ivanmi)
HADOOP-10193. hadoop-auth's PseudoAuthenticationHandler can consume getInputStream.
(gchanan via tucu)
HADOOP-10178. Configuration deprecation always emit "deprecated" warnings
when a new key is used. (Shanyu Zhao via cnauroth)
HADOOP-10234. "hadoop.cmd jar" does not propagate exit code. (cnauroth)
HADOOP-10240. Windows build instructions incorrectly state requirement of
protoc 2.4.1 instead of 2.5.0. (cnauroth)
HADOOP-10167. Mark hadoop-common source as UTF-8 in Maven pom files / refactoring
(Mikhail Antonov via cos)
HADOOP-9982. Fix dead links in hadoop site docs. (Akira Ajisaka via Arpit
Agarwal)
HADOOP-10212. Incorrect compile command in Native Library document.
(Akira Ajisaka via Arpit Agarwal)
HADOOP-9830. Fix typo at http://hadoop.apache.org/docs/current/
(Kousuke Saruta via Arpit Agarwal)
HADOOP-10255. Rename HttpServer to HttpServer2 to retain older
HttpServer in branch-2 for compatibility. (Haohui Mai via suresh)
HADOOP-10291. TestSecurityUtil#testSocketAddrWithIP fails due to test
order dependency. (Mit Desai via Arpit Agarwal)
HADOOP-9964. Fix deadlocks in TestHttpServer by synchronize
ReflectionUtils.printThreadInfo. (Junping Du via llu)
@ -459,7 +580,6 @@ Release 2.4.0 - UNRELEASED
HADOOP-9865. FileContext#globStatus has a regression with respect to
relative path. (Chuan Lin via Colin Patrick McCabe)
HADOOP-9909. org.apache.hadoop.fs.Stat should permit other LANG.
(Shinichi Yamashita via Andrew Wang)
@ -539,106 +659,11 @@ Release 2.4.0 - UNRELEASED
HADOOP-10203. Connection leak in
Jets3tNativeFileSystemStore#retrieveMetadata. (Andrei Savu via atm)
Release 2.3.0 - UNRELEASED
HADOOP-10250. VersionUtil returns wrong value when comparing two versions.
(Yongjun Zhang via atm)
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
HADOOP-10046. Print a log message when SSL is enabled.
(David S. Wang via wang)
HADOOP-10079. log a warning message if group resolution takes too long.
(cmccabe)
HADOOP-9623 Update jets3t dependency to 0.9.0. (Amandeep Khurana via Colin
Patrick McCabe)
HADOOP-10132. RPC#stopProxy() should log the class of proxy when IllegalArgumentException
is encountered (Ted yu via umamahesh)
HADOOP-10248. Property name should be included in the exception where property value
is null (Akira AJISAKA via umamahesh)
OPTIMIZATIONS
HADOOP-10142. Avoid groups lookup for unprivileged users such as "dr.who"
(vinay via cmccabe)
BUG FIXES
HADOOP-10028. Malformed ssl-server.xml.example. (Haohui Mai via jing9)
HADOOP-10030. FsShell -put/copyFromLocal should support Windows local path.
(Chuan Liu via cnauroth)
HADOOP-10031. FsShell -get/copyToLocal/moveFromLocal should support Windows
local path. (Chuan Liu via cnauroth)
HADOOP-10039. Add Hive to the list of projects using
AbstractDelegationTokenSecretManager. (Haohui Mai via jing9)
HADOOP-10040. hadoop.cmd in UNIX format and would not run by default on
Windows. (cnauroth)
HADOOP-10055. FileSystemShell.apt.vm doc has typo "numRepicas".
(Akira Ajisaka via cnauroth)
HADOOP-10072. TestNfsExports#testMultiMatchers fails due to non-deterministic
timing around cache expiry check. (cnauroth)
HADOOP-9898. Set SO_KEEPALIVE on all our sockets. (todd via wang)
HADOOP-9478. Fix race conditions during the initialization of Configuration
related to deprecatedKeyMap (cmccabe)
HADOOP-9660. [WINDOWS] Powershell / cmd parses -Dkey=value from command line
as [-Dkey, value] which breaks GenericsOptionParser.
(Enis Soztutar via cnauroth)
HADOOP-10078. KerberosAuthenticator always does SPNEGO. (rkanter via tucu)
HADOOP-10110. hadoop-auth has a build break due to missing dependency.
(Chuan Liu via arp)
HADOOP-9114. After defined the dfs.checksum.type as the NULL, write file and hflush will
through java.lang.ArrayIndexOutOfBoundsException (Sathish via umamahesh)
HADOOP-10130. RawLocalFS::LocalFSFileInputStream.pread does not track
FS::Statistics (Binglin Chang via Colin Patrick McCabe)
HDFS-5560. Trash configuration log statements prints incorrect units.
(Josh Elser via Andrew Wang)
HADOOP-10081. Client.setupIOStreams can leak socket resources on exception
or error (Tsuyoshi OZAWA via jlowe)
HADOOP-10087. UserGroupInformation.getGroupNames() fails to return primary
group first when JniBasedUnixGroupsMappingWithFallback is used (cmccabe)
HADOOP-10175. Har files system authority should preserve userinfo.
(Chuan Liu via cnauroth)
HADOOP-10090. Jobtracker metrics not updated properly after execution
of a mapreduce job. (ivanmi)
HADOOP-10193. hadoop-auth's PseudoAuthenticationHandler can consume getInputStream.
(gchanan via tucu)
HADOOP-10178. Configuration deprecation always emit "deprecated" warnings
when a new key is used. (Shanyu Zhao via cnauroth)
HADOOP-10234. "hadoop.cmd jar" does not propagate exit code. (cnauroth)
HADOOP-10240. Windows build instructions incorrectly state requirement of
protoc 2.4.1 instead of 2.5.0. (cnauroth)
HADOOP-10112. har file listing doesn't work with wild card. (brandonli)
HADOOP-10167. Mark hadoop-common source as UTF-8 in Maven pom files / refactoring
(Mikhail Antonov via cos)
HADOOP-10288. Explicit reference to Log4JLogger breaks non-log4j users
(todd)
Release 2.2.0 - 2013-10-13

View File

@ -364,4 +364,11 @@
<Bug pattern="OBL_UNSATISFIED_OBLIGATION"/>
</Match>
<!-- code from maven source, null value is checked at callee side. -->
<Match>
<Class name="org.apache.hadoop.util.ComparableVersion$ListItem" />
<Method name="compareTo" />
<Bug code="NP" />
</Match>
</FindBugsFilter>

View File

@ -27,7 +27,7 @@ import javax.servlet.http.HttpServletResponse;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
/**
* A servlet to print out the running configuration data.
@ -47,7 +47,7 @@ public class ConfServlet extends HttpServlet {
*/
private Configuration getConfFromContext() {
Configuration conf = (Configuration)getServletContext().getAttribute(
HttpServer.CONF_CONTEXT_ATTRIBUTE);
HttpServer2.CONF_CONTEXT_ATTRIBUTE);
assert conf != null;
return conf;
}
@ -56,7 +56,7 @@ public class ConfServlet extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
if (!HttpServer.isInstrumentationAccessAllowed(getServletContext(),
if (!HttpServer2.isInstrumentationAccessAllowed(getServletContext(),
request, response)) {
return;
}

View File

@ -245,6 +245,7 @@ public class CommonConfigurationKeys extends CommonConfigurationKeysPublic {
public static final String RPC_METRICS_QUANTILE_ENABLE =
"rpc.metrics.quantile.enable";
public static final boolean RPC_METRICS_QUANTILE_ENABLE_DEFAULT = false;
public static final String RPC_METRICS_PERCENTILES_INTERVALS_KEY =
"rpc.metrics.percentiles.intervals";
}

View File

@ -37,7 +37,7 @@ public class AdminAuthorizedServlet extends DefaultServlet {
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
// Do the authorization
if (HttpServer.hasAdministratorAccess(getServletContext(), request,
if (HttpServer2.hasAdministratorAccess(getServletContext(), request,
response)) {
// Authorization is done. Just call super.
super.doGet(request, response);

View File

@ -53,7 +53,17 @@ public class HttpRequestLog {
String appenderName = name + "requestlog";
Log logger = LogFactory.getLog(loggerName);
if (logger instanceof Log4JLogger) {
boolean isLog4JLogger;;
try {
isLog4JLogger = logger instanceof Log4JLogger;
} catch (NoClassDefFoundError err) {
// In some dependent projects, log4j may not even be on the classpath at
// runtime, in which case the above instanceof check will throw
// NoClassDefFoundError.
LOG.debug("Could not load Log4JLogger class", err);
isLog4JLogger = false;
}
if (isLog4JLogger) {
Log4JLogger httpLog4JLog = (Log4JLogger)logger;
Logger httpLogger = httpLog4JLog.getLogger();
Appender appender = null;

View File

@ -24,7 +24,6 @@ import java.io.PrintWriter;
import java.net.BindException;
import java.net.InetSocketAddress;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Collections;
@ -89,17 +88,19 @@ import com.google.common.collect.Lists;
import com.sun.jersey.spi.container.servlet.ServletContainer;
/**
* Create a Jetty embedded server to answer http requests. The primary goal
* is to serve up status information for the server.
* There are three contexts:
* "/logs/" -> points to the log directory
* "/static/" -> points to common static files (src/webapps/static)
* "/" -> the jsp server code from (src/webapps/<name>)
* Create a Jetty embedded server to answer http requests. The primary goal is
* to serve up status information for the server. There are three contexts:
* "/logs/" -> points to the log directory "/static/" -> points to common static
* files (src/webapps/static) "/" -> the jsp server code from
* (src/webapps/<name>)
*
* This class is a fork of the old HttpServer. HttpServer exists for
* compatibility reasons. See HBASE-10336 for more details.
*/
@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce", "HBase"})
@InterfaceAudience.Private
@InterfaceStability.Evolving
public class HttpServer implements FilterContainer {
public static final Log LOG = LogFactory.getLog(HttpServer.class);
public final class HttpServer2 implements FilterContainer {
public static final Log LOG = LogFactory.getLog(HttpServer2.class);
static final String FILTER_INITIALIZER_PROPERTY
= "hadoop.http.filter.initializers";
@ -166,11 +167,6 @@ public class HttpServer implements FilterContainer {
// The -keypass option in keytool
private String keyPassword;
@Deprecated
private String bindAddress;
@Deprecated
private int port = -1;
private boolean findPort;
private String hostName;
@ -233,24 +229,6 @@ public class HttpServer implements FilterContainer {
return this;
}
/**
* Use addEndpoint() instead.
*/
@Deprecated
public Builder setBindAddress(String bindAddress){
this.bindAddress = bindAddress;
return this;
}
/**
* Use addEndpoint() instead.
*/
@Deprecated
public Builder setPort(int port) {
this.port = port;
return this;
}
public Builder setFindPort(boolean findPort) {
this.findPort = findPort;
return this;
@ -291,20 +269,11 @@ public class HttpServer implements FilterContainer {
return this;
}
public HttpServer build() throws IOException {
public HttpServer2 build() throws IOException {
if (this.name == null) {
throw new HadoopIllegalArgumentException("name is not set");
}
// Make the behavior compatible with deprecated interfaces
if (bindAddress != null && port != -1) {
try {
endpoints.add(0, new URI("http", "", bindAddress, port, "", "", ""));
} catch (URISyntaxException e) {
throw new HadoopIllegalArgumentException("Invalid endpoint: "+ e);
}
}
if (endpoints.size() == 0 && connector == null) {
throw new HadoopIllegalArgumentException("No endpoints specified");
}
@ -318,7 +287,7 @@ public class HttpServer implements FilterContainer {
conf = new Configuration();
}
HttpServer server = new HttpServer(this);
HttpServer2 server = new HttpServer2(this);
if (this.securityEnabled) {
server.initSpnego(conf, hostName, usernameConfKey, keytabConfKey);
@ -332,7 +301,7 @@ public class HttpServer implements FilterContainer {
Connector listener = null;
String scheme = ep.getScheme();
if ("http".equals(scheme)) {
listener = HttpServer.createDefaultChannelConnector();
listener = HttpServer2.createDefaultChannelConnector();
} else if ("https".equals(scheme)) {
SslSocketConnector c = new SslSocketConnector();
c.setNeedClientAuth(needsClientAuth);
@ -364,104 +333,7 @@ public class HttpServer implements FilterContainer {
}
}
/** Same as this(name, bindAddress, port, findPort, null); */
@Deprecated
public HttpServer(String name, String bindAddress, int port, boolean findPort
) throws IOException {
this(name, bindAddress, port, findPort, new Configuration());
}
@Deprecated
public HttpServer(String name, String bindAddress, int port,
boolean findPort, Configuration conf, Connector connector) throws IOException {
this(name, bindAddress, port, findPort, conf, null, connector, null);
}
/**
* Create a status server on the given port. Allows you to specify the
* path specifications that this server will be serving so that they will be
* added to the filters properly.
*
* @param name The name of the server
* @param bindAddress The address for this server
* @param port The port to use on the server
* @param findPort whether the server should start at the given port and
* increment by 1 until it finds a free port.
* @param conf Configuration
* @param pathSpecs Path specifications that this httpserver will be serving.
* These will be added to any filters.
*/
@Deprecated
public HttpServer(String name, String bindAddress, int port,
boolean findPort, Configuration conf, String[] pathSpecs) throws IOException {
this(name, bindAddress, port, findPort, conf, null, null, pathSpecs);
}
/**
* Create a status server on the given port.
* The jsp scripts are taken from src/webapps/<name>.
* @param name The name of the server
* @param port The port to use on the server
* @param findPort whether the server should start at the given port and
* increment by 1 until it finds a free port.
* @param conf Configuration
*/
@Deprecated
public HttpServer(String name, String bindAddress, int port,
boolean findPort, Configuration conf) throws IOException {
this(name, bindAddress, port, findPort, conf, null, null, null);
}
@Deprecated
public HttpServer(String name, String bindAddress, int port,
boolean findPort, Configuration conf, AccessControlList adminsAcl)
throws IOException {
this(name, bindAddress, port, findPort, conf, adminsAcl, null, null);
}
/**
* Create a status server on the given port.
* The jsp scripts are taken from src/webapps/<name>.
* @param name The name of the server
* @param bindAddress The address for this server
* @param port The port to use on the server
* @param findPort whether the server should start at the given port and
* increment by 1 until it finds a free port.
* @param conf Configuration
* @param adminsAcl {@link AccessControlList} of the admins
*/
@Deprecated
public HttpServer(String name, String bindAddress, int port,
boolean findPort, Configuration conf, AccessControlList adminsAcl,
Connector connector) throws IOException {
this(name, bindAddress, port, findPort, conf, adminsAcl, connector, null);
}
/**
* Create a status server on the given port.
* The jsp scripts are taken from src/webapps/<name>.
* @param name The name of the server
* @param bindAddress The address for this server
* @param port The port to use on the server
* @param findPort whether the server should start at the given port and
* increment by 1 until it finds a free port.
* @param conf Configuration
* @param adminsAcl {@link AccessControlList} of the admins
* @param connector A jetty connection listener
* @param pathSpecs Path specifications that this httpserver will be serving.
* These will be added to any filters.
*/
@Deprecated
public HttpServer(String name, String bindAddress, int port,
boolean findPort, Configuration conf, AccessControlList adminsAcl,
Connector connector, String[] pathSpecs) throws IOException {
this(new Builder().setName(name).hostName(bindAddress)
.addEndpoint(URI.create("http://" + bindAddress + ":" + port))
.setFindPort(findPort).setConf(conf).setACL(adminsAcl)
.setConnector(connector).setPathSpec(pathSpecs));
}
private HttpServer(final Builder b) throws IOException {
private HttpServer2(final Builder b) throws IOException {
final String appDir = getWebAppsPath(b.name);
this.webServer = new Server();
this.adminsAcl = b.adminsAcl;
@ -554,7 +426,7 @@ public class HttpServer implements FilterContainer {
* listener.
*/
public Connector createBaseListener(Configuration conf) throws IOException {
return HttpServer.createDefaultChannelConnector();
return HttpServer2.createDefaultChannelConnector();
}
@InterfaceAudience.Private
@ -1171,7 +1043,7 @@ public class HttpServer implements FilterContainer {
@Override
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
if (!HttpServer.isInstrumentationAccessAllowed(getServletContext(),
if (!HttpServer2.isInstrumentationAccessAllowed(getServletContext(),
request, response)) {
return;
}

View File

@ -54,7 +54,8 @@ public class RpcMetrics {
int[] intervals = conf.getInts(
CommonConfigurationKeys.RPC_METRICS_PERCENTILES_INTERVALS_KEY);
rpcQuantileEnable = (intervals.length > 0) && conf.getBoolean(
CommonConfigurationKeys.RPC_METRICS_QUANTILE_ENABLE, false);
CommonConfigurationKeys.RPC_METRICS_QUANTILE_ENABLE,
CommonConfigurationKeys.RPC_METRICS_QUANTILE_ENABLE_DEFAULT);
if (rpcQuantileEnable) {
rpcQueueTimeMillisQuantiles =
new MutableQuantiles[intervals.length];

View File

@ -46,7 +46,7 @@ import javax.servlet.http.HttpServletResponse;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.codehaus.jackson.JsonFactory;
import org.codehaus.jackson.JsonGenerator;
@ -154,7 +154,7 @@ public class JMXJsonServlet extends HttpServlet {
@Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
try {
if (!HttpServer.isInstrumentationAccessAllowed(getServletContext(),
if (!HttpServer2.isInstrumentationAccessAllowed(getServletContext(),
request, response)) {
return;
}

View File

@ -28,7 +28,7 @@ import org.apache.commons.logging.*;
import org.apache.commons.logging.impl.*;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.util.ServletUtil;
/**
@ -93,7 +93,7 @@ public class LogLevel {
) throws ServletException, IOException {
// Do the authorization
if (!HttpServer.hasAdministratorAccess(getServletContext(), request,
if (!HttpServer2.hasAdministratorAccess(getServletContext(), request,
response)) {
return;
}

View File

@ -32,7 +32,7 @@ import javax.servlet.http.HttpServletResponse;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.metrics.spi.OutputRecord;
import org.apache.hadoop.metrics.spi.AbstractMetricsContext.MetricMap;
import org.apache.hadoop.metrics.spi.AbstractMetricsContext.TagMap;
@ -108,7 +108,7 @@ public class MetricsServlet extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
if (!HttpServer.isInstrumentationAccessAllowed(getServletContext(),
if (!HttpServer2.isInstrumentationAccessAllowed(getServletContext(),
request, response)) {
return;
}

View File

@ -17,7 +17,7 @@
*/
package org.apache.hadoop.security;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.security.authentication.server.AuthenticationFilter;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.http.FilterContainer;
@ -94,7 +94,7 @@ public class AuthenticationFilterInitializer extends FilterInitializer {
}
//Resolve _HOST into bind address
String bindAddress = conf.get(HttpServer.BIND_ADDRESS);
String bindAddress = conf.get(HttpServer2.BIND_ADDRESS);
String principal = filterConfig.get(KerberosAuthenticationHandler.PRINCIPAL);
if (principal != null) {
try {

View File

@ -1560,7 +1560,7 @@ public class UserGroupInformation {
return Subject.doAs(subject, action);
} catch (PrivilegedActionException pae) {
Throwable cause = pae.getCause();
LOG.error("PriviledgedActionException as:"+this+" cause:"+cause);
LOG.warn("PriviledgedActionException as:"+this+" cause:"+cause);
if (cause instanceof IOException) {
throw (IOException) cause;
} else if (cause instanceof Error) {

View File

@ -0,0 +1,479 @@
// Code source of this file:
// http://grepcode.com/file/repo1.maven.org/maven2/
// org.apache.maven/maven-artifact/3.1.1/
// org/apache/maven/artifact/versioning/ComparableVersion.java/
//
// Modifications made on top of the source:
// 1. Changed
// package org.apache.maven.artifact.versioning;
// to
// package org.apache.hadoop.util;
// 2. Removed author tags to clear hadoop author tag warning
// author <a href="mailto:kenney@apache.org">Kenney Westerhof</a>
// author <a href="mailto:hboutemy@apache.org">Hervé Boutemy</a>
//
package org.apache.hadoop.util;
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import java.util.ListIterator;
import java.util.Locale;
import java.util.Properties;
import java.util.Stack;
/**
* Generic implementation of version comparison.
*
* <p>Features:
* <ul>
* <li>mixing of '<code>-</code>' (dash) and '<code>.</code>' (dot) separators,</li>
* <li>transition between characters and digits also constitutes a separator:
* <code>1.0alpha1 =&gt; [1, 0, alpha, 1]</code></li>
* <li>unlimited number of version components,</li>
* <li>version components in the text can be digits or strings,</li>
* <li>strings are checked for well-known qualifiers and the qualifier ordering is used for version ordering.
* Well-known qualifiers (case insensitive) are:<ul>
* <li><code>alpha</code> or <code>a</code></li>
* <li><code>beta</code> or <code>b</code></li>
* <li><code>milestone</code> or <code>m</code></li>
* <li><code>rc</code> or <code>cr</code></li>
* <li><code>snapshot</code></li>
* <li><code>(the empty string)</code> or <code>ga</code> or <code>final</code></li>
* <li><code>sp</code></li>
* </ul>
* Unknown qualifiers are considered after known qualifiers, with lexical order (always case insensitive),
* </li>
* <li>a dash usually precedes a qualifier, and is always less important than something preceded with a dot.</li>
* </ul></p>
*
* @see <a href="https://cwiki.apache.org/confluence/display/MAVENOLD/Versioning">"Versioning" on Maven Wiki</a>
*/
public class ComparableVersion
implements Comparable<ComparableVersion>
{
private String value;
private String canonical;
private ListItem items;
private interface Item
{
int INTEGER_ITEM = 0;
int STRING_ITEM = 1;
int LIST_ITEM = 2;
int compareTo( Item item );
int getType();
boolean isNull();
}
/**
* Represents a numeric item in the version item list.
*/
private static class IntegerItem
implements Item
{
private static final BigInteger BIG_INTEGER_ZERO = new BigInteger( "0" );
private final BigInteger value;
public static final IntegerItem ZERO = new IntegerItem();
private IntegerItem()
{
this.value = BIG_INTEGER_ZERO;
}
public IntegerItem( String str )
{
this.value = new BigInteger( str );
}
public int getType()
{
return INTEGER_ITEM;
}
public boolean isNull()
{
return BIG_INTEGER_ZERO.equals( value );
}
public int compareTo( Item item )
{
if ( item == null )
{
return BIG_INTEGER_ZERO.equals( value ) ? 0 : 1; // 1.0 == 1, 1.1 > 1
}
switch ( item.getType() )
{
case INTEGER_ITEM:
return value.compareTo( ( (IntegerItem) item ).value );
case STRING_ITEM:
return 1; // 1.1 > 1-sp
case LIST_ITEM:
return 1; // 1.1 > 1-1
default:
throw new RuntimeException( "invalid item: " + item.getClass() );
}
}
public String toString()
{
return value.toString();
}
}
/**
* Represents a string in the version item list, usually a qualifier.
*/
private static class StringItem
implements Item
{
private static final String[] QUALIFIERS = { "alpha", "beta", "milestone", "rc", "snapshot", "", "sp" };
private static final List<String> _QUALIFIERS = Arrays.asList( QUALIFIERS );
private static final Properties ALIASES = new Properties();
static
{
ALIASES.put( "ga", "" );
ALIASES.put( "final", "" );
ALIASES.put( "cr", "rc" );
}
/**
* A comparable value for the empty-string qualifier. This one is used to determine if a given qualifier makes
* the version older than one without a qualifier, or more recent.
*/
private static final String RELEASE_VERSION_INDEX = String.valueOf( _QUALIFIERS.indexOf( "" ) );
private String value;
public StringItem( String value, boolean followedByDigit )
{
if ( followedByDigit && value.length() == 1 )
{
// a1 = alpha-1, b1 = beta-1, m1 = milestone-1
switch ( value.charAt( 0 ) )
{
case 'a':
value = "alpha";
break;
case 'b':
value = "beta";
break;
case 'm':
value = "milestone";
break;
}
}
this.value = ALIASES.getProperty( value , value );
}
public int getType()
{
return STRING_ITEM;
}
public boolean isNull()
{
return ( comparableQualifier( value ).compareTo( RELEASE_VERSION_INDEX ) == 0 );
}
/**
* Returns a comparable value for a qualifier.
*
* This method takes into account the ordering of known qualifiers then unknown qualifiers with lexical ordering.
*
* just returning an Integer with the index here is faster, but requires a lot of if/then/else to check for -1
* or QUALIFIERS.size and then resort to lexical ordering. Most comparisons are decided by the first character,
* so this is still fast. If more characters are needed then it requires a lexical sort anyway.
*
* @param qualifier
* @return an equivalent value that can be used with lexical comparison
*/
public static String comparableQualifier( String qualifier )
{
int i = _QUALIFIERS.indexOf( qualifier );
return i == -1 ? ( _QUALIFIERS.size() + "-" + qualifier ) : String.valueOf( i );
}
public int compareTo( Item item )
{
if ( item == null )
{
// 1-rc < 1, 1-ga > 1
return comparableQualifier( value ).compareTo( RELEASE_VERSION_INDEX );
}
switch ( item.getType() )
{
case INTEGER_ITEM:
return -1; // 1.any < 1.1 ?
case STRING_ITEM:
return comparableQualifier( value ).compareTo( comparableQualifier( ( (StringItem) item ).value ) );
case LIST_ITEM:
return -1; // 1.any < 1-1
default:
throw new RuntimeException( "invalid item: " + item.getClass() );
}
}
public String toString()
{
return value;
}
}
/**
* Represents a version list item. This class is used both for the global item list and for sub-lists (which start
* with '-(number)' in the version specification).
*/
private static class ListItem
extends ArrayList<Item>
implements Item
{
public int getType()
{
return LIST_ITEM;
}
public boolean isNull()
{
return ( size() == 0 );
}
void normalize()
{
for ( ListIterator<Item> iterator = listIterator( size() ); iterator.hasPrevious(); )
{
Item item = iterator.previous();
if ( item.isNull() )
{
iterator.remove(); // remove null trailing items: 0, "", empty list
}
else
{
break;
}
}
}
public int compareTo( Item item )
{
if ( item == null )
{
if ( size() == 0 )
{
return 0; // 1-0 = 1- (normalize) = 1
}
Item first = get( 0 );
return first.compareTo( null );
}
switch ( item.getType() )
{
case INTEGER_ITEM:
return -1; // 1-1 < 1.0.x
case STRING_ITEM:
return 1; // 1-1 > 1-sp
case LIST_ITEM:
Iterator<Item> left = iterator();
Iterator<Item> right = ( (ListItem) item ).iterator();
while ( left.hasNext() || right.hasNext() )
{
Item l = left.hasNext() ? left.next() : null;
Item r = right.hasNext() ? right.next() : null;
// if this is shorter, then invert the compare and mul with -1
int result = l == null ? -1 * r.compareTo( l ) : l.compareTo( r );
if ( result != 0 )
{
return result;
}
}
return 0;
default:
throw new RuntimeException( "invalid item: " + item.getClass() );
}
}
public String toString()
{
StringBuilder buffer = new StringBuilder( "(" );
for ( Iterator<Item> iter = iterator(); iter.hasNext(); )
{
buffer.append( iter.next() );
if ( iter.hasNext() )
{
buffer.append( ',' );
}
}
buffer.append( ')' );
return buffer.toString();
}
}
public ComparableVersion( String version )
{
parseVersion( version );
}
public final void parseVersion( String version )
{
this.value = version;
items = new ListItem();
version = version.toLowerCase( Locale.ENGLISH );
ListItem list = items;
Stack<Item> stack = new Stack<Item>();
stack.push( list );
boolean isDigit = false;
int startIndex = 0;
for ( int i = 0; i < version.length(); i++ )
{
char c = version.charAt( i );
if ( c == '.' )
{
if ( i == startIndex )
{
list.add( IntegerItem.ZERO );
}
else
{
list.add( parseItem( isDigit, version.substring( startIndex, i ) ) );
}
startIndex = i + 1;
}
else if ( c == '-' )
{
if ( i == startIndex )
{
list.add( IntegerItem.ZERO );
}
else
{
list.add( parseItem( isDigit, version.substring( startIndex, i ) ) );
}
startIndex = i + 1;
if ( isDigit )
{
list.normalize(); // 1.0-* = 1-*
if ( ( i + 1 < version.length() ) && Character.isDigit( version.charAt( i + 1 ) ) )
{
// new ListItem only if previous were digits and new char is a digit,
// ie need to differentiate only 1.1 from 1-1
list.add( list = new ListItem() );
stack.push( list );
}
}
}
else if ( Character.isDigit( c ) )
{
if ( !isDigit && i > startIndex )
{
list.add( new StringItem( version.substring( startIndex, i ), true ) );
startIndex = i;
}
isDigit = true;
}
else
{
if ( isDigit && i > startIndex )
{
list.add( parseItem( true, version.substring( startIndex, i ) ) );
startIndex = i;
}
isDigit = false;
}
}
if ( version.length() > startIndex )
{
list.add( parseItem( isDigit, version.substring( startIndex ) ) );
}
while ( !stack.isEmpty() )
{
list = (ListItem) stack.pop();
list.normalize();
}
canonical = items.toString();
}
private static Item parseItem( boolean isDigit, String buf )
{
return isDigit ? new IntegerItem( buf ) : new StringItem( buf, false );
}
public int compareTo( ComparableVersion o )
{
return items.compareTo( o.items );
}
public String toString()
{
return value;
}
public boolean equals( Object o )
{
return ( o instanceof ComparableVersion ) && canonical.equals( ( (ComparableVersion) o ).canonical );
}
public int hashCode()
{
return canonical.hashCode();
}
}

View File

@ -17,54 +17,16 @@
*/
package org.apache.hadoop.util;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.hadoop.classification.InterfaceAudience;
import com.google.common.collect.ComparisonChain;
/**
* A wrapper class to maven's ComparableVersion class, to comply
* with maven's version name string convention
*/
@InterfaceAudience.Private
public abstract class VersionUtil {
private static final Pattern COMPONENT_GROUPS = Pattern.compile("(\\d+)|(\\D+)");
/**
* Suffix added by maven for nightly builds and other snapshot releases.
* These releases are considered to precede the non-SNAPSHOT version
* with the same version number.
*/
private static final String SNAPSHOT_SUFFIX = "-SNAPSHOT";
/**
* This function splits the two versions on &quot;.&quot; and performs a
* naturally-ordered comparison of the resulting components. For example, the
* version string "0.3" is considered to precede "0.20", despite the fact that
* lexical comparison would consider "0.20" to precede "0.3". This method of
* comparison is similar to the method used by package versioning systems like
* deb and RPM.
*
* Version components are compared numerically whenever possible, however a
* version component can contain non-numeric characters. When a non-numeric
* group of characters is found in a version component, this group is compared
* with the similarly-indexed group in the other version component. If the
* other group is numeric, then the numeric group is considered to precede the
* non-numeric group. If both groups are non-numeric, then a lexical
* comparison is performed.
*
* If two versions have a different number of components, then only the lower
* number of components are compared. If those components are identical
* between the two versions, then the version with fewer components is
* considered to precede the version with more components.
*
* In addition to the above rules, there is one special case: maven SNAPSHOT
* releases are considered to precede a non-SNAPSHOT release with an
* otherwise identical version number. For example, 2.0-SNAPSHOT precedes
* 2.0.
*
* This function returns a negative integer if version1 precedes version2, a
* positive integer if version2 precedes version1, and 0 if and only if the
* two versions' components are identical in value and cardinality.
* Compares two version name strings using maven's ComparableVersion class.
*
* @param version1
* the first version to compare
@ -75,58 +37,8 @@ public abstract class VersionUtil {
* versions are equal.
*/
public static int compareVersions(String version1, String version2) {
boolean isSnapshot1 = version1.endsWith(SNAPSHOT_SUFFIX);
boolean isSnapshot2 = version2.endsWith(SNAPSHOT_SUFFIX);
version1 = stripSnapshotSuffix(version1);
version2 = stripSnapshotSuffix(version2);
String[] version1Parts = version1.split("\\.");
String[] version2Parts = version2.split("\\.");
for (int i = 0; i < version1Parts.length && i < version2Parts.length; i++) {
String component1 = version1Parts[i];
String component2 = version2Parts[i];
if (!component1.equals(component2)) {
Matcher matcher1 = COMPONENT_GROUPS.matcher(component1);
Matcher matcher2 = COMPONENT_GROUPS.matcher(component2);
while (matcher1.find() && matcher2.find()) {
String group1 = matcher1.group();
String group2 = matcher2.group();
if (!group1.equals(group2)) {
if (isNumeric(group1) && isNumeric(group2)) {
return Integer.parseInt(group1) - Integer.parseInt(group2);
} else if (!isNumeric(group1) && !isNumeric(group2)) {
return group1.compareTo(group2);
} else {
return isNumeric(group1) ? -1 : 1;
}
}
}
return component1.length() - component2.length();
}
}
return ComparisonChain.start()
.compare(version1Parts.length, version2Parts.length)
.compare(isSnapshot2, isSnapshot1)
.result();
}
private static String stripSnapshotSuffix(String version) {
if (version.endsWith(SNAPSHOT_SUFFIX)) {
return version.substring(0, version.length() - SNAPSHOT_SUFFIX.length());
} else {
return version;
}
}
private static boolean isNumeric(String s) {
try {
Integer.parseInt(s);
return true;
} catch (NumberFormatException nfe) {
return false;
}
ComparableVersion v1 = new ComparableVersion(version1);
ComparableVersion v2 = new ComparableVersion(version2);
return v1.compareTo(v2);
}
}

View File

@ -1275,4 +1275,26 @@
Default, "dr.who=;" will consider "dr.who" as user without groups.
</description>
</property>
<property>
<name>rpc.metrics.quantile.enable</name>
<value>false</value>
<description>
Setting this property to true and rpc.metrics.percentiles.intervals
to a comma-separated list of the granularity in seconds, the
50/75/90/95/99th percentile latency for rpc queue/processing time in
milliseconds are added to rpc metrics.
</description>
</property>
<property>
<name>rpc.metrics.percentiles.intervals</name>
<value></value>
<description>
A comma-separated list of the granularity in seconds for the metrics which
describe the 50/75/90/95/99th percentile latency for rpc queue/processing
time. The metrics are outputted if rpc.metrics.quantile.enable is set to
true.
</description>
</property>
</configuration>

View File

@ -18,8 +18,6 @@
Hadoop MapReduce Next Generation - CLI MiniCluster.
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* {Purpose}
@ -42,7 +40,8 @@ Hadoop MapReduce Next Generation - CLI MiniCluster.
$ mvn clean install -DskipTests
$ mvn package -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip
+---+
<<NOTE:>> You will need protoc 2.5.0 installed.
<<NOTE:>> You will need {{{http://code.google.com/p/protobuf/}protoc 2.5.0}}
installed.
The tarball should be available in <<<hadoop-dist/target/>>> directory.

View File

@ -16,8 +16,6 @@
---
${maven.build.timestamp}
\[ {{{../index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
Hadoop MapReduce Next Generation - Cluster Setup
@ -29,7 +27,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
with thousands of nodes.
To play with Hadoop, you may first want to install it on a single
machine (see {{{SingleCluster}Single Node Setup}}).
machine (see {{{./SingleCluster.html}Single Node Setup}}).
* {Prerequisites}
@ -571,440 +569,6 @@ $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh stop proxyserver --config $HADOOP_CONF_D
$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
----
* {Running Hadoop in Secure Mode}
This section deals with important parameters to be specified in
to run Hadoop in <<secure mode>> with strong, Kerberos-based
authentication.
* <<<User Accounts for Hadoop Daemons>>>
Ensure that HDFS and YARN daemons run as different Unix users, for e.g.
<<<hdfs>>> and <<<yarn>>>. Also, ensure that the MapReduce JobHistory
server runs as user <<<mapred>>>.
It's recommended to have them share a Unix group, for e.g. <<<hadoop>>>.
*---------------+----------------------------------------------------------------------+
|| User:Group || Daemons |
*---------------+----------------------------------------------------------------------+
| hdfs:hadoop | NameNode, Secondary NameNode, Checkpoint Node, Backup Node, DataNode |
*---------------+----------------------------------------------------------------------+
| yarn:hadoop | ResourceManager, NodeManager |
*---------------+----------------------------------------------------------------------+
| mapred:hadoop | MapReduce JobHistory Server |
*---------------+----------------------------------------------------------------------+
* <<<Permissions for both HDFS and local fileSystem paths>>>
The following table lists various paths on HDFS and local filesystems (on
all nodes) and recommended permissions:
*-------------------+-------------------+------------------+------------------+
|| Filesystem || Path || User:Group || Permissions |
*-------------------+-------------------+------------------+------------------+
| local | <<<dfs.namenode.name.dir>>> | hdfs:hadoop | drwx------ |
*-------------------+-------------------+------------------+------------------+
| local | <<<dfs.datanode.data.dir>>> | hdfs:hadoop | drwx------ |
*-------------------+-------------------+------------------+------------------+
| local | $HADOOP_LOG_DIR | hdfs:hadoop | drwxrwxr-x |
*-------------------+-------------------+------------------+------------------+
| local | $YARN_LOG_DIR | yarn:hadoop | drwxrwxr-x |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.local-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.log-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| local | container-executor | root:hadoop | --Sr-s--- |
*-------------------+-------------------+------------------+------------------+
| local | <<<conf/container-executor.cfg>>> | root:hadoop | r-------- |
*-------------------+-------------------+------------------+------------------+
| hdfs | / | hdfs:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| hdfs | /tmp | hdfs:hadoop | drwxrwxrwxt |
*-------------------+-------------------+------------------+------------------+
| hdfs | /user | hdfs:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| hdfs | <<<yarn.nodemanager.remote-app-log-dir>>> | yarn:hadoop | drwxrwxrwxt |
*-------------------+-------------------+------------------+------------------+
| hdfs | <<<mapreduce.jobhistory.intermediate-done-dir>>> | mapred:hadoop | |
| | | | drwxrwxrwxt |
*-------------------+-------------------+------------------+------------------+
| hdfs | <<<mapreduce.jobhistory.done-dir>>> | mapred:hadoop | |
| | | | drwxr-x--- |
*-------------------+-------------------+------------------+------------------+
* Kerberos Keytab files
* HDFS
The NameNode keytab file, on the NameNode host, should look like the
following:
----
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/nn.service.keytab
Keytab name: FILE:/etc/security/keytab/nn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
The Secondary NameNode keytab file, on that host, should look like the
following:
----
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/sn.service.keytab
Keytab name: FILE:/etc/security/keytab/sn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
The DataNode keytab file, on each host, should look like the following:
----
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/dn.service.keytab
Keytab name: FILE:/etc/security/keytab/dn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
* YARN
The ResourceManager keytab file, on the ResourceManager host, should look
like the following:
----
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/rm.service.keytab
Keytab name: FILE:/etc/security/keytab/rm.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
The NodeManager keytab file, on each host, should look like the following:
----
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/nm.service.keytab
Keytab name: FILE:/etc/security/keytab/nm.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
* MapReduce JobHistory Server
The MapReduce JobHistory Server keytab file, on that host, should look
like the following:
----
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/jhs.service.keytab
Keytab name: FILE:/etc/security/keytab/jhs.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
** Configuration in Secure Mode
* <<<conf/core-site.xml>>>
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.security.authentication>>> | <kerberos> | <simple> is non-secure. |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.security.authorization>>> | <true> | |
| | | Enable RPC service-level authorization. |
*-------------------------+-------------------------+------------------------+
* <<<conf/hdfs-site.xml>>>
* Configurations for NameNode:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.block.access.token.enable>>> | <true> | |
| | | Enable HDFS block access tokens for secure operations. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.https.enable>>> | <true> | |
| | | This value is deprecated. Use dfs.http.policy |
*-------------------------+-------------------------+------------------------+
| <<<dfs.http.policy>>> | <HTTP_ONLY> or <HTTPS_ONLY> or <HTTP_AND_HTTPS> | |
| | | HTTPS_ONLY turns off http access |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.https-address>>> | <nn_host_fqdn:50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.https.port>>> | <50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.keytab.file>>> | </etc/security/keytab/nn.service.keytab> | |
| | | Kerberos keytab file for the NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.kerberos.principal>>> | nn/_HOST@REALM.TLD | |
| | | Kerberos principal name for the NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.kerberos.https.principal>>> | host/_HOST@REALM.TLD | |
| | | HTTPS Kerberos principal name for the NameNode. |
*-------------------------+-------------------------+------------------------+
* Configurations for Secondary NameNode:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.http-address>>> | <c_nn_host_fqdn:50090> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.https-port>>> | <50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.keytab.file>>> | | |
| | </etc/security/keytab/sn.service.keytab> | |
| | | Kerberos keytab file for the NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.kerberos.principal>>> | sn/_HOST@REALM.TLD | |
| | | Kerberos principal name for the Secondary NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.kerberos.https.principal>>> | | |
| | host/_HOST@REALM.TLD | |
| | | HTTPS Kerberos principal name for the Secondary NameNode. |
*-------------------------+-------------------------+------------------------+
* Configurations for DataNode:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.data.dir.perm>>> | 700 | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.address>>> | <0.0.0.0:2003> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.https.address>>> | <0.0.0.0:2005> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.keytab.file>>> | </etc/security/keytab/dn.service.keytab> | |
| | | Kerberos keytab file for the DataNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.kerberos.principal>>> | dn/_HOST@REALM.TLD | |
| | | Kerberos principal name for the DataNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.kerberos.https.principal>>> | | |
| | host/_HOST@REALM.TLD | |
| | | HTTPS Kerberos principal name for the DataNode. |
*-------------------------+-------------------------+------------------------+
* <<<conf/yarn-site.xml>>>
* WebAppProxy
The <<<WebAppProxy>>> provides a proxy between the web applications
exported by an application and an end user. If security is enabled
it will warn users before accessing a potentially unsafe web application.
Authentication and authorization using the proxy is handled just like
any other privileged web application.
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.web-proxy.address>>> | | |
| | <<<WebAppProxy>>> host:port for proxy to AM web apps. | |
| | | <host:port> if this is the same as <<<yarn.resourcemanager.webapp.address>>>|
| | | or it is not defined then the <<<ResourceManager>>> will run the proxy|
| | | otherwise a standalone proxy server will need to be launched.|
*-------------------------+-------------------------+------------------------+
| <<<yarn.web-proxy.keytab>>> | | |
| | </etc/security/keytab/web-app.service.keytab> | |
| | | Kerberos keytab file for the WebAppProxy. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.web-proxy.principal>>> | wap/_HOST@REALM.TLD | |
| | | Kerberos principal name for the WebAppProxy. |
*-------------------------+-------------------------+------------------------+
* LinuxContainerExecutor
A <<<ContainerExecutor>>> used by YARN framework which define how any
<container> launched and controlled.
The following are the available in Hadoop YARN:
*--------------------------------------+--------------------------------------+
|| ContainerExecutor || Description |
*--------------------------------------+--------------------------------------+
| <<<DefaultContainerExecutor>>> | |
| | The default executor which YARN uses to manage container execution. |
| | The container process has the same Unix user as the NodeManager. |
*--------------------------------------+--------------------------------------+
| <<<LinuxContainerExecutor>>> | |
| | Supported only on GNU/Linux, this executor runs the containers as either the |
| | YARN user who submitted the application (when full security is enabled) or |
| | as a dedicated user (defaults to nobody) when full security is not enabled. |
| | When full security is enabled, this executor requires all user accounts to be |
| | created on the cluster nodes where the containers are launched. It uses |
| | a <setuid> executable that is included in the Hadoop distribution. |
| | The NodeManager uses this executable to launch and kill containers. |
| | The setuid executable switches to the user who has submitted the |
| | application and launches or kills the containers. For maximum security, |
| | this executor sets up restricted permissions and user/group ownership of |
| | local files and directories used by the containers such as the shared |
| | objects, jars, intermediate files, log files etc. Particularly note that, |
| | because of this, except the application owner and NodeManager, no other |
| | user can access any of the local files/directories including those |
| | localized as part of the distributed cache. |
*--------------------------------------+--------------------------------------+
To build the LinuxContainerExecutor executable run:
----
$ mvn package -Dcontainer-executor.conf.dir=/etc/hadoop/
----
The path passed in <<<-Dcontainer-executor.conf.dir>>> should be the
path on the cluster nodes where a configuration file for the setuid
executable should be located. The executable should be installed in
$HADOOP_YARN_HOME/bin.
The executable must have specific permissions: 6050 or --Sr-s---
permissions user-owned by <root> (super-user) and group-owned by a
special group (e.g. <<<hadoop>>>) of which the NodeManager Unix user is
the group member and no ordinary application user is. If any application
user belongs to this special group, security will be compromised. This
special group name should be specified for the configuration property
<<<yarn.nodemanager.linux-container-executor.group>>> in both
<<<conf/yarn-site.xml>>> and <<<conf/container-executor.cfg>>>.
For example, let's say that the NodeManager is run as user <yarn> who is
part of the groups users and <hadoop>, any of them being the primary group.
Let also be that <users> has both <yarn> and another user
(application submitter) <alice> as its members, and <alice> does not
belong to <hadoop>. Going by the above description, the setuid/setgid
executable should be set 6050 or --Sr-s--- with user-owner as <yarn> and
group-owner as <hadoop> which has <yarn> as its member (and not <users>
which has <alice> also as its member besides <yarn>).
The LinuxTaskController requires that paths including and leading up to
the directories specified in <<<yarn.nodemanager.local-dirs>>> and
<<<yarn.nodemanager.log-dirs>>> to be set 755 permissions as described
above in the table on permissions on directories.
* <<<conf/container-executor.cfg>>>
The executable requires a configuration file called
<<<container-executor.cfg>>> to be present in the configuration
directory passed to the mvn target mentioned above.
The configuration file must be owned by the user running NodeManager
(user <<<yarn>>> in the above example), group-owned by anyone and
should have the permissions 0400 or r--------.
The executable requires following configuration items to be present
in the <<<conf/container-executor.cfg>>> file. The items should be
mentioned as simple key=value pairs, one per-line:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.linux-container-executor.group>>> | <hadoop> | |
| | | Unix group of the NodeManager. The group owner of the |
| | |<container-executor> binary should be this group. Should be same as the |
| | | value with which the NodeManager is configured. This configuration is |
| | | required for validating the secure access of the <container-executor> |
| | | binary. |
*-------------------------+-------------------------+------------------------+
| <<<banned.users>>> | hfds,yarn,mapred,bin | Banned users. |
*-------------------------+-------------------------+------------------------+
| <<<allowed.system.users>>> | foo,bar | Allowed system users. |
*-------------------------+-------------------------+------------------------+
| <<<min.user.id>>> | 1000 | Prevent other super-users. |
*-------------------------+-------------------------+------------------------+
To re-cap, here are the local file-sysytem permissions required for the
various paths related to the <<<LinuxContainerExecutor>>>:
*-------------------+-------------------+------------------+------------------+
|| Filesystem || Path || User:Group || Permissions |
*-------------------+-------------------+------------------+------------------+
| local | container-executor | root:hadoop | --Sr-s--- |
*-------------------+-------------------+------------------+------------------+
| local | <<<conf/container-executor.cfg>>> | root:hadoop | r-------- |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.local-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.log-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
* Configurations for ResourceManager:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.resourcemanager.keytab>>> | | |
| | </etc/security/keytab/rm.service.keytab> | |
| | | Kerberos keytab file for the ResourceManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.resourcemanager.principal>>> | rm/_HOST@REALM.TLD | |
| | | Kerberos principal name for the ResourceManager. |
*-------------------------+-------------------------+------------------------+
* Configurations for NodeManager:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.keytab>>> | </etc/security/keytab/nm.service.keytab> | |
| | | Kerberos keytab file for the NodeManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.principal>>> | nm/_HOST@REALM.TLD | |
| | | Kerberos principal name for the NodeManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.container-executor.class>>> | | |
| | <<<org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor>>> |
| | | Use LinuxContainerExecutor. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.linux-container-executor.group>>> | <hadoop> | |
| | | Unix group of the NodeManager. |
*-------------------------+-------------------------+------------------------+
* <<<conf/mapred-site.xml>>>
* Configurations for MapReduce JobHistory Server:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<mapreduce.jobhistory.address>>> | | |
| | MapReduce JobHistory Server <host:port> | Default port is 10020. |
*-------------------------+-------------------------+------------------------+
| <<<mapreduce.jobhistory.keytab>>> | |
| | </etc/security/keytab/jhs.service.keytab> | |
| | | Kerberos keytab file for the MapReduce JobHistory Server. |
*-------------------------+-------------------------+------------------------+
| <<<mapreduce.jobhistory.principal>>> | jhs/_HOST@REALM.TLD | |
| | | Kerberos principal name for the MapReduce JobHistory Server. |
*-------------------------+-------------------------+------------------------+
* {Operating the Hadoop Cluster}

View File

@ -44,8 +44,9 @@ Overview
Generic Options
The following options are supported by {{dfsadmin}}, {{fs}}, {{fsck}},
{{job}} and {{fetchdt}}. Applications should implement {{{some_useful_url}Tool}} to support
{{{another_useful_url}GenericOptions}}.
{{job}} and {{fetchdt}}. Applications should implement
{{{../../api/org/apache/hadoop/util/Tool.html}Tool}} to support
GenericOptions.
*------------------------------------------------+-----------------------------+
|| GENERIC_OPTION || Description
@ -123,7 +124,8 @@ User Commands
* <<<fsck>>>
Runs a HDFS filesystem checking utility. See {{Fsck}} for more info.
Runs a HDFS filesystem checking utility.
See {{{../hadoop-hdfs/HdfsUserGuide.html#fsck}fsck}} for more info.
Usage: <<<hadoop fsck [GENERIC_OPTIONS] <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]>>>
@ -149,7 +151,8 @@ User Commands
* <<<fetchdt>>>
Gets Delegation Token from a NameNode. See {{fetchdt}} for more info.
Gets Delegation Token from a NameNode.
See {{{../hadoop-hdfs/HdfsUserGuide.html#fetchdt}fetchdt}} for more info.
Usage: <<<hadoop fetchdt [GENERIC_OPTIONS] [--webservice <namenode_http_addr>] <path> >>>
@ -302,7 +305,8 @@ Administration Commands
* <<<balancer>>>
Runs a cluster balancing utility. An administrator can simply press Ctrl-C
to stop the rebalancing process. See Rebalancer for more details.
to stop the rebalancing process. See
{{{../hadoop-hdfs/HdfsUserGuide.html#Rebalancer}Rebalancer}} for more details.
Usage: <<<hadoop balancer [-threshold <threshold>]>>>
@ -445,7 +449,7 @@ Administration Commands
* <<<namenode>>>
Runs the namenode. More info about the upgrade, rollback and finalize is
at Upgrade Rollback
at {{{../hadoop-hdfs/HdfsUserGuide.html#Upgrade_and_Rollback}Upgrade Rollback}}.
Usage: <<<hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint]>>>
@ -474,8 +478,9 @@ Administration Commands
* <<<secondarynamenode>>>
Runs the HDFS secondary namenode. See Secondary Namenode for more
info.
Runs the HDFS secondary namenode.
See {{{../hadoop-hdfs/HdfsUserGuide.html#Secondary_NameNode}Secondary Namenode}}
for more info.
Usage: <<<hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize]>>>

View File

@ -233,9 +233,10 @@ hand-in-hand to address this.
* In particular for MapReduce applications, the developer community will
try our best to support provide binary compatibility across major
releases e.g. applications using org.apache.hadop.mapred.* APIs are
supported compatibly across hadoop-1.x and hadoop-2.x. See
{{{../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}
releases e.g. applications using org.apache.hadoop.mapred.
* APIs are supported compatibly across hadoop-1.x and hadoop-2.x. See
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}
Compatibility for MapReduce applications between hadoop-1.x and hadoop-2.x}}
for more details.
@ -248,13 +249,13 @@ hand-in-hand to address this.
* {{{../hadoop-hdfs/WebHDFS.html}WebHDFS}} - Stable
* {{{../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html}ResourceManager}}
* {{{../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html}ResourceManager}}
* {{{../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html}NodeManager}}
* {{{../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html}NodeManager}}
* {{{../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html}MR Application Master}}
* {{{../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html}MR Application Master}}
* {{{../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html}History Server}}
* {{{../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html}History Server}}
*** Policy
@ -512,7 +513,8 @@ hand-in-hand to address this.
{{{https://issues.apache.org/jira/browse/HADOOP-9517}HADOOP-9517}}
* Binary compatibility for MapReduce end-user applications between hadoop-1.x and hadoop-2.x -
{{{../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}MapReduce Compatibility between hadoop-1.x and hadoop-2.x}}
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}
MapReduce Compatibility between hadoop-1.x and hadoop-2.x}}
* Annotations for interfaces as per interface classification
schedule -

View File

@ -88,7 +88,7 @@ chgrp
Change group association of files. The user must be the owner of files, or
else a super-user. Additional information is in the
{{{betterurl}Permissions Guide}}.
{{{../hadoop-hdfs/HdfsPermissionsGuide.html}Permissions Guide}}.
Options
@ -101,7 +101,7 @@ chmod
Change the permissions of files. With -R, make the change recursively
through the directory structure. The user must be the owner of the file, or
else a super-user. Additional information is in the
{{{betterurl}Permissions Guide}}.
{{{../hadoop-hdfs/HdfsPermissionsGuide.html}Permissions Guide}}.
Options
@ -112,7 +112,7 @@ chown
Usage: <<<hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]>>>
Change the owner of files. The user must be a super-user. Additional information
is in the {{{betterurl}Permissions Guide}}.
is in the {{{../hadoop-hdfs/HdfsPermissionsGuide.html}Permissions Guide}}.
Options
@ -210,8 +210,8 @@ expunge
Usage: <<<hdfs dfs -expunge>>>
Empty the Trash. Refer to the {{{betterurl}HDFS Architecture Guide}} for
more information on the Trash feature.
Empty the Trash. Refer to the {{{../hadoop-hdfs/HdfsDesign.html}
HDFS Architecture Guide}} for more information on the Trash feature.
get
@ -439,7 +439,9 @@ test
Options:
* The -e option will check to see if the file exists, returning 0 if true.
* The -z option will check to see if the file is zero length, returning 0 if true.
* The -d option will check to see if the path is directory, returning 0 if true.
Example:

View File

@ -18,8 +18,6 @@
Hadoop Interface Taxonomy: Audience and Stability Classification
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Motivation

View File

@ -117,23 +117,19 @@ Native Libraries Guide
* zlib-development package (stable version >= 1.2.0)
Once you installed the prerequisite packages use the standard hadoop
build.xml file and pass along the compile.native flag (set to true) to
build the native hadoop library:
pom.xml file and pass along the native flag to build the native hadoop
library:
----
$ ant -Dcompile.native=true <target>
$ mvn package -Pdist,native -Dskiptests -Dtar
----
You should see the newly-built library in:
----
$ build/native/<platform>/lib
$ hadoop-dist/target/hadoop-${project.version}/lib/native
----
where <platform> is a combination of the system-properties:
${os.name}-${os.arch}-${sun.arch.data.model} (for example,
Linux-i386-32).
Please note the following:
* It is mandatory to install both the zlib and gzip development

View File

@ -0,0 +1,637 @@
~~ Licensed under the Apache License, Version 2.0 (the "License");
~~ you may not use this file except in compliance with the License.
~~ You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License. See accompanying LICENSE file.
---
Hadoop in Secure Mode
---
---
${maven.build.timestamp}
%{toc|section=0|fromDepth=0|toDepth=3}
Hadoop in Secure Mode
* Introduction
This document describes how to configure authentication for Hadoop in
secure mode.
By default Hadoop runs in non-secure mode in which no actual
authentication is required.
By configuring Hadoop runs in secure mode,
each user and service needs to be authenticated by Kerberos
in order to use Hadoop services.
Security features of Hadoop consist of
{{{Authentication}authentication}},
{{{./ServiceLevelAuth.html}service level authorization}},
{{{./HttpAuthentication.html}authentication for Web consoles}}
and {{{Data confidentiality}data confidenciality}}.
* Authentication
** End User Accounts
When service level authentication is turned on,
end users using Hadoop in secure mode needs to be authenticated by Kerberos.
The simplest way to do authentication is using <<<kinit>>> command of Kerberos.
** User Accounts for Hadoop Daemons
Ensure that HDFS and YARN daemons run as different Unix users,
e.g. <<<hdfs>>> and <<<yarn>>>.
Also, ensure that the MapReduce JobHistory server runs as
different user such as <<<mapred>>>.
It's recommended to have them share a Unix group, for e.g. <<<hadoop>>>.
See also "{{Mapping from user to group}}" for group management.
*---------------+----------------------------------------------------------------------+
|| User:Group || Daemons |
*---------------+----------------------------------------------------------------------+
| hdfs:hadoop | NameNode, Secondary NameNode, JournalNode, DataNode |
*---------------+----------------------------------------------------------------------+
| yarn:hadoop | ResourceManager, NodeManager |
*---------------+----------------------------------------------------------------------+
| mapred:hadoop | MapReduce JobHistory Server |
*---------------+----------------------------------------------------------------------+
** Kerberos principals for Hadoop Daemons and Users
For running hadoop service daemons in Hadoop in secure mode,
Kerberos principals are required.
Each service reads auhenticate information saved in keytab file with appropriate permission.
HTTP web-consoles should be served by principal different from RPC's one.
Subsections below shows the examples of credentials for Hadoop services.
*** HDFS
The NameNode keytab file, on the NameNode host, should look like the
following:
----
$ klist -e -k -t /etc/security/keytab/nn.service.keytab
Keytab name: FILE:/etc/security/keytab/nn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
The Secondary NameNode keytab file, on that host, should look like the
following:
----
$ klist -e -k -t /etc/security/keytab/sn.service.keytab
Keytab name: FILE:/etc/security/keytab/sn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
The DataNode keytab file, on each host, should look like the following:
----
$ klist -e -k -t /etc/security/keytab/dn.service.keytab
Keytab name: FILE:/etc/security/keytab/dn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
*** YARN
The ResourceManager keytab file, on the ResourceManager host, should look
like the following:
----
$ klist -e -k -t /etc/security/keytab/rm.service.keytab
Keytab name: FILE:/etc/security/keytab/rm.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
The NodeManager keytab file, on each host, should look like the following:
----
$ klist -e -k -t /etc/security/keytab/nm.service.keytab
Keytab name: FILE:/etc/security/keytab/nm.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
*** MapReduce JobHistory Server
The MapReduce JobHistory Server keytab file, on that host, should look
like the following:
----
$ klist -e -k -t /etc/security/keytab/jhs.service.keytab
Keytab name: FILE:/etc/security/keytab/jhs.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
----
** Mapping from Kerberos principal to OS user account
Hadoop maps Kerberos principal to OS user account using
the rule specified by <<<hadoop.security.auth_to_local>>>
which works in the same way as the <<<auth_to_local>>> in
{{{http://web.mit.edu/Kerberos/krb5-latest/doc/admin/conf_files/krb5_conf.html}Kerberos configuration file (krb5.conf)}}.
By default, it picks the first component of principal name as a user name
if the realms matches to the <<<defalut_realm>>> (usually defined in /etc/krb5.conf).
For example, <<<host/full.qualified.domain.name@REALM.TLD>>> is mapped to <<<host>>>
by default rule.
** Mapping from user to group
Though files on HDFS are associated to owner and group,
Hadoop does not have the definition of group by itself.
Mapping from user to group is done by OS or LDAP.
You can change a way of mapping by
specifying the name of mapping provider as a value of
<<<hadoop.security.group.mapping>>>
See {{{../hadoop-hdfs/HdfsPermissionsGuide.html}HDFS Permissions Guide}} for details.
Practically you need to manage SSO environment using Kerberos with LDAP
for Hadoop in secure mode.
** Proxy user
Some products such as Apache Oozie which access the services of Hadoop
on behalf of end users need to be able to impersonate end users.
You can configure proxy user using properties
<<<hadoop.proxyuser.${superuser}.hosts>>> and <<<hadoop.proxyuser.${superuser}.groups>>>.
For example, by specifying as below in core-site.xml,
user named <<<oozie>>> accessing from any host
can impersonate any user belonging to any group.
----
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
----
** Secure DataNode
Because the data transfer protocol of DataNode
does not use the RPC framework of Hadoop,
DataNode must authenticate itself by
using privileged ports which are specified by
<<<dfs.datanode.address>>> and <<<dfs.datanode.http.address>>>.
This authentication is based on the assumption
that the attacker won't be able to get root privileges.
When you execute <<<hdfs datanode>>> command as root,
server process binds privileged port at first,
then drops privilege and runs as the user account specified by
<<<HADOOP_SECURE_DN_USER>>>.
This startup process uses jsvc installed to <<<JSVC_HOME>>>.
You must specify <<<HADOOP_SECURE_DN_USER>>> and <<<JSVC_HOME>>>
as environment variables on start up (in hadoop-env.sh).
* Data confidentiality
** Data Encryption on RPC
The data transfered between hadoop services and clients.
Setting <<<hadoop.rpc.protection>>> to <<<"privacy">>> in the core-site.xml
activate data encryption.
** Data Encryption on Block data transfer.
You need to set <<<dfs.encrypt.data.transfer>>> to <<<"true">>> in the hdfs-site.xml
in order to activate data encryption for data transfer protocol of DataNode.
** Data Encryption on HTTP
Data transfer between Web-console and clients are protected by using SSL(HTTPS).
* Configuration
** Permissions for both HDFS and local fileSystem paths
The following table lists various paths on HDFS and local filesystems (on
all nodes) and recommended permissions:
*-------------------+-------------------+------------------+------------------+
|| Filesystem || Path || User:Group || Permissions |
*-------------------+-------------------+------------------+------------------+
| local | <<<dfs.namenode.name.dir>>> | hdfs:hadoop | drwx------ |
*-------------------+-------------------+------------------+------------------+
| local | <<<dfs.datanode.data.dir>>> | hdfs:hadoop | drwx------ |
*-------------------+-------------------+------------------+------------------+
| local | $HADOOP_LOG_DIR | hdfs:hadoop | drwxrwxr-x |
*-------------------+-------------------+------------------+------------------+
| local | $YARN_LOG_DIR | yarn:hadoop | drwxrwxr-x |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.local-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.log-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| local | container-executor | root:hadoop | --Sr-s--- |
*-------------------+-------------------+------------------+------------------+
| local | <<<conf/container-executor.cfg>>> | root:hadoop | r-------- |
*-------------------+-------------------+------------------+------------------+
| hdfs | / | hdfs:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| hdfs | /tmp | hdfs:hadoop | drwxrwxrwxt |
*-------------------+-------------------+------------------+------------------+
| hdfs | /user | hdfs:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| hdfs | <<<yarn.nodemanager.remote-app-log-dir>>> | yarn:hadoop | drwxrwxrwxt |
*-------------------+-------------------+------------------+------------------+
| hdfs | <<<mapreduce.jobhistory.intermediate-done-dir>>> | mapred:hadoop | |
| | | | drwxrwxrwxt |
*-------------------+-------------------+------------------+------------------+
| hdfs | <<<mapreduce.jobhistory.done-dir>>> | mapred:hadoop | |
| | | | drwxr-x--- |
*-------------------+-------------------+------------------+------------------+
** Common Configurations
In order to turn on RPC authentication in hadoop,
set the value of <<<hadoop.security.authentication>>> property to
<<<"kerberos">>>, and set security related settings listed below appropriately.
The following properties should be in the <<<core-site.xml>>> of all the
nodes in the cluster.
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.security.authentication>>> | <kerberos> | |
| | | <<<simple>>> : No authentication. (default) \
| | | <<<kerberos>>> : Enable authentication by Kerberos. |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.security.authorization>>> | <true> | |
| | | Enable {{{./ServiceLevelAuth.html}RPC service-level authorization}}. |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.rpc.protection>>> | <authentication> |
| | | <authentication> : authentication only (default) \
| | | <integrity> : integrity check in addition to authentication \
| | | <privacy> : data encryption in addition to integrity |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.security.auth_to_local>>> | | |
| | <<<RULE:>>><exp1>\
| | <<<RULE:>>><exp2>\
| | <...>\
| | DEFAULT |
| | | The value is string containing new line characters.
| | | See
| | | {{{http://web.mit.edu/Kerberos/krb5-latest/doc/admin/conf_files/krb5_conf.html}Kerberos documentation}}
| | | for format for <exp>.
*-------------------------+-------------------------+------------------------+
| <<<hadoop.proxyuser.>>><superuser><<<.hosts>>> | | |
| | | comma separated hosts from which <superuser> access are allowd to impersonation. |
| | | <<<*>>> means wildcard. |
*-------------------------+-------------------------+------------------------+
| <<<hadoop.proxyuser.>>><superuser><<<.groups>>> | | |
| | | comma separated groups to which users impersonated by <superuser> belongs. |
| | | <<<*>>> means wildcard. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/core-site.xml>>>
** NameNode
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.block.access.token.enable>>> | <true> | |
| | | Enable HDFS block access tokens for secure operations. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.https.enable>>> | <true> | |
| | | This value is deprecated. Use dfs.http.policy |
*-------------------------+-------------------------+------------------------+
| <<<dfs.http.policy>>> | <HTTP_ONLY> or <HTTPS_ONLY> or <HTTP_AND_HTTPS> | |
| | | HTTPS_ONLY turns off http access |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.https-address>>> | <nn_host_fqdn:50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.https.port>>> | <50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.keytab.file>>> | </etc/security/keytab/nn.service.keytab> | |
| | | Kerberos keytab file for the NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.kerberos.principal>>> | nn/_HOST@REALM.TLD | |
| | | Kerberos principal name for the NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.kerberos.https.principal>>> | host/_HOST@REALM.TLD | |
| | | HTTPS Kerberos principal name for the NameNode. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/hdfs-site.xml>>>
** Secondary NameNode
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.http-address>>> | <c_nn_host_fqdn:50090> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.https-port>>> | <50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.keytab.file>>> | | |
| | </etc/security/keytab/sn.service.keytab> | |
| | | Kerberos keytab file for the NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.kerberos.principal>>> | sn/_HOST@REALM.TLD | |
| | | Kerberos principal name for the Secondary NameNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.namenode.secondary.kerberos.https.principal>>> | | |
| | host/_HOST@REALM.TLD | |
| | | HTTPS Kerberos principal name for the Secondary NameNode. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/hdfs-site.xml>>>
** DataNode
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.data.dir.perm>>> | 700 | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.address>>> | <0.0.0.0:1004> | |
| | | Secure DataNode must use privileged port |
| | | in order to assure that the server was started securely. |
| | | This means that the server must be started via jsvc. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.http.address>>> | <0.0.0.0:1006> | |
| | | Secure DataNode must use privileged port |
| | | in order to assure that the server was started securely. |
| | | This means that the server must be started via jsvc. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.https.address>>> | <0.0.0.0:50470> | |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.keytab.file>>> | </etc/security/keytab/dn.service.keytab> | |
| | | Kerberos keytab file for the DataNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.kerberos.principal>>> | dn/_HOST@REALM.TLD | |
| | | Kerberos principal name for the DataNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.datanode.kerberos.https.principal>>> | | |
| | host/_HOST@REALM.TLD | |
| | | HTTPS Kerberos principal name for the DataNode. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.encrypt.data.transfer>>> | <false> | |
| | | set to <<<true>>> when using data encryption |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/hdfs-site.xml>>>
** WebHDFS
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<dfs.webhdfs.enabled>>> | http/_HOST@REALM.TLD | |
| | | Enable security on WebHDFS. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.web.authentication.kerberos.principal>>> | http/_HOST@REALM.TLD | |
| | | Kerberos keytab file for the WebHDFS. |
*-------------------------+-------------------------+------------------------+
| <<<dfs.web.authentication.kerberos.keytab>>> | </etc/security/keytab/http.service.keytab> | |
| | | Kerberos principal name for WebHDFS. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/hdfs-site.xml>>>
** ResourceManager
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.resourcemanager.keytab>>> | | |
| | </etc/security/keytab/rm.service.keytab> | |
| | | Kerberos keytab file for the ResourceManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.resourcemanager.principal>>> | rm/_HOST@REALM.TLD | |
| | | Kerberos principal name for the ResourceManager. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/yarn-site.xml>>>
** NodeManager
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.keytab>>> | </etc/security/keytab/nm.service.keytab> | |
| | | Kerberos keytab file for the NodeManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.principal>>> | nm/_HOST@REALM.TLD | |
| | | Kerberos principal name for the NodeManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.container-executor.class>>> | | |
| | <<<org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor>>> |
| | | Use LinuxContainerExecutor. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.linux-container-executor.group>>> | <hadoop> | |
| | | Unix group of the NodeManager. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.linux-container-executor.path>>> | </path/to/bin/container-executor> | |
| | | The path to the executable of Linux container executor. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/yarn-site.xml>>>
** Configuration for WebAppProxy
The <<<WebAppProxy>>> provides a proxy between the web applications
exported by an application and an end user. If security is enabled
it will warn users before accessing a potentially unsafe web application.
Authentication and authorization using the proxy is handled just like
any other privileged web application.
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.web-proxy.address>>> | | |
| | <<<WebAppProxy>>> host:port for proxy to AM web apps. | |
| | | <host:port> if this is the same as <<<yarn.resourcemanager.webapp.address>>>|
| | | or it is not defined then the <<<ResourceManager>>> will run the proxy|
| | | otherwise a standalone proxy server will need to be launched.|
*-------------------------+-------------------------+------------------------+
| <<<yarn.web-proxy.keytab>>> | | |
| | </etc/security/keytab/web-app.service.keytab> | |
| | | Kerberos keytab file for the WebAppProxy. |
*-------------------------+-------------------------+------------------------+
| <<<yarn.web-proxy.principal>>> | wap/_HOST@REALM.TLD | |
| | | Kerberos principal name for the WebAppProxy. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/yarn-site.xml>>>
** LinuxContainerExecutor
A <<<ContainerExecutor>>> used by YARN framework which define how any
<container> launched and controlled.
The following are the available in Hadoop YARN:
*--------------------------------------+--------------------------------------+
|| ContainerExecutor || Description |
*--------------------------------------+--------------------------------------+
| <<<DefaultContainerExecutor>>> | |
| | The default executor which YARN uses to manage container execution. |
| | The container process has the same Unix user as the NodeManager. |
*--------------------------------------+--------------------------------------+
| <<<LinuxContainerExecutor>>> | |
| | Supported only on GNU/Linux, this executor runs the containers as either the |
| | YARN user who submitted the application (when full security is enabled) or |
| | as a dedicated user (defaults to nobody) when full security is not enabled. |
| | When full security is enabled, this executor requires all user accounts to be |
| | created on the cluster nodes where the containers are launched. It uses |
| | a <setuid> executable that is included in the Hadoop distribution. |
| | The NodeManager uses this executable to launch and kill containers. |
| | The setuid executable switches to the user who has submitted the |
| | application and launches or kills the containers. For maximum security, |
| | this executor sets up restricted permissions and user/group ownership of |
| | local files and directories used by the containers such as the shared |
| | objects, jars, intermediate files, log files etc. Particularly note that, |
| | because of this, except the application owner and NodeManager, no other |
| | user can access any of the local files/directories including those |
| | localized as part of the distributed cache. |
*--------------------------------------+--------------------------------------+
To build the LinuxContainerExecutor executable run:
----
$ mvn package -Dcontainer-executor.conf.dir=/etc/hadoop/
----
The path passed in <<<-Dcontainer-executor.conf.dir>>> should be the
path on the cluster nodes where a configuration file for the setuid
executable should be located. The executable should be installed in
$HADOOP_YARN_HOME/bin.
The executable must have specific permissions: 6050 or --Sr-s---
permissions user-owned by <root> (super-user) and group-owned by a
special group (e.g. <<<hadoop>>>) of which the NodeManager Unix user is
the group member and no ordinary application user is. If any application
user belongs to this special group, security will be compromised. This
special group name should be specified for the configuration property
<<<yarn.nodemanager.linux-container-executor.group>>> in both
<<<conf/yarn-site.xml>>> and <<<conf/container-executor.cfg>>>.
For example, let's say that the NodeManager is run as user <yarn> who is
part of the groups users and <hadoop>, any of them being the primary group.
Let also be that <users> has both <yarn> and another user
(application submitter) <alice> as its members, and <alice> does not
belong to <hadoop>. Going by the above description, the setuid/setgid
executable should be set 6050 or --Sr-s--- with user-owner as <yarn> and
group-owner as <hadoop> which has <yarn> as its member (and not <users>
which has <alice> also as its member besides <yarn>).
The LinuxTaskController requires that paths including and leading up to
the directories specified in <<<yarn.nodemanager.local-dirs>>> and
<<<yarn.nodemanager.log-dirs>>> to be set 755 permissions as described
above in the table on permissions on directories.
* <<<conf/container-executor.cfg>>>
The executable requires a configuration file called
<<<container-executor.cfg>>> to be present in the configuration
directory passed to the mvn target mentioned above.
The configuration file must be owned by the user running NodeManager
(user <<<yarn>>> in the above example), group-owned by anyone and
should have the permissions 0400 or r--------.
The executable requires following configuration items to be present
in the <<<conf/container-executor.cfg>>> file. The items should be
mentioned as simple key=value pairs, one per-line:
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<yarn.nodemanager.linux-container-executor.group>>> | <hadoop> | |
| | | Unix group of the NodeManager. The group owner of the |
| | |<container-executor> binary should be this group. Should be same as the |
| | | value with which the NodeManager is configured. This configuration is |
| | | required for validating the secure access of the <container-executor> |
| | | binary. |
*-------------------------+-------------------------+------------------------+
| <<<banned.users>>> | hfds,yarn,mapred,bin | Banned users. |
*-------------------------+-------------------------+------------------------+
| <<<allowed.system.users>>> | foo,bar | Allowed system users. |
*-------------------------+-------------------------+------------------------+
| <<<min.user.id>>> | 1000 | Prevent other super-users. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/yarn-site.xml>>>
To re-cap, here are the local file-sysytem permissions required for the
various paths related to the <<<LinuxContainerExecutor>>>:
*-------------------+-------------------+------------------+------------------+
|| Filesystem || Path || User:Group || Permissions |
*-------------------+-------------------+------------------+------------------+
| local | container-executor | root:hadoop | --Sr-s--- |
*-------------------+-------------------+------------------+------------------+
| local | <<<conf/container-executor.cfg>>> | root:hadoop | r-------- |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.local-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
| local | <<<yarn.nodemanager.log-dirs>>> | yarn:hadoop | drwxr-xr-x |
*-------------------+-------------------+------------------+------------------+
** MapReduce JobHistory Server
*-------------------------+-------------------------+------------------------+
|| Parameter || Value || Notes |
*-------------------------+-------------------------+------------------------+
| <<<mapreduce.jobhistory.address>>> | | |
| | MapReduce JobHistory Server <host:port> | Default port is 10020. |
*-------------------------+-------------------------+------------------------+
| <<<mapreduce.jobhistory.keytab>>> | |
| | </etc/security/keytab/jhs.service.keytab> | |
| | | Kerberos keytab file for the MapReduce JobHistory Server. |
*-------------------------+-------------------------+------------------------+
| <<<mapreduce.jobhistory.principal>>> | jhs/_HOST@REALM.TLD | |
| | | Kerberos principal name for the MapReduce JobHistory Server. |
*-------------------------+-------------------------+------------------------+
Configuration for <<<conf/mapred-site.xml>>>

View File

@ -29,8 +29,10 @@ Service Level Authorization Guide
Make sure Hadoop is installed, configured and setup correctly. For more
information see:
* Single Node Setup for first-time users.
* Cluster Setup for large, distributed clusters.
* {{{./SingleCluster.html}Single Node Setup}} for first-time users.
* {{{./ClusterSetup.html}Cluster Setup}} for large, distributed clusters.
* Overview

View File

@ -18,8 +18,6 @@
Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Mapreduce Tarball
@ -32,7 +30,8 @@ $ mvn clean install -DskipTests
$ cd hadoop-mapreduce-project
$ mvn clean install assembly:assembly -Pnative
+---+
<<NOTE:>> You will need protoc 2.5.0 installed.
<<NOTE:>> You will need {{{http://code.google.com/p/protobuf}protoc 2.5.0}}
installed.
To ignore the native builds in mapreduce you can omit the <<<-Pnative>>> argument
for maven. The tarball should be available in <<<target/>>> directory.

View File

@ -23,7 +23,7 @@ import org.apache.hadoop.net.NetUtils;
import org.apache.hadoop.security.authorize.AccessControlList;
import org.junit.Assert;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.http.HttpServer.Builder;
import org.apache.hadoop.http.HttpServer2.Builder;
import java.io.File;
import java.io.IOException;
@ -33,7 +33,7 @@ import java.net.URL;
import java.net.MalformedURLException;
/**
* This is a base class for functional tests of the {@link HttpServer}.
* This is a base class for functional tests of the {@link HttpServer2}.
* The methods are static for other classes to import statically.
*/
public class HttpServerFunctionalTest extends Assert {
@ -54,7 +54,7 @@ public class HttpServerFunctionalTest extends Assert {
* @throws IOException if a problem occurs
* @throws AssertionError if a condition was not met
*/
public static HttpServer createTestServer() throws IOException {
public static HttpServer2 createTestServer() throws IOException {
prepareTestWebapp();
return createServer(TEST);
}
@ -68,13 +68,13 @@ public class HttpServerFunctionalTest extends Assert {
* @throws IOException if a problem occurs
* @throws AssertionError if a condition was not met
*/
public static HttpServer createTestServer(Configuration conf)
public static HttpServer2 createTestServer(Configuration conf)
throws IOException {
prepareTestWebapp();
return createServer(TEST, conf);
}
public static HttpServer createTestServer(Configuration conf, AccessControlList adminsAcl)
public static HttpServer2 createTestServer(Configuration conf, AccessControlList adminsAcl)
throws IOException {
prepareTestWebapp();
return createServer(TEST, conf, adminsAcl);
@ -89,7 +89,7 @@ public class HttpServerFunctionalTest extends Assert {
* @throws IOException if a problem occurs
* @throws AssertionError if a condition was not met
*/
public static HttpServer createTestServer(Configuration conf,
public static HttpServer2 createTestServer(Configuration conf,
String[] pathSpecs) throws IOException {
prepareTestWebapp();
return createServer(TEST, conf, pathSpecs);
@ -120,10 +120,10 @@ public class HttpServerFunctionalTest extends Assert {
* @return the server
* @throws IOException if it could not be created
*/
public static HttpServer createServer(String host, int port)
public static HttpServer2 createServer(String host, int port)
throws IOException {
prepareTestWebapp();
return new HttpServer.Builder().setName(TEST)
return new HttpServer2.Builder().setName(TEST)
.addEndpoint(URI.create("http://" + host + ":" + port))
.setFindPort(true).build();
}
@ -134,7 +134,7 @@ public class HttpServerFunctionalTest extends Assert {
* @return the server
* @throws IOException if it could not be created
*/
public static HttpServer createServer(String webapp) throws IOException {
public static HttpServer2 createServer(String webapp) throws IOException {
return localServerBuilder(webapp).setFindPort(true).build();
}
/**
@ -144,18 +144,18 @@ public class HttpServerFunctionalTest extends Assert {
* @return the server
* @throws IOException if it could not be created
*/
public static HttpServer createServer(String webapp, Configuration conf)
public static HttpServer2 createServer(String webapp, Configuration conf)
throws IOException {
return localServerBuilder(webapp).setFindPort(true).setConf(conf).build();
}
public static HttpServer createServer(String webapp, Configuration conf, AccessControlList adminsAcl)
public static HttpServer2 createServer(String webapp, Configuration conf, AccessControlList adminsAcl)
throws IOException {
return localServerBuilder(webapp).setFindPort(true).setConf(conf).setACL(adminsAcl).build();
}
private static Builder localServerBuilder(String webapp) {
return new HttpServer.Builder().setName(webapp).addEndpoint(
return new HttpServer2.Builder().setName(webapp).addEndpoint(
URI.create("http://localhost:0"));
}
@ -167,7 +167,7 @@ public class HttpServerFunctionalTest extends Assert {
* @return the server
* @throws IOException if it could not be created
*/
public static HttpServer createServer(String webapp, Configuration conf,
public static HttpServer2 createServer(String webapp, Configuration conf,
String[] pathSpecs) throws IOException {
return localServerBuilder(webapp).setFindPort(true).setConf(conf).setPathSpec(pathSpecs).build();
}
@ -180,8 +180,8 @@ public class HttpServerFunctionalTest extends Assert {
* @throws IOException on any failure
* @throws AssertionError if a condition was not met
*/
public static HttpServer createAndStartTestServer() throws IOException {
HttpServer server = createTestServer();
public static HttpServer2 createAndStartTestServer() throws IOException {
HttpServer2 server = createTestServer();
server.start();
return server;
}
@ -191,7 +191,7 @@ public class HttpServerFunctionalTest extends Assert {
* @param server to stop
* @throws Exception on any failure
*/
public static void stop(HttpServer server) throws Exception {
public static void stop(HttpServer2 server) throws Exception {
if (server != null) {
server.stop();
}
@ -203,7 +203,7 @@ public class HttpServerFunctionalTest extends Assert {
* @return a URL bonded to the base of the server
* @throws MalformedURLException if the URL cannot be created.
*/
public static URL getServerURL(HttpServer server)
public static URL getServerURL(HttpServer2 server)
throws MalformedURLException {
assertNotNull("No server", server);
return new URL("http://"

View File

@ -40,7 +40,7 @@ import org.apache.hadoop.net.NetUtils;
import org.junit.Test;
public class TestGlobalFilter extends HttpServerFunctionalTest {
static final Log LOG = LogFactory.getLog(HttpServer.class);
static final Log LOG = LogFactory.getLog(HttpServer2.class);
static final Set<String> RECORDS = new TreeSet<String>();
/** A very simple filter that records accessed uri's */
@ -106,9 +106,9 @@ public class TestGlobalFilter extends HttpServerFunctionalTest {
Configuration conf = new Configuration();
//start a http server with CountingFilter
conf.set(HttpServer.FILTER_INITIALIZER_PROPERTY,
conf.set(HttpServer2.FILTER_INITIALIZER_PROPERTY,
RecordingFilter.Initializer.class.getName());
HttpServer http = createTestServer(conf);
HttpServer2 http = createTestServer(conf);
http.start();
final String fsckURL = "/fsck";

View File

@ -68,8 +68,8 @@ public class TestHtmlQuoting {
@Test
public void testRequestQuoting() throws Exception {
HttpServletRequest mockReq = Mockito.mock(HttpServletRequest.class);
HttpServer.QuotingInputFilter.RequestQuoter quoter =
new HttpServer.QuotingInputFilter.RequestQuoter(mockReq);
HttpServer2.QuotingInputFilter.RequestQuoter quoter =
new HttpServer2.QuotingInputFilter.RequestQuoter(mockReq);
Mockito.doReturn("a<b").when(mockReq).getParameter("x");
assertEquals("Test simple param quoting",

View File

@ -51,7 +51,7 @@ import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.CommonConfigurationKeys;
import org.apache.hadoop.http.HttpServer.QuotingInputFilter.RequestQuoter;
import org.apache.hadoop.http.HttpServer2.QuotingInputFilter.RequestQuoter;
import org.apache.hadoop.http.resource.JerseyResource;
import org.apache.hadoop.net.NetUtils;
import org.apache.hadoop.security.Groups;
@ -70,7 +70,7 @@ import static org.mockito.Mockito.*;
public class TestHttpServer extends HttpServerFunctionalTest {
static final Log LOG = LogFactory.getLog(TestHttpServer.class);
private static HttpServer server;
private static HttpServer2 server;
private static URL baseUrl;
private static final int MAX_THREADS = 10;
@ -150,7 +150,7 @@ public class TestHttpServer extends HttpServerFunctionalTest {
@BeforeClass public static void setup() throws Exception {
Configuration conf = new Configuration();
conf.setInt(HttpServer.HTTP_MAX_THREADS, 10);
conf.setInt(HttpServer2.HTTP_MAX_THREADS, 10);
server = createTestServer(conf);
server.addServlet("echo", "/echo", EchoServlet.class);
server.addServlet("echomap", "/echomap", EchoMapServlet.class);
@ -357,7 +357,7 @@ public class TestHttpServer extends HttpServerFunctionalTest {
Configuration conf = new Configuration();
// Authorization is disabled by default
conf.set(HttpServer.FILTER_INITIALIZER_PROPERTY,
conf.set(HttpServer2.FILTER_INITIALIZER_PROPERTY,
DummyFilterInitializer.class.getName());
conf.set(CommonConfigurationKeys.HADOOP_SECURITY_GROUP_MAPPING,
MyGroupsProvider.class.getName());
@ -366,9 +366,9 @@ public class TestHttpServer extends HttpServerFunctionalTest {
MyGroupsProvider.mapping.put("userA", Arrays.asList("groupA"));
MyGroupsProvider.mapping.put("userB", Arrays.asList("groupB"));
HttpServer myServer = new HttpServer.Builder().setName("test")
HttpServer2 myServer = new HttpServer2.Builder().setName("test")
.addEndpoint(new URI("http://localhost:0")).setFindPort(true).build();
myServer.setAttribute(HttpServer.CONF_CONTEXT_ATTRIBUTE, conf);
myServer.setAttribute(HttpServer2.CONF_CONTEXT_ATTRIBUTE, conf);
myServer.start();
String serverURL = "http://" + NetUtils.getHostPortString(myServer.getConnectorAddress(0)) + "/";
for (String servlet : new String[] { "conf", "logs", "stacks",
@ -394,7 +394,7 @@ public class TestHttpServer extends HttpServerFunctionalTest {
true);
conf.setBoolean(CommonConfigurationKeys.HADOOP_SECURITY_INSTRUMENTATION_REQUIRES_ADMIN,
true);
conf.set(HttpServer.FILTER_INITIALIZER_PROPERTY,
conf.set(HttpServer2.FILTER_INITIALIZER_PROPERTY,
DummyFilterInitializer.class.getName());
conf.set(CommonConfigurationKeys.HADOOP_SECURITY_GROUP_MAPPING,
@ -407,10 +407,10 @@ public class TestHttpServer extends HttpServerFunctionalTest {
MyGroupsProvider.mapping.put("userD", Arrays.asList("groupD"));
MyGroupsProvider.mapping.put("userE", Arrays.asList("groupE"));
HttpServer myServer = new HttpServer.Builder().setName("test")
HttpServer2 myServer = new HttpServer2.Builder().setName("test")
.addEndpoint(new URI("http://localhost:0")).setFindPort(true).setConf(conf)
.setACL(new AccessControlList("userA,userB groupC,groupD")).build();
myServer.setAttribute(HttpServer.CONF_CONTEXT_ATTRIBUTE, conf);
myServer.setAttribute(HttpServer2.CONF_CONTEXT_ATTRIBUTE, conf);
myServer.start();
String serverURL = "http://"
@ -468,39 +468,39 @@ public class TestHttpServer extends HttpServerFunctionalTest {
Configuration conf = new Configuration();
conf.setBoolean(CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION, false);
ServletContext context = Mockito.mock(ServletContext.class);
Mockito.when(context.getAttribute(HttpServer.CONF_CONTEXT_ATTRIBUTE)).thenReturn(conf);
Mockito.when(context.getAttribute(HttpServer.ADMINS_ACL)).thenReturn(null);
Mockito.when(context.getAttribute(HttpServer2.CONF_CONTEXT_ATTRIBUTE)).thenReturn(conf);
Mockito.when(context.getAttribute(HttpServer2.ADMINS_ACL)).thenReturn(null);
HttpServletRequest request = Mockito.mock(HttpServletRequest.class);
Mockito.when(request.getRemoteUser()).thenReturn(null);
HttpServletResponse response = Mockito.mock(HttpServletResponse.class);
//authorization OFF
Assert.assertTrue(HttpServer.hasAdministratorAccess(context, request, response));
Assert.assertTrue(HttpServer2.hasAdministratorAccess(context, request, response));
//authorization ON & user NULL
response = Mockito.mock(HttpServletResponse.class);
conf.setBoolean(CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION, true);
Assert.assertFalse(HttpServer.hasAdministratorAccess(context, request, response));
Assert.assertFalse(HttpServer2.hasAdministratorAccess(context, request, response));
Mockito.verify(response).sendError(Mockito.eq(HttpServletResponse.SC_UNAUTHORIZED), Mockito.anyString());
//authorization ON & user NOT NULL & ACLs NULL
response = Mockito.mock(HttpServletResponse.class);
Mockito.when(request.getRemoteUser()).thenReturn("foo");
Assert.assertTrue(HttpServer.hasAdministratorAccess(context, request, response));
Assert.assertTrue(HttpServer2.hasAdministratorAccess(context, request, response));
//authorization ON & user NOT NULL & ACLs NOT NULL & user not in ACLs
response = Mockito.mock(HttpServletResponse.class);
AccessControlList acls = Mockito.mock(AccessControlList.class);
Mockito.when(acls.isUserAllowed(Mockito.<UserGroupInformation>any())).thenReturn(false);
Mockito.when(context.getAttribute(HttpServer.ADMINS_ACL)).thenReturn(acls);
Assert.assertFalse(HttpServer.hasAdministratorAccess(context, request, response));
Mockito.when(context.getAttribute(HttpServer2.ADMINS_ACL)).thenReturn(acls);
Assert.assertFalse(HttpServer2.hasAdministratorAccess(context, request, response));
Mockito.verify(response).sendError(Mockito.eq(HttpServletResponse.SC_UNAUTHORIZED), Mockito.anyString());
//authorization ON & user NOT NULL & ACLs NOT NULL & user in in ACLs
response = Mockito.mock(HttpServletResponse.class);
Mockito.when(acls.isUserAllowed(Mockito.<UserGroupInformation>any())).thenReturn(true);
Mockito.when(context.getAttribute(HttpServer.ADMINS_ACL)).thenReturn(acls);
Assert.assertTrue(HttpServer.hasAdministratorAccess(context, request, response));
Mockito.when(context.getAttribute(HttpServer2.ADMINS_ACL)).thenReturn(acls);
Assert.assertTrue(HttpServer2.hasAdministratorAccess(context, request, response));
}
@ -508,38 +508,27 @@ public class TestHttpServer extends HttpServerFunctionalTest {
public void testRequiresAuthorizationAccess() throws Exception {
Configuration conf = new Configuration();
ServletContext context = Mockito.mock(ServletContext.class);
Mockito.when(context.getAttribute(HttpServer.CONF_CONTEXT_ATTRIBUTE)).thenReturn(conf);
Mockito.when(context.getAttribute(HttpServer2.CONF_CONTEXT_ATTRIBUTE)).thenReturn(conf);
HttpServletRequest request = Mockito.mock(HttpServletRequest.class);
HttpServletResponse response = Mockito.mock(HttpServletResponse.class);
//requires admin access to instrumentation, FALSE by default
Assert.assertTrue(HttpServer.isInstrumentationAccessAllowed(context, request, response));
Assert.assertTrue(HttpServer2.isInstrumentationAccessAllowed(context, request, response));
//requires admin access to instrumentation, TRUE
conf.setBoolean(CommonConfigurationKeys.HADOOP_SECURITY_INSTRUMENTATION_REQUIRES_ADMIN, true);
conf.setBoolean(CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION, true);
AccessControlList acls = Mockito.mock(AccessControlList.class);
Mockito.when(acls.isUserAllowed(Mockito.<UserGroupInformation>any())).thenReturn(false);
Mockito.when(context.getAttribute(HttpServer.ADMINS_ACL)).thenReturn(acls);
Assert.assertFalse(HttpServer.isInstrumentationAccessAllowed(context, request, response));
}
@Test
@SuppressWarnings("deprecation")
public void testOldConstructor() throws Exception {
HttpServer server = new HttpServer("test", "0.0.0.0", 0, false);
try {
server.start();
} finally {
server.stop();
}
Mockito.when(context.getAttribute(HttpServer2.ADMINS_ACL)).thenReturn(acls);
Assert.assertFalse(HttpServer2.isInstrumentationAccessAllowed(context, request, response));
}
@Test public void testBindAddress() throws Exception {
checkBindAddress("localhost", 0, false).stop();
// hang onto this one for a bit more testing
HttpServer myServer = checkBindAddress("localhost", 0, false);
HttpServer myServer2 = null;
HttpServer2 myServer = checkBindAddress("localhost", 0, false);
HttpServer2 myServer2 = null;
try {
int port = myServer.getConnectorAddress(0).getPort();
// it's already in use, true = expect a higher port
@ -558,9 +547,9 @@ public class TestHttpServer extends HttpServerFunctionalTest {
}
}
private HttpServer checkBindAddress(String host, int port, boolean findPort)
private HttpServer2 checkBindAddress(String host, int port, boolean findPort)
throws Exception {
HttpServer server = createServer(host, port);
HttpServer2 server = createServer(host, port);
try {
// not bound, ephemeral should return requested port (0 for ephemeral)
List<?> listeners = (List<?>) Whitebox.getInternalState(server,
@ -608,7 +597,7 @@ public class TestHttpServer extends HttpServerFunctionalTest {
public void testHttpServerBuilderWithExternalConnector() throws Exception {
Connector c = mock(Connector.class);
doReturn("localhost").when(c).getHost();
HttpServer s = new HttpServer.Builder().setName("test").setConnector(c)
HttpServer2 s = new HttpServer2.Builder().setName("test").setConnector(c)
.build();
s.stop();
}

View File

@ -23,18 +23,18 @@ import org.junit.Test;
public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
/**
* Check that a server is alive by probing the {@link HttpServer#isAlive()} method
* Check that a server is alive by probing the {@link HttpServer2#isAlive()} method
* and the text of its toString() description
* @param server server
*/
private void assertAlive(HttpServer server) {
private void assertAlive(HttpServer2 server) {
assertTrue("Server is not alive", server.isAlive());
assertToStringContains(server, HttpServer.STATE_DESCRIPTION_ALIVE);
assertToStringContains(server, HttpServer2.STATE_DESCRIPTION_ALIVE);
}
private void assertNotLive(HttpServer server) {
private void assertNotLive(HttpServer2 server) {
assertTrue("Server should not be live", !server.isAlive());
assertToStringContains(server, HttpServer.STATE_DESCRIPTION_NOT_LIVE);
assertToStringContains(server, HttpServer2.STATE_DESCRIPTION_NOT_LIVE);
}
/**
@ -43,12 +43,12 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
* @throws Throwable on failure
*/
@Test public void testCreatedServerIsNotAlive() throws Throwable {
HttpServer server = createTestServer();
HttpServer2 server = createTestServer();
assertNotLive(server);
}
@Test public void testStopUnstartedServer() throws Throwable {
HttpServer server = createTestServer();
HttpServer2 server = createTestServer();
stop(server);
}
@ -59,7 +59,7 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
*/
@Test
public void testStartedServerIsAlive() throws Throwable {
HttpServer server = null;
HttpServer2 server = null;
server = createTestServer();
assertNotLive(server);
server.start();
@ -78,22 +78,22 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
requestLogAppender.setName("httprequestlog");
requestLogAppender.setFilename(System.getProperty("test.build.data", "/tmp/")
+ "jetty-name-yyyy_mm_dd.log");
Logger.getLogger(HttpServer.class.getName() + ".test").addAppender(requestLogAppender);
HttpServer server = null;
Logger.getLogger(HttpServer2.class.getName() + ".test").addAppender(requestLogAppender);
HttpServer2 server = null;
server = createTestServer();
assertNotLive(server);
server.start();
assertAlive(server);
stop(server);
Logger.getLogger(HttpServer.class.getName() + ".test").removeAppender(requestLogAppender);
Logger.getLogger(HttpServer2.class.getName() + ".test").removeAppender(requestLogAppender);
}
/**
* Assert that the result of {@link HttpServer#toString()} contains the specific text
* Assert that the result of {@link HttpServer2#toString()} contains the specific text
* @param server server to examine
* @param text text to search for
*/
private void assertToStringContains(HttpServer server, String text) {
private void assertToStringContains(HttpServer2 server, String text) {
String description = server.toString();
assertTrue("Did not find \"" + text + "\" in \"" + description + "\"",
description.contains(text));
@ -105,7 +105,7 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
* @throws Throwable on failure
*/
@Test public void testStoppedServerIsNotAlive() throws Throwable {
HttpServer server = createAndStartTestServer();
HttpServer2 server = createAndStartTestServer();
assertAlive(server);
stop(server);
assertNotLive(server);
@ -117,7 +117,7 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
* @throws Throwable on failure
*/
@Test public void testStoppingTwiceServerIsAllowed() throws Throwable {
HttpServer server = createAndStartTestServer();
HttpServer2 server = createAndStartTestServer();
assertAlive(server);
stop(server);
assertNotLive(server);
@ -133,7 +133,7 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest {
*/
@Test
public void testWepAppContextAfterServerStop() throws Throwable {
HttpServer server = null;
HttpServer2 server = null;
String key = "test.attribute.key";
String value = "test.attribute.value";
server = createTestServer();

View File

@ -36,7 +36,7 @@ public class TestHttpServerWebapps extends HttpServerFunctionalTest {
*/
@Test
public void testValidServerResource() throws Throwable {
HttpServer server = null;
HttpServer2 server = null;
try {
server = createServer("test");
} finally {
@ -51,7 +51,7 @@ public class TestHttpServerWebapps extends HttpServerFunctionalTest {
@Test
public void testMissingServerResource() throws Throwable {
try {
HttpServer server = createServer("NoSuchWebapp");
HttpServer2 server = createServer("NoSuchWebapp");
//should not have got here.
//close the server
String serverDescription = server.toString();

View File

@ -40,7 +40,7 @@ import org.apache.hadoop.net.NetUtils;
import org.junit.Test;
public class TestPathFilter extends HttpServerFunctionalTest {
static final Log LOG = LogFactory.getLog(HttpServer.class);
static final Log LOG = LogFactory.getLog(HttpServer2.class);
static final Set<String> RECORDS = new TreeSet<String>();
/** A very simple filter that records accessed uri's */
@ -107,10 +107,10 @@ public class TestPathFilter extends HttpServerFunctionalTest {
Configuration conf = new Configuration();
//start a http server with CountingFilter
conf.set(HttpServer.FILTER_INITIALIZER_PROPERTY,
conf.set(HttpServer2.FILTER_INITIALIZER_PROPERTY,
RecordingFilter.Initializer.class.getName());
String[] pathSpecs = { "/path", "/path/*" };
HttpServer http = createTestServer(conf, pathSpecs);
HttpServer2 http = createTestServer(conf, pathSpecs);
http.start();
final String baseURL = "/path";

View File

@ -48,7 +48,7 @@ public class TestSSLHttpServer extends HttpServerFunctionalTest {
private static final Log LOG = LogFactory.getLog(TestSSLHttpServer.class);
private static Configuration conf;
private static HttpServer server;
private static HttpServer2 server;
private static URL baseUrl;
private static String keystoresDir;
private static String sslConfDir;
@ -57,7 +57,7 @@ public class TestSSLHttpServer extends HttpServerFunctionalTest {
@BeforeClass
public static void setup() throws Exception {
conf = new Configuration();
conf.setInt(HttpServer.HTTP_MAX_THREADS, 10);
conf.setInt(HttpServer2.HTTP_MAX_THREADS, 10);
File base = new File(BASEDIR);
FileUtil.fullyDelete(base);
@ -73,7 +73,7 @@ public class TestSSLHttpServer extends HttpServerFunctionalTest {
clientSslFactory = new SSLFactory(SSLFactory.Mode.CLIENT, sslConf);
clientSslFactory.init();
server = new HttpServer.Builder()
server = new HttpServer2.Builder()
.setName("test")
.addEndpoint(new URI("https://localhost"))
.setConf(conf)

View File

@ -40,7 +40,7 @@ import org.apache.hadoop.test.GenericTestUtils;
import org.junit.Test;
public class TestServletFilter extends HttpServerFunctionalTest {
static final Log LOG = LogFactory.getLog(HttpServer.class);
static final Log LOG = LogFactory.getLog(HttpServer2.class);
static volatile String uri = null;
/** A very simple filter which record the uri filtered. */
@ -105,9 +105,9 @@ public class TestServletFilter extends HttpServerFunctionalTest {
Configuration conf = new Configuration();
//start a http server with CountingFilter
conf.set(HttpServer.FILTER_INITIALIZER_PROPERTY,
conf.set(HttpServer2.FILTER_INITIALIZER_PROPERTY,
SimpleFilter.Initializer.class.getName());
HttpServer http = createTestServer(conf);
HttpServer2 http = createTestServer(conf);
http.start();
final String fsckURL = "/fsck";
@ -167,9 +167,9 @@ public class TestServletFilter extends HttpServerFunctionalTest {
public void testServletFilterWhenInitThrowsException() throws Exception {
Configuration conf = new Configuration();
// start a http server with ErrorFilter
conf.set(HttpServer.FILTER_INITIALIZER_PROPERTY,
conf.set(HttpServer2.FILTER_INITIALIZER_PROPERTY,
ErrorFilter.Initializer.class.getName());
HttpServer http = createTestServer(conf);
HttpServer2 http = createTestServer(conf);
try {
http.start();
fail("expecting exception");
@ -186,8 +186,8 @@ public class TestServletFilter extends HttpServerFunctionalTest {
public void testContextSpecificServletFilterWhenInitThrowsException()
throws Exception {
Configuration conf = new Configuration();
HttpServer http = createTestServer(conf);
HttpServer.defineFilter(http.webAppContext,
HttpServer2 http = createTestServer(conf);
HttpServer2.defineFilter(http.webAppContext,
"ErrorFilter", ErrorFilter.class.getName(),
null, null);
try {

View File

@ -24,7 +24,7 @@ import java.util.regex.Pattern;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.http.HttpServerFunctionalTest;
import org.junit.AfterClass;
import org.junit.BeforeClass;
@ -32,7 +32,7 @@ import org.junit.Test;
public class TestJMXJsonServlet extends HttpServerFunctionalTest {
private static final Log LOG = LogFactory.getLog(TestJMXJsonServlet.class);
private static HttpServer server;
private static HttpServer2 server;
private static URL baseUrl;
@BeforeClass public static void setup() throws Exception {

View File

@ -20,7 +20,7 @@ package org.apache.hadoop.log;
import java.io.*;
import java.net.*;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.net.NetUtils;
import junit.framework.TestCase;
@ -44,7 +44,7 @@ public class TestLogLevel extends TestCase {
log.error("log.error1");
assertTrue(!Level.ERROR.equals(log.getEffectiveLevel()));
HttpServer server = new HttpServer.Builder().setName("..")
HttpServer2 server = new HttpServer2.Builder().setName("..")
.addEndpoint(new URI("http://localhost:0")).setFindPort(true)
.build();

View File

@ -18,7 +18,7 @@ package org.apache.hadoop.security;
import junit.framework.TestCase;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.security.authentication.server.AuthenticationFilter;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.http.FilterContainer;
@ -49,7 +49,7 @@ public class TestAuthenticationFilter extends TestCase {
AuthenticationFilterInitializer.SIGNATURE_SECRET_FILE,
secretFile.getAbsolutePath());
conf.set(HttpServer.BIND_ADDRESS, "barhost");
conf.set(HttpServer2.BIND_ADDRESS, "barhost");
FilterContainer container = Mockito.mock(FilterContainer.class);
Mockito.doAnswer(

View File

@ -331,7 +331,9 @@ public class TestSecurityUtil {
@Test
public void testSocketAddrWithIP() {
verifyServiceAddr("127.0.0.1", "127.0.0.1");
String staticHost = "127.0.0.1";
NetUtils.addStaticResolution(staticHost, "localhost");
verifyServiceAddr(staticHost, "127.0.0.1");
}
@Test

View File

@ -28,10 +28,30 @@ public class TestVersionUtil {
// Equal versions are equal.
assertEquals(0, VersionUtil.compareVersions("2.0.0", "2.0.0"));
assertEquals(0, VersionUtil.compareVersions("2.0.0a", "2.0.0a"));
assertEquals(0, VersionUtil.compareVersions("1", "1"));
assertEquals(0, VersionUtil.compareVersions(
"2.0.0-SNAPSHOT", "2.0.0-SNAPSHOT"));
assertEquals(0, VersionUtil.compareVersions("1", "1"));
assertEquals(0, VersionUtil.compareVersions("1", "1.0"));
assertEquals(0, VersionUtil.compareVersions("1", "1.0.0"));
assertEquals(0, VersionUtil.compareVersions("1.0", "1"));
assertEquals(0, VersionUtil.compareVersions("1.0", "1.0"));
assertEquals(0, VersionUtil.compareVersions("1.0", "1.0.0"));
assertEquals(0, VersionUtil.compareVersions("1.0.0", "1"));
assertEquals(0, VersionUtil.compareVersions("1.0.0", "1.0"));
assertEquals(0, VersionUtil.compareVersions("1.0.0", "1.0.0"));
assertEquals(0, VersionUtil.compareVersions("1.0.0-alpha-1", "1.0.0-a1"));
assertEquals(0, VersionUtil.compareVersions("1.0.0-alpha-2", "1.0.0-a2"));
assertEquals(0, VersionUtil.compareVersions("1.0.0-alpha1", "1.0.0-alpha-1"));
assertEquals(0, VersionUtil.compareVersions("1a0", "1.0.0-alpha-0"));
assertEquals(0, VersionUtil.compareVersions("1a0", "1-a0"));
assertEquals(0, VersionUtil.compareVersions("1.a0", "1-a0"));
assertEquals(0, VersionUtil.compareVersions("1.a0", "1.0.0-alpha-0"));
// Assert that lower versions are lower, and higher versions are higher.
assertExpectedValues("1", "2.0.0");
assertExpectedValues("1.0.0", "2");
@ -51,15 +71,27 @@ public class TestVersionUtil {
assertExpectedValues("1.0.2a", "1.0.2ab");
assertExpectedValues("1.0.0a1", "1.0.0a2");
assertExpectedValues("1.0.0a2", "1.0.0a10");
// The 'a' in "1.a" is not followed by digit, thus not treated as "alpha",
// and treated larger than "1.0", per maven's ComparableVersion class
// implementation.
assertExpectedValues("1.0", "1.a");
assertExpectedValues("1.0", "1.a0");
//The 'a' in "1.a0" is followed by digit, thus treated as "alpha-<digit>"
assertExpectedValues("1.a0", "1.0");
assertExpectedValues("1a0", "1.0");
assertExpectedValues("1.0.1-alpha-1", "1.0.1-alpha-2");
assertExpectedValues("1.0.1-beta-1", "1.0.1-beta-2");
// Snapshot builds precede their eventual releases.
assertExpectedValues("1.0-SNAPSHOT", "1.0");
assertExpectedValues("1.0", "1.0.0-SNAPSHOT");
assertExpectedValues("1.0.0-SNAPSHOT", "1.0");
assertExpectedValues("1.0.0-SNAPSHOT", "1.0.0");
assertExpectedValues("1.0.0", "1.0.1-SNAPSHOT");
assertExpectedValues("1.0.1-SNAPSHOT", "1.0.1");
assertExpectedValues("1.0.1-SNAPSHOT", "1.0.2");
assertExpectedValues("1.0.1-alpha-1", "1.0.1-SNAPSHOT");
assertExpectedValues("1.0.1-beta-1", "1.0.1-SNAPSHOT");
assertExpectedValues("1.0.1-beta-2", "1.0.1-SNAPSHOT");
}
private static void assertExpectedValues(String lower, String higher) {

View File

@ -290,6 +290,21 @@ Release 2.4.0 - UNRELEASED
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
HDFS-5781. Use an array to record the mapping between FSEditLogOpCode and
the corresponding byte value. (jing9)
OPTIMIZATIONS
BUG FIXES
Release 2.3.0 - UNRELEASED
INCOMPATIBLE CHANGES
NEW FEATURES
HDFS-5122. Support failover and retry in WebHdfsFileSystem for NN HA.
@ -329,6 +344,43 @@ Release 2.4.0 - UNRELEASED
IMPROVEMENTS
HDFS-5360. Improvement of usage message of renameSnapshot and
deleteSnapshot. (Shinichi Yamashita via wang)
HDFS-5331. make SnapshotDiff.java to a o.a.h.util.Tool interface implementation.
(Vinayakumar B via umamahesh)
HDFS-4657. Limit the number of blocks logged by the NN after a block
report to a configurable value. (Aaron T. Myers via Colin Patrick
McCabe)
HDFS-5344. Make LsSnapshottableDir as Tool interface implementation. (Sathish via umamahesh)
HDFS-5544. Adding Test case For Checking dfs.checksum type as NULL value. (Sathish via umamahesh)
HDFS-5568. Support includeSnapshots option with Fsck command. (Vinayakumar B via umamahesh)
HDFS-4983. Numeric usernames do not work with WebHDFS FS. (Yongjun Zhang via
jing9)
HDFS-5592. statechangeLog of completeFile should be logged only in case of success.
(Vinayakumar via umamahesh)
HDFS-5662. Can't decommission a DataNode due to file's replication factor
larger than the rest of the cluster size. (brandonli)
HDFS-5068. Convert NNThroughputBenchmark to a Tool to allow generic options.
(shv)
HDFS-5675. Add Mkdirs operation to NNThroughputBenchmark.
(Plamen Jeliazkov via shv)
HDFS-5677. Need error checking for HA cluster configuration.
(Vincent Sheffer via cos)
HDFS-5825. Use FileUtils.copyFile() to implement DFSTestUtils.copyFile().
(Haohui Mai via Arpit Agarwal)
HDFS-5267. Remove volatile from LightWeightHashSet. (Junping Du via llu)
HDFS-4278. Log an ERROR when DFS_BLOCK_ACCESS_TOKEN_ENABLE config is
@ -504,6 +556,8 @@ Release 2.4.0 - UNRELEASED
HDFS-5788. listLocatedStatus response can be very large. (Nathan Roberts
via kihwal)
HDFS-5841. Update HDFS caching documentation with new changes. (wang)
OPTIMIZATIONS
HDFS-5239. Allow FSNamesystem lock fairness to be configurable (daryn)
@ -518,6 +572,177 @@ Release 2.4.0 - UNRELEASED
BUG FIXES
HDFS-5307. Support both HTTP and HTTPS in jsp pages (Haohui Mai via
brandonli)
HDFS-5291. Standby namenode after transition to active goes into safemode.
(jing9)
HDFS-5317. Go back to DFS Home link does not work on datanode webUI
(Haohui Mai via brandonli)
HDFS-5316. Namenode ignores the default https port (Haohui Mai via
brandonli)
HDFS-5281. COMMIT request should not block. (brandonli)
HDFS-5337. should do hsync for a commit request even there is no pending
writes (brandonli)
HDFS-5335. Hive query failed with possible race in dfs output stream.
(Haohui Mai via suresh)
HDFS-5322. HDFS delegation token not found in cache errors seen on secure HA
clusters. (jing9)
HDFS-5329. Update FSNamesystem#getListing() to handle inode path in startAfter
token. (brandonli)
HDFS-5330. fix readdir and readdirplus for large directories (brandonli)
HDFS-5370. Typo in Error Message: different between range in condition
and range in error message. (Kousuke Saruta via suresh)
HDFS-5365. Fix libhdfs compile error on FreeBSD9. (Radim Kolar via cnauroth)
HDFS-5347. Add HDFS NFS user guide. (brandonli)
HDFS-5403. WebHdfs client cannot communicate with older WebHdfs servers
post HDFS-5306. (atm)
HDFS-5171. NFS should create input stream for a file and try to share it
with multiple read requests. (Haohui Mai via brandonli)
HDFS-5413. hdfs.cmd does not support passthrough to any arbitrary class.
(cnauroth)
HDFS-5433. When reloading fsimage during checkpointing, we should clear
existing snapshottable directories. (Aaron T. Myers via wang)
HDFS-5432. TestDatanodeJsp fails on Windows due to assumption that loopback
address resolves to host name localhost. (cnauroth)
HDFS-5065. TestSymlinkHdfsDisable fails on Windows. (ivanmi)
HDFS-4633 TestDFSClientExcludedNodes fails sporadically if excluded nodes
cache expires too quickly (Chris Nauroth via Sanjay)
HDFS-5037. Active NN should trigger its own edit log rolls (wang)
HDFS-5035. getFileLinkStatus and rename do not correctly check permissions
of symlinks. (Andrew Wang via Colin Patrick McCabe)
HDFS-5456. NameNode startup progress creates new steps if caller attempts to
create a counter for a step that doesn't already exist. (cnauroth)
HDFS-5458. Datanode failed volume threshold ignored if exception is thrown
in getDataDirsFromURIs. (Mike Mellenthin via wang)
HDFS-5252. Stable write is not handled correctly in someplace. (brandonli)
HDFS-5364. Add OpenFileCtx cache. (brandonli)
HDFS-5469. Add configuration property for the sub-directroy export path
(brandonli)
HDFS-5519. COMMIT handler should update the commit status after sync
(brandonli)
HDFS-5372. In FSNamesystem, hasReadLock() returns false if the current thread
holds the write lock (VinayaKumar B via umamahesh)
HDFS-4516. Client crash after block allocation and NN switch before lease recovery for
the same file can cause readers to fail forever (VinaayKumar B via umamahesh)
HDFS-5014. Process register commands with out holding BPOfferService lock.
(Vinaykumar B via umamahesh)
HDFS-5288. Close idle connections in portmap (Haohui Mai via brandonli)
HDFS-5407. Fix typos in DFSClientCache (Haohui Mai via brandonli)
HDFS-5548. Use ConcurrentHashMap in portmap (Haohui Mai via brandonli)
HDFS-5577. NFS user guide update (brandonli)
HDFS-5563. NFS gateway should commit the buffered data when read request comes
after write to the same file (brandonli)
HDFS-4997. libhdfs doesn't return correct error codes in most cases (cmccabe)
HDFS-5587. add debug information when NFS fails to start with duplicate user
or group names (brandonli)
HDFS-5590. Block ID and generation stamp may be reused when persistBlocks is
set to false. (jing9)
HDFS-5353. Short circuit reads fail when dfs.encrypt.data.transfer is
enabled. (Colin Patrick McCabe via jing9)
HDFS-5283. Under construction blocks only inside snapshots should not be
counted in safemode threshhold. (Vinay via szetszwo)
HDFS-5257. addBlock() retry should return LocatedBlock with locations else client
will get AIOBE. (Vinay via jing9)
HDFS-5427. Not able to read deleted files from snapshot directly under
snapshottable dir after checkpoint and NN restart. (Vinay via jing9)
HDFS-5443. Delete 0-sized block when deleting an under-construction file that
is included in snapshot. (jing9)
HDFS-5476. Snapshot: clean the blocks/files/directories under a renamed
file/directory while deletion. (jing9)
HDFS-5425. Renaming underconstruction file with snapshots can make NN failure on
restart. (jing9 and Vinay)
HDFS-5474. Deletesnapshot can make Namenode in safemode on NN restarts.
(Sathish via jing9)
HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold,
leads to NN safemode. (Vinay via jing9)
HDFS-5428. Under construction files deletion after snapshot+checkpoint+nn restart
leads nn safemode. (jing9)
HDFS-5074. Allow starting up from an fsimage checkpoint in the middle of a
segment. (Todd Lipcon via atm)
HDFS-4201. NPE in BPServiceActor#sendHeartBeat. (jxiang via cmccabe)
HDFS-5666. Fix inconsistent synchronization in BPOfferService (jxiang via cmccabe)
HDFS-5657. race condition causes writeback state error in NFS gateway (brandonli)
HDFS-5661. Browsing FileSystem via web ui, should use datanode's fqdn instead of ip
address. (Benoy Antony via jing9)
HDFS-5582. hdfs getconf -excludeFile or -includeFile always failed (sathish
via cmccabe)
HDFS-5671. Fix socket leak in DFSInputStream#getBlockReader. (JamesLi via umamahesh)
HDFS-5649. Unregister NFS and Mount service when NFS gateway is shutting down.
(brandonli)
HDFS-5789. Some of snapshot APIs missing checkOperation double check in fsn. (umamahesh)
HDFS-5343. When cat command is issued on snapshot files getting unexpected result.
(Sathish via umamahesh)
HDFS-5297. Fix dead links in HDFS site documents. (Akira Ajisaka via
Arpit Agarwal)
HDFS-5830. WebHdfsFileSystem.getFileBlockLocations throws
IllegalArgumentException when accessing another cluster. (Yongjun Zhang via
Colin Patrick McCabe)
HDFS-5833. Fix SecondaryNameNode javadoc. (Bangtao Zhou via Arpit Agarwal)
HDFS-5844. Fix broken link in WebHDFS.apt.vm. (Akira Ajisaka via
Arpit Agarwal)
HDFS-5034. Remove debug prints from GetFileLinkInfo (Andrew Wang via Colin
Patrick McCabe)
@ -599,6 +824,12 @@ Release 2.4.0 - UNRELEASED
HDFS-5728. Block recovery will fail if the metafile does not have crc
for all chunks of the block (Vinay via kihwal)
HDFS-5845. SecondaryNameNode dies when checkpointing with cache pools.
(wang)
HDFS-5842. Cannot create hftp filesystem when using a proxy user ugi and a doAs
on a secure cluster. (jing9)
BREAKDOWN OF HDFS-2832 SUBTASKS AND RELATED JIRAS
HDFS-4985. Add storage type to the protocol and expose it in block report
@ -936,212 +1167,6 @@ Release 2.4.0 - UNRELEASED
HDFS-5724. modifyCacheDirective logging audit log command wrongly as
addCacheDirective (Uma Maheswara Rao G via Colin Patrick McCabe)
Release 2.3.0 - UNRELEASED
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
HDFS-5360. Improvement of usage message of renameSnapshot and
deleteSnapshot. (Shinichi Yamashita via wang)
HDFS-5331. make SnapshotDiff.java to a o.a.h.util.Tool interface implementation.
(Vinayakumar B via umamahesh)
HDFS-4657. Limit the number of blocks logged by the NN after a block
report to a configurable value. (Aaron T. Myers via Colin Patrick
McCabe)
HDFS-5344. Make LsSnapshottableDir as Tool interface implementation. (Sathish via umamahesh)
HDFS-5544. Adding Test case For Checking dfs.checksum type as NULL value. (Sathish via umamahesh)
HDFS-5568. Support includeSnapshots option with Fsck command. (Vinayakumar B via umamahesh)
HDFS-4983. Numeric usernames do not work with WebHDFS FS. (Yongjun Zhang via
jing9)
HDFS-5592. statechangeLog of completeFile should be logged only in case of success.
(Vinayakumar via umamahesh)
HDFS-5662. Can't decommission a DataNode due to file's replication factor
larger than the rest of the cluster size. (brandonli)
HDFS-5068. Convert NNThroughputBenchmark to a Tool to allow generic options.
(shv)
HDFS-5675. Add Mkdirs operation to NNThroughputBenchmark.
(Plamen Jeliazkov via shv)
HDFS-5677. Need error checking for HA cluster configuration.
(Vincent Sheffer via cos)
OPTIMIZATIONS
BUG FIXES
HDFS-5307. Support both HTTP and HTTPS in jsp pages (Haohui Mai via
brandonli)
HDFS-5291. Standby namenode after transition to active goes into safemode.
(jing9)
HDFS-5317. Go back to DFS Home link does not work on datanode webUI
(Haohui Mai via brandonli)
HDFS-5316. Namenode ignores the default https port (Haohui Mai via
brandonli)
HDFS-5281. COMMIT request should not block. (brandonli)
HDFS-5337. should do hsync for a commit request even there is no pending
writes (brandonli)
HDFS-5335. Hive query failed with possible race in dfs output stream.
(Haohui Mai via suresh)
HDFS-5322. HDFS delegation token not found in cache errors seen on secure HA
clusters. (jing9)
HDFS-5329. Update FSNamesystem#getListing() to handle inode path in startAfter
token. (brandonli)
HDFS-5330. fix readdir and readdirplus for large directories (brandonli)
HDFS-5370. Typo in Error Message: different between range in condition
and range in error message. (Kousuke Saruta via suresh)
HDFS-5365. Fix libhdfs compile error on FreeBSD9. (Radim Kolar via cnauroth)
HDFS-5347. Add HDFS NFS user guide. (brandonli)
HDFS-5403. WebHdfs client cannot communicate with older WebHdfs servers
post HDFS-5306. (atm)
HDFS-5171. NFS should create input stream for a file and try to share it
with multiple read requests. (Haohui Mai via brandonli)
HDFS-5413. hdfs.cmd does not support passthrough to any arbitrary class.
(cnauroth)
HDFS-5433. When reloading fsimage during checkpointing, we should clear
existing snapshottable directories. (Aaron T. Myers via wang)
HDFS-5432. TestDatanodeJsp fails on Windows due to assumption that loopback
address resolves to host name localhost. (cnauroth)
HDFS-5065. TestSymlinkHdfsDisable fails on Windows. (ivanmi)
HDFS-4633 TestDFSClientExcludedNodes fails sporadically if excluded nodes
cache expires too quickly (Chris Nauroth via Sanjay)
HDFS-5037. Active NN should trigger its own edit log rolls (wang)
HDFS-5035. getFileLinkStatus and rename do not correctly check permissions
of symlinks. (Andrew Wang via Colin Patrick McCabe)
HDFS-5456. NameNode startup progress creates new steps if caller attempts to
create a counter for a step that doesn't already exist. (cnauroth)
HDFS-5458. Datanode failed volume threshold ignored if exception is thrown
in getDataDirsFromURIs. (Mike Mellenthin via wang)
HDFS-5252. Stable write is not handled correctly in someplace. (brandonli)
HDFS-5364. Add OpenFileCtx cache. (brandonli)
HDFS-5469. Add configuration property for the sub-directroy export path
(brandonli)
HDFS-5519. COMMIT handler should update the commit status after sync
(brandonli)
HDFS-5372. In FSNamesystem, hasReadLock() returns false if the current thread
holds the write lock (VinayaKumar B via umamahesh)
HDFS-4516. Client crash after block allocation and NN switch before lease recovery for
the same file can cause readers to fail forever (VinaayKumar B via umamahesh)
HDFS-5014. Process register commands with out holding BPOfferService lock.
(Vinaykumar B via umamahesh)
HDFS-5288. Close idle connections in portmap (Haohui Mai via brandonli)
HDFS-5407. Fix typos in DFSClientCache (Haohui Mai via brandonli)
HDFS-5548. Use ConcurrentHashMap in portmap (Haohui Mai via brandonli)
HDFS-5577. NFS user guide update (brandonli)
HDFS-5563. NFS gateway should commit the buffered data when read request comes
after write to the same file (brandonli)
HDFS-4997. libhdfs doesn't return correct error codes in most cases (cmccabe)
HDFS-5587. add debug information when NFS fails to start with duplicate user
or group names (brandonli)
HDFS-5590. Block ID and generation stamp may be reused when persistBlocks is
set to false. (jing9)
HDFS-5353. Short circuit reads fail when dfs.encrypt.data.transfer is
enabled. (Colin Patrick McCabe via jing9)
HDFS-5283. Under construction blocks only inside snapshots should not be
counted in safemode threshhold. (Vinay via szetszwo)
HDFS-5257. addBlock() retry should return LocatedBlock with locations else client
will get AIOBE. (Vinay via jing9)
HDFS-5427. Not able to read deleted files from snapshot directly under
snapshottable dir after checkpoint and NN restart. (Vinay via jing9)
HDFS-5443. Delete 0-sized block when deleting an under-construction file that
is included in snapshot. (jing9)
HDFS-5476. Snapshot: clean the blocks/files/directories under a renamed
file/directory while deletion. (jing9)
HDFS-5425. Renaming underconstruction file with snapshots can make NN failure on
restart. (jing9 and Vinay)
HDFS-5474. Deletesnapshot can make Namenode in safemode on NN restarts.
(Sathish via jing9)
HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold,
leads to NN safemode. (Vinay via jing9)
HDFS-5428. Under construction files deletion after snapshot+checkpoint+nn restart
leads nn safemode. (jing9)
HDFS-5074. Allow starting up from an fsimage checkpoint in the middle of a
segment. (Todd Lipcon via atm)
HDFS-4201. NPE in BPServiceActor#sendHeartBeat. (jxiang via cmccabe)
HDFS-5666. Fix inconsistent synchronization in BPOfferService (jxiang via cmccabe)
HDFS-5657. race condition causes writeback state error in NFS gateway (brandonli)
HDFS-5661. Browsing FileSystem via web ui, should use datanode's fqdn instead of ip
address. (Benoy Antony via jing9)
HDFS-5582. hdfs getconf -excludeFile or -includeFile always failed (sathish
via cmccabe)
HDFS-5671. Fix socket leak in DFSInputStream#getBlockReader. (JamesLi via umamahesh)
HDFS-5649. Unregister NFS and Mount service when NFS gateway is shutting down.
(brandonli)
HDFS-5789. Some of snapshot APIs missing checkOperation double check in fsn. (umamahesh)
HDFS-5343. When cat command is issued on snapshot files getting unexpected result.
(Sathish via umamahesh)
Release 2.2.0 - 2013-10-13
INCOMPATIBLE CHANGES

View File

@ -84,7 +84,7 @@ import org.apache.hadoop.hdfs.server.namenode.NameNode;
import org.apache.hadoop.hdfs.web.SWebHdfsFileSystem;
import org.apache.hadoop.hdfs.web.WebHdfsFileSystem;
import org.apache.hadoop.http.HttpConfig;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.ipc.ProtobufRpcEngine;
import org.apache.hadoop.ipc.RPC;
import org.apache.hadoop.net.NetUtils;
@ -1539,7 +1539,7 @@ public class DFSUtil {
return policy;
}
public static HttpServer.Builder loadSslConfToHttpServerBuilder(HttpServer.Builder builder,
public static HttpServer2.Builder loadSslConfToHttpServerBuilder(HttpServer2.Builder builder,
Configuration sslConf) {
return builder
.needsClientAuth(
@ -1644,13 +1644,13 @@ public class DFSUtil {
* namenode can use to initialize their HTTP / HTTPS server.
*
*/
public static HttpServer.Builder httpServerTemplateForNNAndJN(
public static HttpServer2.Builder httpServerTemplateForNNAndJN(
Configuration conf, final InetSocketAddress httpAddr,
final InetSocketAddress httpsAddr, String name, String spnegoUserNameKey,
String spnegoKeytabFileKey) throws IOException {
HttpConfig.Policy policy = getHttpPolicy(conf);
HttpServer.Builder builder = new HttpServer.Builder().setName(name)
HttpServer2.Builder builder = new HttpServer2.Builder().setName(name)
.setConf(conf).setACL(new AccessControlList(conf.get(DFS_ADMIN, " ")))
.setSecurityEnabled(UserGroupInformation.isSecurityEnabled())
.setUsernameConfKey(spnegoUserNameKey)

View File

@ -98,9 +98,8 @@ public class LocatedBlock {
}
this.storageIDs = storageIDs;
this.storageTypes = storageTypes;
Preconditions.checkArgument(cachedLocs != null,
"cachedLocs should not be null, use a different constructor");
if (cachedLocs.length == 0) {
if (cachedLocs == null || cachedLocs.length == 0) {
this.cachedLocs = EMPTY_LOCS;
} else {
this.cachedLocs = cachedLocs;

View File

@ -28,7 +28,7 @@ import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hdfs.DFSConfigKeys;
import org.apache.hadoop.hdfs.DFSUtil;
import org.apache.hadoop.hdfs.server.common.JspHelper;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.net.NetUtils;
/**
@ -38,7 +38,7 @@ import org.apache.hadoop.net.NetUtils;
public class JournalNodeHttpServer {
public static final String JN_ATTRIBUTE_KEY = "localjournal";
private HttpServer httpServer;
private HttpServer2 httpServer;
private JournalNode localJournalNode;
private final Configuration conf;
@ -56,7 +56,7 @@ public class JournalNodeHttpServer {
DFSConfigKeys.DFS_JOURNALNODE_HTTPS_ADDRESS_DEFAULT);
InetSocketAddress httpsAddr = NetUtils.createSocketAddr(httpsAddrString);
HttpServer.Builder builder = DFSUtil.httpServerTemplateForNNAndJN(conf,
HttpServer2.Builder builder = DFSUtil.httpServerTemplateForNNAndJN(conf,
httpAddr, httpsAddr, "journal",
DFSConfigKeys.DFS_JOURNALNODE_INTERNAL_SPNEGO_USER_NAME_KEY,
DFSConfigKeys.DFS_JOURNALNODE_KEYTAB_FILE_KEY);

View File

@ -120,7 +120,7 @@ import org.apache.hadoop.hdfs.server.protocol.ReplicaRecoveryInfo;
import org.apache.hadoop.hdfs.web.WebHdfsFileSystem;
import org.apache.hadoop.hdfs.web.resources.Param;
import org.apache.hadoop.http.HttpConfig;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.ReadaheadPool;
import org.apache.hadoop.io.nativeio.NativeIO;
@ -235,7 +235,7 @@ public class DataNode extends Configured
private volatile boolean heartbeatsDisabledForTests = false;
private DataStorage storage = null;
private HttpServer infoServer = null;
private HttpServer2 infoServer = null;
private int infoPort;
private int infoSecurePort;
@ -358,7 +358,7 @@ public class DataNode extends Configured
* Http Policy is decided.
*/
private void startInfoServer(Configuration conf) throws IOException {
HttpServer.Builder builder = new HttpServer.Builder().setName("datanode")
HttpServer2.Builder builder = new HttpServer2.Builder().setName("datanode")
.setConf(conf).setACL(new AccessControlList(conf.get(DFS_ADMIN, " ")));
HttpConfig.Policy policy = DFSUtil.getHttpPolicy(conf);

View File

@ -27,7 +27,7 @@ import org.apache.hadoop.hdfs.DFSConfigKeys;
import org.apache.hadoop.hdfs.DFSUtil;
import org.apache.hadoop.hdfs.server.common.HdfsServerConstants;
import org.apache.hadoop.http.HttpConfig;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.security.UserGroupInformation;
import org.mortbay.jetty.Connector;
@ -119,7 +119,7 @@ public class SecureDataNodeStarter implements Daemon {
// certificates if they are communicating through SSL.
Connector listener = null;
if (policy.isHttpEnabled()) {
listener = HttpServer.createDefaultChannelConnector();
listener = HttpServer2.createDefaultChannelConnector();
InetSocketAddress infoSocAddr = DataNode.getInfoAddr(conf);
listener.setHost(infoSocAddr.getHostName());
listener.setPort(infoSocAddr.getPort());

View File

@ -195,6 +195,17 @@ public final class CacheManager {
}
/**
* Resets all tracked directives and pools. Called during 2NN checkpointing to
* reset FSNamesystem state. See {FSNamesystem{@link #clear()}.
*/
void clear() {
directivesById.clear();
directivesByPath.clear();
cachePools.clear();
nextDirectiveId = 1;
}
public void startMonitorThread() {
crmLock.lock();
try {

View File

@ -69,7 +69,7 @@ public enum FSEditLogOpCodes {
OP_MODIFY_CACHE_DIRECTIVE ((byte) 39),
OP_SET_ACL ((byte) 40),
// Note that fromByte(..) depends on OP_INVALID being at the last position.
// Note that the current range of the valid OP code is 0~127
OP_INVALID ((byte) -1);
private final byte opCode;
@ -92,7 +92,22 @@ public enum FSEditLogOpCodes {
return opCode;
}
private static final FSEditLogOpCodes[] VALUES = FSEditLogOpCodes.values();
private static FSEditLogOpCodes[] VALUES;
static {
byte max = 0;
for (FSEditLogOpCodes code : FSEditLogOpCodes.values()) {
if (code.getOpCode() > max) {
max = code.getOpCode();
}
}
VALUES = new FSEditLogOpCodes[max + 1];
for (FSEditLogOpCodes code : FSEditLogOpCodes.values()) {
if (code.getOpCode() >= 0) {
VALUES[code.getOpCode()] = code;
}
}
}
/**
* Converts byte to FSEditLogOpCodes enum value
@ -101,12 +116,9 @@ public enum FSEditLogOpCodes {
* @return enum with byte value of opCode
*/
public static FSEditLogOpCodes fromByte(byte opCode) {
if (opCode == -1) {
return OP_INVALID;
}
if (opCode >= 0 && opCode < OP_INVALID.ordinal()) {
if (opCode >= 0 && opCode < VALUES.length) {
return VALUES[opCode];
}
return null;
return opCode == -1 ? OP_INVALID : null;
}
}

View File

@ -544,6 +544,7 @@ public class FSNamesystem implements Namesystem, FSClusterStats,
leaseManager.removeAllLeases();
inodeId.setCurrentValue(INodeId.LAST_RESERVED_ID);
snapshotManager.clearSnapshottableDirs();
cacheManager.clear();
}
@VisibleForTesting

View File

@ -47,7 +47,7 @@ import org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics;
import org.apache.hadoop.hdfs.server.protocol.RemoteEditLog;
import org.apache.hadoop.hdfs.util.DataTransferThrottler;
import org.apache.hadoop.hdfs.util.MD5FileUtils;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.MD5Hash;
import org.apache.hadoop.security.UserGroupInformation;
@ -287,7 +287,7 @@ public class GetImageServlet extends HttpServlet {
}
}
if (HttpServer.userHasAdministratorAccess(context, remoteUser)) {
if (HttpServer2.userHasAdministratorAccess(context, remoteUser)) {
LOG.info("GetImageServlet allowing administrator: " + remoteUser);
return true;
}

View File

@ -37,7 +37,7 @@ import org.apache.hadoop.hdfs.web.WebHdfsFileSystem;
import org.apache.hadoop.hdfs.web.resources.Param;
import org.apache.hadoop.hdfs.web.resources.UserParam;
import org.apache.hadoop.http.HttpConfig;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.net.NetUtils;
import org.apache.hadoop.security.SecurityUtil;
import org.apache.hadoop.security.UserGroupInformation;
@ -47,7 +47,7 @@ import org.apache.hadoop.security.UserGroupInformation;
*/
@InterfaceAudience.Private
public class NameNodeHttpServer {
private HttpServer httpServer;
private HttpServer2 httpServer;
private final Configuration conf;
private final NameNode nn;
@ -68,7 +68,7 @@ public class NameNodeHttpServer {
}
private void initWebHdfs(Configuration conf) throws IOException {
if (WebHdfsFileSystem.isEnabled(conf, HttpServer.LOG)) {
if (WebHdfsFileSystem.isEnabled(conf, HttpServer2.LOG)) {
// set user pattern based on configuration file
UserParam.setUserPattern(conf.get(DFSConfigKeys.DFS_WEBHDFS_USER_PATTERN_KEY, DFSConfigKeys.DFS_WEBHDFS_USER_PATTERN_DEFAULT));
@ -77,9 +77,9 @@ public class NameNodeHttpServer {
final String classname = AuthFilter.class.getName();
final String pathSpec = WebHdfsFileSystem.PATH_PREFIX + "/*";
Map<String, String> params = getAuthFilterParams(conf);
HttpServer.defineFilter(httpServer.getWebAppContext(), name, classname, params,
HttpServer2.defineFilter(httpServer.getWebAppContext(), name, classname, params,
new String[]{pathSpec});
HttpServer.LOG.info("Added filter '" + name + "' (class=" + classname + ")");
HttpServer2.LOG.info("Added filter '" + name + "' (class=" + classname + ")");
// add webhdfs packages
httpServer.addJerseyResourcePackage(
@ -103,7 +103,7 @@ public class NameNodeHttpServer {
DFSConfigKeys.DFS_NAMENODE_HTTPS_ADDRESS_DEFAULT);
InetSocketAddress httpsAddr = NetUtils.createSocketAddr(httpsAddrString);
HttpServer.Builder builder = DFSUtil.httpServerTemplateForNNAndJN(conf,
HttpServer2.Builder builder = DFSUtil.httpServerTemplateForNNAndJN(conf,
httpAddr, httpsAddr, "hdfs",
DFSConfigKeys.DFS_NAMENODE_INTERNAL_SPNEGO_USER_NAME_KEY,
DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY);
@ -152,7 +152,7 @@ public class NameNodeHttpServer {
SecurityUtil.getServerPrincipal(principalInConf,
bindAddress.getHostName()));
} else if (UserGroupInformation.isSecurityEnabled()) {
HttpServer.LOG.error(
HttpServer2.LOG.error(
"WebHDFS and security are enabled, but configuration property '" +
DFSConfigKeys.DFS_WEB_AUTHENTICATION_KERBEROS_PRINCIPAL_KEY +
"' is not set.");
@ -164,7 +164,7 @@ public class NameNodeHttpServer {
DFSConfigKeys.DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY,
httpKeytab);
} else if (UserGroupInformation.isSecurityEnabled()) {
HttpServer.LOG.error(
HttpServer2.LOG.error(
"WebHDFS and security are enabled, but configuration property '" +
DFSConfigKeys.DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY +
"' is not set.");
@ -214,7 +214,7 @@ public class NameNodeHttpServer {
httpServer.setAttribute(STARTUP_PROGRESS_ATTRIBUTE_KEY, prog);
}
private static void setupServlets(HttpServer httpServer, Configuration conf) {
private static void setupServlets(HttpServer2 httpServer, Configuration conf) {
httpServer.addInternalServlet("startupProgress",
StartupProgressServlet.PATH_SPEC, StartupProgressServlet.class);
httpServer.addInternalServlet("getDelegationToken",

View File

@ -65,7 +65,7 @@ import org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol;
import org.apache.hadoop.hdfs.server.protocol.RemoteEditLog;
import org.apache.hadoop.hdfs.server.protocol.RemoteEditLogManifest;
import org.apache.hadoop.http.HttpConfig;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.io.MD5Hash;
import org.apache.hadoop.ipc.RemoteException;
import org.apache.hadoop.metrics2.lib.DefaultMetricsSystem;
@ -90,7 +90,7 @@ import com.google.common.collect.ImmutableList;
* The Secondary NameNode is a daemon that periodically wakes
* up (determined by the schedule specified in the configuration),
* triggers a periodic checkpoint and then goes back to sleep.
* The Secondary NameNode uses the ClientProtocol to talk to the
* The Secondary NameNode uses the NamenodeProtocol to talk to the
* primary NameNode.
*
**********************************************************/
@ -113,7 +113,7 @@ public class SecondaryNameNode implements Runnable {
private Configuration conf;
private InetSocketAddress nameNodeAddr;
private volatile boolean shouldRun;
private HttpServer infoServer;
private HttpServer2 infoServer;
private URL imageListenURL;
private Collection<URI> checkpointDirs;
@ -257,7 +257,7 @@ public class SecondaryNameNode implements Runnable {
DFSConfigKeys.DFS_NAMENODE_SECONDARY_HTTPS_ADDRESS_DEFAULT);
InetSocketAddress httpsAddr = NetUtils.createSocketAddr(httpsAddrString);
HttpServer.Builder builder = DFSUtil.httpServerTemplateForNNAndJN(conf,
HttpServer2.Builder builder = DFSUtil.httpServerTemplateForNNAndJN(conf,
httpAddr, httpsAddr, "secondary",
DFSConfigKeys.DFS_SECONDARY_NAMENODE_INTERNAL_SPNEGO_USER_NAME_KEY,
DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
@ -1001,7 +1001,12 @@ public class SecondaryNameNode implements Runnable {
sig.mostRecentCheckpointTxId + " even though it should have " +
"just been downloaded");
}
dstNamesystem.writeLock();
try {
dstImage.reloadFromImageFile(file, dstNamesystem);
} finally {
dstNamesystem.writeUnlock();
}
dstNamesystem.dir.imageLoadComplete();
}
// error simulation code for junit test

View File

@ -620,7 +620,7 @@ public class CacheAdmin extends Configured implements Tool {
"directives being added to the pool. This can be specified in " +
"seconds, minutes, hours, and days, e.g. 120s, 30m, 4h, 2d. " +
"Valid units are [smhd]. By default, no maximum is set. " +
"This can also be manually specified by \"never\".");
"A value of \"never\" specifies that there is no limit.");
return getShortUsage() + "\n" +
"Add a new cache pool.\n\n" +
listing.toString();

View File

@ -185,8 +185,8 @@ public class DelegationTokenFetcher {
} else {
// otherwise we are fetching
if (webUrl != null) {
Credentials creds = getDTfromRemote(connectionFactory, new URI(webUrl),
renewer);
Credentials creds = getDTfromRemote(connectionFactory, new URI(
webUrl), renewer, null);
creds.writeTokenStorageFile(tokenFile, conf);
for (Token<?> token : creds.getAllTokens()) {
if(LOG.isDebugEnabled()) {
@ -213,12 +213,17 @@ public class DelegationTokenFetcher {
}
static public Credentials getDTfromRemote(URLConnectionFactory factory,
URI nnUri, String renewer) throws IOException {
URI nnUri, String renewer, String proxyUser) throws IOException {
StringBuilder buf = new StringBuilder(nnUri.toString())
.append(GetDelegationTokenServlet.PATH_SPEC);
String separator = "?";
if (renewer != null) {
buf.append("?").append(GetDelegationTokenServlet.RENEWER).append("=")
.append(renewer);
separator = "&";
}
if (proxyUser != null) {
buf.append(separator).append("doas=").append(proxyUser);
}
boolean isHttps = nnUri.getScheme().equals("https");

View File

@ -57,7 +57,6 @@ import org.apache.hadoop.net.NetUtils;
import org.apache.hadoop.security.Credentials;
import org.apache.hadoop.security.SecurityUtil;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.security.authentication.client.AuthenticationException;
import org.apache.hadoop.security.token.Token;
import org.apache.hadoop.security.token.TokenIdentifier;
import org.apache.hadoop.util.Progressable;
@ -234,17 +233,23 @@ public class HftpFileSystem extends FileSystem
}
@Override
public synchronized Token<?> getDelegationToken(final String renewer
) throws IOException {
public synchronized Token<?> getDelegationToken(final String renewer)
throws IOException {
try {
//Renew TGT if needed
ugi.checkTGTAndReloginFromKeytab();
return ugi.doAs(new PrivilegedExceptionAction<Token<?>>() {
// Renew TGT if needed
UserGroupInformation connectUgi = ugi.getRealUser();
final String proxyUser = connectUgi == null ? null : ugi
.getShortUserName();
if (connectUgi == null) {
connectUgi = ugi;
}
return connectUgi.doAs(new PrivilegedExceptionAction<Token<?>>() {
@Override
public Token<?> run() throws IOException {
Credentials c;
try {
c = DelegationTokenFetcher.getDTfromRemote(connectionFactory, nnUri, renewer);
c = DelegationTokenFetcher.getDTfromRemote(connectionFactory,
nnUri, renewer, proxyUser);
} catch (IOException e) {
if (e.getCause() instanceof ConnectException) {
LOG.warn("Couldn't connect to " + nnUri +
@ -299,13 +304,13 @@ public class HftpFileSystem extends FileSystem
* @return user_shortname,group1,group2...
*/
private String getEncodedUgiParameter() {
StringBuilder ugiParamenter = new StringBuilder(
StringBuilder ugiParameter = new StringBuilder(
ServletUtil.encodeQueryValue(ugi.getShortUserName()));
for(String g: ugi.getGroupNames()) {
ugiParamenter.append(",");
ugiParamenter.append(ServletUtil.encodeQueryValue(g));
ugiParameter.append(",");
ugiParameter.append(ServletUtil.encodeQueryValue(g));
}
return ugiParamenter.toString();
return ugiParameter.toString();
}
/**
@ -675,30 +680,48 @@ public class HftpFileSystem extends FileSystem
@SuppressWarnings("unchecked")
@Override
public long renewDelegationToken(Token<?> token) throws IOException {
public long renewDelegationToken(final Token<?> token) throws IOException {
// update the kerberos credentials, if they are coming from a keytab
UserGroupInformation.getLoginUser().checkTGTAndReloginFromKeytab();
InetSocketAddress serviceAddr = SecurityUtil.getTokenServiceAddr(token);
UserGroupInformation connectUgi = ugi.getRealUser();
if (connectUgi == null) {
connectUgi = ugi;
}
try {
return connectUgi.doAs(new PrivilegedExceptionAction<Long>() {
@Override
public Long run() throws Exception {
InetSocketAddress serviceAddr = SecurityUtil
.getTokenServiceAddr(token);
return DelegationTokenFetcher.renewDelegationToken(connectionFactory,
DFSUtil.createUri(getUnderlyingProtocol(), serviceAddr),
(Token<DelegationTokenIdentifier>) token);
} catch (AuthenticationException e) {
}
});
} catch (InterruptedException e) {
throw new IOException(e);
}
}
@SuppressWarnings("unchecked")
@Override
public void cancelDelegationToken(Token<?> token) throws IOException {
// update the kerberos credentials, if they are coming from a keytab
UserGroupInformation.getLoginUser().checkTGTAndReloginFromKeytab();
InetSocketAddress serviceAddr = SecurityUtil.getTokenServiceAddr(token);
public void cancelDelegationToken(final Token<?> token) throws IOException {
UserGroupInformation connectUgi = ugi.getRealUser();
if (connectUgi == null) {
connectUgi = ugi;
}
try {
DelegationTokenFetcher.cancelDelegationToken(connectionFactory, DFSUtil
.createUri(getUnderlyingProtocol(), serviceAddr),
connectUgi.doAs(new PrivilegedExceptionAction<Void>() {
@Override
public Void run() throws Exception {
InetSocketAddress serviceAddr = SecurityUtil
.getTokenServiceAddr(token);
DelegationTokenFetcher.cancelDelegationToken(connectionFactory,
DFSUtil.createUri(getUnderlyingProtocol(), serviceAddr),
(Token<DelegationTokenIdentifier>) token);
} catch (AuthenticationException e) {
return null;
}
});
} catch (InterruptedException e) {
throw new IOException(e);
}
}

View File

@ -22,110 +22,140 @@ Centralized Cache Management in HDFS
%{toc|section=1|fromDepth=2|toDepth=4}
* {Background}
* {Overview}
Normally, HDFS relies on the operating system to cache data it reads from disk.
However, HDFS can also be configured to use centralized cache management. Under
centralized cache management, the HDFS NameNode itself decides which blocks
should be cached, and where they should be cached.
<Centralized cache management> in HDFS is an explicit caching mechanism that
allows users to specify <paths> to be cached by HDFS. The NameNode will
communicate with DataNodes that have the desired blocks on disk, and instruct
them to cache the blocks in off-heap caches.
Centralized cache management has several advantages. First of all, it
prevents frequently used block files from being evicted from memory. This is
particularly important when the size of the working set exceeds the size of
main memory, which is true for many big data applications. Secondly, when
HDFS decides what should be cached, it can let clients know about this
information through the getFileBlockLocations API. Finally, when the DataNode
knows a block is locked into memory, it can provide access to that block via
mmap.
Centralized cache management in HDFS has many significant advantages.
[[1]] Explicit pinning prevents frequently used data from being evicted from
memory. This is particularly important when the size of the working set
exceeds the size of main memory, which is common for many HDFS workloads.
[[1]] Because DataNode caches are managed by the NameNode, applications can
query the set of cached block locations when making task placement decisions.
Co-locating a task with a cached block replica improves read performance.
[[1]] When block has been cached by a DataNode, clients can use a new ,
more-efficient, zero-copy read API. Since checksum verification of cached
data is done once by the DataNode, clients can incur essentially zero
overhead when using this new API.
[[1]] Centralized caching can improve overall cluster memory utilization.
When relying on the OS buffer cache at each DataNode, repeated reads of
a block will result in all <n> replicas of the block being pulled into
buffer cache. With centralized cache management, a user can explicitly pin
only <m> of the <n> replicas, saving <n-m> memory.
* {Use Cases}
Centralized cache management is most useful for files which are accessed very
often. For example, a "fact table" in Hive which is often used in joins is a
good candidate for caching. On the other hand, when running a classic
"word count" MapReduce job which counts the number of words in each
document, there may not be any good candidates for caching, since all the
files may be accessed exactly once.
Centralized cache management is useful for files that accessed repeatedly.
For example, a small <fact table> in Hive which is often used for joins is a
good candidate for caching. On the other hand, caching the input of a <
one year reporting query> is probably less useful, since the
historical data might only be read once.
Centralized cache management is also useful for mixed workloads with
performance SLAs. Caching the working set of a high-priority workload
insures that it does not contend for disk I/O with a low-priority workload.
* {Architecture}
[images/caching.png] Caching Architecture
With centralized cache management, the NameNode coordinates all caching
across the cluster. It receives cache information from each DataNode via the
cache report, a periodic message that describes all the blocks IDs cached on
a given DataNode. The NameNode will reply to DataNode heartbeat messages
with commands telling it which blocks to cache and which to uncache.
In this architecture, the NameNode is responsible for coordinating all the
DataNode off-heap caches in the cluster. The NameNode periodically receives
a <cache report> from each DataNode which describes all the blocks cached
on a given DN. The NameNode manages DataNode caches by piggybacking cache and
uncache commands on the DataNode heartbeat.
The NameNode stores a set of path cache directives, which tell it which files
to cache. The NameNode also stores a set of cache pools, which are groups of
cache directives. These directives and pools are persisted to the edit log
and fsimage, and will be loaded if the cluster is restarted.
The NameNode queries its set of <cache directives> to determine
which paths should be cached. Cache directives are persistently stored in the
fsimage and edit log, and can be added, removed, and modified via Java and
command-line APIs. The NameNode also stores a set of <cache pools>,
which are administrative entities used to group cache directives together for
resource management and enforcing permissions.
Periodically, the NameNode rescans the namespace, to see which blocks need to
be cached based on the current set of path cache directives. Rescans are also
triggered by relevant user actions, such as adding or removing a cache
directive or removing a cache pool.
Cache directives also may specific a numeric cache replication, which is the
number of replicas to cache. This number may be equal to or smaller than the
file's block replication. If multiple cache directives cover the same file
with different cache replication settings, then the highest cache replication
setting is applied.
The NameNode periodically rescans the namespace and active cache directives
to determine which blocks need to be cached or uncached and assign caching
work to DataNodes. Rescans can also be triggered by user actions like adding
or removing a cache directive or removing a cache pool.
We do not currently cache blocks which are under construction, corrupt, or
otherwise incomplete. If a cache directive covers a symlink, the symlink
target is not cached.
Caching is currently done on a per-file basis, although we would like to add
block-level granularity in the future.
Caching is currently done on the file or directory-level. Block and sub-block
caching is an item of future work.
* {Interface}
* {Concepts}
The NameNode stores a list of "cache directives." These directives contain a
path as well as the number of times blocks in that path should be replicated.
** {Cache directive}
Paths can be either directories or files. If the path specifies a file, that
file is cached. If the path specifies a directory, all the files in the
directory will be cached. However, this process is not recursive-- only the
direct children of the directory will be cached.
A <cache directive> defines a path that should be cached. Paths can be either
directories or files. Directories are cached non-recursively, meaning only
files in the first-level listing of the directory.
** {hdfs cacheadmin Shell}
Directives also specify additional parameters, such as the cache replication
factor and expiration time. The replication factor specifies the number of
block replicas to cache. If multiple cache directives refer to the same file,
the maximum cache replication factor is applied.
Path cache directives can be created by the <<<hdfs cacheadmin
-addDirective>>> command and removed via the <<<hdfs cacheadmin
-removeDirective>>> command. To list the current path cache directives, use
<<<hdfs cacheadmin -listDirectives>>>. Each path cache directive has a
unique 64-bit ID number which will not be reused if it is deleted. To remove
all path cache directives with a specified path, use <<<hdfs cacheadmin
-removeDirectives>>>.
The expiration time is specified on the command line as a <time-to-live
(TTL)>, a relative expiration time in the future. After a cache directive
expires, it is no longer considered by the NameNode when making caching
decisions.
Directives are grouped into "cache pools." Each cache pool gets a share of
the cluster's resources. Additionally, cache pools are used for
authentication. Cache pools have a mode, user, and group, similar to regular
files. The same authentication rules are applied as for normal files. So, for
example, if the mode is 0777, any user can add or remove directives from the
cache pool. If the mode is 0644, only the owner can write to the cache pool,
but anyone can read from it. And so forth.
** {Cache pool}
Cache pools are identified by name. They can be created by the <<<hdfs
cacheAdmin -addPool>>> command, modified by the <<<hdfs cacheadmin
-modifyPool>>> command, and removed via the <<<hdfs cacheadmin
-removePool>>> command. To list the current cache pools, use <<<hdfs
cacheAdmin -listPools>>>
A <cache pool> is an administrative entity used to manage groups of cache
directives. Cache pools have UNIX-like <permissions>, which restrict which
users and groups have access to the pool. Write permissions allow users to
add and remove cache directives to the pool. Read permissions allow users to
list the cache directives in a pool, as well as additional metadata. Execute
permissions are unused.
Cache pools are also used for resource management. Pools can enforce a
maximum <limit>, which restricts the number of bytes that can be cached in
aggregate by directives in the pool. Normally, the sum of the pool limits
will approximately equal the amount of aggregate memory reserved for
HDFS caching on the cluster. Cache pools also track a number of statistics
to help cluster users determine what is and should be cached.
Pools also can enforce a maximum time-to-live. This restricts the maximum
expiration time of directives being added to the pool.
* {<<<cacheadmin>>> command-line interface}
On the command-line, administrators and users can interact with cache pools
and directives via the <<<hdfs cacheadmin>>> subcommand.
Cache directives are identified by a unique, non-repeating 64-bit integer ID.
IDs will not be reused even if a cache directive is later removed.
Cache pools are identified by a unique string name.
** {Cache directive commands}
*** {addDirective}
Usage: <<<hdfs cacheadmin -addDirective -path <path> -replication <replication> -pool <pool-name> >>>
Usage: <<<hdfs cacheadmin -addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]>>>
Add a new cache directive.
*--+--+
\<path\> | A path to cache. The path can be a directory or a file.
*--+--+
\<pool-name\> | The pool to which the directive will be added. You must have write permission on the cache pool in order to add new directives.
*--+--+
-force | Skips checking of cache pool resource limits.
*--+--+
\<replication\> | The cache replication factor to use. Defaults to 1.
*--+--+
\<pool-name\> | The pool to which the directive will be added. You must have write permission on the cache pool in order to add new directives.
\<time-to-live\> | How long the directive is valid. Can be specified in minutes, hours, and days, e.g. 30m, 4h, 2d. Valid units are [smhd]. "never" indicates a directive that never expires. If unspecified, the directive never expires.
*--+--+
*** {removeDirective}
@ -150,7 +180,7 @@ Centralized Cache Management in HDFS
*** {listDirectives}
Usage: <<<hdfs cacheadmin -listDirectives [-path <path>] [-pool <pool>] >>>
Usage: <<<hdfs cacheadmin -listDirectives [-stats] [-path <path>] [-pool <pool>]>>>
List cache directives.
@ -159,10 +189,14 @@ Centralized Cache Management in HDFS
*--+--+
\<pool\> | List only path cache directives in that pool.
*--+--+
-stats | List path-based cache directive statistics.
*--+--+
** {Cache pool commands}
*** {addPool}
Usage: <<<hdfs cacheadmin -addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-weight <weight>] >>>
Usage: <<<hdfs cacheadmin -addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-maxTtl <maxTtl>>>>
Add a new cache pool.
@ -175,12 +209,14 @@ Centralized Cache Management in HDFS
*--+--+
\<mode\> | UNIX-style permissions for the pool. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755.
*--+--+
\<weight\> | Weight of the pool. This is a relative measure of the importance of the pool used during cache resource management. By default, it is set to 100.
\<limit\> | The maximum number of bytes that can be cached by directives in this pool, in aggregate. By default, no limit is set.
*--+--+
\<maxTtl\> | The maximum allowed time-to-live for directives being added to the pool. This can be specified in seconds, minutes, hours, and days, e.g. 120s, 30m, 4h, 2d. Valid units are [smhd]. By default, no maximum is set. A value of \"never\" specifies that there is no limit.
*--+--+
*** {modifyPool}
Usage: <<<hdfs cacheadmin -modifyPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-weight <weight>] >>>
Usage: <<<hdfs cacheadmin -modifyPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-maxTtl <maxTtl>]>>>
Modifies the metadata of an existing cache pool.
@ -193,7 +229,9 @@ Centralized Cache Management in HDFS
*--+--+
\<mode\> | Unix-style permissions of the pool in octal.
*--+--+
\<weight\> | Weight of the pool.
\<limit\> | Maximum number of bytes that can be cached by this pool.
*--+--+
\<maxTtl\> | The maximum allowed time-to-live for directives being added to the pool.
*--+--+
*** {removePool}
@ -208,11 +246,13 @@ Centralized Cache Management in HDFS
*** {listPools}
Usage: <<<hdfs cacheadmin -listPools [name] >>>
Usage: <<<hdfs cacheadmin -listPools [-stats] [<name>]>>>
Display information about one or more cache pools, e.g. name, owner, group,
permissions, etc.
*--+--+
-stats | Display additional cache pool statistics.
*--+--+
\<name\> | If specified, list only the named cache pool.
*--+--+
@ -244,10 +284,12 @@ Centralized Cache Management in HDFS
* dfs.datanode.max.locked.memory
The DataNode will treat this as the maximum amount of memory it can use for
its cache. When setting this value, please remember that you will need space
in memory for other things, such as the Java virtual machine (JVM) itself
and the operating system's page cache.
This determines the maximum amount of memory a DataNode will use for caching.
The "locked-in-memory size" ulimit (<<<ulimit -l>>>) of the DataNode user
also needs to be increased to match this parameter (see below section on
{{OS Limits}}). When setting this value, please remember that you will need
space in memory for other things as well, such as the DataNode and
application JVM heaps and the operating system page cache.
*** Optional

View File

@ -19,8 +19,6 @@
HDFS Federation
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
This guide provides an overview of the HDFS Federation feature and

View File

@ -18,8 +18,6 @@
HDFS High Availability
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* {Purpose}

View File

@ -18,8 +18,6 @@
HDFS High Availability Using the Quorum Journal Manager
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* {Purpose}

View File

@ -20,8 +20,6 @@
Offline Edits Viewer Guide
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Overview

View File

@ -18,8 +18,6 @@
Offline Image Viewer Guide
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Overview
@ -64,9 +62,9 @@ Offline Image Viewer Guide
but no data recorded. The default record delimiter is a tab, but
this may be changed via the -delimiter command line argument. This
processor is designed to create output that is easily analyzed by
other tools, such as [36]Apache Pig. See the [37]Analyzing Results
section for further information on using this processor to analyze
the contents of fsimage files.
other tools, such as {{{http://pig.apache.org}Apache Pig}}. See
the {{Analyzing Results}} section for further information on using
this processor to analyze the contents of fsimage files.
[[4]] XML creates an XML document of the fsimage and includes all of the
information within the fsimage, similar to the lsr processor. The

View File

@ -18,8 +18,6 @@
HDFS Permissions Guide
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Overview
@ -55,8 +53,10 @@ HDFS Permissions Guide
* If the user name matches the owner of foo, then the owner
permissions are tested;
* Else if the group of foo matches any of member of the groups list,
then the group permissions are tested;
* Otherwise the other permissions of foo are tested.
If a permissions check fails, the client operation fails.

View File

@ -18,8 +18,6 @@
HDFS Quotas Guide
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Overview

View File

@ -108,9 +108,11 @@ HDFS Users Guide
The following documents describe how to install and set up a Hadoop
cluster:
* {{Single Node Setup}} for first-time users.
* {{{../hadoop-common/SingleCluster.html}Single Node Setup}}
for first-time users.
* {{Cluster Setup}} for large, distributed clusters.
* {{{../hadoop-common/ClusterSetup.html}Cluster Setup}}
for large, distributed clusters.
The rest of this document assumes the user is able to set up and run a
HDFS with at least one DataNode. For the purpose of this document, both
@ -136,7 +138,8 @@ HDFS Users Guide
for a command. These commands support most of the normal files system
operations like copying files, changing file permissions, etc. It also
supports a few HDFS specific operations like changing replication of
files. For more information see {{{File System Shell Guide}}}.
files. For more information see {{{../hadoop-common/FileSystemShell.html}
File System Shell Guide}}.
** DFSAdmin Command
@ -169,7 +172,7 @@ HDFS Users Guide
of racks and datanodes attached to the tracks as viewed by the
NameNode.
For command usage, see {{{dfsadmin}}}.
For command usage, see {{{../hadoop-common/CommandsManual.html#dfsadmin}dfsadmin}}.
* Secondary NameNode
@ -203,7 +206,8 @@ HDFS Users Guide
So that the check pointed image is always ready to be read by the
primary NameNode if necessary.
For command usage, see {{{secondarynamenode}}}.
For command usage,
see {{{../hadoop-common/CommandsManual.html#secondarynamenode}secondarynamenode}}.
* Checkpoint Node
@ -245,7 +249,7 @@ HDFS Users Guide
Multiple checkpoint nodes may be specified in the cluster configuration
file.
For command usage, see {{{namenode}}}.
For command usage, see {{{../hadoop-common/CommandsManual.html#namenode}namenode}}.
* Backup Node
@ -287,7 +291,7 @@ HDFS Users Guide
For a complete discussion of the motivation behind the creation of the
Backup node and Checkpoint node, see {{{https://issues.apache.org/jira/browse/HADOOP-4539}HADOOP-4539}}.
For command usage, see {{{namenode}}}.
For command usage, see {{{../hadoop-common/CommandsManual.html#namenode}namenode}}.
* Import Checkpoint
@ -310,7 +314,7 @@ HDFS Users Guide
verifies that the image in <<<dfs.namenode.checkpoint.dir>>> is consistent,
but does not modify it in any way.
For command usage, see {{{namenode}}}.
For command usage, see {{{../hadoop-common/CommandsManual.html#namenode}namenode}}.
* Rebalancer
@ -337,7 +341,7 @@ HDFS Users Guide
A brief administrator's guide for rebalancer as a PDF is attached to
{{{https://issues.apache.org/jira/browse/HADOOP-1652}HADOOP-1652}}.
For command usage, see {{{balancer}}}.
For command usage, see {{{../hadoop-common/CommandsManual.html#balancer}balancer}}.
* Rack Awareness
@ -379,8 +383,9 @@ HDFS Users Guide
most of the recoverable failures. By default fsck ignores open files
but provides an option to select all files during reporting. The HDFS
fsck command is not a Hadoop shell command. It can be run as
<<<bin/hadoop fsck>>>. For command usage, see {{{fsck}}}. fsck can be run on the
whole file system or on a subset of files.
<<<bin/hadoop fsck>>>. For command usage, see
{{{../hadoop-common/CommandsManual.html#fsck}fsck}}. fsck can be run on
the whole file system or on a subset of files.
* fetchdt
@ -393,7 +398,8 @@ HDFS Users Guide
command. It can be run as <<<bin/hadoop fetchdt DTfile>>>. After you got
the token you can run an HDFS command without having Kerberos tickets,
by pointing <<<HADOOP_TOKEN_FILE_LOCATION>>> environmental variable to the
delegation token file. For command usage, see {{{fetchdt}}} command.
delegation token file. For command usage, see
{{{../hadoop-common/CommandsManual.html#fetchdt}fetchdt}} command.
* Recovery Mode
@ -427,10 +433,11 @@ HDFS Users Guide
let alone to restart HDFS from scratch. HDFS allows administrators to
go back to earlier version of Hadoop and rollback the cluster to the
state it was in before the upgrade. HDFS upgrade is described in more
detail in {{{Hadoop Upgrade}}} Wiki page. HDFS can have one such backup at a
time. Before upgrading, administrators need to remove existing backup
using bin/hadoop dfsadmin <<<-finalizeUpgrade>>> command. The following
briefly describes the typical upgrade procedure:
detail in {{{http://wiki.apache.org/hadoop/Hadoop_Upgrade}Hadoop Upgrade}}
Wiki page. HDFS can have one such backup at a time. Before upgrading,
administrators need to remove existing backupusing bin/hadoop dfsadmin
<<<-finalizeUpgrade>>> command. The following briefly describes the
typical upgrade procedure:
* Before upgrading Hadoop software, finalize if there an existing
backup. <<<dfsadmin -upgradeProgress>>> status can tell if the cluster
@ -450,7 +457,7 @@ HDFS Users Guide
* stop the cluster and distribute earlier version of Hadoop.
* start the cluster with rollback option. (<<<bin/start-dfs.h -rollback>>>).
* start the cluster with rollback option. (<<<bin/start-dfs.sh -rollback>>>).
* File Permissions and Security
@ -465,14 +472,15 @@ HDFS Users Guide
* Scalability
Hadoop currently runs on clusters with thousands of nodes. The
{{{PoweredBy}}} Wiki page lists some of the organizations that deploy Hadoop
on large clusters. HDFS has one NameNode for each cluster. Currently
the total memory available on NameNode is the primary scalability
limitation. On very large clusters, increasing average size of files
stored in HDFS helps with increasing cluster size without increasing
memory requirements on NameNode. The default configuration may not
suite very large clustes. The {{{FAQ}}} Wiki page lists suggested
configuration improvements for large Hadoop clusters.
{{{http://wiki.apache.org/hadoop/PoweredBy}PoweredBy}} Wiki page lists
some of the organizations that deploy Hadoop on large clusters.
HDFS has one NameNode for each cluster. Currently the total memory
available on NameNode is the primary scalability limitation.
On very large clusters, increasing average size of files stored in
HDFS helps with increasing cluster size without increasing memory
requirements on NameNode. The default configuration may not suite
very large clusters. The {{{http://wiki.apache.org/hadoop/FAQ}FAQ}}
Wiki page lists suggested configuration improvements for large Hadoop clusters.
* Related Documentation
@ -481,19 +489,22 @@ HDFS Users Guide
documentation about Hadoop and HDFS. The following list is a starting
point for further exploration:
* {{{Hadoop Site}}}: The home page for the Apache Hadoop site.
* {{{http://hadoop.apache.org}Hadoop Site}}: The home page for
the Apache Hadoop site.
* {{{Hadoop Wiki}}}: The home page (FrontPage) for the Hadoop Wiki. Unlike
* {{{http://wiki.apache.org/hadoop/FrontPage}Hadoop Wiki}}:
The home page (FrontPage) for the Hadoop Wiki. Unlike
the released documentation, which is part of Hadoop source tree,
Hadoop Wiki is regularly edited by Hadoop Community.
* {{{FAQ}}}: The FAQ Wiki page.
* {{{http://wiki.apache.org/hadoop/FAQ}FAQ}}: The FAQ Wiki page.
* {{{Hadoop JavaDoc API}}}.
* {{{../../api/index.html}Hadoop JavaDoc API}}.
* {{{Hadoop User Mailing List}}}: core-user[at]hadoop.apache.org.
* Hadoop User Mailing List: user[at]hadoop.apache.org.
* Explore {{{src/hdfs/hdfs-default.xml}}}. It includes brief description of
most of the configuration variables available.
* Explore {{{./hdfs-default.xml}hdfs-default.xml}}. It includes
brief description of most of the configuration variables available.
* {{{Hadoop Commands Guide}}}: Hadoop commands usage.
* {{{../hadoop-common/CommandsManual.html}Hadoop Commands Guide}}:
Hadoop commands usage.

View File

@ -18,8 +18,6 @@
HFTP Guide
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Introduction

View File

@ -19,8 +19,6 @@
HDFS Short-Circuit Local Reads
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* {Background}

View File

@ -18,8 +18,6 @@
WebHDFS REST API
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* {Document Conventions}
@ -54,7 +52,7 @@ WebHDFS REST API
* {{{Status of a File/Directory}<<<GETFILESTATUS>>>}}
(see {{{../../api/org/apache/hadoop/fs/FileSystem.html}FileSystem}}.getFileStatus)
* {{<<<LISTSTATUS>>>}}
* {{{List a Directory}<<<LISTSTATUS>>>}}
(see {{{../../api/org/apache/hadoop/fs/FileSystem.html}FileSystem}}.listStatus)
* {{{Get Content Summary of a Directory}<<<GETCONTENTSUMMARY>>>}}
@ -109,7 +107,7 @@ WebHDFS REST API
* {{{Append to a File}<<<APPEND>>>}}
(see {{{../../api/org/apache/hadoop/fs/FileSystem.html}FileSystem}}.append)
* {{{Concatenate Files}<<<CONCAT>>>}}
* {{{Concat File(s)}<<<CONCAT>>>}}
(see {{{../../api/org/apache/hadoop/fs/FileSystem.html}FileSystem}}.concat)
* HTTP DELETE
@ -871,7 +869,7 @@ Content-Length: 0
* {Error Responses}
When an operation fails, the server may throw an exception.
The JSON schema of error responses is defined in {{<<<RemoteException>>> JSON schema}}.
The JSON schema of error responses is defined in {{RemoteException JSON Schema}}.
The table below shows the mapping from exceptions to HTTP response codes.
** {HTTP Response Codes}
@ -1119,7 +1117,7 @@ Transfer-Encoding: chunked
See also:
{{{FileStatus Properties}<<<FileStatus>>> Properties}},
{{{Status of a File/Directory}<<<GETFILESTATUS>>>}},
{{{../../api/org/apache/hadoop/fs/FileStatus}FileStatus}}
{{{../../api/org/apache/hadoop/fs/FileStatus.html}FileStatus}}
*** {FileStatus Properties}
@ -1232,7 +1230,7 @@ var fileStatusProperties =
See also:
{{{FileStatus Properties}<<<FileStatus>>> Properties}},
{{{List a Directory}<<<LISTSTATUS>>>}},
{{{../../api/org/apache/hadoop/fs/FileStatus}FileStatus}}
{{{../../api/org/apache/hadoop/fs/FileStatus.html}FileStatus}}
** {Long JSON Schema}
@ -1275,7 +1273,7 @@ var fileStatusProperties =
See also:
{{{Get Home Directory}<<<GETHOMEDIRECTORY>>>}},
{{{../../api/org/apache/hadoop/fs/Path}Path}}
{{{../../api/org/apache/hadoop/fs/Path.html}Path}}
** {RemoteException JSON Schema}

View File

@ -22,6 +22,7 @@ import com.google.common.base.Charsets;
import com.google.common.base.Joiner;
import com.google.common.collect.Lists;
import org.apache.commons.io.FileUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.conf.Configuration;
@ -891,21 +892,7 @@ public class DFSTestUtil {
/** Copy one file's contents into the other **/
public static void copyFile(File src, File dest) throws IOException {
InputStream in = null;
OutputStream out = null;
try {
in = new FileInputStream(src);
out = new FileOutputStream(dest);
byte [] b = new byte[1024];
while( in.read(b) > 0 ) {
out.write(b);
}
} finally {
if(in != null) in.close();
if(out != null) out.close();
}
FileUtils.copyFile(src, dest);
}
public static class Builder {

View File

@ -118,6 +118,20 @@ public class TestDFSUtil {
assertEquals(0, bs.length);
}
/**
* Test constructing LocatedBlock with null cachedLocs
*/
@Test
public void testLocatedBlockConstructorWithNullCachedLocs() {
DatanodeInfo d = DFSTestUtil.getLocalDatanodeInfo();
DatanodeInfo[] ds = new DatanodeInfo[1];
ds[0] = d;
ExtendedBlock b1 = new ExtendedBlock("bpid", 1, 1, 1);
LocatedBlock l1 = new LocatedBlock(b1, ds, null, null, 0, false, null);
final DatanodeInfo[] cachedLocs = l1.getCachedLocations();
assertTrue(cachedLocs.length == 0);
}
private Configuration setupAddress(String key) {
HdfsConfiguration conf = new HdfsConfiguration();

View File

@ -69,6 +69,7 @@ import org.apache.hadoop.hdfs.protocol.CachePoolInfo;
import org.apache.hadoop.hdfs.protocol.CachePoolStats;
import org.apache.hadoop.hdfs.protocol.DatanodeInfo;
import org.apache.hadoop.hdfs.protocol.HdfsConstants.DatanodeReportType;
import org.apache.hadoop.hdfs.protocol.HdfsConstants.SafeModeAction;
import org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor;
import org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.CachedBlocksList.Type;
import org.apache.hadoop.hdfs.server.datanode.DataNode;
@ -528,6 +529,13 @@ public class TestCacheDirectives {
@Test(timeout=60000)
public void testCacheManagerRestart() throws Exception {
SecondaryNameNode secondary = null;
try {
// Start a secondary namenode
conf.set(DFSConfigKeys.DFS_NAMENODE_SECONDARY_HTTP_ADDRESS_KEY,
"0.0.0.0:0");
secondary = new SecondaryNameNode(conf);
// Create and validate a pool
final String pool = "poolparty";
String groupName = "partygroup";
@ -570,6 +578,28 @@ public class TestCacheDirectives {
}
assertFalse("Unexpected # of cache directives found", dit.hasNext());
// Checkpoint once to set some cache pools and directives on 2NN side
secondary.doCheckpoint();
// Add some more CacheManager state
final String imagePool = "imagePool";
dfs.addCachePool(new CachePoolInfo(imagePool));
prevId = dfs.addCacheDirective(new CacheDirectiveInfo.Builder()
.setPath(new Path("/image")).setPool(imagePool).build());
// Save a new image to force a fresh fsimage download
dfs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
dfs.saveNamespace();
dfs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
// Checkpoint again forcing a reload of FSN state
boolean fetchImage = secondary.doCheckpoint();
assertTrue("Secondary should have fetched a new fsimage from NameNode",
fetchImage);
// Remove temp pool and directive
dfs.removeCachePool(imagePool);
// Restart namenode
cluster.restartNameNode();
@ -599,6 +629,11 @@ public class TestCacheDirectives {
new CacheDirectiveInfo.Builder().
setPath(new Path("/foobar")).setPool(pool).build());
assertEquals(prevId + 1, nextId);
} finally {
if (secondary != null) {
secondary.shutdown();
}
}
}
/**

View File

@ -1634,7 +1634,7 @@ public class TestCheckpoint {
* Test that the secondary namenode correctly deletes temporary edits
* on startup.
*/
@Test(timeout = 30000)
@Test(timeout = 60000)
public void testDeleteTemporaryEditsOnStartup() throws IOException {
Configuration conf = new HdfsConfiguration();
SecondaryNameNode secondary = null;

View File

@ -28,7 +28,7 @@ import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hdfs.DFSConfigKeys;
import org.apache.hadoop.hdfs.DFSUtil;
import org.apache.hadoop.hdfs.HdfsConfiguration;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.security.authentication.util.KerberosName;
import org.apache.hadoop.security.authorize.AccessControlList;
@ -66,7 +66,7 @@ public class TestGetImageServlet {
AccessControlList acls = Mockito.mock(AccessControlList.class);
Mockito.when(acls.isUserAllowed(Mockito.<UserGroupInformation>any())).thenReturn(false);
ServletContext context = Mockito.mock(ServletContext.class);
Mockito.when(context.getAttribute(HttpServer.ADMINS_ACL)).thenReturn(acls);
Mockito.when(context.getAttribute(HttpServer2.ADMINS_ACL)).thenReturn(acls);
// Make sure that NN2 is considered a valid fsimage/edits requestor.
assertTrue(GetImageServlet.isValidRequestor(context,

View File

@ -37,7 +37,7 @@ import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hdfs.DFSUtil;
import org.apache.hadoop.hdfs.HdfsConfiguration;
import org.apache.hadoop.hdfs.MiniDFSCluster;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.http.HttpServerFunctionalTest;
import org.apache.hadoop.test.PathUtils;
import org.apache.hadoop.util.StringUtils;
@ -119,7 +119,7 @@ public class TestTransferFsImage {
*/
@Test(timeout = 5000)
public void testImageTransferTimeout() throws Exception {
HttpServer testServer = HttpServerFunctionalTest.createServer("hdfs");
HttpServer2 testServer = HttpServerFunctionalTest.createServer("hdfs");
try {
testServer.addServlet("GetImage", "/getimage", TestGetImageServlet.class);
testServer.start();

View File

@ -58,7 +58,7 @@ import org.apache.hadoop.hdfs.server.namenode.INodeDirectory;
import org.apache.hadoop.hdfs.server.namenode.INodeFile;
import org.apache.hadoop.hdfs.server.namenode.LeaseManager;
import org.apache.hadoop.hdfs.server.namenode.NameNode;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.ipc.ProtobufRpcEngine.Server;
import org.apache.hadoop.metrics2.impl.MetricsSystemImpl;
import org.apache.hadoop.security.UserGroupInformation;
@ -89,7 +89,7 @@ public class SnapshotTestHelper {
setLevel2OFF(LogFactory.getLog(MetricsSystemImpl.class));
setLevel2OFF(DataBlockScanner.LOG);
setLevel2OFF(HttpServer.LOG);
setLevel2OFF(HttpServer2.LOG);
setLevel2OFF(DataNode.LOG);
setLevel2OFF(BlockPoolSliceStorage.LOG);
setLevel2OFF(LeaseManager.LOG);

View File

@ -206,6 +206,12 @@ Release 2.4.0 - UNRELEASED
MAPREDUCE-5725. Make explicit that TestNetworkedJob relies on the Capacity
Scheduler (Sandy Ryza)
MAPREDUCE-5464. Add analogs of the SLOTS_MILLIS counters that jive with the
YARN resource model (Sandy Ryza)
MAPREDUCE-5732. Report proper queue when job has been automatically placed
(Sandy Ryza)
OPTIMIZATIONS
MAPREDUCE-5484. YarnChild unnecessarily loads job conf twice (Sandy Ryza)

View File

@ -526,6 +526,12 @@ public class JobHistoryEventHandler extends AbstractService
mi.getJobIndexInfo().setJobStartTime(jie.getLaunchTime());
}
if (event.getHistoryEvent().getEventType() == EventType.JOB_QUEUE_CHANGED) {
JobQueueChangeEvent jQueueEvent =
(JobQueueChangeEvent) event.getHistoryEvent();
mi.getJobIndexInfo().setQueueName(jQueueEvent.getJobQueueName());
}
// If this is JobFinishedEvent, close the writer and setup the job-index
if (event.getHistoryEvent().getEventType() == EventType.JOB_FINISHED) {
try {

View File

@ -39,7 +39,7 @@ import org.apache.hadoop.security.authorize.AccessControlList;
/**
* Main interface to interact with the job. Provides only getters.
* Main interface to interact with the job.
*/
public interface Job {
@ -98,4 +98,6 @@ public interface Job {
List<AMInfo> getAMInfos();
boolean checkAccess(UserGroupInformation callerUGI, JobACL jobOperation);
public void setQueueName(String queueName);
}

View File

@ -59,6 +59,7 @@ import org.apache.hadoop.mapreduce.jobhistory.JobHistoryEvent;
import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.TaskInfo;
import org.apache.hadoop.mapreduce.jobhistory.JobInfoChangeEvent;
import org.apache.hadoop.mapreduce.jobhistory.JobInitedEvent;
import org.apache.hadoop.mapreduce.jobhistory.JobQueueChangeEvent;
import org.apache.hadoop.mapreduce.jobhistory.JobSubmittedEvent;
import org.apache.hadoop.mapreduce.jobhistory.JobUnsuccessfulCompletionEvent;
import org.apache.hadoop.mapreduce.lib.chain.ChainMapper;
@ -181,7 +182,7 @@ public class JobImpl implements org.apache.hadoop.mapreduce.v2.app.job.Job,
private final EventHandler eventHandler;
private final MRAppMetrics metrics;
private final String userName;
private final String queueName;
private String queueName;
private final long appSubmitTime;
private final AppContext appContext;
@ -1123,6 +1124,13 @@ public class JobImpl implements org.apache.hadoop.mapreduce.v2.app.job.Job,
return queueName;
}
@Override
public void setQueueName(String queueName) {
this.queueName = queueName;
JobQueueChangeEvent jqce = new JobQueueChangeEvent(oldJobId, queueName);
eventHandler.handle(new JobHistoryEvent(jobId, jqce));
}
/*
* (non-Javadoc)
* @see org.apache.hadoop.mapreduce.v2.app.job.Job#getConfFile()

View File

@ -1266,34 +1266,40 @@ public abstract class TaskAttemptImpl implements
}
}
private static long computeSlotMillis(TaskAttemptImpl taskAttempt) {
private static void updateMillisCounters(JobCounterUpdateEvent jce,
TaskAttemptImpl taskAttempt) {
TaskType taskType = taskAttempt.getID().getTaskId().getTaskType();
int slotMemoryReq =
long duration = (taskAttempt.getFinishTime() - taskAttempt.getLaunchTime());
int mbRequired =
taskAttempt.getMemoryRequired(taskAttempt.conf, taskType);
int vcoresRequired = taskAttempt.getCpuRequired(taskAttempt.conf, taskType);
int minSlotMemSize = taskAttempt.conf.getInt(
YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB);
int simSlotsRequired =
minSlotMemSize == 0 ? 0 : (int) Math.ceil((float) slotMemoryReq
minSlotMemSize == 0 ? 0 : (int) Math.ceil((float) mbRequired
/ minSlotMemSize);
long slotMillisIncrement =
simSlotsRequired
* (taskAttempt.getFinishTime() - taskAttempt.getLaunchTime());
return slotMillisIncrement;
if (taskType == TaskType.MAP) {
jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_MAPS, simSlotsRequired * duration);
jce.addCounterUpdate(JobCounter.MB_MILLIS_MAPS, duration * mbRequired);
jce.addCounterUpdate(JobCounter.VCORES_MILLIS_MAPS, duration * vcoresRequired);
jce.addCounterUpdate(JobCounter.MILLIS_MAPS, duration);
} else {
jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_REDUCES, simSlotsRequired * duration);
jce.addCounterUpdate(JobCounter.MB_MILLIS_REDUCES, duration * mbRequired);
jce.addCounterUpdate(JobCounter.VCORES_MILLIS_REDUCES, duration * vcoresRequired);
jce.addCounterUpdate(JobCounter.MILLIS_REDUCES, duration);
}
}
private static JobCounterUpdateEvent createJobCounterUpdateEventTASucceeded(
TaskAttemptImpl taskAttempt) {
long slotMillis = computeSlotMillis(taskAttempt);
TaskId taskId = taskAttempt.attemptId.getTaskId();
JobCounterUpdateEvent jce = new JobCounterUpdateEvent(taskId.getJobId());
jce.addCounterUpdate(
taskId.getTaskType() == TaskType.MAP ?
JobCounter.SLOTS_MILLIS_MAPS : JobCounter.SLOTS_MILLIS_REDUCES,
slotMillis);
updateMillisCounters(jce, taskAttempt);
return jce;
}
@ -1302,20 +1308,13 @@ public abstract class TaskAttemptImpl implements
TaskType taskType = taskAttempt.getID().getTaskId().getTaskType();
JobCounterUpdateEvent jce = new JobCounterUpdateEvent(taskAttempt.getID().getTaskId().getJobId());
long slotMillisIncrement = computeSlotMillis(taskAttempt);
if (taskType == TaskType.MAP) {
jce.addCounterUpdate(JobCounter.NUM_FAILED_MAPS, 1);
if(!taskAlreadyCompleted) {
// dont double count the elapsed time
jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_MAPS, slotMillisIncrement);
}
} else {
jce.addCounterUpdate(JobCounter.NUM_FAILED_REDUCES, 1);
if(!taskAlreadyCompleted) {
// dont double count the elapsed time
jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_REDUCES, slotMillisIncrement);
}
if (!taskAlreadyCompleted) {
updateMillisCounters(jce, taskAttempt);
}
return jce;
}
@ -1325,20 +1324,13 @@ public abstract class TaskAttemptImpl implements
TaskType taskType = taskAttempt.getID().getTaskId().getTaskType();
JobCounterUpdateEvent jce = new JobCounterUpdateEvent(taskAttempt.getID().getTaskId().getJobId());
long slotMillisIncrement = computeSlotMillis(taskAttempt);
if (taskType == TaskType.MAP) {
jce.addCounterUpdate(JobCounter.NUM_KILLED_MAPS, 1);
if(!taskAlreadyCompleted) {
// dont double count the elapsed time
jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_MAPS, slotMillisIncrement);
}
} else {
jce.addCounterUpdate(JobCounter.NUM_KILLED_REDUCES, 1);
if(!taskAlreadyCompleted) {
// dont double count the elapsed time
jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_REDUCES, slotMillisIncrement);
}
if (!taskAlreadyCompleted) {
updateMillisCounters(jce, taskAttempt);
}
return jce;
}

View File

@ -109,11 +109,11 @@ public abstract class RMCommunicator extends AbstractService
@Override
protected void serviceStart() throws Exception {
scheduler= createSchedulerProxy();
register();
startAllocatorThread();
JobID id = TypeConverter.fromYarn(this.applicationId);
JobId jobId = TypeConverter.toYarn(id);
job = context.getJob(jobId);
register();
startAllocatorThread();
super.serviceStart();
}
@ -161,6 +161,9 @@ public abstract class RMCommunicator extends AbstractService
}
this.applicationACLs = response.getApplicationACLs();
LOG.info("maxContainerCapability: " + maxContainerCapability.getMemory());
String queue = response.getQueue();
LOG.info("queue: " + queue);
job.setQueueName(queue);
} catch (Exception are) {
LOG.error("Exception while registering", are);
throw new YarnRuntimeException(are);

View File

@ -82,6 +82,15 @@ public class TestEvents {
}
@Test(timeout = 10000)
public void testJobQueueChange() throws Exception {
org.apache.hadoop.mapreduce.JobID jid = new JobID("001", 1);
JobQueueChangeEvent test = new JobQueueChangeEvent(jid,
"newqueue");
assertEquals(test.getJobId().toString(), jid.toString());
assertEquals(test.getJobQueueName(), "newqueue");
}
/**
* simple test TaskUpdatedEvent and TaskUpdated
*

View File

@ -118,6 +118,9 @@ public class MRApp extends MRAppMaster {
private Path testAbsPath;
private ClusterInfo clusterInfo;
// Queue to pretend the RM assigned us
private String assignedQueue;
public static String NM_HOST = "localhost";
public static int NM_PORT = 1234;
public static int NM_HTTP_PORT = 8042;
@ -133,7 +136,7 @@ public class MRApp extends MRAppMaster {
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, Clock clock) {
this(maps, reduces, autoComplete, testName, cleanOnStart, 1, clock);
this(maps, reduces, autoComplete, testName, cleanOnStart, 1, clock, null);
}
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
@ -147,6 +150,12 @@ public class MRApp extends MRAppMaster {
this(maps, reduces, autoComplete, testName, cleanOnStart, 1);
}
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, String assignedQueue) {
this(maps, reduces, autoComplete, testName, cleanOnStart, 1,
new SystemClock(), assignedQueue);
}
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, boolean unregistered) {
this(maps, reduces, autoComplete, testName, cleanOnStart, 1, unregistered);
@ -178,7 +187,7 @@ public class MRApp extends MRAppMaster {
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, int startCount) {
this(maps, reduces, autoComplete, testName, cleanOnStart, startCount,
new SystemClock());
new SystemClock(), null);
}
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
@ -191,33 +200,34 @@ public class MRApp extends MRAppMaster {
boolean cleanOnStart, int startCount, Clock clock, boolean unregistered) {
this(getApplicationAttemptId(applicationId, startCount), getContainerId(
applicationId, startCount), maps, reduces, autoComplete, testName,
cleanOnStart, startCount, clock, unregistered);
cleanOnStart, startCount, clock, unregistered, null);
}
public MRApp(int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, int startCount, Clock clock) {
boolean cleanOnStart, int startCount, Clock clock, String assignedQueue) {
this(getApplicationAttemptId(applicationId, startCount), getContainerId(
applicationId, startCount), maps, reduces, autoComplete, testName,
cleanOnStart, startCount, clock, true);
cleanOnStart, startCount, clock, true, assignedQueue);
}
public MRApp(ApplicationAttemptId appAttemptId, ContainerId amContainerId,
int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, int startCount, boolean unregistered) {
this(appAttemptId, amContainerId, maps, reduces, autoComplete, testName,
cleanOnStart, startCount, new SystemClock(), unregistered);
cleanOnStart, startCount, new SystemClock(), unregistered, null);
}
public MRApp(ApplicationAttemptId appAttemptId, ContainerId amContainerId,
int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, int startCount) {
this(appAttemptId, amContainerId, maps, reduces, autoComplete, testName,
cleanOnStart, startCount, new SystemClock(), true);
cleanOnStart, startCount, new SystemClock(), true, null);
}
public MRApp(ApplicationAttemptId appAttemptId, ContainerId amContainerId,
int maps, int reduces, boolean autoComplete, String testName,
boolean cleanOnStart, int startCount, Clock clock, boolean unregistered) {
boolean cleanOnStart, int startCount, Clock clock, boolean unregistered,
String assignedQueue) {
super(appAttemptId, amContainerId, NM_HOST, NM_PORT, NM_HTTP_PORT, clock, System
.currentTimeMillis(), MRJobConfig.DEFAULT_MR_AM_MAX_ATTEMPTS);
this.testWorkDir = new File("target", testName);
@ -239,6 +249,7 @@ public class MRApp extends MRAppMaster {
// If safeToReportTerminationToUser is set to true, we can verify whether
// the job can reaches the final state when MRAppMaster shuts down.
this.successfullyUnregistered.set(unregistered);
this.assignedQueue = assignedQueue;
}
@Override
@ -285,6 +296,9 @@ public class MRApp extends MRAppMaster {
start();
DefaultMetricsSystem.shutdown();
Job job = getContext().getAllJobs().values().iterator().next();
if (assignedQueue != null) {
job.setQueueName(assignedQueue);
}
// Write job.xml
String jobFile = MRApps.getJobFile(conf, user,

View File

@ -39,6 +39,7 @@ public class MockAppContext implements AppContext {
final Map<JobId, Job> jobs;
final long startTime = System.currentTimeMillis();
Set<String> blacklistedNodes;
String queue;
public MockAppContext(int appid) {
appID = MockJobs.newAppID(appid);

View File

@ -629,6 +629,11 @@ public class MockJobs extends MockApps {
jobConf.addResource(fc.open(configFile), configFile.toString());
return jobConf;
}
@Override
public void setQueueName(String queueName) {
// do nothing
}
};
}

View File

@ -37,7 +37,7 @@ import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.JobContext;
import org.apache.hadoop.mapreduce.MRJobConfig;
@ -199,7 +199,7 @@ public class TestJobEndNotifier extends JobEndNotifier {
@Test
public void testNotificationOnLastRetryNormalShutdown() throws Exception {
HttpServer server = startHttpServer();
HttpServer2 server = startHttpServer();
// Act like it is the second attempt. Default max attempts is 2
MRApp app = spy(new MRAppWithCustomContainerAllocator(
2, 2, true, this.getClass().getName(), true, 2, true));
@ -223,7 +223,7 @@ public class TestJobEndNotifier extends JobEndNotifier {
@Test
public void testAbsentNotificationOnNotLastRetryUnregistrationFailure()
throws Exception {
HttpServer server = startHttpServer();
HttpServer2 server = startHttpServer();
MRApp app = spy(new MRAppWithCustomContainerAllocator(2, 2, false,
this.getClass().getName(), true, 1, false));
doNothing().when(app).sysexit();
@ -250,7 +250,7 @@ public class TestJobEndNotifier extends JobEndNotifier {
@Test
public void testNotificationOnLastRetryUnregistrationFailure()
throws Exception {
HttpServer server = startHttpServer();
HttpServer2 server = startHttpServer();
MRApp app = spy(new MRAppWithCustomContainerAllocator(2, 2, false,
this.getClass().getName(), true, 2, false));
doNothing().when(app).sysexit();
@ -274,10 +274,10 @@ public class TestJobEndNotifier extends JobEndNotifier {
server.stop();
}
private static HttpServer startHttpServer() throws Exception {
private static HttpServer2 startHttpServer() throws Exception {
new File(System.getProperty(
"build.webapps", "build/webapps") + "/test").mkdirs();
HttpServer server = new HttpServer.Builder().setName("test")
HttpServer2 server = new HttpServer2.Builder().setName("test")
.addEndpoint(URI.create("http://localhost:0"))
.setFindPort(true).build();
server.addServlet("jobend", "/jobend", JobEndServlet.class);

View File

@ -505,6 +505,11 @@ public class TestRuntimeEstimators {
public Configuration loadConfFile() {
throw new UnsupportedOperationException();
}
@Override
public void setQueueName(String queueName) {
// do nothing
}
}
/*

View File

@ -41,6 +41,7 @@ import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.RawLocalFileSystem;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapTaskAttemptImpl;
import org.apache.hadoop.mapreduce.Counters;
import org.apache.hadoop.mapreduce.JobCounter;
import org.apache.hadoop.mapreduce.MRJobConfig;
import org.apache.hadoop.mapreduce.jobhistory.JobHistoryEvent;
@ -182,13 +183,13 @@ public class TestTaskAttempt{
}
@Test
public void testSlotMillisCounterUpdate() throws Exception {
verifySlotMillis(2048, 2048, 1024);
verifySlotMillis(2048, 1024, 1024);
verifySlotMillis(10240, 1024, 2048);
public void testMillisCountersUpdate() throws Exception {
verifyMillisCounters(2048, 2048, 1024);
verifyMillisCounters(2048, 1024, 1024);
verifyMillisCounters(10240, 1024, 2048);
}
public void verifySlotMillis(int mapMemMb, int reduceMemMb,
public void verifyMillisCounters(int mapMemMb, int reduceMemMb,
int minContainerSize) throws Exception {
Clock actualClock = new SystemClock();
ControlledClock clock = new ControlledClock(actualClock);
@ -232,13 +233,23 @@ public class TestTaskAttempt{
Assert.assertEquals(mta.getLaunchTime(), 10);
Assert.assertEquals(rta.getFinishTime(), 11);
Assert.assertEquals(rta.getLaunchTime(), 10);
Counters counters = job.getAllCounters();
Assert.assertEquals((int) Math.ceil((float) mapMemMb / minContainerSize),
job.getAllCounters().findCounter(JobCounter.SLOTS_MILLIS_MAPS)
.getValue());
Assert.assertEquals(
(int) Math.ceil((float) reduceMemMb / minContainerSize), job
.getAllCounters().findCounter(JobCounter.SLOTS_MILLIS_REDUCES)
.getValue());
counters.findCounter(JobCounter.SLOTS_MILLIS_MAPS).getValue());
Assert.assertEquals((int) Math.ceil((float) reduceMemMb / minContainerSize),
counters.findCounter(JobCounter.SLOTS_MILLIS_REDUCES).getValue());
Assert.assertEquals(1,
counters.findCounter(JobCounter.MILLIS_MAPS).getValue());
Assert.assertEquals(1,
counters.findCounter(JobCounter.MILLIS_REDUCES).getValue());
Assert.assertEquals(mapMemMb,
counters.findCounter(JobCounter.MB_MILLIS_MAPS).getValue());
Assert.assertEquals(reduceMemMb,
counters.findCounter(JobCounter.MB_MILLIS_REDUCES).getValue());
Assert.assertEquals(1,
counters.findCounter(JobCounter.VCORES_MILLIS_MAPS).getValue());
Assert.assertEquals(1,
counters.findCounter(JobCounter.VCORES_MILLIS_REDUCES).getValue());
}
private TaskAttemptImpl createMapTaskAttemptImplForTest(

View File

@ -122,6 +122,13 @@
]
},
{"type": "record", "name": "JobQueueChange",
"fields": [
{"name": "jobid", "type": "string"},
{"name": "jobQueueName", "type": "string"}
]
},
{"type": "record", "name": "JobUnsuccessfulCompletion",
"fields": [
{"name": "jobid", "type": "string"},
@ -267,6 +274,7 @@
"JOB_FINISHED",
"JOB_PRIORITY_CHANGED",
"JOB_STATUS_CHANGED",
"JOB_QUEUE_CHANGED",
"JOB_FAILED",
"JOB_KILLED",
"JOB_ERROR",
@ -306,6 +314,7 @@
"JobInited",
"AMStarted",
"JobPriorityChange",
"JobQueueChange",
"JobStatusChanged",
"JobSubmitted",
"JobUnsuccessfulCompletion",

View File

@ -49,5 +49,11 @@ public enum JobCounter {
TASKS_REQ_PREEMPT,
CHECKPOINTS,
CHECKPOINT_BYTES,
CHECKPOINT_TIME
CHECKPOINT_TIME,
MILLIS_MAPS,
MILLIS_REDUCES,
VCORES_MILLIS_MAPS,
VCORES_MILLIS_REDUCES,
MB_MILLIS_MAPS,
MB_MILLIS_REDUCES
}

View File

@ -98,6 +98,8 @@ public class EventReader implements Closeable {
result = new JobFinishedEvent(); break;
case JOB_PRIORITY_CHANGED:
result = new JobPriorityChangeEvent(); break;
case JOB_QUEUE_CHANGED:
result = new JobQueueChangeEvent(); break;
case JOB_STATUS_CHANGED:
result = new JobStatusChangedEvent(); break;
case JOB_FAILED:

View File

@ -183,6 +183,9 @@ public class JobHistoryParser implements HistoryEventHandler {
case JOB_PRIORITY_CHANGED:
handleJobPriorityChangeEvent((JobPriorityChangeEvent) event);
break;
case JOB_QUEUE_CHANGED:
handleJobQueueChangeEvent((JobQueueChangeEvent) event);
break;
case JOB_FAILED:
case JOB_KILLED:
case JOB_ERROR:
@ -386,6 +389,10 @@ public class JobHistoryParser implements HistoryEventHandler {
info.priority = event.getPriority();
}
private void handleJobQueueChangeEvent(JobQueueChangeEvent event) {
info.jobQueueName = event.getJobQueueName();
}
private void handleJobInitedEvent(JobInitedEvent event) {
info.launchTime = event.getLaunchTime();
info.totalMaps = event.getTotalMaps();

View File

@ -0,0 +1,63 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.mapreduce.jobhistory;
import org.apache.avro.util.Utf8;
import org.apache.hadoop.mapreduce.JobID;
@SuppressWarnings("deprecation")
public class JobQueueChangeEvent implements HistoryEvent {
private JobQueueChange datum = new JobQueueChange();
public JobQueueChangeEvent(JobID id, String queueName) {
datum.jobid = new Utf8(id.toString());
datum.jobQueueName = new Utf8(queueName);
}
JobQueueChangeEvent() { }
@Override
public EventType getEventType() {
return EventType.JOB_QUEUE_CHANGED;
}
@Override
public Object getDatum() {
return datum;
}
@Override
public void setDatum(Object datum) {
this.datum = (JobQueueChange) datum;
}
/** Get the Job ID */
public JobID getJobId() {
return JobID.forName(datum.jobid.toString());
}
/** Get the new Job queue name */
public String getJobQueueName() {
if (datum.jobQueueName != null) {
return datum.jobQueueName.toString();
}
return null;
}
}

View File

@ -25,6 +25,12 @@ DATA_LOCAL_MAPS.name= Data-local map tasks
RACK_LOCAL_MAPS.name= Rack-local map tasks
SLOTS_MILLIS_MAPS.name= Total time spent by all maps in occupied slots (ms)
SLOTS_MILLIS_REDUCES.name= Total time spent by all reduces in occupied slots (ms)
MILLIS_MAPS.name= Total time spent by all map tasks (ms)
MILLIS_REDUCES.name= Total time spent by all reduce tasks (ms)
MB_MILLIS_MAPS.name= Total megabyte-seconds taken by all map tasks
MB_MILLIS_REDUCES.name= Total megabyte-seconds taken by all reduce tasks
VCORES_MILLIS_MAPS.name= Total vcore-seconds taken by all map tasks
VCORES_MILLIS_REDUCES.name= Total vcore-seconds taken by all reduce tasks
FALLOW_SLOTS_MILLIS_MAPS.name= Total time spent by all maps waiting after reserving slots (ms)
FALLOW_SLOTS_MILLIS_REDUCES.name= Total time spent by all reduces waiting after reserving slots (ms)
TASKS_REQ_PREEMPT.name= Tasks that have been asked to preempt

View File

@ -34,10 +34,10 @@ import javax.servlet.http.HttpServletResponse;
import junit.framework.TestCase;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.http.HttpServer;
import org.apache.hadoop.http.HttpServer2;
public class TestJobEndNotifier extends TestCase {
HttpServer server;
HttpServer2 server;
URL baseUrl;
@SuppressWarnings("serial")
@ -102,7 +102,7 @@ public class TestJobEndNotifier extends TestCase {
public void setUp() throws Exception {
new File(System.getProperty("build.webapps", "build/webapps") + "/test"
).mkdirs();
server = new HttpServer.Builder().setName("test")
server = new HttpServer2.Builder().setName("test")
.addEndpoint(URI.create("http://localhost:0"))
.setFindPort(true).build();
server.addServlet("delay", "/delay", DelayServlet.class);

View File

@ -453,4 +453,9 @@ public class CompletedJob implements org.apache.hadoop.mapreduce.v2.app.job.Job
}
return amInfos;
}
@Override
public void setQueueName(String queueName) {
throw new UnsupportedOperationException("Can't set job's queue name in history");
}
}

View File

@ -191,4 +191,9 @@ public class PartialJob implements org.apache.hadoop.mapreduce.v2.app.job.Job {
return null;
}
@Override
public void setQueueName(String queueName) {
throw new UnsupportedOperationException("Can't set job's queue name in history");
}
}

View File

@ -156,6 +156,41 @@ public class TestJobHistoryEvents {
services[services.length - 1].getName());
}
@Test
public void testAssignedQueue() throws Exception {
Configuration conf = new Configuration();
MRApp app = new MRAppWithHistory(2, 1, true, this.getClass().getName(),
true, "assignedQueue");
app.submit(conf);
Job job = app.getContext().getAllJobs().values().iterator().next();
JobId jobId = job.getID();
LOG.info("JOBID is " + TypeConverter.fromYarn(jobId).toString());
app.waitForState(job, JobState.SUCCEEDED);
//make sure all events are flushed
app.waitForState(Service.STATE.STOPPED);
/*
* Use HistoryContext to read logged events and verify the number of
* completed maps
*/
HistoryContext context = new JobHistory();
// test start and stop states
((JobHistory)context).init(conf);
((JobHistory)context).start();
Assert.assertTrue( context.getStartTime()>0);
Assert.assertEquals(((JobHistory)context).getServiceState(),Service.STATE.STARTED);
// get job before stopping JobHistory
Job parsedJob = context.getJob(jobId);
// stop JobHistory
((JobHistory)context).stop();
Assert.assertEquals(((JobHistory)context).getServiceState(),Service.STATE.STOPPED);
Assert.assertEquals("QueueName not correct", "assignedQueue",
parsedJob.getQueueName());
}
private void verifyTask(Task task) {
Assert.assertEquals("Task state not currect", TaskState.SUCCEEDED,
task.getState());
@ -184,6 +219,11 @@ public class TestJobHistoryEvents {
super(maps, reduces, autoComplete, testName, cleanOnStart);
}
public MRAppWithHistory(int maps, int reduces, boolean autoComplete,
String testName, boolean cleanOnStart, String assignedQueue) {
super(maps, reduces, autoComplete, testName, cleanOnStart, assignedQueue);
}
@Override
protected EventHandler<JobHistoryEvent> createJobHistoryHandler(
AppContext context) {

View File

@ -415,5 +415,9 @@ public class TestHsWebServicesAcls {
return aclsMgr.checkAccess(callerUGI, jobOperation,
this.getUserName(), jobAcls.get(jobOperation));
}
@Override
public void setQueueName(String queueName) {
}
}
}

Some files were not shown because too many files have changed in this diff Show More